Most important characteristics of patient (e.g. age, disease/condition, gender)
Main intervention (e.g. drug treatment, diagnostic/screening test)
Main alternative (e.g. placebo, standard therapy, no treatment, gold standard)
What you are trying to accomplish, measure, improve, affect (e.g. reduced mortality or morbidity, improved memory)
Richardson, WS, Wilson, MC, Nishikawa, J & Hayward, RS 1995, 'The well-built clinical question: A key to evidence-based decisions', ACP Journal Club , vol. 123, no. 3, pp. A12-A12 .
We do not have access to this article at UTAS.
A variant of PICO is PICOS . S stands for Study designs . It establishes which study designs are appropriate for answering the question, e.g. randomised controlled trial (RCT). There is also PICO C (C for context) and PICO T (T for timeframe).
You may find this document on PICO / PIO / PEO useful:
Framing a PICO / PIO / PEO question Developed by Teesside University
SPIDER - to search for qualitative and mixed methods research studies
S
PI
D
E
R
Sample
Phenomenon of Interest
Design
Evaluation
Research type
Cooke, A, Smith, D & Booth, A 2012, 'Beyond pico the spider tool for qualitative evidence synthesis', Qualitative Health Research , vol. 22, no. 10, pp. 1435-1443.
This article is only accessible for UTAS staff and students.
SPICE - to search for qualitative evidence
S
P
I
C
E
Setting (where?)
Perspecitve (for whom?)
Intervention (what?)
Comparison (compared with what?)
Evaluation (with what result?)
Cleyle, S & Booth, A 2006, 'Clear and present questions: Formulating questions for evidence based practice', Library hi tech , vol. 24, no. 3, pp. 355-368.
ECLIPSE - to search for health policy/management information
E
C
L
I
P
Se
Expectation (improvement or information or innovation)
Client group (at whom the service is aimed)
Location (where is the service located?)
Impact (outcomes)
Professionals (who is involved in providing/improving the service)
Service (for which service are you looking for information)
Wildridge, V & Bell, L 2002, 'How clip became eclipse: A mnemonic to assist in searching for health policy/management information', Health Information & Libraries Journal , vol. 19, no. 2, pp. 113-115.
There are many more techniques available. See the below guide from the CQUniversity Library for an extensive list:
Question frameworks overview from Framing your research question guide, developed by CQUniversity Library
This is the specific research question used in the example:
"Is animal-assisted therapy more effective than music therapy in managing aggressive behaviour in elderly people with dementia?"
Within this question are the four PICO concepts :
P
elderly patients with dementia
I
animal-assisted therapy
C
music therapy
O
aggressive behaviour
S - Study design
This is a therapy question. The best study design to answer a therapy question is a randomised controlled trial (RCT). You may decide to only include studies in the systematic review that were using a RCT, see Step 8 .
See source of example
Need More Help? Book a consultation with a Learning and Research Librarian or contact [email protected] .
<< Previous: Building Search Strategies
Next: 2. Identify the Key Concepts >>
Last Updated: Aug 6, 2024 10:44 AM
URL: https://utas.libguides.com/SystematicReviews
Library Services
UCL LIBRARY SERVICES
Guides and databases
Library skills
Systematic reviews
Formulating a research question
What are systematic reviews?
Types of systematic reviews
Identifying studies
Searching databases
Describing and appraising studies
Synthesis and systematic maps
Software for systematic reviews
Online training and support
Live and face to face training
Individual support
Further help
Clarifying the review question leads to specifying what type of studies can best address that question and setting out criteria for including such studies in the review. This is often called inclusion criteria or eligibility criteria. The criteria could relate to the review topic, the research methods of the studies, specific populations, settings, date limits, geographical areas, types of interventions, or something else.
Systematic reviews address clear and answerable research questions, rather than a general topic or problem of interest. They also have clear criteria about the studies that are being used to address the research questions. This is often called inclusion criteria or eligibility criteria.
Six examples of types of question are listed below, and the examples show different questions that a review might address based on the topic of influenza vaccination. Structuring questions in this way aids thinking about the different types of research that could address each type of question. Mneumonics can help in thinking about criteria that research must fulfil to address the question. The criteria could relate to the context, research methods of the studies, specific populations, settings, date limits, geographical areas, types of interventions, or something else.
Examples of review questions
Needs - What do people want? Example: What are the information needs of healthcare workers regarding vaccination for seasonal influenza?
Impact or effectiveness - What is the balance of benefit and harm of a given intervention? Example: What is the effectiveness of strategies to increase vaccination coverage among healthcare workers. What is the cost effectiveness of interventions that increase immunisation coverage?
Process or explanation - Why does it work (or not work)? How does it work (or not work)? Example: What factors are associated with uptake of vaccinations by healthcare workers? What factors are associated with inequities in vaccination among healthcare workers?
Correlation - What relationships are seen between phenomena? Example: How does influenza vaccination of healthcare workers vary with morbidity and mortality among patients? (Note: correlation does not in itself indicate causation).
Views / perspectives - What are people's experiences? Example: What are the views and experiences of healthcare workers regarding vaccination for seasonal influenza?
Service implementation - What is happening? Example: What is known about the implementation and context of interventions to promote vaccination for seasonal influenza among healthcare workers?
Examples in practice : Seasonal influenza vaccination of health care workers: evidence synthesis / Loreno et al. 2017
Example of eligibility criteria
Research question: What are the views and experiences of UK healthcare workers regarding vaccination for seasonal influenza?
Population: healthcare workers, any type, including those without direct contact with patients.
Context: seasonal influenza vaccination for healthcare workers.
Study design: qualitative data including interviews, focus groups, ethnographic data.
Date of publication: all.
Country: all UK regions.
Studies focused on influenza vaccination for general population and pandemic influenza vaccination.
Studies using survey data with only closed questions, studies that only report quantitative data.
Consider the research boundaries
It is important to consider the reasons that the research question is being asked. Any research question has ideological and theoretical assumptions around the meanings and processes it is focused on. A systematic review should either specify definitions and boundaries around these elements at the outset, or be clear about which elements are undefined.
For example if we are interested in the topic of homework, there are likely to be pre-conceived ideas about what is meant by 'homework'. If we want to know the impact of homework on educational attainment, we need to set boundaries on the age range of children, or how educational attainment is measured. There may also be a particular setting or contexts: type of school, country, gender, the timeframe of the literature, or the study designs of the research.
Research question: What is the impact of homework on children's educational attainment?
Scope : Homework - Tasks set by school teachers for students to complete out of school time, in any format or setting.
Population: children aged 5-11 years.
Outcomes: measures of literacy or numeracy from tests administered by researchers, school or other authorities.
Study design: Studies with a comparison control group.
Context: OECD countries, all settings within mainstream education.
Date Limit: 2007 onwards.
Any context not in mainstream primary schools.
Non-English language studies.
Mnemonics for structuring questions
Some mnemonics that sometimes help to formulate research questions, set the boundaries of question and inform a search strategy.
Intervention effects
PICO Population – Intervention– Outcome– Comparison
Variations: add T on for time, or ‘C’ for context, or S’ for study type,
Policy and management issues
ECLIPSE : Expectation – Client group – Location – Impact ‐ Professionals involved – Service
Expectation encourages reflection on what the information is needed for i.e. improvement, innovation or information. Impact looks at what you would like to achieve e.g. improve team communication .
How CLIP became ECLIPSE: a mnemonic to assist in searching for health policy/management information / Wildridge & Bell, 2002
Analysis tool for management and organisational strategy
PESTLE: Political – Economic – Social – Technological – Environmental ‐ Legal
An analysis tool that can be used by organizations for identifying external factors which may influence their strategic development, marketing strategies, new technologies or organisational change.
PESTLE analysis / CIPD, 2010
Service evaluations with qualitative study designs
A well-developed and answerable question is the foundation for any systematic review. This process involves:
Systematic review questions typically follow a PICO-format (patient or population, intervention, comparison, and outcome)
Using the PICO framework can help team members clarify and refine the scope of their question. For example, if the population is breast cancer patients, is it all breast cancer patients or just a segment of them?
When formulating your research question, you should also consider how it could be answered. If it is not possible to answer your question (the research would be unethical, for example), you'll need to reconsider what you're asking
Typically, systematic review protocols include a list of studies that will be included in the review. These studies, known as exemplars, guide the search development but also serve as proof of concept that your question is answerable. If you are unable to find studies to include, you may need to reconsider your question
Other Question Frameworks
PICO is a helpful framework for clinical research questions, but may not be the best for other types of research questions. Did you know there are at least 25 other question frameworks besides variations of PICO? Frameworks like PEO, SPIDER, SPICE, and ECLIPS can help you formulate a focused research question. The table and example below were created by the Medical University of South Carolina (MUSC) Libraries .
The PEO question framework is useful for qualitative research topics. PEO questions identify three concepts: population, exposure, and outcome. Research question : What are the daily living experiences of mothers with postnatal depression?
opulation
Who is my question focused on?
mothers
xposure
What is the issue I am interested in?
postnatal depression
utcome
What, in relation to the issue, do I want to examine?
daily living experiences
The SPIDER question framework is useful for qualitative or mixed methods research topics focused on "samples" rather than populations. SPIDER questions identify five concepts: sample, phenomenon of interest, design, evaluation, and research type.
Research question : What are the experiences of young parents in attendance at antenatal education classes?
Element
Definition
Example
ample
Who is the group of people being studied?
young parents
henomenon of nterest
What are the reasons for behavior and decisions?
attendance at antenatal education classes
esign
How has the research been collected (e.g., interview, survey)?
interviews
valuation
What is the outcome being impacted?
experiences
esearch type
What type of research (qualitative or mixed methods)?
qualitative studies
The SPICE question framework is useful for qualitative research topics evaluating the outcomes of a service, project, or intervention. SPICE questions identify five concepts: setting, perspective, intervention/exposure/interest, comparison, and evaluation.
Research question : For teenagers in South Carolina, what is the effect of provision of Quit Kits to support smoking cessation on number of successful attempts to give up smoking compared to no support ("cold turkey")?
Element
Definition
Example
etting
Setting is the context for the question (where).
South Carolina
erspective
Perspective is the users, potential users, or stakeholders of the service (for whom).
teenagers
ntervention / Exposure
Intervention is the action taken for the users, potential users, or stakeholders (what).
provision of Quit Kits to support smoking cessation
omparison
Comparison is the alternative actions or outcomes (compared to what).
no support or "cold turkey"
valuation
Evaluation is the result or measurement that will determine the success of the intervention (what is the result, how well).
number of successful attempts to give up smoking with Quit Kits compared to number of successful attempts with no support
The ECLIPSE framework is useful for qualitative research topics investigating the outcomes of a policy or service. ECLIPSE questions identify six concepts: expectation, client group, location, impact, professionals, and service.
Research question: How can I increase access to wireless internet for hospital patients?
xpectation
What are you looking to improve or change? What is the information going to be used for?
to increase access to wireless internet in the hospital
lient group
Who is the service or policy aimed at?
patients and families
ocation
Where is the service or policy located?
hospitals
mpact
What is the change in service or policy that the researcher is investigating?
clients have easy access to free internet
rofessionals
Who is involved in providing or improving the service or policy?
The physical space of the MSK Library is permanently closed to visitors as of Friday, May 17, 2024. Please visit this guide for more information.
Systematic Review Service
Review Types
How Long Do Reviews Take
Which Review Type is Right for Me
AI and Reviews
What to Expect From an MSK Librarian Partnership
Policies for Partnering with an MSK Librarian
Request to Work with an MSK Librarian
Your First Meeting with an MSK Librarian
Covidence Review Software
Step 1: Form Your Team
Step 2: Define Your Research Question
Step 3: Write and Register Your Protocol
Step 4: Search for Evidence
Step 5: Screen Your Results
Step 6: Assess the Quality
Step 7: Collect the Data
Step 8: Write and Publish the Review
Additional Resources
Define Your Research Question
A well-developed research question will inform the entirety of your review process, including:
The development of your inclusion and exclusion criteria.
The terms used in your search strategies.
The tool(s) used to assess the quality of included studies.
The data pulled from the included studies.
The analysis completed in your review.
The target journal(s) for your review's publication.
If your question is too broad, you may have trouble completing the review. If your topic is too narrow, there may not be sufficient literature to warrant a review.
How the MSK Library Can Help
One of the first conversations you will have with your MSK librarian will be about your topic.
Your MSK librarian will:
Work with you to determine whether a systematic review on your topic has been published or planned by searching databases like PubMed , Embase , and Epistemonikos and registries like PROSPERO , Protocols.io , and Open Science Framework (OSF) Registries .
Ask you for a sample set of relevant publications (also known as seed articles) that you know you want your review to capture. This helps provide a better sense of the scope of your research question. If your topic is too broad or narrow, your MSK librarian can help improve the focus. This sample set will later inform the construction of the search strategy.
Using a Question Framework
What if my topic does not fit a framework?
PICO is a model commonly used for clinical and healthcare related questions, and is often, although not exclusively, used for searching for quantitively designed studies.
Example question: In elderly patients, does patient handwashing compared to no handwashing impact rates of hospital-acquired infections?
opulation
Any characteristic that define your patient or population group.
Elderly people
ntervention
What do you want to do with the patient or population?
Handwashing
omparison (if relevant)
What are the alternatives to the main intervention?
No handwashing
utcome
Any specific outcomes or effects of your intervention.
Hospital-acquired infection rates
Richardson, W.S., Wilson, M.C, Nishikawa, J. and Hayward, R.S.A. (1995). "The well-built clinical question: a key to evidence-based decisions." ACP Journal Club , 123(3), A12.
Question framework content adapted from The University of Plymouth Library .
PEO is useful for qualitative research questions.
Example question: In homeless populations, do addiction services impact housing rates?
Who are the users - patients, family, practitioners or community being affected? What are the symptoms, condition, health status, age, gender, ethnicity? What is the setting e.g. acute care, community, mental health?
homeless persons
Exposure to a condition or illness, a risk factor (e.g. smoking), screening, rehabilitation, service etc.
drug and alcohol addiction services
Experiences, attitudes, feelings, improvement in condition, mobility, responsiveness to treatment, care, quality of life or daily living.
rates of homelessness
Moola S, Munn Z, Sears K, Sfetcu R, Currie M, Lisy K, Tufanaru C, Qureshi R, Mattis P & Mu P. (2015). "Conducting systematic reviews of association (etiology): The Joanna Briggs Institute's approach." International Journal of Evidence - Based Healthcare, 13(3), 163-9. Available at: 10.1097/XEB.0000000000000064.
PCC is useful for both qualitative and quantitative (mixed methods) topics, and is commonly used in scoping reviews.
Example question: What patient-led models of care are used to manage chronic disease in high income countries?
Population
"Important characteristics of participants, including age and other qualifying criteria. You may not need to include this element unless your question focuses on a specific condition or cohort."
N/A. As our example considers chronic diseases broadly, not a specific condition/population - such as women with chronic obstructive pulmonary disorder.
Concept
"The core concept examined by the scoping review should be clearly articulated to guide the scope and breadth of the inquiry. This may include details that pertain to elements that would be detailed in a standard systematic review, such as the "interventions" and/or "phenomena of interest" and/or "outcomes."
Chronic disease
Patient-led care models
Peters MDJ, Godfrey C, McInerney P, Munn Z, Tricco AC, Khalil, H. "Chapter 11: Scoping Reviews" (2020 version). In: Aromataris E, Munn Z (Editors). JBI Manual for Evidence Synthesis, JBI, 2020. Available from https://synthesismanual.jbi.global . https://doi.org/10.46658/JBIMES-20-12
Question framework content adapted from The University of Plymouth Library .
SPIDER is a model useful for qualitative and mixed method type research questions.
Example question: What are young parents’ experiences of attending antenatal education?
ample
The group you are focusing on.
Young parents
henomenon of nterest
The behaviour or experience your research is examining.
Experience of antenatal classes
esign
How the research will be carried out?
Interviews, questionnaires
valuation
What are the outcomes you are measuring?
Experiences and views
esearch type
What is the research type you are undertaking?
Qualitative
Cooke, A., Smith, D. and Booth, A. (2012)."Beyond PICO: the SPIDER tool for qualitative evidence synthesis." Qualitative Health Research , 22(10), 1435-1443.
SPICE is a model useful for qualitative and mixed method type research questions.
Example question: Does mindfulness therapy in a counseling service impact the attitudes of patients diagnosed with cancer?
etting
The setting or the context
Counseling service
opulation or perspective
Which population or perspective will the research be conducted for/from
Patients diagnosed with cancer
ntervention
The intervention been studied
Mindfulness based cognitive therapy
omparison
Is there a comparison to be made?
No comparison
valuation
How well did the intervention work, what were the results?
Assess patients attitudes to see if the intervention changed their quality of life
Example question adapted from: Tate, KJ., Newbury-Birch, D., and McGeechan, GJ. (2018). "A systematic review of qualitative evidence of cancer patients’ attitudes to mindfulness." European Journal of Cancer Care , 27(2), 1-10.
ECLIPSE is a model useful for qualitative and mixed method type research questions, especially for questions examining particular services or professions.
Example question: Can cross-service communication impact the support of adults with learning difficulties?
xpectation
Purpose of the study - what are you trying to achieve?
How communication can be improved between services to create better care
lient group
Which group are you focusing on?
Adult with learning difficulties
ocation
Where is that group based?
Community
mpact
If your research is looking for service improvement, what is this and how is it being measured?
Better support services for adults with learning difficulties through joined up, cross-service working
rofessionals
What professional staff are involved?
Community nurses, social workers, carers
ervice
Which service are you focusing on?
Adult support services
You might find that your topic does not always fall into one of the models listed on this page. You can always modify a model to make it work for your topic, and either remove or incorporate additional elements.
The important thing is to ensure that you have a high quality question that can be separated into its component parts.
Database video & help guides This link opens in a new window
Documenting your search and results
Data management
How the library can help
Systematic reviews A to Z
Using a framework to structure your research question
Your systematic review or systematic literature review will be defined by your research question. A well formulated question will help:
Frame your entire research process
Determine the scope of your review
Provide a focus for your searches
Help you identify key concepts
Guide the selection of your papers
There are different models you can use to structure help structure a question, which will help with searching.
Selecting a framework
What if my topic doesn't fit a framework?
A model commonly used for clinical and healthcare related questions, often, although not exclusively, used for searching for quantitively designed studies.
Example question: Does handwashing reduce hospital acquired infections in elderly people?
opulation
Any characteristic that define your patient or population group.
Elderly people
ntervention
What do you want to do with the patient or population?
Handwashing
omparison (if relevant)
What are the alternatives to the main intervention?
No handwashing
utcome
Any specific outcomes or effects of your intervention.
Reduced infection
Richardson, W.S., Wilson, M.C, Nishikawa, J. and Hayward, R.S.A. (1995) 'The well-built clinical question: a key to evidence-based decisions.' ACP Journal Club , 123(3) pp. A12
PEO is useful for qualitative research questions.
Example question: How does substance dependence addiction play a role in homelessness?
Who are the users - patients, family, practitioners or community being affected? What are the symptoms, condition, health status, age, gender, ethnicity? What is the setting e.g. acute care, community, mental health?
homeless persons
Exposure to a condition or illness, a risk factor (e.g. smoking), screening, rehabilitation, service etc.
drug and alcohol addiction services
Experiences, attitudes, feelings, improvement in condition, mobility, responsiveness to treatment, care, quality of life or daily living.
reduced homelessness
Moola S, Munn Z, Sears K, Sfetcu R, Currie M, Lisy K, Tufanaru C, Qureshi R, Mattis P & Mu P. (2015) 'Conducting systematic reviews of association (etiology): The Joanna Briggs Institute's approach'. International Journal of Evidence - Based Healthcare, 13(3), pp. 163-9. Available at: 10.1097/XEB.0000000000000064.
PCC is useful for both qualitative and quantitative (mixed methods) topics, and is commonly used in scoping reviews.
Example question: “What patient-led models of care are used to manage chronic disease in high income countries?"
Population
"Important characteristics of participants, including age and other qualifying criteria. You may not need to include this element unless your question focuses on a specific condition or cohort."
N/A. As our example considers chronic diseases broadly, not a specific condition/population - such as women with chronic obstructive pulmonary disorder.
Concept
"The core concept examined by the scoping review should be clearly articulated to guide the scope and breadth of the inquiry. This may include details that pertain to elements that would be detailed in a standard systematic review, such as the "interventions" and/or "phenomena of interest" and/or "outcomes".
Chronic disease
Patient-led care models
Peters MDJ, Godfrey C, McInerney P, Munn Z, Tricco AC, Khalil, H. Chapter 11: Scoping Reviews (2020 version). In: Aromataris E, Munn Z (Editors). JBI Manual for Evidence Synthesis, JBI, 2020. Available from https://synthesismanual.jbi.global . https://doi.org/10.46658/JBIMES-20-12
A model useful for qualitative and mixed method type research questions.
Example question: What are young parents’ experiences of attending antenatal education? (Cooke et al., 2012)
ample
The group you are focusing on.
Young parents
henomenon of nterest
The behaviour or experience your research is examining.
Experience of antenatal classes
esign
How the research will be carried out?
Interviews, questionnaires
valuation
What are the outcomes you are measuring?
Experiences and views
esearch type
What is the research type you are undertaking?
Qualitative
Cooke, A., Smith, D. and Booth, A. (2012) 'Beyond PICO: the SPIDER tool for qualitative evidence synthesis.' Qualitative Health Research , 22(10) pp. 1435-1443
A model useful for qualitative and mixed method type research questions.
Example question: How effective is mindfulness used as a cognitive therapy in a counseling service in improving the attitudes of patients diagnosed with cancer?
etting
The setting or the context
Counseling service
opulation or perspective
Which population or perspective will the research be conducted for/from
Patients diagnosed with cancer
ntervention
The intervention been studied
Mindfulness based cognitive therapy
omparison
Is there a comparison to be made?
No comparison
valuation
How well did the intervention work, what were the results?
Assess patients attitudes to see if the intervention improved their quality of life
Example question taken from: Tate, KJ., Newbury-Birch, D., and McGeechan, GJ. (2018) ‘A systematic review of qualitative evidence of cancer patients’ attitudes to mindfulness.’ European Journal of Cancer Care , 27(2) pp. 1 – 10.
A model useful for qualitative and mixed method type research questions, especially for question examining particular services or professions.
Example question: Cross service communication in supporting adults with learning difficulties
xpectation
Purpose of the study - what are you trying to achieve?
How communication can be improved between services to create better care
lient group
Which group are you focusing on?
Adult with learning difficulties
ocation
Where is that group based?
Community
mpact
If your research is looking for service improvement, what is this and how is it being measured?
Better support services for adults with learning difficulties through joined up, cross-service working
rofessionals
What professional staff are involved?
Community nurses, social workers, carers
ervice
Which service are you focusing on?
Adult support services
You might find that your topic does not always fall into one of the models listed on this page. You can always modify a model to make it work for your topic, and either remove or incorporate additional elements.
The important thing is to ensure that you have a high quality question that can be separated into its component parts.
Defining the research question and developing a protocol are the essential first steps in your systematic review. The success of your systematic review depends on a clear and focused question, so take the time to get it right.
A framework may help you to identify the key concepts in your research question and to organise your search terms in one of the Library's databases.
Several frameworks or models exist to help researchers structure a research question and three of these are outlined on this page: PICO, SPICE and SPIDER.
It is advisable to conduct some scoping searches in a database to look for any reviews on your research topic and establish whether your topic is an original one .
Y ou will need to identify the relevant database(s) to search and your choice will depend on your topic and the research question you need to answer.
By scanning the titles, abstracts and references retrieved in a scoping search, you will reveal the terms used by authors to describe the concepts in your research question, including the synonyms or abbreviations that you may wish to add to a database search.
The Library can help you to search for existing reviews: make an appointment with your Subject Librarian to learn more.
The PICO framework
PICO may be the most well-known model framework: it has its origins in epidemiology and now is widely-used for evidence-based practice and systematic reviews.
PICO normally stands for Population (or Patient or Problem) - Intervention - Comparator - Outcome.
Population defines the group you are studying. It may for example be healthy adults, or adults with dementia, or children under 5 years of age with asthma.
Intervention is the type of treatment you aim to study, e.g. a medicine or a physical therapy.
Comparator is another type of treatment you aim to compare the first treatment with, or perhaps a placebo.
Outcome is the result you intend to measure, for example (increased or decreased) life expectancy, or (cessation of) pain.
The SPICE framework
SPICE is used mostly in social science and healthcare research. It stands for Setting - Population (or Perspective) - Intervention - Comparator - Evaluation. It is similar to PICO and was devised by Booth (2004).
Setting: the location or environment relevant to your research (e.g. accident and emergency unit)
Population (or perspective): the type of group that you are studying (e.g. older people)
Intervention: the intervention/practice/treatment that you are evaluating (e.g. initial examination of patients by allied health staff)
Comparator: an intervention with which you compare the above comparator (e.g. initial examination by medical staff)
Evaluation: the hypothetical result you intend to evaluate e.g. lower mortality rates)
The examples in the SPICE table are based on the following research question: Can mortality rates for older people be reduced if a greater proportion are examined initially by allied health staff in A&E? Source: Booth, A (2004) Formulating answerable questions. In Booth, A & Brice, A (Eds) Evidence Based Practice for Information Professionals: A handbook. (pp. 61-70) London: Facet Publishing.
The SPIDER framework
SPIDER was adapted from the PIC O framework in order to include searches for qualitative and mixed-methods research. SPIDER was developed by Cooke, Smith and Booth (2012).
Sample: qualitative research may have fewer participants than quantitative research and findings may not be generalised to the entire population.
Phenonemon of Interest: experiences, behaviours or decisions may be of more interest to the qualitative researcher, rather than an intervention.
Design: the research method may be an interview or a survey.
Evaluation: outcomes may include more subjective ones, e.g. attitudes.
Research type: the search can encompass qualitative and mixed-methods research, as well as quantitative research.
Source : Cooke, A., Smith, D. & Booth, A. (2012). Beyond PICO: the SPIDER tool for qualitative evidence synthesis. Qualitative Health Research (10), 1435-1443. http://doi.org/10.1177/1049732312452938 .
More advice about formulating a research question
Module 1 in Cochrane Interactive Learning explains the importance of the research question, some types of review question and the PICO framework. The Library is subscribing to Cochrane Interactive Learning .
Run a free plagiarism check in 10 minutes, generate accurate citations for free.
Knowledge Base
Methodology
Systematic Review | Definition, Example, & Guide
Systematic Review | Definition, Example & Guide
Published on June 15, 2022 by Shaun Turney . Revised on November 20, 2023.
A systematic review is a type of review that uses repeatable methods to find, select, and synthesize all available evidence. It answers a clearly formulated research question and explicitly states the methods used to arrive at the answer.
They answered the question “What is the effectiveness of probiotics in reducing eczema symptoms and improving quality of life in patients with eczema?”
In this context, a probiotic is a health product that contains live microorganisms and is taken by mouth. Eczema is a common skin condition that causes red, itchy skin.
Table of contents
What is a systematic review, systematic review vs. meta-analysis, systematic review vs. literature review, systematic review vs. scoping review, when to conduct a systematic review, pros and cons of systematic reviews, step-by-step example of a systematic review, other interesting articles, frequently asked questions about systematic reviews.
A review is an overview of the research that’s already been completed on a topic.
What makes a systematic review different from other types of reviews is that the research methods are designed to reduce bias . The methods are repeatable, and the approach is formal and systematic:
Formulate a research question
Develop a protocol
Search for all relevant studies
Apply the selection criteria
Extract the data
Synthesize the data
Write and publish a report
Although multiple sets of guidelines exist, the Cochrane Handbook for Systematic Reviews is among the most widely used. It provides detailed guidelines on how to complete each step of the systematic review process.
Systematic reviews are most commonly used in medical and public health research, but they can also be found in other disciplines.
Systematic reviews typically answer their research question by synthesizing all available evidence and evaluating the quality of the evidence. Synthesizing means bringing together different information to tell a single, cohesive story. The synthesis can be narrative ( qualitative ), quantitative , or both.
Prevent plagiarism. Run a free check.
Systematic reviews often quantitatively synthesize the evidence using a meta-analysis . A meta-analysis is a statistical analysis, not a type of review.
A meta-analysis is a technique to synthesize results from multiple studies. It’s a statistical analysis that combines the results of two or more studies, usually to estimate an effect size .
A literature review is a type of review that uses a less systematic and formal approach than a systematic review. Typically, an expert in a topic will qualitatively summarize and evaluate previous work, without using a formal, explicit method.
Although literature reviews are often less time-consuming and can be insightful or helpful, they have a higher risk of bias and are less transparent than systematic reviews.
Similar to a systematic review, a scoping review is a type of review that tries to minimize bias by using transparent and repeatable methods.
However, a scoping review isn’t a type of systematic review. The most important difference is the goal: rather than answering a specific question, a scoping review explores a topic. The researcher tries to identify the main concepts, theories, and evidence, as well as gaps in the current research.
Sometimes scoping reviews are an exploratory preparation step for a systematic review, and sometimes they are a standalone project.
A systematic review is a good choice of review if you want to answer a question about the effectiveness of an intervention , such as a medical treatment.
To conduct a systematic review, you’ll need the following:
A precise question , usually about the effectiveness of an intervention. The question needs to be about a topic that’s previously been studied by multiple researchers. If there’s no previous research, there’s nothing to review.
If you’re doing a systematic review on your own (e.g., for a research paper or thesis ), you should take appropriate measures to ensure the validity and reliability of your research.
Access to databases and journal archives. Often, your educational institution provides you with access.
Time. A professional systematic review is a time-consuming process: it will take the lead author about six months of full-time work. If you’re a student, you should narrow the scope of your systematic review and stick to a tight schedule.
Bibliographic, word-processing, spreadsheet, and statistical software . For example, you could use EndNote, Microsoft Word, Excel, and SPSS.
A systematic review has many pros .
They minimize research bias by considering all available evidence and evaluating each study for bias.
Their methods are transparent , so they can be scrutinized by others.
They’re thorough : they summarize all available evidence.
They can be replicated and updated by others.
Systematic reviews also have a few cons .
They’re time-consuming .
They’re narrow in scope : they only answer the precise research question.
The 7 steps for conducting a systematic review are explained with an example.
Step 1: Formulate a research question
Formulating the research question is probably the most important step of a systematic review. A clear research question will:
Allow you to more effectively communicate your research to other researchers and practitioners
Guide your decisions as you plan and conduct your systematic review
A good research question for a systematic review has four components, which you can remember with the acronym PICO :
Population(s) or problem(s)
Intervention(s)
Comparison(s)
You can rearrange these four components to write your research question:
What is the effectiveness of I versus C for O in P ?
Sometimes, you may want to include a fifth component, the type of study design . In this case, the acronym is PICOT .
Type of study design(s)
The population of patients with eczema
The intervention of probiotics
In comparison to no treatment, placebo , or non-probiotic treatment
The outcome of changes in participant-, parent-, and doctor-rated symptoms of eczema and quality of life
Randomized control trials, a type of study design
Their research question was:
What is the effectiveness of probiotics versus no treatment, a placebo, or a non-probiotic treatment for reducing eczema symptoms and improving quality of life in patients with eczema?
Step 2: Develop a protocol
A protocol is a document that contains your research plan for the systematic review. This is an important step because having a plan allows you to work more efficiently and reduces bias.
Your protocol should include the following components:
Background information : Provide the context of the research question, including why it’s important.
Research objective (s) : Rephrase your research question as an objective.
Selection criteria: State how you’ll decide which studies to include or exclude from your review.
Search strategy: Discuss your plan for finding studies.
Analysis: Explain what information you’ll collect from the studies and how you’ll synthesize the data.
If you’re a professional seeking to publish your review, it’s a good idea to bring together an advisory committee . This is a group of about six people who have experience in the topic you’re researching. They can help you make decisions about your protocol.
It’s highly recommended to register your protocol. Registering your protocol means submitting it to a database such as PROSPERO or ClinicalTrials.gov .
Step 3: Search for all relevant studies
Searching for relevant studies is the most time-consuming step of a systematic review.
To reduce bias, it’s important to search for relevant studies very thoroughly. Your strategy will depend on your field and your research question, but sources generally fall into these four categories:
Databases: Search multiple databases of peer-reviewed literature, such as PubMed or Scopus . Think carefully about how to phrase your search terms and include multiple synonyms of each word. Use Boolean operators if relevant.
Handsearching: In addition to searching the primary sources using databases, you’ll also need to search manually. One strategy is to scan relevant journals or conference proceedings. Another strategy is to scan the reference lists of relevant studies.
Gray literature: Gray literature includes documents produced by governments, universities, and other institutions that aren’t published by traditional publishers. Graduate student theses are an important type of gray literature, which you can search using the Networked Digital Library of Theses and Dissertations (NDLTD) . In medicine, clinical trial registries are another important type of gray literature.
Experts: Contact experts in the field to ask if they have unpublished studies that should be included in your review.
At this stage of your review, you won’t read the articles yet. Simply save any potentially relevant citations using bibliographic software, such as Scribbr’s APA or MLA Generator .
Databases: EMBASE, PsycINFO, AMED, LILACS, and ISI Web of Science
Handsearch: Conference proceedings and reference lists of articles
Gray literature: The Cochrane Library, the metaRegister of Controlled Trials, and the Ongoing Skin Trials Register
Experts: Authors of unpublished registered trials, pharmaceutical companies, and manufacturers of probiotics
Step 4: Apply the selection criteria
Applying the selection criteria is a three-person job. Two of you will independently read the studies and decide which to include in your review based on the selection criteria you established in your protocol . The third person’s job is to break any ties.
To increase inter-rater reliability , ensure that everyone thoroughly understands the selection criteria before you begin.
If you’re writing a systematic review as a student for an assignment, you might not have a team. In this case, you’ll have to apply the selection criteria on your own; you can mention this as a limitation in your paper’s discussion.
You should apply the selection criteria in two phases:
Based on the titles and abstracts : Decide whether each article potentially meets the selection criteria based on the information provided in the abstracts.
Based on the full texts: Download the articles that weren’t excluded during the first phase. If an article isn’t available online or through your library, you may need to contact the authors to ask for a copy. Read the articles and decide which articles meet the selection criteria.
It’s very important to keep a meticulous record of why you included or excluded each article. When the selection process is complete, you can summarize what you did using a PRISMA flow diagram .
Next, Boyle and colleagues found the full texts for each of the remaining studies. Boyle and Tang read through the articles to decide if any more studies needed to be excluded based on the selection criteria.
When Boyle and Tang disagreed about whether a study should be excluded, they discussed it with Varigos until the three researchers came to an agreement.
Step 5: Extract the data
Extracting the data means collecting information from the selected studies in a systematic way. There are two types of information you need to collect from each study:
Information about the study’s methods and results . The exact information will depend on your research question, but it might include the year, study design , sample size, context, research findings , and conclusions. If any data are missing, you’ll need to contact the study’s authors.
Your judgment of the quality of the evidence, including risk of bias .
You should collect this information using forms. You can find sample forms in The Registry of Methods and Tools for Evidence-Informed Decision Making and the Grading of Recommendations, Assessment, Development and Evaluations Working Group .
Extracting the data is also a three-person job. Two people should do this step independently, and the third person will resolve any disagreements.
They also collected data about possible sources of bias, such as how the study participants were randomized into the control and treatment groups.
Step 6: Synthesize the data
Synthesizing the data means bringing together the information you collected into a single, cohesive story. There are two main approaches to synthesizing the data:
Narrative ( qualitative ): Summarize the information in words. You’ll need to discuss the studies and assess their overall quality.
Quantitative : Use statistical methods to summarize and compare data from different studies. The most common quantitative approach is a meta-analysis , which allows you to combine results from multiple studies into a summary result.
Generally, you should use both approaches together whenever possible. If you don’t have enough data, or the data from different studies aren’t comparable, then you can take just a narrative approach. However, you should justify why a quantitative approach wasn’t possible.
Boyle and colleagues also divided the studies into subgroups, such as studies about babies, children, and adults, and analyzed the effect sizes within each group.
Step 7: Write and publish a report
The purpose of writing a systematic review article is to share the answer to your research question and explain how you arrived at this answer.
Your article should include the following sections:
Abstract : A summary of the review
Introduction : Including the rationale and objectives
Methods : Including the selection criteria, search method, data extraction method, and synthesis method
Results : Including results of the search and selection process, study characteristics, risk of bias in the studies, and synthesis results
Discussion : Including interpretation of the results and limitations of the review
Conclusion : The answer to your research question and implications for practice, policy, or research
To verify that your report includes everything it needs, you can use the PRISMA checklist .
Once your report is written, you can publish it in a systematic review database, such as the Cochrane Database of Systematic Reviews , and/or in a peer-reviewed journal.
In their report, Boyle and colleagues concluded that probiotics cannot be recommended for reducing eczema symptoms or improving quality of life in patients with eczema. Note Generative AI tools like ChatGPT can be useful at various stages of the writing and research process and can help you to write your systematic review. However, we strongly advise against trying to pass AI-generated text off as your own work.
If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.
Student’s t -distribution
Normal distribution
Null and Alternative Hypotheses
Chi square tests
Confidence interval
Quartiles & Quantiles
Cluster sampling
Stratified sampling
Data cleansing
Reproducibility vs Replicability
Peer review
Prospective cohort study
Research bias
Implicit bias
Cognitive bias
Placebo effect
Hawthorne effect
Hindsight bias
Affect heuristic
Social desirability bias
A literature review is a survey of scholarly sources (such as books, journal articles, and theses) related to a specific topic or research question .
It is often written as part of a thesis, dissertation , or research paper , in order to situate your work in relation to existing knowledge.
A literature review is a survey of credible sources on a topic, often used in dissertations , theses, and research papers . Literature reviews give an overview of knowledge on a subject, helping you identify relevant theories and methods, as well as gaps in existing research. Literature reviews are set up similarly to other academic texts , with an introduction , a main body, and a conclusion .
An annotated bibliography is a list of source references that has a short description (called an annotation ) for each of the sources. It is often assigned as part of the research process for a paper .
A systematic review is secondary research because it uses existing research. You don’t collect new data yourself.
Cite this Scribbr article
If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.
Turney, S. (2023, November 20). Systematic Review | Definition, Example & Guide. Scribbr. Retrieved August 21, 2024, from https://www.scribbr.com/methodology/systematic-review/
Is this article helpful?
Shaun Turney
Other students also liked, how to write a literature review | guide, examples, & templates, how to write a research proposal | examples & templates, what is critical thinking | definition & examples, get unlimited documents corrected.
✔ Free APA citation check included ✔ Unlimited document corrections ✔ Specialized in correcting academic texts
An official website of the United States government
The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.
The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.
Publications
Account settings
Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .
Advanced Search
Journal List
v.15(12); 2023 Dec
PMC10828625
Ten Steps to Conduct a Systematic Review
Ernesto calderon martinez.
1 Digital Health, Universidad Nacional Autónoma de México, Ciudad de Mexico, MEX
Jose R Flores Valdés
2 General Medicine, Universidad Autonoma de Guadalajara, Guadalajara, MEX
Jaqueline L Castillo
Jennifer v castillo.
3 General Medicine, Universidad Autónoma de Guadalajara, Guadalajara, MEX
Ronald M Blanco Montecino
4 Research, University of Texas Southwestern Medical Center, Dallas, USA
Julio E Morin Jimenez
5 General Medicine, Universidad Autónoma del Estado de México, Ciudad de Mexico, MEX
David Arriaga Escamilla
6 Internal Medicine, Universidad Justo Sierra, Ciudad de Mexico, MEX
Edna Diarte
7 Medicine, Universidad Autonoma de Sinaloa, Culiacan, MEX
This article introduces a concise 10-step guide tailored for researchers engaged in systematic reviews within the field of medicine and health, aligning with the imperative for evidence-based healthcare. The guide underscores the importance of integrating research evidence, clinical proficiency, and patient preferences. It emphasizes the need for precision in formulating research questions, utilizing tools such as PICO(S)(Population Intervention Comparator Outcome), PEO (Population Exposure Outcome), SPICE (setting, perspective, intervention/exposure/interest, comparison, and evaluation), and SPIDER (expectation, client group, location, impact, professionals, service and evaluation), and advocates for the validation of research ideas through preliminary investigations. The guide prioritizes transparency by recommending the documentation and registration of protocols on various platforms. It highlights the significance of a well-organized literature search, encouraging the involvement of experts to ensure a high-quality search strategy. The critical stages of screening titles and abstracts are navigated using different tools, each characterized by its specific advantages. This diverse approach aims to enhance the effectiveness of the systematic review process. In conclusion, this 10-step guide provides a practical framework for the rigorous conduct of systematic reviews in the domain of medicine and health. It addresses the unique challenges inherent in this field, emphasizing the values of transparency, precision, and ongoing efforts to improve primary research practices. The guide aims to contribute to the establishment of a robust evidence base, facilitating informed decision-making in healthcare.
Introduction
The necessity of evidence-based healthcare, which prioritizes the integration of top-tier research evidence, clinical proficiency, and patient preferences, is increasingly recognized [ 1 , 2 ]. Due to the extensive amount and varied approaches of primary research, secondary research, particularly systematic reviews, is required to consolidate and interpret this information with minimal bias [ 3 , 4 ]. Systematic reviews, structured to reduce bias in the selection, examination, and consolidation of pertinent research studies, are highly regarded in the research evidence hierarchy. The aim is to enable objective, repeatable, and transparent healthcare decisions by reducing systematic errors.
To guarantee the quality and openness of systematic reviews, protocols are formulated, registered, and published prior to the commencement of the review process. Platforms such as PROSPERO (International Prospective Register of Systematic Reviews) aid in the registration of systematic review protocols, thereby enhancing transparency in the review process [ 5 ]. High-standard reviews comply with stringent peer review norms, ensuring that methodologies are revealed beforehand, thus reducing post hoc alterations for objective, repeatable, and transparent outcomes [ 6 ].
Nonetheless, the practical execution of systematic reviews, particularly in the field of medicine and health, poses difficulties for researchers. To address this, a succinct 10-step guide is offered to both seasoned and novice researchers, with the goal of improving the rigor and transparency of systematic reviews.
Technical report
Step 1: structure of your topic
When developing a research question for a systematic review or meta-analysis (SR/MA), it is essential to precisely outline the objectives of the study, taking into account potential effect modifiers. The research question should concentrate on and concisely explain the scientific elements and encapsulate the aim of the project.
Instruments such as PICO(S)(Population Intervention Comparator Outcome), PEO (Population Exposure Outcome), SPICE (setting, perspective, intervention/exposure/interest, comparison, and evaluation), and SPIDER (expectation, client group, location, impact, professionals, service and evaluation) assist in structuring research questions for evidence-based clinical practice, qualitative research, and mixed-methods research [ 7 - 9 ]. A joint strategy of employing SPIDER and PICO is suggested for exhaustive searches, subject to time and resource constraints. PICO and SPIDER are the frequently utilized tools. The selection between them is contingent on the research’s nature. The ability to frame and address research questions is crucial in evidence-based medicine. The "PICO format" extends to the "PICOTS" (Population Intervention Comparator Outcome Time Setting) (Table 1 ) design. Explicit delineation of these components is critical for systematic reviews, ensuring a balanced and pertinent research question with broad applicability.
This table gives a breakdown of the mnemonic for the elements required to formulate an adequate research question. Utilizing this mnemonic leads to a proper and non-biased search. Examples extracted from “The use and efficacy of oral phenylephrine versus placebo on adults treating nasal congestion over the years in a systematic review” [ 10 ].
RCT, randomized control trial; PICOTS, Population Intervention Comparator Outcome Time Setting
Structure
Meaning
Example
Inclusion criteria
Exclusion criteria
P
P (Population and/or Patient and/or Problem): It refers to the people in/for whom the systematic review is expected to be applied.
Adults’ population >18 years and <65 years
Adults between 18 and 65 years
Elderly, pediatrics, pregnant
I
I (Intervention): In the context of systematic reviews examining the effects of treatment. In other words, it encompasses medicines, procedures, health education, public health measures, or bundles/combinations. It also includes preventive measures like vaccination, prophylaxis, health education tools, and packages of such interventions. In some cases, intervention is not something that the investigators administer, and the investigators are merely observing the effects. Therefore, (I) can be better expressed as ‘Exposure’ abbreviated as (E). Diagnostic tests, prognostic markers, and condition prevalence can represent exposure.
Administration of oral phenylephrine
Oral administration of phenylephrine [ ]
IV administration of phenylephrine, nasal phenylephrine
C
C (Comparison): It refers to the comparison of two groups; it can be people not receiving the intervention and those receiving an alternate intervention, placebo, or nothing. However, for some study designs and/or research questions, including a comparison may not be feasible.
Placebo, standard care, or no treatment
Phenylephrine vs. placebo
Phenylephrine in combination with another medication. Phenylephrine in comparison with other medication
O
O (Outcome): This refers to the effect intervention (I) has on the selected population (P) in comparison to the comparison (C). Most systematic reviews focus on efficacy, safety, and sometimes cost. When a systematic review focuses on diagnostic tests, the aim is to identify accuracy, reliability, and cost.
Symptoms like nasal congestion and nasal airway resistance
Nasal congestion management
Other allergy-related symptoms
T
T (Time Frame): The outcomes are only relevant when it is evaluated in a specific time frame.
Over the years
Taking medication over some time
One day, one week
S
S (Study Design): A study design is a specific protocol that allows the conduction of the study, allowing the investigator to translate the conceptual hypothesis research question into an operational one.
RCTs
RCT
Letters to the editor, case-control trials, observational
While there are various formats like SPICE and ECLIPSE, PICO continues to be favored due to its adaptability across research designs. The research question should be stated in the introduction of a systematic review, laying the groundwork for impartial interpretations. The PICOTS template is applicable to systematic reviews that tackle a variety of research questions.
Validation of the Idea
To bolster the solidity of our research, we advocate for the execution of preliminary investigations and the validation of ideas. An initial exploration, especially in esteemed databases like PubMed, is vital. This process serves several functions, including the discovery of pertinent articles, the verification of the suggested concept, the prevention of revisiting previously explored queries, and the assurance of a sufficient collection of articles for review.
Moreover, it is crucial to concentrate on topics that tackle significant healthcare challenges, align with worldwide necessities and principles, mirror the present scientific comprehension, and comply with established review methodologies. Gaining a profound comprehension of the research field through pertinent videos and discussions is crucial for enhancing result retrieval. Overlooking this step could lead to the unfortunate unearthing of a similar study published earlier, potentially leading to the termination of our research, a scenario where precious time would be squandered on an issue already thoroughly investigated.
For example, during our initial exploration using the terms “Silymarin AND Liver Enzyme Levels” on PubMed, we discovered a systematic review and meta-analysis discussing the impact of Silymarin on liver enzyme levels in humans [ 11 ]. This discovery acts as a safety net because we will not pursue this identical idea/approach and face rejection; instead, we can rephrase a more sophisticated research question or objective, shifting the focus on evaluating different aspects of the same idea by just altering a part of the PICOTS structure. We can evaluate a different population, a different comparator, and a different outcome and arrive at a completely novel idea. This strategic method guarantees the relevance and uniqueness of our research within the scientific community.
Step 2: databases
This procedure is consistently executed concurrently. A well-orchestrated and orderly team is essential for primary tasks such as literature review, screening, and risk of bias evaluation by independent reviewers. During the study inclusion phase, if disagreements arise, the involvement of a third independent reviewer often becomes vital for resolution. The team’s composition should strive to include individuals with a variety of skills.
The intricacy of the research question and the expected number of references dictate the team’s size. The final team structure is decided after the definitive search, with the participation of independent reviewers dependent on the number of hits obtained. It is crucial to maintain a balance of expertise among team members to avoid undue influence from a specific group of experts. Importantly, a team requires a competent leader who may not necessarily be the most senior member or a professor. The leader plays a central role in coordinating the project, ensuring compliance with the study protocol, keeping all team members updated, and promoting their active involvement.
Establishing solid selection criteria is the foundational step in a systematic review. These criteria act as the guiding principles during the screening process, ensuring a focused approach that conserves time, reduces errors, and maintains transparency and reproducibility, being a primary component of all systematic review protocols. Carefully designed to align with the research question, as in Table 1 , the selection criteria cover a range of study characteristics, including design, publication date, and geographical location. Importantly, they incorporate details related to the study population, exposure and outcome measures, and methodological approaches. Concurrently, researchers must develop a comprehensive search strategy to retrieve eligible studies. A well-organized strategy using various terms and Boolean operators is typically required (Figure 1 ). It involves crafting specific search queries for different online databases, such as Embase, MEDLINE, Web of Science, and Google Scholar. In these searches, we can include singulars and plurals of the terms, misspellings of the terms, and related terms, among others. However, it is crucial to strike a balance, avoiding overly extensive searches that yield unnecessary results and insufficient searches that may miss relevant evidence. In this process, collaborating with a librarian or search specialist improves the quality and reproducibility of the search. For this, it is important to understand the basic characteristics of the main databases (Table 2 ). It is important for the team to include in their methodology how they will collect the data and the tools they will use for their entire protocol so that there is a consensus about this among all of them.
Principal databases where the main articles of the whole body of the research can be gathered. This is an example of specialities and it can be used for the researchers to have a variety of databases to work.
NLM, National Library of Medicine; ICTRP, International Clinical Trials Registry Platform; LILACS, Literatura Latino-Americana e do Caribe em Ciências da Saúde
Database
Principal characteristics
PubMed [ , ]
A free search engine accessing primarily the MEDLINE database of references and abstracts on life sciences and biomedical topics. It is maintained by the United States NLM at the National Institutes of Health
EMBASE [ ]
A biomedical and pharmacological database containing bibliographic records with citations, abstracts, and indexing derived from biomedical articles in peer-reviewed journals. It is especially strong in its coverage of drug and pharmaceutical research.
Cochrane [ ]
A database of systematic reviews. It includes reliable evidence from Cochrane and other systematic reviews of clinical trials
Google Scholar [ ]
A freely accessible web search engine that indexes the full text or metadata of scholarly literature across an array of publishing formats and disciplines.
Web of Science [ ]
A research database used for citation analysis. It provides access to multiple databases including the Science Citation Index, the Social Sciences Citation Index, and the Arts and Humanities Citation Index.
Science Direct [ ]
A full-text scientific database offering journal articles and book chapters from more than 2,500 peer-reviewed journals and more than 11,000 books.
PsychINFO [ ]
An electronic bibliographic database providing abstracts and citations to the scholarly literature in the psychological, social, behavioral, and health sciences.
ICTRP [ ]
ICTRP is a database of clinical trials being conducted around the world. It is maintained by the World Health Organization.
Clinical Trials [ ]
A database of privately and publicly funded clinical studies conducted around the world. It is provided by the United States NLM.
LILACS [ ]
The LILACS is an online bibliographic database of scientific and medical publications maintained by the Latin American and Caribbean Center on Health Sciences Information.
Boolean operators help break down and narrow down the search. "AND" will narrow your search so you get fewer results. It tells the database that your search results must include every one of your search terms. "OR" means MORE results. OR tells the database that you want results that mention one or both of your search terms. "NOT" means you are telling the database that you wish to have information related to the first term but not the second.
Image credits to authors of the articles (Created on www.canva.com )
Documenting and registering the protocol early in the research process is crucial for transparency and avoiding duplication. The protocol serves as recorded guidance, encompassing elements like the research question, eligibility criteria, intervention details, quality assessment, and the analysis plan. Before uploading to registry sites, such as PROSPERO, it is advisable to have the protocol reviewed by the principal investigator. The comprehensive study protocol outlines research objectives, design, inclusion/exclusion criteria, electronic search strategy, and analysis plan, providing a framework for reviewers during the screening process. These are steps previously established in our process. Registration can be done on platforms like PROSPERO 5 for health and social care reviews or Cochrane 3 for interventions.
Step 3: search
In the process of conducting a systematic review, a well-organized literature search is a pivotal step. It is suggested to incorporate at least two to four online databases, such as Embase, MEDLINE, Web of Science, and Cochrane. As mentioned earlier, formulating search strategies for each database is crucial due to their distinct requirements. In line with AMSTAR (A Measurement Tool to Assess Systematic Reviews) guidelines, a minimum of two databases should be explored in systematic reviews/meta-analyses (SR/MA), but increasing this number improves the accuracy of the results [ 22 ]. We advise including databases from China as most studies exclude databases from this demographic [ 9 ]. The choice of databases, like Cochrane or ICTRP, is dependent on the review questions, especially in the case of clinical trials. These databases cater to various health-related aspects, and researchers should select based on the research subject. Additionally, it is important to consider unique search methods for each database, as some may not support the use of Boolean operators or quotations. Detailed search strategies for each database, including customization based on specific attributes, are provided for guidance. In general, systematic reviews involve searching through multiple databases and exploring additional sources, such as reference lists, clinical trial registries, and databases of non-indexed journals, to ensure a comprehensive review of both published and, in some instances, unpublished literature.
It is important to note that the extraction of information will also vary among databases. However, our goal is to obtain a RIS, BibText, CSV, bib, or txt file to import into any of the tools we will use in subsequent steps.
Step 4: tools
It is necessary to upload all our reference files into a predetermined tool like Rayyan, Covidence, EPPI, CADIMA, and DistillerSR for the collection and management of records (Table 3 ). The subsequent step entails the elimination of duplicates using a particular method. Duplicates are recognized if they have the same title and author published in the same year or if they have the same title and author published in the same journal. Tools such as Rayyan or Covidence assist in automatically identifying duplicates. The eradication of duplicate records is vital for lessening the workload during the screening of titles and abstracts.
The tools described above use artificial intelligence to help create keywords according to the inclusion and exclusion criteria defined previously by the researcher. This tool will help to reduce the amount of time to rule in or out efficiently.
Tool
Description
Key Features
Usage
Cost
Duplicate removal
Article screening
Critical appraisal
Assist with reporting
Covidence [ ]
Web-based software for managing systematic review projects.
Streamlined screening and data extraction processes; collaboration features for team members; integration with reference management tools; real-time project tracking.
Systematic reviews and evidence synthesis projects.
Subscription-based, pricing varies.
Yes
Yes
Yes
Yes
Rayyan [ ]
A web application designed for systematic review screening and study selection.
User-friendly interface for importing, screening, and organizing studies; collaboration tools for multiple reviewers; supports a variety of file formats.
Screening and study selection in systematic reviews.
Free with limitations; Premium plans available.
No
Yes
No
Limited
EPPI-Reviewer [ ]
Software for managing the review process, with a focus on systematic reviews and other forms of evidence synthesis.
Comprehensive data extraction and synthesis capabilities; customizable review processes; integration with reference management tools.
Systematic reviews, evidence synthesis, and meta-analysis.
Subscription-based, pricing varies.
Yes
Yes
Yes
Yes
CADIMA [ ]
A web-based systematic review software platform.
Customizable review workflow; collaboration tools for team members; integrated data extraction and synthesis features; real-time project tracking.
Systematic reviews and evidence synthesis projects.
Subscription-based, pricing varies.
Yes
Yes
Yes
Limited
DistillerSR [ ]
Online systematic review software for data extraction and synthesis.
Streamlined data extraction and synthesis tools; collaboration features for team members; real-time progress tracking; integration with reference management tools.
Systematic reviews and evidence synthesis projects.
Subscription-based, pricing varies.
Yes
Yes
Yes
Yes
Step 5: title and abstract screening
The process of a systematic review encompasses several steps, which include screening titles and abstracts and applying selection criteria. During the phase of title and abstract screening, a minimum of two reviewers independently evaluate the pertinence of each reference. Tools like Rayyan, Covidence, and DistillerSR are suggested for this phase due to their effectiveness. The decisions to further assess retrieved articles are made based on the selection criteria. It is recommended to involve at least three reviewers to minimize the likelihood of errors and resolve disagreements.
In the following stages of the systematic review process, the focus is on acquiring full-text articles. Numerous search engines provide links for free access to full-text articles, and in situations where this is not feasible, alternative routes such as ResearchGate are pursued for direct requests from authors. Additionally, a manual search is carried out to decrease bias, using methods like searching references from included studies, reaching out to authors and experts, and exploring related articles in PubMed and Google Scholar. This manual search is vital for identifying reports that might have been initially overlooked. The approach involves independent reviewing by assigning specific methods to each team member, with the results gathered for comparison, discussion, and minimizing bias.
Step 6: full-text screening
The second phase in the screening process is full-text screening. This involves a thorough examination of the study reports that were selected after the title and abstract screening stage. To prevent bias, it is essential that three individuals participate in the full-text screening. Two individuals will scrutinize the entire text to ensure that the initial research question is being addressed and that none of the previously determined exclusion criteria are present in the articles. They have the option to "include" or "exclude" an article. If an article is "excluded," the reviewer must provide a justification for its exclusion. The third reviewer is responsible for resolving any disagreements, which could arise if one reviewer "excludes" an article that another reviewer "includes." The articles that are "included" will be used in the systematic review.
The process of seeking additional references following the full-text screening in a systematic review involves identifying other potentially relevant studies that were not found in the initial literature search. This can be achieved by reviewing the reference lists of the studies that were included after the full-text screening. This step is crucial as it can help uncover additional studies that are relevant to your research question but might have been overlooked in the initial database search due to variations in keywords, indexing terms, or other factors [ 15 ].
A PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) chart, also referred to as a PRISMA flow diagram, is a visual tool that illustrates the steps involved in an SR/MA. These steps encompass the identification, screening, evaluation of eligibility, and inclusion of studies.
The PRISMA diagram provides a detailed overview of the information flow during the various stages of an SR/MA. It displays the count of records that were identified, included, and excluded, along with the reasons for any exclusions.
The typical stages represented on a PRISMA chart are as follows: 1) identification: this is where records are discovered through database searches. 2) screening: this stage involves going through the records after removing any duplicates. 3) eligibility: at this stage, full-text articles are evaluated for their suitability. 4) included: this refers to the studies that are incorporated into the qualitative and quantitative synthesis. The PRISMA chart serves as a valuable tool for researchers and readers alike, aiding in understanding the process of study selection in the review and the reasons for the exclusion of certain studies. It is usually the initial figure presented in the results section of your systematic review [ 4 ].
Step 7: data extraction
As the systematic review advances, the subsequent crucial steps involve data extraction from the studies included. This process involves a structured data extraction from the full texts included, guided by a pilot-tested Excel sheet, which aids two independent reviewers in meticulously extracting detailed information from each article [ 28 ]. This thorough process offers an initial comprehension of the common characteristics within the evidence body and sets the foundation for the following analytical and interpretive synthesis. The participation of two to three independent reviewers ensures a holistic approach, including the extraction of both adjusted and non-adjusted data to account for potential confounding factors in future analyses. Moreover, numerical data extracted, such as dichotomous or continuous data in intervention reviews or information on true and false results in diagnostic test reviews, undergoes a thorough process. The extracted data might be suitable for pooled analysis, depending on sufficiency and compatibility. Difficulties in harmonizing data formats might occur, and systematic review authors might resort to communication with study authors to resolve these issues and enhance the robustness of the synthesis. This multi-dimensional data extraction process ensures a comprehensive and nuanced understanding of the included studies, paving the way for the subsequent analysis and synthesis phases.
Step 8: risk of bias assessment
To conduct a risk of bias in medical research, it is crucial to adhere to a specific sequence: choose tools that are specifically designed for systematic reviews. These tools should have proven acceptable validity and reliability, specifically address items related to methodological quality (internal validity), and ideally be based on empirical evidence of bias [ 29 ]. These tools should be chosen once the full text is obtained. For easy organization, it can be helpful to compile a list of the retrieved articles and view the type of study because it is necessary to understand how to select and organize each one. The most common tools to evaluate the risk of bias can be found in Table 4 .
The table summarizes some of the different tools to appraise the different types of studies and their main characteristics.
ROB, risk of bias; RRB, risk of reporting bias; AMSTAR; A Measurement Tool to Assess Systematic Reviews; GRADE, Grading of Recommendations Assessment, Development and Evaluations; ROBINS, risk of bias in non-randomized studies; RCT, randomized controlled trials
Tool
Description of the appraisal studies
Cochrane RoB2 Tool [ ]
Widely used in both Cochrane and other systematic reviews. It replaces the notion of assessing study quality with that of assessing the risk of bias (RoB) 2 tool, considers biases arising at different stages of a trial (randomization process, deviation from intended intervention, missing outcome data, measurement of the outcome, and selection of the report result). It assesses RCT individually and in clusters. it also asses crossover RCT and cluster RCT
AHQR RRB [ ]
Evaluates the risk of reporting bias and outcome reporting bias in a systematic review
AMSTAR 2 [ ]
Assess the methodological quality of systematic reviews Including both randomized and non-randomized studies of healthcare interventions. Useful in the context of real-world observational evidence
Evaluate case-control studies. Assess the quality of non-randomized studies. Useful in the evaluation of the methodological quality of case-control studies. It provides a semi-quantitative measure of study quality that can be used to inform the interpretation of findings in a systematic review
GRADE [ ]
It is used to assess the quality of evidence and the strength of recommendations in healthcare
ROBINS [ ]
Tool used to assess the risk of bias in non-randomized studies. Two types of this tool (ROBINS-I and ROBINS-E). ROBINS-I assesses the risk of bias in the results of non-randomized studies that compare the health effects of two or more interventions; it evaluates the estimates of the effectiveness or safety (benefit or harm) of an intervention from studies that did not use randomization to allocate interventions. ROBINS-E provides a structured approach to assess the risk of bias in observational epidemiological studies, designed primarily for use in the context of a systematic review. Evaluates the effects of exposure (including environmental, occupational, and behavioral exposures) on human health. Both tools share many characteristics with the RoB2 tool. They are structured into a fixed set of domains of bias (signaling questions that inform the risk of bias judgments and overall risk of bias judgments). The seven domains of bias addressed are confounding, selection of participants, classification of intervention, deviations from intended interventions, missing data, measurement of outcomes, and selection of reported results. After completing all seven bias domains, an overall judgment is made for each three of the above-mentioned considerations.
After choosing the suitable tool for the type of study, you should know that a good risk of bias should be transparent and easily replicable. This necessitates the review protocol to include clear definitions of the biases that will be evaluated [ 30 ].
The subsequent step in determining the risk of bias is to understand the different categories of risk of bias. This will explicitly assess the risk of selection, performance, attrition, detection, and selective outcome reporting biases. It allows for separate risk of bias ratings by the outcome to account for the outcome-specific variations in detection bias and specific outcome reporting bias.
Keep in mind that assessing the risk of bias based on study design and conduct rather than reporting is very important. Poorly reported studies may be judged as unclear risk of bias. Avoid presenting the risk of bias assessment as a composite score. Finally, classifying the risk of bias as "low," "medium," or "high" is a more practical way to proceed. Methods for determining an overall categorization for the study limitations should be established a priori and documented clearly.
As a concluding statement or as a way to summarize the risk of bias, the assessment is to evaluate the internal validity of the studies included in the systematic review. This process helps to ensure that the conclusions drawn from the review are based on high-quality, reliable evidence.
Step 9: synthesis
This step can be broken down to simplify the concept of conducting a descriptive synthesis of a systematic review. 1) inclusion of studies: the final count of primary studies included in the review is established based on the screening process. 2) flowchart: the systematic review process flow is summarized in a flowchart. This includes the number of references discovered, the number of abstracts and full texts screened, and the final count of primary studies included. 3) study description: the characteristics of the included studies are detailed in a table in the main body of the manuscript. This includes the populations studied, types of exposures, intervention details, and outcomes. 4) results: if a meta-analysis is not possible, the results of the included studies are described. This includes the direction and magnitude of the effect, consistency of the effect across studies, and the strength of evidence for the effect. 5) reporting bias check: reporting bias is a systematic error that can influence the results of a systematic review. It happens when the nature and direction of the results affect the dissemination of research findings. Checking for this bias is an important part of the review process. 6) result verification: the results of the included studies should be verified for accuracy and consistency [ 36 , 37 ]. The descriptive synthesis primarily relies on words and text to summarize and explain the findings, necessitating careful planning and meticulous execution.
Step 10: manuscript
When working on a systematic review and meta-analysis for submission, it is essential to keep the bibliographic database search current if more than six to 12 months have passed since the initial search to capture newly published articles. Guidelines like PRISMA and MOOSE provide flowcharts that visually depict the reporting process for systematic reviews and meta-analyses, promoting transparency, reproducibility, and comparability across studies [ 4 , 38 ]. The submission process requires a comprehensive PRISMA or MOOSE report with these flowcharts. Moreover, consulting with subject matter experts can improve the manuscript, and their contributions should be recognized in the final publication. A last review of the results' interpretation is suggested to further enhance the quality of the publication.
The composition process is organized into four main scientific sections: introduction, methods, results, and discussion, typically ending with a concluding section. After the manuscript, characteristics table, and PRISMA flow diagram are finalized, the team should forward the work to the principal investigator (PI) for comprehensive review and feedback. Finally, choosing an appropriate journal for the manuscript is vital, taking into account factors like impact factor and relevance to the discipline. Adherence to the author guidelines of journals is crucial before submitting the manuscript for publication.
The report emphasizes the increasing recognition of evidence-based healthcare, underscoring the integration of research evidence. The acknowledgment of the necessity for systematic reviews to consolidate and interpret extensive primary research aligns with the current emphasis on minimizing bias in evidence synthesis. The report highlights the role of systematic reviews in reducing systematic errors and enabling objective and transparent healthcare decisions. The detailed 10-step guide for conducting systematic reviews provides valuable insights for both experienced and novice researchers. The report emphasizes the importance of formulating precise research questions and suggests the use of tools for structuring questions in evidence-based clinical practice.
The validation of ideas through preliminary investigations is underscored, demonstrating a thorough approach to prevent redundancy in research efforts. The report provides a practical example of how an initial exploration of PubMed helped identify an existing systematic review, highlighting the importance of avoiding duplication. The systematic and well-coordinated team approach in the establishment of selection criteria, development of search strategies, and an organized methodology is evident. The detailed discussion on each step, such as data extraction, risk of bias assessment, and the importance of a descriptive synthesis, reflects a commitment to methodological rigor.
Conclusions
The systematic review process is a rigorous and methodical approach to synthesizing and evaluating existing research on a specific topic. The 10 steps we followed, from defining the research question to interpreting the results, ensured a comprehensive and unbiased review of the available literature. This process allowed us to identify key findings, recognize gaps in the current knowledge, and suggest areas for future research. Our work contributes to the evidence base in our field and can guide clinical decision-making and policy development. However, it is important to remember that systematic reviews are dependent on the quality of the original studies. Therefore, continual efforts to improve the design, reporting, and transparency of primary research are crucial.
The authors have declared that no competing interests exist.
Author Contributions
Concept and design: Ernesto Calderon Martinez, Jennifer V. Castillo, Julio E. Morin Jimenez, Jaqueline L. Castillo, Edna Diarte
Acquisition, analysis, or interpretation of data: Ernesto Calderon Martinez, Ronald M. Blanco Montecino , Jose R. Flores Valdés, David Arriaga Escamilla, Edna Diarte
Drafting of the manuscript: Ernesto Calderon Martinez, Julio E. Morin Jimenez, Ronald M. Blanco Montecino , Jaqueline L. Castillo, David Arriaga Escamilla
Critical review of the manuscript for important intellectual content: Ernesto Calderon Martinez, Jennifer V. Castillo, Jose R. Flores Valdés, Edna Diarte
Supervision: Ernesto Calderon Martinez
Human Ethics
Consent was obtained or waived by all participants in this study
Animal Ethics
Animal subjects: All authors have confirmed that this study did not involve animal subjects or tissue.
An official website of the United States government
The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.
The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.
Publications
Account settings
My Bibliography
Collections
Citation manager
Save citation to file
Email citation, add to collections.
Create a new collection
Add to an existing collection
Add to My Bibliography
Your saved search, create a file for external citation management software, your rss feed.
Search in PubMed
Search in NLM Catalog
Add to Search
How to Do a Systematic Review: A Best Practice Guide for Conducting and Reporting Narrative Reviews, Meta-Analyses, and Meta-Syntheses
Affiliations.
1 Behavioural Science Centre, Stirling Management School, University of Stirling, Stirling FK9 4LA, United Kingdom; email: [email protected].
2 Department of Psychological and Behavioural Science, London School of Economics and Political Science, London WC2A 2AE, United Kingdom.
3 Department of Statistics, Northwestern University, Evanston, Illinois 60208, USA; email: [email protected].
PMID: 30089228
DOI: 10.1146/annurev-psych-010418-102803
Systematic reviews are characterized by a methodical and replicable methodology and presentation. They involve a comprehensive search to locate all relevant published and unpublished work on a subject; a systematic integration of search results; and a critique of the extent, nature, and quality of evidence in relation to a particular research question. The best reviews synthesize studies to draw broad theoretical conclusions about what a literature means, linking theory to evidence and evidence to theory. This guide describes how to plan, conduct, organize, and present a systematic review of quantitative (meta-analysis) or qualitative (narrative review, meta-synthesis) information. We outline core standards and principles and describe commonly encountered problems. Although this guide targets psychological scientists, its high level of abstraction makes it potentially relevant to any subject area or discipline. We argue that systematic reviews are a key methodology for clarifying whether and how research findings replicate and for explaining possible inconsistencies, and we call for researchers to conduct systematic reviews to help elucidate whether there is a replication crisis.
The future of Cochrane Neonatal. Soll RF, Ovelman C, McGuire W. Soll RF, et al. Early Hum Dev. 2020 Nov;150:105191. doi: 10.1016/j.earlhumdev.2020.105191. Epub 2020 Sep 12. Early Hum Dev. 2020. PMID: 33036834
Summarizing systematic reviews: methodological development, conduct and reporting of an umbrella review approach. Aromataris E, Fernandez R, Godfrey CM, Holly C, Khalil H, Tungpunkom P. Aromataris E, et al. Int J Evid Based Healthc. 2015 Sep;13(3):132-40. doi: 10.1097/XEB.0000000000000055. Int J Evid Based Healthc. 2015. PMID: 26360830
RAMESES publication standards: meta-narrative reviews. Wong G, Greenhalgh T, Westhorp G, Buckingham J, Pawson R. Wong G, et al. BMC Med. 2013 Jan 29;11:20. doi: 10.1186/1741-7015-11-20. BMC Med. 2013. PMID: 23360661 Free PMC article.
A Primer on Systematic Reviews and Meta-Analyses. Nguyen NH, Singh S. Nguyen NH, et al. Semin Liver Dis. 2018 May;38(2):103-111. doi: 10.1055/s-0038-1655776. Epub 2018 Jun 5. Semin Liver Dis. 2018. PMID: 29871017 Review.
Publication Bias and Nonreporting Found in Majority of Systematic Reviews and Meta-analyses in Anesthesiology Journals. Hedin RJ, Umberham BA, Detweiler BN, Kollmorgen L, Vassar M. Hedin RJ, et al. Anesth Analg. 2016 Oct;123(4):1018-25. doi: 10.1213/ANE.0000000000001452. Anesth Analg. 2016. PMID: 27537925 Review.
Bridging disciplines-key to success when implementing planetary health in medical training curricula. Malmqvist E, Oudin A. Malmqvist E, et al. Front Public Health. 2024 Aug 6;12:1454729. doi: 10.3389/fpubh.2024.1454729. eCollection 2024. Front Public Health. 2024. PMID: 39165783 Free PMC article. Review.
Strength of evidence for five happiness strategies. Puterman E, Zieff G, Stoner L. Puterman E, et al. Nat Hum Behav. 2024 Aug 12. doi: 10.1038/s41562-024-01954-0. Online ahead of print. Nat Hum Behav. 2024. PMID: 39134738 No abstract available.
Nursing Education During the SARS-COVID-19 Pandemic: The Implementation of Information and Communication Technologies (ICT). Soto-Luffi O, Villegas C, Viscardi S, Ulloa-Inostroza EM. Soto-Luffi O, et al. Med Sci Educ. 2024 May 9;34(4):949-959. doi: 10.1007/s40670-024-02056-2. eCollection 2024 Aug. Med Sci Educ. 2024. PMID: 39099870 Review.
Surveillance of Occupational Exposure to Volatile Organic Compounds at Gas Stations: A Scoping Review Protocol. Mendes TMC, Soares JP, Salvador PTCO, Castro JL. Mendes TMC, et al. Int J Environ Res Public Health. 2024 Apr 23;21(5):518. doi: 10.3390/ijerph21050518. Int J Environ Res Public Health. 2024. PMID: 38791733 Free PMC article. Review.
Association between poor sleep and mental health issues in Indigenous communities across the globe: a systematic review. Fernandez DR, Lee R, Tran N, Jabran DS, King S, McDaid L. Fernandez DR, et al. Sleep Adv. 2024 May 2;5(1):zpae028. doi: 10.1093/sleepadvances/zpae028. eCollection 2024. Sleep Adv. 2024. PMID: 38721053 Free PMC article.
Search in MeSH
LinkOut - more resources
Full text sources.
Ingenta plc
Ovid Technologies, Inc.
Other Literature Sources
scite Smart Citations
Miscellaneous
NCI CPTAC Assay Portal
Citation Manager
NCBI Literature Resources
MeSH PMC Bookshelf Disclaimer
The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.
Mayo Clinic Libraries
Evidence Synthesis Guide
Develop & Refine Your Research Question
Evidence synthesis guide : develop & refine your research question.
Review Types & Decision Tree
Standards & Reporting Results
Materials in the Mayo Clinic Libraries
Training Resources
Review Teams
Develop a Timeline
Project Management
Communication
PRISMA-P Checklist
Eligibility Criteria
Register your Protocol
Other Resources
Other Screening Tools
Grey Literature Searching
Citation Searching
Data Extraction Tools
Minimize Bias
Risk of Bias by Study Design
GRADE & GRADE-CERQual
Synthesis & Meta-Analysis
Publishing your Review
A clear, well-defined, and answerable research question is essential for any systematic review, meta-analysis, or other form of evidence synthesis. The question must be answerable. Spend time refining your research question.
PICO Worksheet
PICO Framework
Focused question frameworks.
The PICO mnemonic is frequently used for framing quantitative clinical research questions. 1
Patient or problem being addressed
Intervention or exposure being studied
Comparison intervention or exposure
Clinical Outcome
The PEO acronym is appropriate for studies of diagnostic accuracy 2
Patient
Exposure (the test that is being evaluated)
Outcome
The SPICE framework is effective “for formulating questions about qualitative or improvement research.” 3
Setting of your project
Population being studied
Intervention (drug, therapy, improvement program)
Comparison
Evaluation (how were outcomes evaluated?)
The SPIDER search strategy was designed for framing questions best answered by qualitative and mixed-methods research. 4
Sample: what groups are of interest?
Phenomenon of Interest: what behaviors, decisions, or experience do you want to study?
Design: are you applying a theoretical framework or specific research method?
Evaluation: how were outcomes evaluated and measured?
Research type: qualitative or mixed-methods?
References & Recommended Reading
1. Anastasiadis E, Rajan P, Winchester CL. Framing a research question: The first and most vital step in planning research. Journal of Clinical Urology. 2015;8(6):409-411.
2. Speckman RA, Friedly JL. Asking Structured, Answerable Clinical Questions Using the Population, Intervention/Comparator, Outcome (PICO) Framework. PM&R. 2019;11(5):548-553.
3. Knowledge Into Action Toolkit. NHS Scotland. http://www.knowledge.scot.nhs.uk/k2atoolkit/source/identify-what-you-need-to-know/spice.aspx . Accessed April 23, 2021.
4. Cooke A, Smith D, Booth A. Beyond PICO: the SPIDER tool for qualitative evidence synthesis. Qualitative health research. 2012;22(10):1435-1443.
Types of questions, pico framework, spice, spider and eclipse.
Plan your search
Sources to search
Search example
Screen and analyse
Further help
A systematic review is an in-depth attempt to answer a specific, focused question in a methodical way.
Start with a clearly defined, researchable question , that should accurately and succinctly sum up the review's line of inquiry.
A well formulated review question will help determine your inclusion and exclusion criteria, the creation of your search strategy, the collection of data and the presentation of your findings.
It is important to ensure the question:
relates to what you really need to know about your topic
is answerable, specific and focused
should strike a suitable balance between being too broad or too narrow in scope
has been formulated with care so as to avoid missing relevant studies or collecting a potentially biased result set
Is the research question justified?
Are healthcare providers, consumers, researchers, and policy makers requiring this evidence for their healthcare decisions?
Is there a gap in the current literature? The question should be worthy of an answer.
Has a similar review been done before?
Question types
To help in focusing the question and determining the most appropriate type of evidence consider the type of question. Is there is a study design (eg. Randomized Controlled Trials, Meta-Analysis) that would provide the best answer.
Is your research question to focus on:
Diagnosis : How to select and interpret diagnostic tests
Intervention/Therapy : How to select treatments to offer patients that do more good than harm and that are worth the efforts and costs of using them
Prediction/Prognosis : How to estimate the patient’s likely clinical course over time and anticipate likely complications of disease
Exploration/Etiology : How to identify causes for disease, including genetics
If appropriate, use a framework to help in the development of your research question. A framework will assist in identifying the important concepts in your question.
A good question will combine several concepts. Identifying the relevant concepts is crucial to successful development and execution of your systematic search. Your research question should provide you with a checklist for the main concepts to be included in your search strategy.
Using a framework to aid in the development of a research question can be useful. The more you understand your question the more likely you are to obtain relevant results for your review. There are a number of different frameworks available.
A technique often used in research for formulating a clinical research question is the PICO model. PICO is explored in more detail in this guide. Slightly different versions of this concept are used to search for quantitative and qualitative reviews.
Is there a specific population you need to focus on?
Describe the most important characteristics of the patient, population or problem.
What treatment or changes are you looking to explore?
What do you want to do with this patient?
What factor may influence the prognosis of the patient?
Is there a comparison treatment to be considered?
The comparison may be with another medication, another form of treatment, or no treatment at all.
Your clinical question does not have to always have a specific comparison. Use if you are comparing multiple interventions. Use a if you are comparing an intervention to no intervention.
What are you trying to accomplish, measure, improve or affect?
What are you trying to do for the patient? Relieve or eliminate the symptoms? Reduce the number of adverse events? Improve function or test scores?
What results will you consider to determine if, or how well, the intervention is working?
For qualitative reviews-
= Population or Problem, Interest, Context
Population or Problem
Interest
Context
What are the characteristics of the Population or the Patient?
What is the Problem, condition or disease you are interested in?
Interest relates to a defined event, activity, experience or process
Context is the setting or distinct characteristics
For qualitative evidence-
= Setting, Perspective, Intervention or Exposure or Interest, Comparison, Evaluation
Setting
Perspective
Intervention, Exposure or Interest
Comparison
Evaluation
Setting is the context for the question -
Perspective is the users, potential users, or stakeholders of the service -
Intervention is the action taken for the users, potential users, or stakeholders -
Comparison is the alternative actions or outcomes -
Evaluation is the result or measurement that will determine the success of the intervention -
Booth, A. (2006). Clear and present questions: Formulating questions for evidence based practice. Library hi tech, 24(3), 355-368.
= Sample, Phenomenon of Interest, Design, Evaluation, Research Type
Sample
Phenomenon of Interest
Design
Evaluation
Research Type
Sample size may very if qualitative and quantitative studies
Phenomena of Interest include behaviours, experiences and interventions
Design influences the strength of the study analysis and finding
Evaluation outcomes may include more subjective outcomes - such as views, attitudes, etc.
Research types include qualitative, quantitative or mixed method studies
Cooke, A., Smith, D., & Booth, A. (2012). Beyond PICO: The SPIDER tool for qualitative evidence synthesis. Qualitative Health Research, 22(10), 1435-1443.
= Expectation, Client, Location, Impact, Professionals, Service
Expectation
Client
Location
Impact
ProfessionalsType
Service
Improvement or information or innovation
At whom the service is aimed
Where is the service located?
Outcomes
Who is involved in providing/improving the service?
For which service are you looking for information?
Wildridge, V., & Bell, L. (2002). How CLIP became ECLIPSE: A mnemonic to assist in searching for health policy/management information. Health Information & Libraries Journal, 19(2), 113-115.
<< Previous: About systematic reviews
Next: Protocol >>
Last Updated: Aug 13, 2024 11:30 AM
URL: https://rmit.libguides.com/systematicreviews
Systematic Reviews
Developing a Research Question
Developing a Protocol
Literature Searching
Screening References
Data Extraction
Quality Assessment
Reporting Results
Related Guides
Getting Help
Developing A Research Question
There are several different methods researchers might use in developing a research question. The best method to use depends on the discipline and nature of the research you hope to review. Consider the following example question templates.
Variations to PICO
Using PICO can help you define and narrow your research question so that it is specific.
P - Patient, population, or problem
I - Intervention
C - Comparison or Control
O - Outcome
Think about whether your question is relevant to practitioners, and whether the answer will help people (doctors, patients, nurses) make better informed health care decisions.
You can find out more about properly formulated questions by reviewing the YouTube video below.
The PICO method is used frequently, though there are some variations that exist to add other specifications to studies collected. Some variations include PICOSS, PICOT, and PICOC.
In addition to the fundamental components of PICO, additional criteria are made for study design (S) and setting (S).
(T), in this instance, represents timeframe . This method could be used to narrow down length of treatment or intervention in health research.
In research where there may not be a comparison, Co instead denotes the context of the population and intervention being studied.
Using SPIDER can help you define and narrow your research question so that it is specific. This is typically used in qualitative research (Cooke, Smith, & Booth, 2012).
PI - Phenomenon of Interest
E - Evaluation
R - Research type
Yet another search measure relating to Evidence-Based Practice (EBP) is SPICE. This framework builds on PICO by considering two additional axes: perspective and setting (Booth, 2006).
S - Setting
P - Perspective
I - Intervention
C - Comparison
Inclusion and Exclusion Criteria
Setting inclusion and exclusion criteria is a critical step in the systematic review process.
Inclusion criteria determine what characteristics are needed for a study to be included in a systematic review.
Exclusion criteria denote what attributes disqualify a study from consideration in a systematic review.
Knowing what to exclude or include helps speed up the review process.
These criteria will be used at different parts of the review process, including in search statements and the screening process.
Has this review been done?
After developing the research question, it is necessary to confirm that the review has not previously been conducted (or is currently in progress).
Make sure to check for both published reviews and registered protocols (to see if the review is in progress). Do a thorough search of appropriate databases; if additional help is needed, consult a librarian for suggestions.
Formulating a strong research question for a systematic review can be a lengthy process. While you may have an idea about the topic you want to explore, your specific research question is what will drive your review and requires some consideration.
You will want to conduct preliminary or exploratory searches of the literature as you refine your question. In these searches you will want to:
Determine if a systematic review has already been conducted on your topic and if so, how yours might be different, or how you might shift or narrow your anticipated focus.
Scope the literature to determine if there is enough literature on your topic to conduct a systematic review.
Identify key concepts and terminology.
Identify seminal or landmark studies.
Identify key studies that you can test your search strategy against (more on that later).
Begin to identify databases that might be useful to your search question.
Types of Research Questions for Systematic Reviews
A narrow and specific research question is required in order to conduct a systematic review. The goal of a systematic review is to provide an evidence synthesis of ALL research performed on one particular topic. Your research question should be clearly answerable from the studies included in your review.
Another consideration is whether the question has been answered enough to warrant a systematic review. If there have been very few studies, there won't be enough qualitative and/or quantitative data to synthesize. You then have to adjust your question... widen the population, broaden the topic, reconsider your inclusion and exclusion criteria, etc.
When developing your question, it can be helpful to consider the FINER criteria (Feasible, Interesting, Novel, Ethics, and Relevant). Read more about the FINER criteria on the Elsevier blog .
If you have a broader question or aren't certain that your question has been answered enough in the literature, you may be better served by pursuing a systematic map, also know as a scoping review . Scoping reviews are conducted to give a broad overview of a topic, to review the scope and themes of the prior research, and to identify the gaps and areas for future research.
What is the effectiveness of talk therapy in treating ADHD in children?
Systematic Review
What treatments are available for treating children with ADHD?
Systematic Map/Scoping Review
Are animal-assisted therapies as effective as traditional cognitive behavioral therapies in treating people with depressive disorders?
Systematic Review
CEE Example Questions Collaboration for Environmental Evidence Guidelines contains Table 2.2 outlining answers sought and example questions in environmental management.
Learn More . . .
Cochrane Handbook Chapter 2 - Determining the scope of the review and the questions it will address
Frameworks for Developing your Research Question
PICO : P atient/ P opulation, I ntervention, C omparison, O utcome.
PEO: P opulation, E xposure, O utcomes
SPIDER : S ample, P henomenon of I nterest, D esign, E valuation, R esearch Type
For more frameworks and guidance on developing the research question, check out:
1. Advanced Literature Search and Systematic Reviews: Selecting a Framework. City University of London Library
2. Select the Appropriate Framework for your Question. Tab "1-1" from PIECES: A guide to developing, conducting, & reporting reviews [Excel workbook ]. Margaret J. Foster, Texas A&M University. CC-BY-3.0 license .
3. Formulating a Research Question. University College London Library. Systematic Reviews .
I = identify: search for studies that match your criteria, e = evaluate: exclude or include studies, c = collect: extract and synthesize key data, e = explain: give context and rate the strength of the studies, s = summarize: write and publish your final report.
Biomedical & Public Health Reviews
Congratulations!
You've decided to conduct a Systematic Review! Please see the associated steps below. You can follow the P-I-E-C-E-S = Plan, Identify, Evaluate, Collect, Explain, Summarize system or any number of systematic review processes available (Foster & Jewell, 2017) .
P = Plan: decide on your search methods
Determine your Research Question
By now you should have identified gaps in the field and have a specific question you are seeking to answer. This will likely have taken several iterations and is the most important part of the Systematic Review process.
Identify Relevant Systematic Reviews
Once you've finalized a research question, you should be able to locate existing systematic reviews on or similar to your topic. existing systematic reviews will be your clues to mine for keywords, sample searches in various databases, and will help your team finalize your review question and develop your inclusion and exclusion criteria. , decide on a protocol and reporting standard, your protocol is essentially a project plan and data management strategy for an objective, reproducible, sound methodology for peer review. the reporting standard or guidelines are not a protocols, but rather a set of standards to guide the development of your systematic review. often they include checklists. it is not required, but highly recommended to follow a reporting standard. .
Protocol registry: Reviewing existing systematic reviews and registering your protocol will increase transparency, minimize bias, and reduce the redundancy of groups working on the same topics ( PLoS Medicine Editors, 2011 ). Protocols can serve as internal or external documents. Protocols can be made public prior to the review. Some registries allow for keeping a protocol private for a set period of time.
Cochrane Database of Systematic Reviews (UGA Login) (Health Sciences)
A collection of regularly updated, systematic reviews of the effects of health care. New reviews are added with each issue of The Cochrane Library Reviews mainly of randomized controlled trials. All reviews have protocols.
PROSPERO (General)
This is an international register of systematic reviews and is public.
Campbell Corporation (Education & Social Sciences)
Topics covered include Ageing; Business and Management; Climate Solutions; Crime and Justice; Disability; Education; International Development; Knowledge Translation and Implementation; Methods; Nutrition and Food Security; Sexual Orientation and Gender Identity; Social Welfare; and Training.
Systematic Review for Animals and Food (Vet Med & Animal Science)
Reporting Standards:
Campbell MECCIR Standards (Education & Social Sciences)
Cochrane Guides & Handbooks (Health & Medical Sciences)
Institute of Medicine of the National Academies: Finding What Works in Healthcare: Standards for Systematic Reviews (healthcare)
PRISMA for Systematic Review Protocols (General)
PRISMA Checklist (General)
PRISMA for Scoping Reviews (General)
Decide on Databases and Grey Literature for Systematic Review Research
Because the purpose of a SR is to find all studies related to your research question, you will need to search multiple databases. You should be able to name the databases you are already familiar with using. Your librarian will be able to recommend additional databases, including some of the following:
PubMed (Health Sciences)
Web of Science
Cochrane Database (Biomedical)
National and Regional Databases (i.e. WHO LILACS scientific health information from Latin America and the Caribbean countries)
CINAHL (Health Sciences)
PsycINFO (Psychology)
Depending on your topic, you may want to search clinical trials and grey literature. See this guide for more on grey literature.
Develop Keywords and Write a Search Strategy
Go here for help with writing your search strategy
Translate Search Strategies
Each database you use will have different methods of searching and resulting search strings, including syntax. ideally you will create one master keyword list and translate it for each database. below are tools to assist with translating search strings. .
Includes syntax for Cochrane Library, EBSCO, ProQuest, Ovid, and POPLINE.
The IEBH SR-Accelerator is a suite of tools to assist in speeding up portions of the Systematic Review process, including the Polyglot tool which translates searches across databases.
University of Michigan Search 101 - SR Database Cheat Sheet
Storing, Screening and Full-Text Screening of Your Citations
Because systematic review literature searches may produce thousands of citations and abstracts, the research team will be screening and systematically reviewing large amounts of results. During screening , you will remove duplicates and remove studies that are not relevant to your topic based on a review of titles and abstracts. Of what remains, the full-text screening of the studies will then need to be conducted to confirm that they fit within your inclusion/exclusion criteria.
The results of the literature review and screening processes are best managed by various tools and software. You can also use a simple form or table to log the relevant information from each study. Consider whether you will be coding your data during the extraction process in your decision on which tool or software to use. Your librarian can consult on which of these is best suited to your research needs.
EndNote Guide (UGA supported citation tracking software) - great for storing, organizing, and de-duplication
RefWorks Guide (UGA supported citation tracking software) - great for storing, organizing, and de-duplication
Rayyan (free service) - great for initial title/abstract screening OR full-text screening as cannot differentiate; not ideal for de-duplication
Covidence (requires a subscription) - full suite of systematic review tools including meta-analysis
Combining Software (EndNote, Google Forms, Excel)
Forms such as Qualtrics (UGA EITS software) can note who the coder is, creates charts and tables, good when have a review of multiple types of studies
Data Extraction
Data extraction processes differ between qualitative and quantitative evidence syntheses. In both cases, you must provide the reader with a clear overview of the studies you have included, their similarities and differences, and the findings. Extraction should be done in accordance to pre-established guidelines, such as PRISMA.
Some systematic reviews contain meta-analysis of the quantitative findings of the results. Consider including a statistician on your team to complete the analysis of all individual study results. Meta-analysis will tell you how much or what the actual results is across the studies and explains results in a measure of variance, typically called a forest plot.
Systematic review price models have changed over the years. Previously, you had to depend on departmental access to software that would cost several hundred dollars. Now that the software is cloud-based, tiered payment systems are now available. Sometimes there is a free tier level, but costs go up for functionality, number or users, or both. Depending on the organization's model, payments may be monthly, annual or per project/review.
Always check your departmental resources before making a purchase.
View all training videos and other resources before starting your project.
If your access is limited to a specific amount of time, wait to purchase until the appropriate work stage
Software list
Tool created by Brown University to assist with screening for systematic reviews.
Cochrane's RevMan
Review Manager (RevMan) is the software used for preparing and maintaining Cochrane Reviews.
Systematic review tool intended to assist with the screening and extraction process. (Requires subscription)
Distiller SR
DistillerSR is an online application designed specifically for the screening and data extraction phases of a systematic review (Requires subscription) Student and Faculty tiers have monthly pricing with a three month minimum. Number of projects is limited by pricing.
EPPI Reviewer (requires subscription, free trial)
It includes features such as text mining, data clustering, classification and term extraction
Rayyan is a free web-based application that can be used to screen titles, abstracts, and full text. Allows for multiple simultaneous users.
AHRQ's SRDR tool (free) which is web-based and has a training environment, tutorials, and example templates of systematic review data extraction forms
"System for the Unified Management, Assessment and Review of Information, the Joanna Briggs Institutes premier software for the systematic review of literature."
Syras Pricing is based on both number of abstracts and number of collaborators. The free tier is limited to 300 abstracts and two collaborators. Rather than monthly pricing, the payment is one-time per project.
Evidence Synthesis or Critical Appraisal
PRISMA guidelines suggest including critical appraisal of the included studies to assess the risk of bias and to include the assessment in your final manuscripts. There are several appraisal tools available depending on your discipline and area of research.
Simple overview of risk of bias assessment, including examples of how to assess and present your conclusions.
CASP is an organization that provides resources for healthcare professionals, but their appraisal tools can be used for varying study types across disciplines.
From the Joanna Briggs Institute: "JBI’s critical appraisal tools assist in assessing the trustworthiness, relevance and results of published papers."
Johns Hopkins Evidence-Based Practice Model (health sciences)
National Academies of Sciences, Engineering, and Medicine
Document the search; 5.1.6. Include a methods section
List of additional critical appraisal tools from Cardiff University.
Synthesize, Map, or Describe the Results
Prepare your process and findings in a final manuscript. Be sure to check your PRISMA checklist or other reporting standard. You will want to include the full formatted search strategy for the appendix, as well as include documentation of your search methodology. A convenient way to illustrate this process is through a PRISMA Flow Diagram.
Attribution: Unless noted otherwise, this section of the guide was adapted from Texas A&M's "Systematic Reviews and Related Evidence Syntheses"
<< Previous: Before You Begin
Next: Biomedical & Public Health Reviews >>
Last Updated: Aug 23, 2024 1:53 PM
URL: https://guides.libs.uga.edu/SystematicReview
Help and Support
Research Guides
Systematic Reviews - Research Guide
Defining your review question
Starting a Systematic Review
Developing your search strategy
Where to search
Appraising Your Results
Documenting Your Review
Find Systematic Reviews
Software and Tools for Systematic Reviews
Guidance for conducting systematic reviews by discipline
Library Support
Review question
A systematic review aims to answer a clear and focused clinical question. The question guides the rest of the systematic review process. This includes determining inclusion and exclusion criteria, developing the search strategy, collecting data and presenting findings. Therefore, developing a clear, focused and well-formulated question is critical to successfully undertaking a systematic review.
A good review question:
allows you to find information quickly
allows you to find relevant information (applicable to the patient) and valid (accurately measures stated objectives)
provides a checklist for the main concepts to be included in your search strategy.
How to define your systematic review question and create your protocol
Starting the process
Defining the question
Creating a protocol
Types of clinical questions
PICO/PICo framework
Other frameworks
Research topic vs review question
A research topic is the area of study you are researching, and the review question is the straightforward, focused question that your systematic review will attempt to answer.
Developing a suitable review question from a research topic can take some time. You should:
perform some scoping searches
use a framework such as PICO
consider the FINER criteria; review questions should be F easible, I nteresting, N ovel, E thical and R elevant
check for existing or prospective systematic reviews.
When considering the feasibility of a potential review question, there should be enough evidence to answer the question whilst ensuring that the quantity of information retrieved remains manageable. A scoping search will aid in defining the boundaries of the question and determining feasibility.
For more information on FINER criteria in systematic review questions, read Section 2.1 of the Cochrane Handbook .
Check for existing or prospective systematic reviews
Before finalising your review question, you should determine if any other systematic review is in progress or has been completed on your intended question (i.e. consider if the review is N ovel).
To find systematic reviews you might search specialist resources such as the Cochrane Library , Joanna Briggs Institute EBP Database or the Campbell Collaboration . "Systematic review" can also be used as a search term or limit when searching the recommended databases .
You should appraise any systematic reviews you find to assess their quality. An article may include ‘systematic review’ in its title without correctly following the systematic review methodology. Checklists, including those developed by AMSTAR and JBI , are useful tools for appraisal.
You may undertake a review on a similar question if that posed by a previously published review had issues with its methodology such as not having a comprehensive search strategy, for example. You may choose to narrow the parameters of a previously conducted search or to update the review if it was published some years ago.
Searching a register of prospective systematic reviews such as PROSPERO will allow you to check that you are not duplicating research already underway.
Once you have performed scoping searches and checked for other systematic reviews on your topic, you can focus and refine your review question. Any PICO elements identified during the initial development of the review question from the research topic should now be further refined.
The review question should always be:
unambiguous
structured.
Work through the first section of the to define your review question
Review questions may be broad or narrow in focus; however, you should consider the FINER criteria when determining the breadth of the PICO elements of your review question.
A question that is too broad may present difficulty with searching, data collection, analysis, and writing, as the number of studies retrieved would be unwieldy. A broad review question could be more suited to another type of review .
A question that is too narrow may not have enough evidence to allow you to answer your review question. Table 2.3.a in the Cochrane Handbook summarises the advantages and disadvantages of broad versus narrow reviews and provides examples of how you could broaden or narrow different PICO elements.
It is essential to formulate your research question with care to avoid missing relevant studies or collecting a potentially biased result set.
A systematic review protocol is a document that describes the rationale, question, and planned methods of a systematic review. Creating a protocol is an essential part of the systematic review process, ensuring careful planning and detailed documentation of what is planned before undertaking the review.
The Preferred Reporting Items for Systematic review and Meta-Analysis Protocols (PRISMA-P) checklist outlines recommended items to address in a systematic review protocol, including:
review question, with PICO elements defined
eligibility criteria
information sources (e.g. planned databases, trial registers, grey literature sources, etc.)
draft search strategy.
The has been designed to help you create your systematic review protocol
Systematic reviews must have pre-specified criteria for including and excluding studies in the review. The Cochrane Handbook states that "predefined, unambiguous eligibility criteria are a fundamental prerequisite for a systematic review."
The first step in developing a protocol is determining the PICO elements of the review question and how the intervention produces the expected outcomes in the specified population. You should then specify the types of studies that will provide the evidence to answer your review question. Then outline the inclusion and exclusion criteria based on these PICO elements.
For more information on defining eligibility criteria, see Chapter 3 of the Cochrane Handbook .
A key purpose of a protocol is to make plans to minimise bias in the findings of the review; where possible, changes should not be made to the eligibility criteria of a published protocol. Where such changes are made, they must be justified and documented in the review. Appropriate time and consideration should be given to creating the protocol.
You may wish to register your protocol in a publicly accessible way. This will help prevent other people from completing a review on your topic.
If you intend to publish a systematic review in the health sciences, it should conform to the IOM Standards for Reporting Systematic Reviews .
If you intend to publish a systematic review in the Cochrane Database of Systematic Reviews , it should conform to the Methodological Expectations in Cochrane Intervention Review s (MECIR).
A clinical question needs to be directly relevant to the patient or problem and phrased to facilitate the search for an answer. A clear and focused question is more likely to lead to a credible and useful answer, whereas a poorly formulated question can lead to an uncertain answer and create confusion.
The population and intervention should be specific, but if any or both are described too narrowly, it may not be easy to find relevant studies or sufficient data to demonstrate a reliable answer.
Question type
Explanation
Evidence types required to answer the question
Diagnosis
Questions about the ability of a test or procedure to differentiate between those with and without a disease or condition
Randomised controlled trial (RCT) or cohort study
Etiology (causation)
Questions about the harmful effect of an intervention or exposure on a patient
Cohort study
Meaning
Questions about patients' experiences and concerns
Qualitative study
Prevention
Questions about the effectiveness of an intervention or exposure in preventing morbidity and mortality. Questions are similar to treatment questions. When assessing preventive measures, it is essential to evaluate potential harms as well as benefits
Randomised controlled trial (RCT) or prospective study
Prognosis (forecast)
Questions about the probable cause of a patient's disease or the likelihood that they will develop an illness
Cohort study and/or case-control series
Therapy (treatment)
Questions about the effectiveness of interventions in improving outcomes in patients suffering from an illness, disease or condition. This is the most frequently asked type of clinical question. Treatments may include medications, surgical procedures, exercise and counselling about lifestyles changes
Randomised controlled trial (RCT)
PICO is a framework for developing a focused clinical question.
Slightly different versions of this concept are used to search for quantitative and qualitative reviews, examples are given below:
PICO for quantitative studies
What are the characteristics of the opulation oratient?
What is the roblem, condition or disease you are interested in?
How do you wish tontervene? What do you want to do with this patient - treat, diagnose, observe, etc.?
What is theomparison or alternative to the intervention - placebo, different drug or therapy, surgery, etc.?
What are the possible utcomes - morbidity, death, complications, etc.?
Here is an example of a clinical question that outlines the PICO components:
PICo for qualitative studies
P
I
Co
What are the characteristics of the opulation or atient?
What is the roblem, condition or disease you are interested in?
nterest relates to a defined event, activity, experience or process
ntext is the setting or distinct characteristics
Here is an example of a clinical question that outlines the PICo components:
Two other mnemonics may be used to frame questions for qualitative and quantitative studies - SPIDER and SPICE .
SPIDER for qualitative or quantitative studies
SPIDER can be used for both qualitative and quantitative studies:
ample size may vary in quantitative and qualitative studies
henomena of nterest include behaviours, experiences and interventions
esign influences the strength of the study analysis and findings
valuation outcomes may include more subjective outcomes such as views, attitudes, etc.
esearch types include qualitative, quantitative, or mixed-method studies
Within social sciences research, SPICE may be more appropriate for formulating research questions:
etting is the context for the question -
erspective is the users, potential users or stakeholders of the service -
ntervention is the action taken for the users, potential users or stakeholders -
omparison is the alternative actions or outcomes -
valuation is the result or measurement that will determine the success of the intervention - or
More question frameworks
For more question frameworks, see the following:
Table 1 Types of reviews , from ' What kind of systematic review should I conduct? A proposed typology and guidance for systematic reviewers in the medical and health sciences. '
Framing your research question , CQ University
Asking focused questions - Centre for Evidence Based Medicine Tips and examples for formulating focused questions
Cochrane Handbook, Chapter 2: Determining the scope of the review and the questions it will address Discusses the formulation of review questions in detail
PICO for Evidence-Based Veterinary Medicine EBVM Toolkit from the RCVS
PICO worksheet
PICo worksheet
<< Previous: Starting a Systematic Review
Next: Developing your search strategy >>
Last Updated: Aug 7, 2024 10:22 AM
URL: https://libguides.murdoch.edu.au/systematic
Systematic Reviews & Meta-Analysis
Identifying your research question.
Developing Your Protocol
Conducting Your Search
Screening & Selection
Data Extraction & Appraisal
Meta-Analyses
Writing the Systematic Review
Suggested Readings
The first step in performing a Systematic Review is to develop your research question. The guidance provided on how to develop your research question for literature reviews will still apply here. The difference with a systematic review research question is that you must have a clearly defined question and consider what problem are you trying to address by conducting the review. The most important point is that you focus your question and design the question so that it is answerable by the research that you will be systematically examining.
Once you have developed your research question, it should not be changed once the review process has begun, as the review protocol needs to be formed around the question.
Literature Review Question
Systematic Review Question
Can be broad; highlight only particular pieces of literature, or support a particular viewpoint.
Requires the question to be well-defined and focused so it is possible to answer.
To help develop and focus your research question you may use one of the question frameworks below.
Methods for Refining a Research Topic
PICO questions can be useful in the health or social sciences. PICO stands for:
Patient, Population, or Problem : What are the characteristics of the patient(s) or population, i.e. their ages, genders, or other demographics? What is the situation, disease, etc., that you are interested in?
Intervention or Exposure : What do you want to do with the patient, person, or population (i.e. observe, diagnose, treat)?
Comparison : What is the alternative to the intervention (i.e. a different drug, a different assignment in a classroom)?
Outcome : What are the relevant outcomes (i.e. complications, morbidity, grades)?
Additionally, the following are variations to the PICO framework:
PICO(T) : The 'T' stands for Timing, where you would define the duration of treatment and the follow-up schedule that matter to patients. Consider both long- and short-term outcomes.
PICO(S) : The 'S' stands for Study type (eg. randomized controlled trial), sometimes S can be used to stand for Setting or Sample Size
PPAARE is a useful question framework for patient care:
Problem - Description of the problem related to the disease or condition
Patient - Description of the patient related to their demographics and risk factors
Action - Description of the action related to the patient’s diagnosis, diagnostic test, etiology, prognosis, treatment or therapy, harm, prevention, patient ed.
Alternative - Description of the alternative to the action when there is one. (Not required)
Results - Identify the patient’s result of the action to produce, improve, or reduce the outcome for the patient
Evidence - Identify the level of evidence available after searching
SPIDER is a useful question framework for qualitative evidence synthesis:
Sample - The group of participants, population, or patients being investigated. Qualitative research is not easy to generalize, which is why sample is preferred over patient.
Phenomenon of Interest - The reasons for behavior and decisions, rather than an intervention.
Design - The research method and study design used for the research, such as interview or survey.
Evaluation - The end result of the research or outcome measures.
Research type - The research type; Qualitative, quantitative and/or mixed methods.
SPICE is a particularly useful method in the social sciences. It stands for
Setting (e.g. United States)
Perspective (e.g. adolescents)
Intervention (e.g. text message reminders)
Comparisons (e.g. telephone message reminders)
Evaluation (e.g. number of homework assignments turned in after text message reminder compared to the number of assignments turned in after a telephone reminder)
CIMO is useful method in the social sciences or organisational context. It stands for
Context - Which individuals, relationships, institutional settings, or wider systems are being studied?
Intervention - The effects of what event, action, or activity are being studied?
Mechanism - What are the mechanisms that explain the relationship between interventions and outcomes? Under what circumstances are these mechanisms activated or not activated?
Outcomes - What are the effects of the intervention? How will the outcomes be measured? What are the intended and unintended effects?
Has Your Systematic Review Already Been Done?
Once you have a reasonably well defined research question, it is important to check if your question has already been asked, or if there are other systematic reviews that are similar to that which you're preparing to do.
In the context of conducting a review, even if you do find one on your topic, it may be sufficiently out of date or you may find other defendable reasons to undertake a new or updated one. In addition, locating an existing systematic reviews may also provide a starting point for selecting a review topic, it may help you refocus your question, or redirect your research toward other gaps in the literature.
You may locate existing systematic reviews or protocols on the following resources:
Cochrane Library This link opens in a new window The Cochrane Library is a database collection containing high-quality, independent evidence, including systematic reviews and controlled trials, to inform healthcare decision-making. Terms of Use .
MEDLINE (EBSCO) This link opens in a new window Medline (EBSCO) produced by the U.S. National Library of Medicine is the premier database of biomedicine and health sciences, covering life sciences including biology, environmental science, marine biology, plant and animal science, biophysics and chemistry. Terms of Use . Coverage: 1950-present.
PsycINFO This link opens in a new window Contains over 5 million citations and summaries of peer-reviewed journal articles, book chapters, and dissertations from the behavioral and social sciences in 29 languages from 50 countries. Terms of Use . Coverage: 1872-present.
NICE process and methods [PMG6] Published: 30 November 2012
Tools and resources
1 Introduction
2 The scope
3 The Guideline Development Group
4 Developing review questions and planning the systematic review
5 Identifying the evidence: literature searching and evidence submission
6 Reviewing the evidence
7 Assessing cost effectiveness
8 Linking clinical guidelines to other NICE guidance
9 Developing and wording guideline recommendations
10 Writing the clinical guideline and the role of the NICE editors
11 The consultation process and dealing with stakeholder comments
12 Finalising and publishing the guideline
13 Implementation support for clinical guidelines
14 Updating published clinical guidelines and correcting errors
Summary of main changes from the 2009 guidelines manual
Update information
About this manual
NICE process and methods
4.1 number of review questions, 4.2 developing review questions from the scope, 4.3 formulating and structuring review questions, 4.4 planning the systematic review, 4.5 further reading.
At the start of guideline development, the key clinical issues listed in the scope need to be translated into review questions. In some instances, this may be done as part of the scoping process (see chapter 2 ). The review questions must be clear, focused and closely define the boundaries of the topic. They are important both as the starting point for the systematic literature review and as a guide for the development of recommendations by the Guideline Development Group (GDG). The development of the review questions should be completed soon after the GDG is convened.
This chapter describes how review questions are developed, formulated and agreed. It describes the different types of review question that may be used, and provides examples. It also provides information on how to plan the systematic review.
The exact number of review questions for each clinical guideline depends on the topic and the breadth of the scope (see chapter 2 ). However, the number of review questions must be manageable for the GDG and the National Collaborating Centre (NCC) or the NICE Internal Clinical Guidelines Programme [ 6 ] within the agreed timescale. For standard clinical guidelines that take 10–18 months to develop (from the time the scope is signed off to submission of the draft guideline), between 15 and 20 review questions is a reasonable number. This number is based on the estimate that, on average, it is feasible for a maximum of two systematic reviews to be presented at any one GDG meeting. However, review questions vary considerably in the number of relevant studies and the complexity of the question and analyses, and the numbers of questions given here are only a guide. For example, a single review question might involve a complex comparison of several treatment options with many individual studies. At the other extreme, a question might address the effects of a single intervention and have few relevant studies.
Review questions should address all areas covered in the scope, and should not introduce new aspects not specified in the scope. They will contain more detail than, and should be seen as building on, the key clinical issues in the scope.
Review questions are usually drafted by the NCC team. They should then be refined and agreed by all GDG members through discussions at GDG meetings. The different perspectives among GDG members will help to ensure that the right review questions are identified, thus enabling the literature search to be planned efficiently. On occasion the questions may need refining once the evidence has been searched; such changes should be documented.
Review questions then inform the development of protocols used by NCCs to detail how questions will be addressed.
4.2.1 Economic aspects
This chapter relates to the specification of questions for reviewing the clinical evidence. Evidence about economic aspects of the key clinical issues should also be sought from published economic evaluations and by conducting new modelling where appropriate. Methods for identifying and reviewing the economic literature are discussed in chapters 5 and 6 ; health economics modelling is discussed in chapter 7 . When developing review questions, it is important to consider what information is required for any planned economic modelling. This might include, for example, information about quality of life, rates of adverse effects or use of health services.
A good review question is clear and focused. It should relate to a specific patient problem, because this helps to identify the clinically relevant evidence. The exact structure of the review question will depend on what is being asked, but it is likely to fall into one of three main areas:
intervention
Patient experience is a component of each of these and should inform the development of a structured review question. In addition, review questions that focus on a specific element of patient experience may merit consideration in their own right.
4.3.1 Review questions about interventions
Usually, most review questions for a particular clinical guideline relate to interventions. Each intervention listed in the scope is likely to require at least one review question, and possibly more depending on the populations and outcomes of interest.
A helpful structured approach for developing questions about interventions is the PICO ( population, intervention, comparator and outcome) framework (see box 4.1). This divides each question into four components:
population (the population under study)
intervention (what is being done)
comparators (other main treatment options)
outcome (measures of how effective the interventions have been).
Box 4.1 Features of a well-formulated review question on the effectiveness of an intervention using the PICO framework
Which populations of patients are we interested in? How can they be best described? Are there subgroups that need to be considered?
Which intervention, treatment or approach should be used?
What is/are the main alternative(s) to compare with the intervention being considered?
What is really important for the patient? Which outcomes should be considered? Examples include intermediate or short-term outcomes; mortality; morbidity and quality of life; treatment complications; adverse effects; rates of relapse; late morbidity and re-admission; return to work, physical and social functioning; resource use.
For each review question, the GDG should take into account the various confounding factors that may influence the outcomes and effectiveness of an intervention. They should also specify the healthcare setting for the question if necessary. To facilitate this process, outcomes and other key criteria that the GDG considers to be important should be listed. Once the review question has been framed, key words can be identified as potential search terms for the systematic review. Examples of review questions on the effectiveness of interventions are presented in box 4.2.
Box 4.2 Examples of review questions on the effectiveness of interventions
For people with IBS (irritable bowel syndrome), are antimuscarinics or smooth muscle relaxants effective compared with placebo or no treatment for the long-term control of IBS symptoms? Which is the most effective antispasmodic?
Which first-line opioid maintenance treatments are effective and cost-effective in relieving pain in patients with advanced and progressive disease who require strong opioids?
Review questions about drugs will usually only consider drugs with a UK marketing authorisation for some indication. Use of a drug outside its licensed indication (off-label use) may be considered if this use of the drug is common in the UK (see also section 9.3.6.3 ). Drugs with no UK marketing authorisation for any indication will not usually be considered in a guideline.
A review question relating to an intervention is usually best answered by a randomised controlled trial (RCT), because this is most likely to give an unbiased estimate of the effects of an intervention. Further information on the side effects of a drug may be obtained from other sources. Some advice on finding data on the adverse effects of an intervention is available in the Cochrane handbook for systematic reviews of interventions .
There are, however, circumstances in which an RCT is not necessary to confirm the effectiveness of a treatment (for example, giving insulin to a person in a diabetic coma compared with not giving insulin) because we are sufficiently certain from non-randomised evidence that an important effect exists. This is the case only if all of the following criteria are fulfilled:
An adverse outcome is likely if the person is not treated (evidence from, for example, studies of the natural history of a condition).
The treatment gives a dramatic benefit that is large enough to be unlikely to be a result of bias (evidence from, for example, historically controlled studies).
The side effects of the treatment are acceptable (evidence from, for example, case series).
There is no alternative treatment.
There is a convincing pathophysiological basis for treatment.
4.3.2 Review questions about diagnosis
Review questions about diagnosis are concerned with the performance of a diagnostic test or test strategy. A diagnostic test is a means of determining whether a patient has a particular condition (disease, stage of disease or subtype of disease). Diagnostic tests can include physical examination, history taking, laboratory or pathological examination and imaging tests.
Broadly, review questions that can be asked about a diagnostic test are of three types:
questions about the diagnostic accuracy of a test or a number of tests individually against a comparator (the reference standard)
questions about the diagnostic accuracy of a test strategy (such as serial testing) against a comparator (the reference standard)
questions about the clinical value of using the test.
Questions about a diagnostic test consider the ability of the test to predict the presence or absence of disease. In studies of the accuracy of a diagnostic test, the results of the test under study (the index test[s]) are compared with those of the best available test (the reference standard) in a sample of patients. It is important to be clear when deciding on the question what the exact proposed use of the test is; for example, as an initial 'triage' test or after other tests.
The PICO framework described in the previous section is useful when formulating review questions about diagnostic test accuracy (see box 4.3). The healthcare setting of the test should be specified. The intervention is the test under investigation (the index test[s]), the comparison is the reference standard, and the outcome is a measure of the presence or absence of the particular disease or disease stage that the index test is intended to identify (for example, sensitivity or specificity). The target condition that the test is intended to identify should be specified in the review question.
Box 4.3 Features of a well-formulated review question on diagnostic test accuracy using the PICO framework
To which populations of patients would the test be applicable? How can they be best described? Are there subgroups that need to be considered?
The test or test strategy being evaluated.
The test with which the index test(s) is/are being compared, usually the reference standard (the test that is considered to be the best available method to establish the presence or absence of the condition of interest – this may not be the one that is routinely used in practice).
The disease, disease stage or subtype of disease that the index test(s) and the reference standard are being used to establish.
The diagnostic accuracy of the test or test strategy for detecting the target condition. This is usually reported as test parameters, such as sensitivity, specificity, predictive values, likelihood ratios, or – where multiple cut-off values are used – a receiver operating characteristic (ROC) curve.
Examples of review questions on the accuracy of a diagnostic test are given in box 4.4. A review question relating to diagnostic test accuracy is usually best answered by a cross-sectional study in which both the index test(s) and the reference standard are performed on the same sample of patients. Case–control studies are also used to assess diagnostic test accuracy, but this type of study design is more prone to bias (and often results in inflated estimates of diagnostic test accuracy). Further advice on conducting reviews of diagnostic test accuracy can be found in the Cochrane handbook for diagnostic test accuracy reviews .
Box 4.4 Examples of review questions on diagnostic test accuracy
In children and young people under 16 years of age with a petechial rash, can non-specific laboratory tests (C-reactive protein, white blood cell count, blood gases) help to confirm or refute the diagnosis of meningococcal disease?
Population: All children and young people from birth up to their 16th birthday who have or are suspected of having bacterial meningitis or meningococcal septicaemia.
Index test(s): Non-specific laboratory tests (C-reactive protein, white blood cell count, blood gases).
Reference standard: Microscopy, lumbar puncture or clinical follow-up.
Although the assessment of test accuracy is an important component of establishing the usefulness of a diagnostic test, the clinical value of a test lies in its usefulness in guiding treatment decisions, and ultimately in improving patient outcomes. 'Test and treat' studies compare outcomes of patients who undergo a new diagnostic test (in combination with a management strategy) with those of patients who receive the usual diagnostic test and management strategy. These types of study are not very common. If there is a trade-off between costs, benefits and harms of the tests, a decision-analytic model may be useful (see Lord et al. 2006).
Review questions aimed at establishing the clinical value of a diagnostic test in practice can be structured in the same way as questions about interventions. The best study design is an RCT. Review questions about the safety of a diagnostic test should also be structured in the same way as questions about interventions.
4.3.3 Review questions about prognosis
Prognosis describes the likelihood of a particular outcome, such as the progression of a disease, or the survival time for a patient after the diagnosis of a disease or with a particular set of risk markers. A prognosis is based on the characteristics of the patient ('prognostic factors'). These prognostic factors may be disease-specific (such as the presence or absence of a particular disease feature) or demographic (such as age or sex), and may also include the likely response to treatment and the presence of comorbidities. A prognostic factor does not need to be the cause of the outcome, but should be associated with (in other words, predictive of) that outcome.
Prognostic information can be used within clinical guidelines to:
provide information to patients about their prognosis
classify patients into risk categories (for example, cardiovascular risk) so that different interventions can be applied
define subgroups of populations that may respond differently to interventions
identify factors that can be used to adjust for case mix (for example, in explorations of heterogeneity)
help determine longer-term outcomes not captured within the timeframe of a clinical trial (for example, for use in an economic model).
Review questions about prognosis address the likelihood of an outcome for patients from a population at risk for that outcome, based on the presence of a proposed prognostic factor.
Review questions about prognosis may be closely related to questions about aetiology (cause of a disease) if the outcome is viewed as the development of the disease itself based on a number of risk factors. They may also be closely related to questions about interventions if one of the prognostic factors is treatment. However, questions about interventions are usually better addressed by controlling for prognostic factors.
Examples of review questions relating to prognosis are given in box 4.5.
Box 4.5 Examples of review questions on prognosis
Are there factors related to the individual (characteristics either of the individual or of the act of self-harm) that predict outcome (including suicide, non-fatal repetition, other psychosocial outcomes)?
(From: . NICE clinical guideline 16 [2004].)
For women in the antenatal and postnatal periods, what factors predict the development or recurrence of particular mental disorders?
(From: . NICE clinical guideline 45 [2007].)
For people who are opioid dependent, are there particular groups that are more likely to benefit from detoxification?
(From: . NICE clinical guideline 52 [2007].)
A review question relating to prognosis is best answered using a prospective cohort study. A cohort of people who have not experienced the outcome in the review question (but for whom the outcome is possible) is followed to monitor the number of outcome events occurring over time. The cohort will contain people who possess or have been exposed to the prognostic factor, and people who do not possess or have not been exposed to it. The cohort may be taken from one arm (usually the control arm) of an RCT, although this often results in a highly selected, unrepresentative group. Case–control studies are not suitable for answering questions about prognosis, because they give only an odds ratio for the occurrence of the event for people with and without the prognostic factor – they give no estimate of the baseline risk.
4.3.4 Using patient experience to inform review questions
The PICO framework should take into account the patient experience. Patient experience, which may vary for different patient populations ('P'), covers a range of dimensions, including:
patient views on the effectiveness and acceptability of given interventions ('I')
patient preferences for different treatment options, including the option of foregoing treatment ('C')
patient views on what constitutes a desired, appropriate or acceptable outcome ('O').
The integration of relevant patient experiences into each review question therefore helps to make the question patient-centred as well as clinically appropriate. For example, a review question that looks at the effectiveness of aggressive chemotherapy for a terminal cancer is more patient-centred if it integrates patient views on whether it is preferable to prolong life or to have a shorter life but of better quality.
It is also possible for review questions to ask about specific elements of the patient experience in their own right, although the PICO framework may not provide a helpful structure if these do not involve an intervention designed to treat a particular condition. Such review questions should be clear and focused, and should address relevant aspects of the patient experience at specific points in the care pathway that are considered to be important by the patient and carer members and others on the GDG. Such questions can address a range of issues, such as:
patient information and support needs
elements of care that are of particular importance to patients
the specific needs of groups of patients who may be disadvantaged compared with others
which outcomes reported in intervention studies are most important to patients.
As with the development of all structured review questions, questions that are broad in scope and lack focus (for example, 'what is the patient experience of living with condition X'?) should be avoided. Examples of review questions relating to patient information and support needs are given in box 4.6.
Box 4.6 Examples of review questions on patient experience
What information and support should be offered to children with atopic eczema and their families/carers?
(From: . NICE clinical guideline 57 [2007].)
What elements of care on the general ward are viewed as important by patients following their discharge from critical care areas?
(From: . NICE clinical guideline 50 [2007].)
Are there cultural differences that need to be considered in delivering information and support on breast or bottle-feeding?
(From: . NICE clinical guideline 37 [2006].)
A review question relating to patient experience is likely to be best answered using qualitative studies and cross-sectional surveys, although information on patient experience is also becoming increasingly available as part of wider intervention studies.
4.3.5 Review questions about service delivery
Clinical guidelines may cover issues of service delivery. Examples of review questions relating to service delivery are given in box 4.7.
Box 4.7 Examples of review questions on service delivery
In patients with hip fractures what is the clinical and cost effectiveness of early surgery (within 24, 36 or 48 hours) on the incidence of complications such as mortality, pneumonia, pressure sores, cognitive dysfunction and increased length of hospital stay?
In patients with hip fracture what is the clinical and cost effectiveness of hospital-based multidisciplinary rehabilitation on functional status, length of stay in secondary care, mortality, place of residence/discharge, hospital readmission and quality of life?
What is the clinical and cost effectiveness of surgeon seniority (consultant or equivalent) in reducing the incidence of mortality, the number of patients requiring reoperation, and poor outcome in terms of mobility, length of stay, wound infection and dislocation?
(From: . NICE clinical guideline 124 [2011].)
The most appropriate study design to answer review questions about service delivery is an RCT. However, a wide variety of methodological approaches and study designs have been used.
For each systematic review, the systematic reviewer (with input from other technical staff at the NCC) should prepare a review protocol that outlines the background, the objectives and the planned methods. This protocol will explain how the review is to be carried out and will help the reviewer to plan and think through the different stages, as well as providing some protection against the introduction of bias. In addition, the review protocol should make it possible for the review to be repeated by others at a later date. A protocol should also make it clear how equality issues have been considered in planning the review work, if appropriate.
4.4.1 Structure of the review protocol
The protocol should be short (no longer than one page) and should describe any differences from the methods described in this guidelines manual ( chapters 5–7 ), rather than duplicating the methodology stated here. It should include the components outlined in table 4.1.
Table 4.1 Components of the review protocol
Review question
The review question as agreed by the GDG.
Objectives
Short description; for example 'To estimate the effectiveness and cost effectiveness of…' or 'To estimate the diagnostic accuracy of…'.
Criteria for considering studies for the review
Using the PICO framework.
Including the study designs selected.
How the information will be searched
The sources to be searched and any limits that will be applied to the search strategies; for example, publication date, study design, language. (Searches should not necessarily be restricted to RCTs.)
The review strategy
The methods that will be used to review the evidence, outlining exceptions and subgroups.
Indicate if meta-analysis will be used and how it will be conducted.
The review protocol is an important opportunity to look at issues relating to equalities that were identified in the scope, and to plan how these should be addressed. For example, if it is anticipated that the effects of an intervention might vary with patient age, the review protocol should outline the plan for addressing this in the review strategy.
4.4.2 Process for developing the review protocol
The review protocol should be produced after the review question has been agreed by the GDG and before starting the review (that is, usually between two GDG meetings). The protocol should be approved by the GDG at the next meeting.
All review protocols should be included as appendices in the draft of the full guideline that is prepared for consultation (see also chapter 10 ). Any changes made to a protocol in the course of the work should be described. Review protocols will also be published on the NICE website 5–7 weeks before consultation on the guideline starts.
Centre for Reviews and Dissemination (2009) Systematic reviews: CRD's guidance for undertaking reviews in health care . Centre for Reviews and Dissemination, University of York
Cochrane Diagnostic Test Accuracy Working Group (2008) Cochrane handbook for diagnostic test accuracy reviews, version 1.0.1 (updated March 2009). The Cochrane Collaboration
Higgins JPT, Green S, editors (2008) Cochrane handbook for systematic reviews of interventions, version 5.1.0 (updated March 2011). The Cochrane Collaboration
Lord SJ, Irwig L, Simes RJ (2006) When is measuring sensitivity and specificity sufficient to evaluate a diagnostic test, and when do we need randomized trials? Annals of Internal Medicine 144: 850–5
National Institute for Health and Clinical Excellence (2011) Diagnostics assessment programme manual . London: National Institute for Health and Clinical Excellence
Richardson WS, Wilson MS, Nishikawa J et al. (1995) The well-built clinical question: a key to evidence-based decisions. American College of Physicians Journal Club 123: A12–3
[ 6 ] Information throughout this manual relating to the role of the National Collaborating Centres in guideline development also applies to the Internal Clinical Guidelines Programme at NICE.
Systematic Reviews: Formulating Your Research Question
What Type of Review is Right for You?
What is in a Systematic Review
Finding and Appraising Systematic Reviews
Formulating Your Research Question
Inclusion and Exclusion Criteria
Creating a Protocol
Results and PRISMA Flow Diagram
Searching the Published Literature
Searching the Gray Literature
Methodology and Documentation
Managing the Process
Scoping Reviews
Types of Questions
Research questions should be answerable and also fill important gaps in the knowledge. Developing a good question takes time and may not fit in the traditional framework. Questions can be broad or narrow and there are advantages and disadvantages to each type.
Questions can be about interventions, diagnosis, screening, measuring, patients/student/customer experiences, or even management strategies. They can also be about policies. As the field of systematic reviews grow, more and more people in humanities and social sciences are embracing systematic reviews and creating questions that fit within their fields of practice.
More information can be found here:
Thomas J, Kneale D, McKenzie JE, Brennan SE, Bhaumik S. Chapter 2: Determining the scope of the review and the questions it will address. In: Higgins JPT, Thomas J, Chandler J, Cumpston M, Li T, Page MJ, Welch VA (editors). Cochrane Handbook for Systematic Reviews of Interventions version 6.0 (updated July 2019). Cochrane, 2019. Available from www.training.cochrane.org/handbook .
Frameworks are used to develop the question being asked. The type of framework doesn't matter as much as the question being selected.
Think of these frameworks as you would for a house or building. A framework is there to provide support and to be a scaffold for the rest of the structure. In the same way, a research question framework can also help structure your evidence synthesis question.
Health
Social Sciences
Interdisciplinary
Organizing Your Question
Formulating non-PICO questions Although the PICO formulation should apply easily to the majority of effectiveness questions and a great number besides you may encounter questions that are not easily accommodated within this particular framework. Below you will find a number of acceptable alternatives:
Using The PICOS Model To Design And Conduct A Systematic Search: A Speech Pathology Case Study
7 STEPS TO THE PERFECT PICO SEARCH Searching for high-quality clinical research evidence can be a daunting task, yet it is an integral part of the evidence-based practice process. One way to streamline and improve the research process for nurses and researchers of all backgrounds is to utilize the PICO search strategy. PICO is a format for developing a good clinical research question prior to starting one’s research. It is a mnemonic used to describe the four elements of a sound clinical foreground question (Yale University’s Cushing/Whitney Medical Library)
to search for quantitative review questions
P: Patient or Population
I: Intervention (or Exposure)
C: Comparison (or Control)
Variations Include:
S: Study Design
T: Timeframe
to search for qualitative evidence
S: Setting (where?)
P: Perspecitve (for whom?)
I: Intervention (what?)
C: Comparison (compared with what?)
E: Evaluation (with what result?)
to search for qualitative and mixed methods research studies
S: Sample
PI: Phenomenon of Interest
E: Evaluation
R: Research type
to search for health policy/management information
E: Expectation (improvement or information or innovation)
C: Client group (at whom the service is aimed)
L: Location (where is the service located?)
I: Impact (outcomes)
P: Professionals (who is involved in providing/improving the service)
Se: Service (for which service are you looking for information)
PICO Template Questions
Try words from your topic in these templates. Your PICO should fit only one type of question in the list.
For an intervention/therapy:
In _______(P), what is the effect of _______(I) on ______(O) compared with _______(C) within ________ (T)?
For etiology:
Are ____ (P) who have _______ (I) at ___ (Increased/decreased) risk for/of_______ (O) compared with ______ (P) with/without ______ (C) over _____ (T)?
Diagnosis or diagnostic test:
Are (is) _________ (I) more accurate in diagnosing ________ (P) compared with ______ (C) for _______ (O)?
Prevention:
For ________ (P) does the use of ______ (I) reduce the future risk of ________ (O) compared with _________ (C)?
Prognosis/Predictions
In__________ (P) how does ________ (I) compared to _______(C) influence _______ (O) over ______ (T)?
How do ________ (P) diagnosed with _______ (I) perceive ______ (O) during _____ (T)?
Template taken from Southern Illinois University- Edwardsville
Example PICO Questions
Intervention/Therapy:
In school-age children (P), what is the effect of a school-based physical activity program (I) on a reduction in the incidence of childhood obesity (O) compared with no intervention (C) within a 1 year period (T)?
In high school children (P), what is the effect of a nurse-led presentation on bullying (I) on a reduction in reported incidences of bullying (O) compared with no intervention (C) within a 6 month time frame (T)?
Are males 50 years of age and older (P) who have a history of 1 year of smoking or less (I) at an increased risk of developing esophageal cancer (O) compared with males age 50 and older (P) who have no smoking history (C)?
Are women ages 25-40 (P) who take oral contraceptives (I) at greater risk for developing blood clots (O) compared with women ages 25-40 (P) who use IUDs for contraception (C) over a 5 year time frame (T)?
Diagnosis/Diagnostic Test:
Is a yearly mammogram (I) more effective in detecting breast cancer (O) compared with a mammogram every 3 years (C) in women under age 50 (P)?
Is a colonoscopy combined with fecal occult blood testing (I) more accurate in detecting colon cancer (O) compared with a colonoscopy alone (C) in adults over age 50 (P)?
For women under age 60 (P), does the daily use of 81mg low-dose Aspirin (I) reduce the future risk of stroke (O) compared with no usage of low-dose Aspirin (C)?
For adults over age 65 (P) does a daily 30 minute exercise regimen (I) reduce the future risk of heart attack (O) compared with no exercise regimen (C)?
Prognosis/Predictions:
Does daily home blood pressure monitoring (I) influence compliance with medication regimens for hypertension (O) in adults over age 60 who have hypertension (P) during the first year after being diagnosed with the condition (T)?
Does monitoring blood glucose 4 times a day (I) improve blood glucose control (O) in people with Type 1 diabetes (P) during the first six months after being diagnosed with the condition (T)?
How do teenagers (P) diagnosed with cancer (I) perceive chemotherapy and radiation treatments (O) during the first 6 months after diagnosis (T)?
How do first-time mothers (P) of premature babies in the NICU (I) perceive bonding with their infant (O) during the first month after birth (T)?
<< Previous: Finding and Appraising Systematic Reviews
3. Searching, screening, and selection of articles
4. Critical appraisal
5. Writing and publishing
Guidelines & standards
Software and tools
Software tutorials
Resources by discipline
Duke Med Center Library: Systematic reviews This link opens in a new window
Overwhelmed? General literature review guidance This link opens in a new window
Email a Librarian
Contact a Librarian
Ask a Librarian
Formulating a question.
Formulating a strong research question for a systematic review can be a lengthy process. While you may have an idea about the topic you want to explore, your specific research question is what will drive your review and requires some consideration.
You will want to conduct preliminary or exploratory searches of the literature as you refine your question. In these searches you will want to:
Determine if a systematic review has already been conducted on your topic and if so, how yours might be different, or how you might shift or narrow your anticipated focus
Scope the literature to determine if there is enough literature on your topic to conduct a systematic review
Identify key concepts and terminology
Identify seminal or landmark studies
Identify key studies that you can test your research strategy against (more on that later)
Begin to identify databases that might be useful to your search question
Systematic review vs. other reviews
Systematic reviews required a narrow and specific research question. The goal of a systematic review is to provide an evidence synthesis of ALL research performed on one particular topic. So, your research question should be clearly answerable from the data you gather from the studies included in your review.
Ask yourself if your question even warrants a systematic review (has it been answered before?). If your question is more broad in scope or you aren't sure if it's been answered, you might look into performing a systematic map or scoping review instead.
Learn more about systematic reviews versus scoping reviews:
CEE. (2022). Section 2:Identifying the need for evidence, determining the evidence synthesis type, and establishing a Review Team. Collaboration for Environmental Evidence. https://environmentalevidence.org/information-for-authors/2-need-for-evidence-synthesis-type-and-review-team-2/
DistillerSR. (2022). The difference between systematic reviews and scoping reviews. DistillerSR. https://www.distillersr.com/resources/systematic-literature-reviews/the-difference-between-systematic-reviews-and-scoping-reviews
Nalen, CZ. (2022). What is a scoping review? AJE. https://www.aje.com/arc/what-is-a-scoping-review/
Frame your entire research process
Determine the scope of your review
Provide a focus for your searches
Help you identify key concepts
Guide the selection of your papers
There are different frameworks you can use to help structure a question.
PICO / PECO
What if my topic doesn't fit a framework?
The PICO or PECO framework is typically used in clinical and health sciences-related research, but it can also be adapted for other quantitative research.
P — Patient / Problem / Population
I / E — Intervention / Indicator / phenomenon of Interest / Exposure / Event
C — Comparison / Context / Control
O — Outcome
Example topic : Health impact of hazardous waste exposure
Population
E
Comparators
Outcomes
People living near hazardous waste sites
Exposure to hazardous waste
All comparators
All diseases/health disorders
Fazzo, L., Minichilli, F., Santoro, M., Ceccarini, A., Della Seta, M., Bianchi, F., Comba, P., & Martuzzi, M. (2017). Hazardous waste and health impact: A systematic review of the scientific literature. Environmental Health , 16 (1), 107. https://doi.org/10.1186/s12940-017-0311-8
The SPICE framework is useful for both qualitative and mixed-method research. It is often used in the social sciences.
S — Setting (where?)
P — Perspective (for whom?)
I — Intervention / Exposure (what?)
C — Comparison (compared with what?)
E — Evaluation (with what result?)
Learn more : Booth, A. (2006). Clear and present questions: Formulating questions for evidence based practice. Library Hi Tech , 24 (3), 355-368. https://doi.org/10.1108/07378830610692127
The SPIDER framework is useful for both qualitative and mixed-method research. It is most often used in health sciences research.
S — Sample
PI — Phenomenon of Interest
D — Design
E — Evaluation
R — Study Type
Learn more : Cooke, A., Smith, D., & Booth, A. (2012). Beyond PICO: The SPIDER tool for qualitative evidence synthesis. Qualitative Health Research, 22 (10), 1435-1443. https://doi.org/10.1177/1049732312452938
The CIMO framework is used to understand complex social and organizational phenomena, most useful for management and business research.
C — Context (the social and organizational setting of the phenomenon)
I — Intervention (the actions taken to address/influence the phenomenon)
M — Mechanisms (the underlying processes or mechanisms that drive change within the phenomenon)
O — Outcomes (the resulting changes that occur due to intervention/mechanisms)
Learn more : Denyer, D., Tranfield, D., & van Aken, J. E. (2008). Developing design propositions through research synthesis. Organization Studies, 29 (3), 393-413. https://doi.org/10.1177/0170840607088020
Click here for an exhaustive list of research question frameworks from the University of Maryland Libraries.
You might find that your topic does not always fall into one of the models listed on this page. You can always modify a model to make it work for your topic, and either remove or incorporate additional elements. Be sure to document in your review the established framework that yours is based off and how it has been modified.
There are many ways of framing questions depending on the topic, discipline, or type of questions.
Try to generate a few options for your initial research topic and narrow it down to a specific population, geographical location, disease, etc. You may explore a similar tool, to identify additional search terms.
Several frameworks are listed in the table below.
Source:
Foster, M. & Jewell, S. (Eds). (2017). . Medical Library Association, Lanham: Rowman & Littlefield. p. 38, Table 3.
Dawes, M., Pluye, P., Shea, L., Grad, R., Greenberg, A., & Nie, J.-Y. (2007). . (1), 9–16.
Medicine
Perspective Setting Phenomenon of interest/Problem Environment Comparison (optional) Time/Timing Findings
Booth, A., Noyes, J., Flemming, K., Moore, G., Tunçalp, Ö., & Shakibazadeh, E. (2019). . (Suppl 1).
Qualitative research
Person Environments Stakeholders Intervention Comparison Outcome
Schlosser, R. W., & O'Neil-Pirozzi, T. (2006). . , 5-10.
Augmentative and alternative communication
Patient Intervention Comparison Outcome
Richardson, W. S., Wilson, M. C., Nishikawa, J., & Hayward, R. S. (1995). . (3), A12-A12.
Clinical medicine
Patient Intervention Comparison Outcome
+context, patient values, and preferences
Bennett, S., & Bennett, J. W. (2000). . (4), 171-180.
Occupational therapy
Patient Intervention Comparison Outcome
Context
Petticrew, M., & Roberts, H. (2006). Malden, MA: Blackwell Publishers.
Social Sciences
Patient Intervention Comparison Outcome
Study Type
Moher, D., Liberati, A., Tetzlaff, J., Altman, D. G., & Prisma Group. (2009). (7), e1000097.
Medicine
Patient Intervention Comparison Outcome
Time
Richardson, W. S., Wilson, M. C., Nishikawa, J., & Hayward, R. S. (1995). . (3), A12-A12.
Education, health care
Patient/participants/population Index tests Comparator/reference tests Outcome
Kim, K. W., Lee, J., Choi, S. H., Huh, J., & Park, S. H. (2015). (6), 1175-1187.
Diagnostic questions
Population Intervention Professionals Outcomes Health care setting/context
ADAPTE Collaboration. (2009). . Version 2.0. Available from
Screening
Problem Phenomenon of interest Time
Booth, A., Noyes, J., Flemming, K., Gerhardus, A., Wahlster, P., van der Wilt, G. J., ... & Rehfuess, E. (2016). [Technical Report]. https://doi.org/10.13140/RG.2.1.2318.0562
-----
Booth, A., Sutton, A., & Papaioannou, D. (2016). (2. ed.). London: Sage.
Sample Phenomenon of interest Design Evaluation Research type
Cooke, A., Smith, D., & Booth, A. (2012). (10), 1435-1443.
Health, qualitative research
Who What How
What was done? (intervention, exposure, policy, phenomenon)
How does the what affect the who?
Further reading:
Methley, A. M., Campbell, S., Chew-Graham, C., McNally, R., & Cheraghi-Sohi, S. (2014). PICO, PICOS and SPIDER: A comparison study of specificity and sensitivity in three search tools for qualitative systematic reviews. BMC Health Services Research, 14 (1), 579.
<< Previous: Steps of a Systematic Review
Next: Developing a Search Strategy >>
Last Updated: Jul 11, 2024 6:38 AM
URL: https://lib.guides.umd.edu/SR
McGill Library
Systematic Reviews, Scoping Reviews, and other Knowledge Syntheses
Identifying the research question
Types of knowledge syntheses
Process of conducting a knowledge synthesis
Constructing a good research question
Inclusion/exclusion criteria, has your review already been done, where to find other reviews or syntheses, references on question formulation frameworks.
Developing the protocol
Database-specific operators and fields
Search filters and tools
Exporting and documenting search results
Deduplicating
Grey literature and other supplementary search methods
Documenting the search methods
Updating the database searches
Resources for screening, appraisal, and synthesis
Writing the review
Additional training resources
Formulating a well-constructed research question is essential for a successful review. You should have a draft research question before you choose the type of knowledge synthesis that you will conduct, as the type of answers you are looking for will help guide your choice of knowledge synthesis.
Examples of systematic review and scoping review questions
A systematic review question
A scoping review question
Typically a focused research question with narrow parameters, and usually fits into the PICO question format
Often a broad question that looks at answering larger, more complex, exploratory research questions and often does not fit into the PICO question format
Example: "In people with multiple sclerosis, what is the extent to which a walking intervention, compared to no intervention, improves self-report fatigue?"
Example: "What rehabilitation interventions are used to reduce fatigue in adults with multiple sclerosis?"
Process of formulating a question
Developing a good research question is not a straightforward process and requires engaging with the literature as you refine and rework your idea.
Some questions that might be useful to ask yourself as you are drafting your question:
Does the question fit into the PICO question format?
What age group?
What type or types of conditions?
What intervention? How else might it be described?
What outcomes? How else might they be described?
What is the relationship between the different elements of your question?
Do you have several questions lumped into one? If so, should you split them into more than one review? Alternatively, do you have many questions that could be lumped into one review?
A good knowledge synthesis question will have the following qualities:
Be focused on a specific question with a meaningful answer
Retrieve a number of results that is manageable for the research team (is the number of results on your topic feasible for you to finish the review? Your initial literature searches should give you an idea, and a librarian can help you with understanding the size of your question).
Considering the inclusion and exclusion criteria
It is important to think about which studies will be included in your review when you are writing your research question. The Cochrane Handbook chapter (linked below) offers guidance on this aspect.
McKenzie, J. E., Brennan, S. E., Ryan, R. E., Thomson, H. J., Johnston, R. V, & Thomas, J. (2021). Chapter 3: Defining the criteria for including studies and how they will be grouped for the synthesis. Retrieved from https://training.cochrane.org/handbook/current/chapter-03
Once you have a reasonably well defined research question, it is important to make sure your project has not already been recently and successfully undertaken. This means it is important to find out if there are other knowledge syntheses that have been published or that are in the process of being published on your topic.
If you are submitting your review or study for funding, for example, you may want to make a good case that your review or study is needed and not duplicating work that has already been successfully and recently completed—or that is in the process of being completed. It is also important to note that what is considered “recent” will depend on your discipline and the topic.
In the context of conducting a review, even if you do find one on your topic, it may be sufficiently out of date or you may find other defendable reasons to undertake a new or updated one. In addition, looking at other knowledge syntheses published around your topic may help you refocus your question or redirect your research toward other gaps in the literature.
PROSPERO Search PROSPERO is an international, searchable database that allows free registration of systematic reviews, rapid reviews, and umbrella reviews with a health-related outcome in health & social care, welfare, public health, education, crime, justice, and international development. Note: PROSPERO does not accept scoping review protocols.
Open Science Framework (OSF) At present, OSF does not allow for Boolean searching on their site. However, you can search via https://share.osf.io/, an aggregator, that allows you to search for major keywords using Boolean and truncation. Add "review*" to your search to narrow results down to scoping, systematic, umbrella or other types of reviews. Be sure to click on the drop-down menu for "Source" and select OSF and OSF Registries (search separately as you can't combine them). This will search for ongoing and/or registered reviews in OSF.
The Cochrane Library (including systematic reviews of interventions, diagnostic studies, prognostic studies, and more) is an excellent place to start, even if Cochrane reviews are also indexed in MEDLINE/PubMed.
By default, the Cochrane Library will display “ Cochrane Reviews ” (Cochrane Database of Systematic Reviews, aka CDSR). You can ignore the results which show up in the Trials tab when looking for systematic reviews: They are records of controlled trials.
The example shows the number of Cochrane Reviews with hiv AND circumcision in the title, abstract, or keywords.
Google Scholar
Subject-specific databases you can search to find existing or in-process reviews
Alternatively, you can use a search hedge/filter; for example, the filter used by BMJ Best Practice to find systematic reviews in Embase (can be copied and pasted into the Embase search box then combined with the concepts of your research question):
(exp review/ or (literature adj3 review$).ti,ab. or exp meta analysis/ or exp "Systematic Review"/) and ((medline or medlars or embase or pubmed or cinahl or amed or psychlit or psyclit or psychinfo or psycinfo or scisearch or cochrane).ti,ab. or RETRACTED ARTICLE/) or (systematic$ adj2 (review$ or overview)).ti,ab. or (meta?anal$ or meta anal$ or meta-anal$ or metaanal$ or metanal$).ti,ab.
Alternative interface to PubMed: You can also search MEDLINE on the Ovid platform, which we recommend for systematic searching. Perform a sufficiently developed search strategy (be as broad in your search as is reasonably possible) and then, from Additional Limits , select the publication type Systematic Reviews, or select the subject subset Systematic Reviews Pre 2019 for more sensitive/less precise results.
The subject subset for Systematic Reviews is based on the filter version used in PubMed .
Perform a sufficiently developed search strategy (be as broad in your search as is reasonably possible) and then, from Additional Limits , select, under Methodology, 0830 Systematic Review
See Systematic Reviews Search Strategy Applied in PubMed for details.
healthevidence.org Database of thousands of "quality-rated reviews on the effectiveness of public health interventions"
See also: Evidence-informed resources for Public Health
Munn Z, Stern C, Aromataris E, Lockwood C, Jordan Z. What kind of systematic review should I conduct? A proposed typology and guidance for systematic reviewers in the medical and health sciences. BMC Med Res Methodol. 2018;18(1):5. doi: 10.1186/s12874-017-0468-4
Scoping reviews: Developing the title and question . In: Aromataris E, Munn Z (Editors) . JBI Manual for Evidence Synthesis. JBI; 2020. https://doi.org/10.46658/JBIMES-20-01
Due to a large influx of requests, there may be an extended wait time for librarian support on knowledge syntheses.
Find a librarian in your subject area to help you with your knowledge synthesis project.
Or contact the librarians at the Schulich Library of Physical Sciences, Life Sciences, and Engineering s [email protected]
Need help? Ask us!
Online training resources.
Advanced Research Skills: Conducting Literature and Systematic Reviews A short course for graduate students to increase their proficiency in conducting research for literature and systematic reviews developed by the Toronto Metropolitan University (formerly Ryerson).
Compétences avancées en matière de recherche : Effectuer des revues de la littérature et des revues systématiques (2e édition) Ce cours destiné aux étudiant.e.s universitaires vise à peaufiner leurs compétences dans la réalisation de revues systématiques et de recherches dans la littérature en vue de mener avec succès leurs propres recherches durant leur parcours universitaire et leur éventuelle carrière.
The Art and Science of Searching in Systematic Reviews Self-paced course on search strategies, information sources, project management, and reporting (National University of Singapore)
CERTaIN: Knowledge Synthesis: Systematic Reviews and Clinical Decision Making "Learn how to interpret and report systematic review and meta-analysis results, and define strategies for searching and critically appraising scientific literature" (MDAndersonX)
Cochrane Interactive Learning Online modules that walk you through the process of working on a Cochrane intervention review. Module 1 is free (login to access) but otherwise payment is required to complete the online training
Evidence Synthesis for Librarians and Information Specialists Introduction to core components of evidence synthesis. Developed by the Evidence Synthesis Institute. Free for a limited time as of July 10, 2024.
Introduction to Systematic Review and Meta-Analysis Free coursera MOOC offered by Johns Hopkins University; covers the whole process of conducting a systematic review; week 3 focuses on searching and assessing bias
Mieux réussir un examen de la portée en sciences de la santé : une boîte à outils Cette ressource éducative libre (REL) est conçue pour soutenir les étudiant·e·s universitaires en sciences de la santé dans la préparation d’un examen de la portée de qualité.
Online Methods Course in Systematic Review and Systematic Mapping "This step-by-step course takes time to explain the theory behind each part of the review process, and provides guidance, tips and advice for those wanting to undertake a full systematic review or map." Developed using an environmental framework (Collaboration for Environmental Evidence, Stockholm Environment Institute)
Scoping Review Methods for Producing Research Syntheses Two-part, online workshop sponsored by the Center on Knowledge Translation for Disability and Rehabilitation Research (KTDRR)
Systematic Reviews and Meta-Analysis Online overview of the steps involved in systematic reviews of quantitative studies, with options to practice. Courtesy of the Campbell Collaboration and the Open Learning Initiative (Carnegie Mellon University). Free pilot
Systematic Searches Developed by the Harvey Cushing/John Hay Whitney Medical Library (Yale University)
Systematic Reviews of Animal Studies (SYRCLE) Introduction to systematic reviews of animal studies
Chapter 2: determining the scope of the review and the questions it will address.
James Thomas, Dylan Kneale, Joanne E McKenzie, Sue E Brennan, Soumyadeep Bhaumik
Key Points:
Systematic reviews should address answerable questions and fill important gaps in knowledge.
Developing good review questions takes time, expertise and engagement with intended users of the review.
Cochrane Reviews can focus on broad questions, or be more narrowly defined. There are advantages and disadvantages of each.
Logic models are a way of documenting how interventions, particularly complex interventions, are intended to ‘work’, and can be used to refine review questions and the broader scope of the review.
Using priority-setting exercises, involving relevant stakeholders, and ensuring that the review takes account of issues relating to equity can be strategies for ensuring that the scope and focus of reviews address the right questions.
Cite this chapter as: Thomas J, Kneale D, McKenzie JE, Brennan SE, Bhaumik S. Chapter 2: Determining the scope of the review and the questions it will address. In: Higgins JPT, Thomas J, Chandler J, Cumpston M, Li T, Page MJ, Welch VA (editors). Cochrane Handbook for Systematic Reviews of Interventions version 6.4 (updated August 2023). Cochrane, 2023. Available from www.training.cochrane.org/handbook .
2.1 Rationale for well-formulated questions
As with any research, the first and most important decision in preparing a systematic review is to determine its focus. This is best done by clearly framing the questions the review seeks to answer. The focus of any Cochrane Review should be on questions that are important to people making decisions about health or health care. These decisions will usually need to take into account both the benefits and harms of interventions (see MECIR Box 2.1.a ). Good review questions often take time to develop, requiring engagement with not only the subject area, but with a wide group of stakeholders (Section 2.4.2 ).
Well-formulated questions will guide many aspects of the review process, including determining eligibility criteria, searching for studies, collecting data from included studies, structuring the syntheses and presenting findings (Cooper 1984, Hedges 1994, Oliver et al 2017) . In Cochrane Reviews, questions are stated broadly as review ‘Objectives’, and operationalized in terms of the studies that will be eligible to answer those questions as ‘Criteria for considering studies for this review’. As well as focusing review conduct, the contents of these sections are used by readers in their initial assessments of whether the review is likely to be directly relevant to the issues they face.
The FINER criteria have been proposed as encapsulating the issues that should be addressed when developing research questions. These state that questions should be F easible, I nteresting, N ovel, E thical, and R elevant (Cummings et al 2007). All of these criteria raise important issues for consideration at the outset of a review and should be borne in mind when questions are formulated.
A feasible review is one that asks a question that the author team is capable of addressing using the evidence available. Issues concerning the breadth of a review are discussed in Section 2.3.1 , but in terms of feasibility it is important not to ask a question that will result in retrieving unmanageable quantities of information; up-front scoping work will help authors to define sensible boundaries for their reviews. Likewise, while it can be useful to identify gaps in the evidence base, review authors and stakeholders should be aware of the possibility of asking a question that may not be answerable using the existing evidence (i.e. that will result in an ‘empty’ review, see also Section 2.5.3 ).
Embarking on a review that authors are interested in is important because reviews are a significant undertaking and review authors need sufficient commitment to see the work through to its conclusion.
A novel review will address a genuine gap in knowledge, so review authors should be aware of any related or overlapping reviews. This reduces duplication of effort, and also ensures that authors understand the wider research context to which their review will contribute. Authors should check for pre-existing syntheses in the published research literature and also for ongoing reviews in the PROSPERO register of systematic reviews before beginning their own review.
Given the opportunity cost involved in undertaking an activity as demanding as a systematic review, authors should ensure that their work is relevant by: (i) involving relevant stakeholders in defining its focus and the questions it will address; and (ii) writing up the review in such a way as to facilitate the translation of its findings to inform decisions. The GRADE framework aims to achieve this, and should be considered throughout the review process, not only when it is being written up (see Chapter 14 and Chapter 15 ).
Consideration of opportunity costs is also relevant in terms of the ethics of conducting a review, though ethical issues should also be considered primarily in terms of the questions that are prioritized for answering and the way that they are framed. Research questions are often not value-neutral, and the way that a given problem is approached can have political implications which can result in, for example, the widening of health inequalities (whether intentional or not). These issues are explored in Section 2.4.3 and Chapter 16 .
MECIR Box 2.1.a Relevant expectations for conduct of intervention reviews
Formulating review questions ( )
Cochrane Reviews are intended to support clinical practice and policy, not just scientific curiosity. The needs of consumers play a central role in Cochrane Reviews and they can play an important role in defining the review question. Qualitative research, i.e. studies that explore the experience of those involved in providing and receiving interventions, and studies evaluating factors that shape the implementation of interventions, might be used in the same way.
Considering potential adverse effects ( )
It is important that adverse effects are addressed in order to avoid one-sided summaries of the evidence. At a minimum, the review will need to highlight the extent to which potential adverse effects have been evaluated in any included studies. Sometimes data on adverse effects are best obtained from non-randomized studies, or qualitative research studies. This does not mean however that all reviews must include non-randomized studies.
2.2 Aims of reviews of interventions
Systematic reviews can address any question that can be answered by a primary research study. This Handbook focuses on a subset of all possible review questions: the impact of intervention(s) implemented within a specified human population. Even within these limits, systematic reviews examining the effects of intervention(s) can vary quite markedly in their aims. Some will focus specifically on evidence of an effect of an intervention compared with a specific alternative, whereas others may examine a range of different interventions. Reviews that examine multiple interventions and aim to identify which might be the most effective can be broader and more challenging than those looking at single interventions. These can also be the most useful for end users, where decision making involves selecting from a number of intervention options. The incorporation of network meta-analysis as a core method in this edition of the Handbook (see Chapter 11 ) reflects the growing importance of these types of reviews.
As well as looking at the balance of benefit and harm that can be attributed to a given intervention, reviews within the ambit of this Handbook might also aim to investigate the relationship between the size of an intervention effect and other characteristics, such as aspects of the population, the intervention itself, how the outcome is measured, or the methodology of the primary research studies included. Such approaches might be used to investigate which components of multi-component interventions are more or less important or essential (and when). While it is not always necessary to know how an intervention achieves its effect for it to be useful, many reviews will aim to articulate an intervention’s mechanisms of action (see Section 2.5.1 ), either by making this an explicit aim of the review itself (see Chapter 17 and Chapter 21 ), or when describing the scope of the review. Understanding how an intervention works (or is intended to work) can be an important aid to decision makers in assessing the applicability of the review to their situation. These investigations can be assisted by the incorporation of results from process evaluations conducted alongside trials (see Chapter 21 ). Further, many decisions in policy and practice are at least partially constrained by the resource available, so review authors often need to consider the economic context of interventions (see Chapter 20 ).
2.3 Defining the scope of a review question
Studies comparing healthcare interventions, notably randomized trials, use the outcomes of participants to compare the effects of different interventions. Statistical syntheses (e.g. meta-analysis) focus on comparisons of interventions, such as a new intervention versus a control intervention (which may represent conditions of usual practice or care), or the comparison of two competing interventions. Throughout the Handbook we use the terminology experimental intervention versus comparator intervention. This implies a need to identify one of the interventions as experimental, and is used only for convenience since all methods apply to both controlled and head-to-head comparisons. The contrast between the outcomes of two groups treated differently is known as the ‘effect’, the ‘treatment effect’ or the ‘intervention effect’; we generally use the last of these throughout the Handbook .
A statement of the review’s objectives should begin with a precise statement of the primary objective, ideally in a single sentence ( MECIR Box 2.3.a ). Where possible the style should be of the form ‘To assess the effects of [ intervention or comparison ] for [ health problem ] in [ types of people, disease or problem and setting if specified ]’. This might be followed by one or more secondary objectives, for example relating to different participant groups, different comparisons of interventions or different outcome measures. The detailed specification of the review question(s) requires consideration of several key components (Richardson et al 1995, Counsell 1997) which can often be encapsulated by the ‘PICO’ mnemonic, an acronym for P opulation, I ntervention, C omparison(s) and O utcome. Equal emphasis in addressing, and equal precision in defining, each PICO component is not necessary. For example, a review might concentrate on competing interventions for a particular stage of breast cancer, with stage and severity of the disease being defined very precisely; or alternately focus on a particular drug for any stage of breast cancer, with the treatment formulation being defined very precisely.
Throughout the Handbook we make a distinction between three different stages in the review at which the PICO construct might be used. This division is helpful for understanding the decisions that need to be made:
The review PICO (planned at the protocol stage) is the PICO on which eligibility of studies is based (what will be included and what excluded from the review).
The PICO for each synthesis (also planned at the protocol stage) defines the question that each specific synthesis aims to answer, determining how the synthesis will be structured, specifying planned comparisons (including intervention and comparator groups, any grouping of outcome and population subgroups).
The PICO of the included studies (determined at the review stage) is what was actually investigated in the included studies.
Reaching the point where it is possible to articulate the review’s objectives in the above form – the review PICO – requires time and detailed discussion between potential authors and users of the review. It is important that those involved in developing the review’s scope and questions have a good knowledge of the practical issues that the review will address as well as the research field to be synthesized. Developing the questions is a critical part of the research process. As such, there are methodological issues to bear in mind, including: how to determine which questions are most important to answer; how to engage stakeholders in question formulation; how to account for changes in focus as the review progresses; and considerations about how broad (or narrow) a review should be.
MECIR Box 2.3 . a Relevant expectations for conduct of intervention reviews
Predefining objectives ( )
Objectives give the review focus and must be clear before appropriate eligibility criteria can be developed. If the review will address multiple interventions, clarity is required on how these will be addressed (e.g. summarized separately, combined or explicitly compared).
2.3.1 Broad versus narrow reviews
The questions addressed by a review may be broad or narrow in scope. For example, a review might address a broad question regarding whether antiplatelet agents in general are effective in preventing all thrombotic events in humans. Alternatively, a review might address whether a particular antiplatelet agent, such as aspirin, is effective in decreasing the risks of a particular thrombotic event, stroke, in elderly persons with a previous history of stroke. Increasingly, reviews are becoming broader, aiming, for example, to identify which intervention – out of a range of treatment options – is most effective, or to investigate how an intervention varies depending on implementation and participant characteristics.
Overviews of reviews (see Chapter V ), in which multiple reviews are summarized, can be one way of addressing the need for breadth when synthesizing the evidence base, since they can summarize multiple reviews of different interventions for the same condition, or multiple reviews of the same intervention for different types of participants. It may be considered desirable to plan a series of reviews with a relatively narrow scope, alongside an Overview to summarize their findings. Alternatively, it may be more useful – particularly given the growth in support for network meta-analysis – to combine comparisons of different treatment options within the same review (see Chapter 11 ). When deciding whether or not an overview might be the most appropriate approach, review authors should take account of the breadth of the question being asked and the resources available. Some questions are simply too broad for a review of all relevant primary research to be practicable, and if a field has sufficient high-quality reviews, then the production of another review of primary research that duplicates the others might not be a sensible use of resources.
Some of the advantages and disadvantages of broad and narrow reviews are summarized in Table 2.3.a . While having a broad scope in terms of the range of participants has the potential to increase generalizability, the extent to which findings are ultimately applicable to broader (or different) populations will depend on the participants who have actually been recruited into research studies. Likewise, heterogeneity can be a disadvantage when the expectation is for homogeneity of effects between studies, but an advantage when the review question seeks to understand differential effects (see Chapter 10 ).A distinction should be drawn between the scope of a review and the precise questions within, since it is possible to have a broad review that addresses quite narrow questions. In the antiplatelet agents for preventing thrombotic events example, a systematic review with a broad scope might include all available treatments. Rather than combining all the studies into one comparison though, specific treatments would be compared with one another in separate comparisons, thus breaking a heterogeneous set of treatments into narrower, more homogenous groups. This relates to the three levels of PICO, outlined in Section 2.3 . The review PICO defines the broad scope of the review, and the PICO for comparison defines the specific treatments that will be compared with one another; Chapter 3 elaborates on the use of PICOs.
In practice, a Cochrane Review may start (or have started) with a broad scope, and be divided up into narrower reviews as evidence accumulates and the original review becomes unwieldy. This may be done for practical and logistical reasons, for example to make updating easier as well as to make it easier for readers to see which parts of the evidence base are changing. Individual review authors must decide if there are instances where splitting a broader focused review into a series of more narrowly focused reviews is appropriate and implement appropriate methods to achieve this. If a major change is to be undertaken, such as splitting a broad review into a series of more narrowly focused reviews, a new protocol must be written for each of the component reviews that documents the eligibility criteria for each one.
Ultimately, the selected breadth of a review depends upon multiple factors including perspectives regarding a question’s relevance and potential impact; supporting theoretical, biologic and epidemiological information; the potential generalizability and validity of answers to the questions; and available resources. As outlined in Section 2.4.2 , authors should consider carefully the needs of users of the review and the context(s) in which they expect the review to be used when determining the most optimal scope for their review.
Table 2.3.a Some advantages and disadvantages of broad versus narrow reviews
e.g. corticosteroid injection for shoulder tendonitis (narrow) or corticosteroid injection for any tendonitis (broad)
:
Comprehensive summary of the evidence.
Opportunity to explore consistency of findings (and therefore generalizability) across different types of participants.
Manageability for review team.
Ease of reading.
:
Searching, data collection, analysis and writing may require more resources.
Interpretation may be difficult for readers if the review is large and lacks a clear rationale (such as examining consistency of findings) for including diverse types of participants.
Evidence may be sparse.
Unable to explore whether an intervention operates differently in other settings or populations (e.g. inability to explore differential effects that could lead to inequity).
Increased burden for decision makers if multiple reviews must be accessed (e.g. if evidence is sparse for the population of interest).
Scope could be chosen by review authors to produce a desired result.
e.g. supervised running for depression (narrow) or any exercise for depression (broad)
:
Comprehensive summary of the evidence.
Opportunity to explore consistency of findings across different implementations of the intervention.
:
Manageability for review team.
Ease of reading.
:
Searching, data collection, analysis and writing may require more resources.
Interpretation may be difficult for readers if the review is large and lacks a clear rationale (such as examining consistency of findings) for including different modes of an intervention.
:
Evidence may be sparse.
Unable to explore whether different modes of an intervention modify the intervention effects.
Increased burden for decision makers if multiple reviews must be accessed (e.g. if evidence is sparse for a specific mode).
Scope could be chosen by review authors to produce a desired result.
e.g. oxybutynin compared with desmopressin for preventing bed-wetting (narrow) or interventions for preventing bed-wetting (broad)
:
Comprehensive summary of the evidence.
Opportunity to compare the effectiveness of a range of different intervention options.
:
Manageability for review team.
Relative simplicity of objectives and ease of reading.
:
Searching, data collection, analysis and writing may require more resources.
May be unwieldy, and more appropriate to present as an Overview of reviews (see ).
:
Increased burden for decision makers if not included in an Overview since multiple reviews may need to be accessed.
2.3.2 ‘Lumping’ versus ‘splitting’
It is important not to confuse the issue of the breadth of the review (determined by the review PICO) with concerns about between-study heterogeneity and the legitimacy of combining results from diverse studies in the same analysis (determined by the PICOs for comparison).
Broad reviews have been criticized as ‘mixing apples and oranges’, and one of the inventors of meta-analysis, Gene Glass, has responded “Of course it mixes apples and oranges… comparing apples and oranges is the only endeavour worthy of true scientists; comparing apples to apples is trivial” (Glass 2015). In fact, the two concepts (‘broad reviews’ and ‘mixing apples and oranges’) are different issues. Glass argues that broad reviews, with diverse studies, provide the opportunity to ask interesting questions about the reasons for differential intervention effects.
The ‘apples and oranges’ critique refers to the inappropriate mixing of studies within a single comparison, where the purpose is to estimate an average effect. In situations where good biologic or sociological evidence suggests that various formulations of an intervention behave very differently or that various definitions of the condition of interest are associated with markedly different effects of the intervention, the uncritical aggregation of results from quite different interventions or populations/settings may well be questionable.
Unfortunately, determining the situations where studies are similar enough to combine with one another is not always straightforward, and it can depend, to some extent, on the question being asked. While the decision is sometimes characterized as ‘lumping’ (where studies are combined in the same analysis) or ‘splitting’ (where they are not) (Squires et al 2013), it is better to consider these issues on a continuum, with reviews that have greater variation in the types of included interventions, settings and populations, and study designs being towards the ‘lumped’ end, and those that include little variation in these elements being towards the ‘split’ end (Petticrew and Roberts 2006).
While specification of the review PICO sets the boundary for the inclusion and exclusion of studies, decisions also need to be made when planning the PICO for the comparisons to be made in the analysis as to whether they aim to address broader (‘lumped’) or narrower (‘split’) questions (Caldwell and Welton 2016). The degree of ‘lumping’ in the comparisons will be primarily driven by the review’s objectives, but will sometimes be dictated by the availability of studies (and data) for a particular comparison (see Chapter 9 for discussion of the latter). The former is illustrated by a Cochrane Review that examined the effects of newer-generation antidepressants for depressive disorders in children and adolescents (Hetrick et al 2012).
Newer-generation antidepressants include multiple different compounds (e.g. paroxetine, fluoxetine). The objectives of this review were to (i) estimate the overall effect of newer-generation antidepressants on depression, (ii) estimate the effect of each compound, and (iii) examine whether the compound type and age of the participants (children versus adolescents) is associated with the intervention effect. Objective (i) addresses a broad, ‘in principle’ (Caldwell and Welton 2016), question of whether newer-generation antidepressants improve depression, where the different compounds are ‘lumped’ into a single comparison. Objective (ii) seeks to address narrower, ‘split’, questions that investigate the effect of each compound on depression separately. Answers to both questions can be identified by setting up separate comparisons for each compound, or by subgrouping the ‘lumped’ comparison by compound ( Chapter 10, Section 10.11.2 ). Objective (iii) seeks to explore factors that explain heterogeneity among the intervention effects, or equivalently, whether the intervention effect varies by the factor. This can be examined using subgroup analysis or meta-regression ( Chapter 10, Section 10.11 ) but, in the case of intervention types, is best achieved using network meta-analysis (see Chapter 11 ).
There are various advantages and disadvantages to bear in mind when defining the PICO for the comparison and considering whether ‘lumping’ or ‘splitting’ is appropriate. Lumping allows for the investigation of factors that may explain heterogeneity. Results from these investigations may provide important leads as to whether an intervention operates differently in, for example, different populations (such as in children and adolescents in the example above). Ultimately, this type of knowledge is useful for clinical decision making. However, lumping is likely to introduce heterogeneity, which will not always be explained by a priori specified factors, and this may lead to a combined effect that is clinically difficult to interpret and implement. For example, when multiple intervention types are ‘lumped’ in one comparison (as in objective (i) above), and there is unexplained heterogeneity, the combined intervention effect would not enable a clinical decision as to which intervention should be selected. Splitting comparisons carries its own risk of there being too few studies to yield a useful synthesis. Inevitably, some degree of aggregation across the PICO elements is required for a meta-analysis to be undertaken (Caldwell and Welton 2016).
2.4 Ensuring the review addresses the right questions
Since systematic reviews are intended for use in healthcare decision making, review teams should ensure not only the application of robust methodology, but also that the review question is meaningful for healthcare decision making. Two approaches are discussed below:
Using results from existing research priority-setting exercises to define the review question.
In the absence of, or in addition to, existing research priority-setting exercises, engaging with stakeholders to define review questions and establish their relevance to policy and practice.
2.4.1 Using priority-setting exercises to define review questions
A research priority-setting exercise is a “collective activity for deciding which uncertainties are most worth trying to resolve through research; uncertainties considered may be problems to be understood or solutions to be developed or tested; across broad or narrow areas” (Sandy Oliver, referenced in Nasser 2018). Using research priority-setting exercises to define the scope of a review helps to prevent the waste of scarce resources for research by making the review more relevant to stakeholders (Chalmers et al 2014).
Research priority setting is always conducted in a specific context, setting and population with specific principles, values and preferences (which should be articulated). Different stakeholders’ interpretation of the scope and purpose of a ‘research question’ might vary, resulting in priorities that might be difficult to interpret. Researchers or review teams might find it necessary to translate the research priorities into an answerable PICO research question format, and may find it useful to recheck the question with the stakeholder groups to determine whether they have accurately reflected their intentions.
While Cochrane Review teams are in most cases reviewing the effects of an intervention with a global scope, they may find that the priorities identified by important stakeholders (such as the World Health Organization or other organizations or individuals in a representative health system) are informative in planning the review. Review authors may find that differences between different stakeholder groups’ views on priorities and the reasons for these differences can help them to define the scope of the review. This is particularly important for making decisions about excluding specific populations or settings, or being inclusive and potentially conducting subgroup analyses.
Whenever feasible, systematic reviews should be based on priorities identified by key stakeholders such as decision makers, patients/public, and practitioners. Cochrane has developed a list of priorities for reviews in consultation with key stakeholders, which is available on the Cochrane website. Issues relating to equity (see Chapter 16 and Section 2.4.3 ) need to be taken into account when conducting and interpreting the results from priority-setting exercises. Examples of materials to support these processes are available (Viergever et al 2010, Nasser et al 2013, Tong et al 2017).
The results of research priority-setting exercises can be searched for in electronic databases and via websites of relevant organizations. Examples are: James Lind Alliance , World Health Organization, organizations of health professionals including research disciplines, and ministries of health in different countries (Viergever 2010). Examples of search strategies for identifying research priority-setting exercises are available (Bryant et al 2014, Tong et al 2015).
Other sources of questions are often found in ‘implications for future research’ sections of articles in journals and clinical practice guidelines. Some guideline developers have prioritized questions identified through the guideline development process (Sharma et al 2018), although these priorities will be influenced by the needs of health systems in which different guideline development teams are working.
2.4.2 Engaging stakeholders to help define the review questions
In the absence of a relevant research priority-setting exercise, or when a systematic review is being conducted for a very specific purpose (for example, commissioned to inform the development of a guideline), researchers should work with relevant stakeholders to define the review question. This practice is especially important when developing review questions for studying the effectiveness of health systems and policies, because of the variability between countries and regions; the significance of these differences may only become apparent through discussion with the stakeholders.
The stakeholders for a review could include consumers or patients, carers, health professionals of different kinds, policy decision makers and others ( Chapter 1, Section 1.3.1 ). Identifying the stakeholders who are critical to a particular question will depend on the question, who the answer is likely to affect, and who will be expected to implement the intervention if it is found to be effective (or to discontinue it if not).
Stakeholder engagement should, optimally, be an ongoing process throughout the life of the systematic review, from defining the question to dissemination of results (Keown et al 2008). Engaging stakeholders increases relevance, promotes mutual learning, improves uptake and decreases research waste (see Chapter 1, Section 1.3.1 and Section 1.3.2 ). However, because such engagement can be challenging and resource intensive, a one-off engagement process to define the review question might only be possible. Review questions that are conceptualized and refined by multiple stakeholders can capture much of the complexity that should be addressed in a systematic review.
2.4.3 Considering issues relating to equity when defining review questions
Deciding what should be investigated, who the participants should be, and how the analysis will be carried out can be considered political activities, with the potential for increasing or decreasing inequalities in health. For example, we now know that well-intended interventions can actually widen inequalities in health outcomes since researchers have chosen to investigate this issue (Lorenc et al 2013). Decision makers can now take account of this knowledge when planning service provision. Authors should therefore consider the potential impact on disadvantaged groups of the intervention(s) that they are investigating on disadvantaged groups, and whether socio-economic inequalities in health might be affected depending on whether or how they are implemented.
Health equity is the absence of avoidable and unfair differences in health (Whitehead 1992). Health inequity may be experienced across characteristics defined by PROGRESS-Plus (Place of residence, Race/ethnicity/culture/language, Occupation, Gender/sex, Religion, Education, Socio-economic status, Social capital, and other characteristics (‘Plus’) such as sexual orientation, age, and disability) (O’Neill et al 2014). Issues relating to health equity should be considered when review questions are developed ( MECIR Box 2.4.a ). Chapter 16 presents detailed guidance on this issue for review authors.
MECIR Box 2.4 . a Relevant expectations for conduct of intervention reviews
Considering equity and specific populations ( )
Where possible reviews should include explicit descriptions of the effect of the interventions not only upon the whole population, but also on the disadvantaged, and/or the ability of the interventions to reduce socio-economic inequalities in health, and to promote use of the interventions to the community.
2.5 Methods and tools for structuring the review
It is important for authors to develop the scope of their review with care: without a clear understanding of where the review will contribute to existing knowledge – and how it will be used – it may be at risk of conceptual incoherence. It may mis-specify critical elements of how the intervention(s) interact with the context(s) within which they operate to produce specific outcomes, and become either irrelevant or possibly misleading. For example, in a systematic review about smoking cessation interventions in pregnancy, it was essential for authors to take account of the way that health service provision has changed over time. The type and intensity of ‘usual care’ in more recent evaluations was equivalent to the interventions being evaluated in older studies, and the analysis needed to take this into account. This review also found that the same intervention can have different effects in different settings depending on whether its materials are culturally appropriate in each context (Chamberlain et al 2017).
In order to protect the review against conceptual incoherence and irrelevance, review authors need to spend time at the outset developing definitions for key concepts and ensuring that they are clear about the prior assumptions on which the review depends. These prior assumptions include, for example, why particular populations should be considered inside or outside the review’s scope; how the intervention is thought to achieve its effect; and why specific outcomes are selected for evaluation. Being clear about these prior assumptions also requires review authors to consider the evidential basis for these assumptions and decide for themselves which they can place more or less reliance on. When considered as a whole, this initial conceptual and definitional work states the review’s conceptual framework . Each element of the review’s PICO raises its own definitional challenges, which are discussed in detail in the Chapter 3 .
In this section we consider tools that may help to define the scope of the review and the relationships between its key concepts; in particular, articulating how the intervention gives rise to the outcomes selected. In some situations, long sequences of events are expected to occur between an intervention being implemented and an outcome being observed. For example, a systematic review examining the effects of asthma education interventions in schools on children’s health and well-being needed to consider: the interplay between core intervention components and their introduction into differing school environments; different child-level effect modifiers; how the intervention then had an impact on the knowledge of the child (and their family); the child’s self-efficacy and adherence to their treatment regime; the severity of their asthma; the number of days of restricted activity; how this affected their attendance at school; and finally, the distal outcomes of education attainment and indicators of child health and well-being (Kneale et al 2015).
Several specific tools can help authors to consider issues raised when defining review questions and planning their review; these are also helpful when developing eligibility criteria and classifying included studies. These include the following.
Taxonomies: hierarchical structures that can be used to categorize (or group) related interventions, outcomes or populations.
Generic frameworks for examining and structuring the description of intervention characteristics (e.g. TIDieR for the description of interventions (Hoffmann et al 2014), iCAT_SR for describing multiple aspects of complexity in systematic reviews (Lewin et al 2017)).
Core outcome sets for identifying and defining agreed outcomes that should be measured for specific health conditions (described in more detail in Chapter 3 ).
Unlike these tools, which focus on particular aspects of a review, logic models provide a framework for planning and guiding synthesis at the review level (see Section 2.5.1 ).
2.5.1 Logic models
Logic models (sometimes referred to as conceptual frameworks or theories of change) are graphical representations of theories about how interventions work. They depict intervention components, mechanisms (pathways of action), outputs, and outcomes as sequential (although not necessarily linear) chains of events. Among systematic review authors, they were originally proposed as a useful tool when working with evaluations of complex social and population health programmes and interventions, to conceptualize the pathways through which interventions are intended to change outcomes (Anderson et al 2011).
In reviews where intervention complexity is a key consideration (see Chapter 17 ), logic models can be particularly helpful. For example, in a review of psychosocial group interventions for those with HIV, a logic model was used to show how the intervention might work (van der Heijden et al 2017). The review authors depicted proximal outcomes, such as self-esteem, but chose only to include psychological health outcomes in their review. In contrast, Bailey and colleagues included proximal outcomes in their review of computer-based interventions for sexual health promotion using a logic model to show how outcomes were grouped (Bailey et al 2010). Finally, in a review of slum upgrading, a logic model showed the broad range of interventions and their interlinkages with health and socio-economic outcomes (Turley et al 2013), and enabled the review authors to select a specific intervention category (physical upgrading) on which to focus the review. Further resources provide further examples of logic models, and can help review authors develop and use logic models (Anderson et al 2011, Baxter et al 2014, Kneale et al 2015, Pfadenhauer et al 2017, Rohwer et al 2017).
Logic models can vary in their emphasis, with a distinction sometimes made between system-based and process-oriented logic models (Rehfuess et al 2018). System-based logic models have particular value in examining the complexity of the system (e.g. the geographical, epidemiological, political, socio-cultural and socio-economic features of a system), and the interactions between contextual features, participants and the intervention (see Chapter 17 ). Process-oriented logic models aim to capture the complexity of causal pathways by which the intervention leads to outcomes, and any factors that may modify intervention effects. However, this is not a crisp distinction; the two types are interrelated; with some logic models depicting elements of both systems and process models simultaneously.
The way that logic models can be represented diagrammatically (see Chapter 17 for an example) provides a valuable visual summary for readers and can be a communication tool for decision makers and practitioners. They can aid initially in the development of a shared understanding between different stakeholders of the scope of the review and its PICO, helping to support decisions taken throughout the review process, from developing the research question and setting the review parameters, to structuring and interpreting the results. They can be used in planning the PICO elements of a review as well as for determining how the synthesis will be structured (i.e. planned comparisons, including intervention and comparator groups, and any grouping of outcome and population subgroups). These models may help review authors specify the link between the intervention, proximal and distal outcomes, and mediating factors. In other words, they depict the intervention theory underpinning the synthesis plan.
Anderson and colleagues note the main value of logic models in systematic review as (Anderson et al 2011):
refining review questions;
deciding on ‘lumping’ or ‘splitting’ a review topic;
identifying intervention components;
defining and conducting the review;
identifying relevant study eligibility criteria;
guiding the literature search strategy;
explaining the rationale behind surrogate outcomes used in the review;
justifying the need for subgroup analyses (e.g. age, sex/gender, socio-economic status);
making the review relevant to policy and practice;
structuring the reporting of results;
illustrating how harms and feasibility are connected with interventions; and
interpreting results based on intervention theory and systems thinking (see Chapter 17 ).
Logic models can be useful in systematic reviews when considering whether failure to find a beneficial effect of an intervention is due to a theory failure, an implementation failure, or both (see Chapter 17 and Cargo et al 2018). Making a distinction between implementation and intervention theory can help to determine whether and how the intervention interacts with (and potentially changes) its context (see Chapter 3 and Chapter 17 for further discussion of context). This helps to elucidate situations in which variations in how the intervention is implemented have the potential to affect the integrity of the intervention and intended outcomes.
Given their potential value in conceptualizing and structuring a review, logic models are increasingly published in review protocols. Logic models may be specified a priori and remain unchanged throughout the review; it might be expected, however, that the findings of reviews produce evidence and new understandings that could be used to update the logic model in some way (Kneale et al 2015). Some reviews take a more staged approach, pre-specifying points in the review process where the model may be revised on the basis of (new) evidence (Rehfuess et al 2018) and a staged logic model can provide an efficient way to report revisions to the synthesis plan. For example, in a review of portion, package and tableware size for changing selection or consumption of food and other products, the authors presented a logic model that clearly showed changes to their original synthesis plan (Hollands et al 2015).
It is preferable to seek out existing logic models for the intervention and revise or adapt these models in line with the review focus, although this may not always be possible. More commonly, new models are developed starting with the identification of outcomes and theorizing the necessary pre-conditions to reach those outcomes. This process of theorizing and identifying the steps and necessary pre-conditions continues, working backwards from the intended outcomes, until the intervention itself is represented. As many mechanisms of action are invisible and can only be ‘known’ through theory, this process is invaluable in exposing assumptions as to how interventions are thought to work; assumptions that might then be tested in the review. Logic models can be developed with stakeholders (see Section 2.5.2 ) and it is considered good practice to obtain stakeholder input in their development.
Logic models are representations of how interventions are intended to ‘work’, but they can also provide a useful basis for thinking through the unintended consequences of interventions and identifying potential adverse effects that may need to be captured in the review (Bonell et al 2015). While logic models provide a guiding theory of how interventions are intended to work, critiques exist around their use, including their potential to oversimplify complex intervention processes (Rohwer et al 2017). Here, contributions from different stakeholders to the development of a logic model may be able to articulate where complex processes may occur; theorizing unintended intervention impacts; and the explicit representation of ambiguity within certain parts of the causal chain where new theory/explanation is most valuable.
2.5.2 Changing review questions
While questions should be posed in the protocol before initiating the full review, these questions should not prevent exploration of unexpected issues. Reviews are analyses of existing data that are constrained by previously chosen study populations, settings, intervention formulations, outcome measures and study designs. It is generally not possible to formulate an answerable question for a review without knowing some of the studies relevant to the question, and it may become clear that the questions a review addresses need to be modified in light of evidence accumulated in the process of conducting the review.
Although a certain fluidity and refinement of questions is to be expected in reviews as a fuller understanding of the evidence is gained, it is important to guard against bias in modifying questions. Data-driven questions can generate false conclusions based on spurious results. Any changes to the protocol that result from revising the question for the review should be documented at the beginning of the Methods section. Sensitivity analyses may be used to assess the impact of changes on the review findings (see Chapter 10, Section 10.14 ). When refining questions it is useful to ask the following questions.
What is the motivation for the refinement?
Could the refinement have been influenced by results from any of the included studies?
Does the refined question require a modification to the search strategy and/or reassessment of any decisions regarding study eligibility?
Are data collection methods appropriate to the refined question?
Does the refined question still meet the FINER criteria discussed in Section 2.1 ?
2.5.3 Building in contingencies to deal with sparse data
The ability to address the review questions will depend on the maturity and validity of the evidence base. When few studies are identified, there will be limited opportunity to address the question through an informative synthesis. In anticipation of this scenario, review authors may build contingencies into their protocol analysis plan that specify grouping (any or multiple) PICO elements at a broader level; thus potentially enabling synthesis of a larger number of studies. Broader groupings will generally address a less specific question, for example:
‘the effect of any antioxidant supplement on …’ instead of ‘the effect of vitamin C on …’;
‘the effect of sexual health promotion on biological outcomes ’ instead of ‘the effect of sexual health promotion on sexually transmitted infections ’; or
‘the effect of cognitive behavioural therapy in children and adolescents on …’ instead of ‘the effect of cognitive behavioural therapy in children on …’.
However, such broader questions may be useful for identifying important leads in areas that lack effective interventions and for guiding future research. Changes in the grouping may affect the assessment of the certainty of the evidence (see Chapter 14 ).
2.5.4 Economic data
Decision makers need to consider the economic aspects of an intervention, such as whether its adoption will lead to a more efficient use of resources. Economic data such as resource use, costs or cost-effectiveness (or a combination of these) may therefore be included as outcomes in a review. It is useful to break down measures of resource use and costs to the level of specific items or categories. It is helpful to consider an international perspective in the discussion of costs. Economics issues are discussed in detail in Chapter 20 .
2.6 Chapter information
Authors: James Thomas, Dylan Kneale, Joanne E McKenzie, Sue E Brennan, Soumyadeep Bhaumik
Acknowledgements: This chapter builds on earlier versions of the Handbook . Mona Nasser, Dan Fox and Sally Crowe contributed to Section 2.4 ; Hilary J Thomson contributed to Section 2.5.1 .
Funding: JT and DK are supported by the National Institute for Health Research (NIHR) Collaboration for Leadership in Applied Health Research and Care North Thames at Barts Health NHS Trust. JEM is supported by an Australian National Health and Medical Research Council (NHMRC) Career Development Fellowship (1143429). SEB’s position is supported by the NHMRC Cochrane Collaboration Funding Program. The views expressed are those of the authors and not necessarily those of the NHS, the NIHR, the Department of Health or the NHMRC.
2.7 References
Anderson L, Petticrew M, Rehfuess E, Armstrong R, Ueffing E, Baker P, Francis D, Tugwell P. Using logic models to capture complexity in systematic reviews. Research Synthesis Methods 2011; 2 : 33–42.
Bailey JV, Murray E, Rait G, Mercer CH, Morris RW, Peacock R, Cassell J, Nazareth I. Interactive computer-based interventions for sexual health promotion. Cochrane Database of Systematic Reviews 2010; 9 : CD006483.
Baxter SK, Blank L, Woods HB, Payne N, Rimmer M, Goyder E. Using logic model methods in systematic review synthesis: describing complex pathways in referral management interventions. BMC Medical Research Methodology 2014; 14 : 62.
Bonell C, Jamal F, Melendez-Torres GJ, Cummins S. ‘Dark logic’: theorising the harmful consequences of public health interventions. Journal of Epidemiology and Community Health 2015; 69 : 95–98.
Bryant J, Sanson-Fisher R, Walsh J, Stewart J. Health research priority setting in selected high income countries: a narrative review of methods used and recommendations for future practice. Cost Effectiveness and Resource Allocation 2014; 12 : 23.
Caldwell DM, Welton NJ. Approaches for synthesising complex mental health interventions in meta-analysis. Evidence-Based Mental Health 2016; 19 : 16–21.
Cargo M, Harris J, Pantoja T, Booth A, Harden A, Hannes K, Thomas J, Flemming K, Garside R, Noyes J. Cochrane Qualitative and Implementation Methods Group guidance series-paper 4: methods for assessing evidence on intervention implementation. Journal of Clinical Epidemiology 2018; 97 : 59–69.
Chalmers I, Bracken MB, Djulbegovic B, Garattini S, Grant J, Gülmezoglu AM, Howells DW, Ioannidis JPA, Oliver S. How to increase value and reduce waste when research priorities are set. Lancet 2014; 383 : 156–165.
Chamberlain C, O’Mara-Eves A, Porter J, Coleman T, Perlen S, Thomas J, McKenzie J. Psychosocial interventions for supporting women to stop smoking in pregnancy. Cochrane Database of Systematic Reviews 2017; 2 : CD001055.
Cooper H. The problem formulation stage. In: Cooper H, editor. Integrating Research: A Guide for Literature Reviews . Newbury Park (CA) USA: Sage Publications; 1984.
Counsell C. Formulating questions and locating primary studies for inclusion in systematic reviews. Annals of Internal Medicine 1997; 127 : 380–387.
Cummings SR, Browner WS, Hulley SB. Conceiving the research question and developing the study plan. In: Hulley SB, Cummings SR, Browner WS, editors. Designing Clinical Research: An Epidemiological Approach . 4th ed. Philadelphia (PA): Lippincott Williams & Wilkins; 2007. p. 14–22.
Glass GV. Meta-analysis at middle age: a personal history. Research Synthesis Methods 2015; 6 : 221–231.
Hedges LV. Statistical considerations. In: Cooper H, Hedges LV, editors. The Handbook of Research Synthesis . New York (NY): USA: Russell Sage Foundation; 1994.
Hetrick SE, McKenzie JE, Cox GR, Simmons MB, Merry SN. Newer generation antidepressants for depressive disorders in children and adolescents. Cochrane Database of Systematic Reviews 2012; 11 : CD004851.
Hoffmann T, Glasziou P, Boutron I. Better reporting of interventions: template for intervention description and replication (TIDieR) checklist and guide. BMJ 2014; 348: g1687.
Hollands GJ, Shemilt I, Marteau TM, Jebb SA, Lewis HB, Wei Y, Higgins JPT, Ogilvie D. Portion, package or tableware size for changing selection and consumption of food, alcohol and tobacco. Cochrane Database of Systematic Reviews 2015; 9 : CD011045.
Keown K, Van Eerd D, Irvin E. Stakeholder engagement opportunities in systematic reviews: Knowledge transfer for policy and practice. Journal of Continuing Education in the Health Professions 2008; 28 : 67–72.
Kneale D, Thomas J, Harris K. Developing and optimising the use of logic models in systematic reviews: exploring practice and good practice in the use of programme theory in reviews. PloS One 2015; 10 : e0142187.
Lewin S, Hendry M, Chandler J, Oxman AD, Michie S, Shepperd S, Reeves BC, Tugwell P, Hannes K, Rehfuess EA, Welch V, McKenzie JE, Burford B, Petkovic J, Anderson LM, Harris J, Noyes J. Assessing the complexity of interventions within systematic reviews: development, content and use of a new tool (iCAT_SR). BMC Medical Research Methodology 2017; 17 : 76.
Lorenc T, Petticrew M, Welch V, Tugwell P. What types of interventions generate inequalities? Evidence from systematic reviews. Journal of Epidemiology and Community Health 2013; 67 : 190–193.
Nasser M, Ueffing E, Welch V, Tugwell P. An equity lens can ensure an equity-oriented approach to agenda setting and priority setting of Cochrane Reviews. Journal of Clinical Epidemiology 2013; 66 : 511–521.
Nasser M. Setting priorities for conducting and updating systematic reviews [PhD Thesis]: University of Plymouth; 2018.
O’Neill J, Tabish H, Welch V, Petticrew M, Pottie K, Clarke M, Evans T, Pardo Pardo J, Waters E, White H, Tugwell P. Applying an equity lens to interventions: using PROGRESS ensures consideration of socially stratifying factors to illuminate inequities in health. Journal of Clinical Epidemiology 2014; 67 : 56–64.
Oliver S, Dickson K, Bangpan M, Newman M. Getting started with a review. In: Gough D, Oliver S, Thomas J, editors. An Introduction to Systematic Reviews . London (UK): Sage Publications Ltd.; 2017.
Petticrew M, Roberts H. Systematic Reviews in the Social Sciences: A Practical Guide . Oxford (UK): Blackwell; 2006.
Pfadenhauer L, Gerhardus A, Mozygemba K, Lysdahl KB, Booth A, Hofmann B, Wahlster P, Polus S, Burns J, Brereton L, Rehfuess E. Making sense of complexity in context and implementation: the Context and Implementation of Complex Interventions (CICI) framework. Implementation Science 2017; 12 : 21.
Rehfuess EA, Booth A, Brereton L, Burns J, Gerhardus A, Mozygemba K, Oortwijn W, Pfadenhauer LM, Tummers M, van der Wilt GJ, Rohwer A. Towards a taxonomy of logic models in systematic reviews and health technology assessments: a priori, staged, and iterative approaches. Research Synthesis Methods 2018; 9 : 13–24.
Richardson WS, Wilson MC, Nishikawa J, Hayward RS. The well-built clinical question: a key to evidence-based decisions. ACP Journal Club 1995; 123 : A12–13.
Rohwer A, Pfadenhauer L, Burns J, Brereton L, Gerhardus A, Booth A, Oortwijn W, Rehfuess E. Series: Clinical epidemiology in South Africa. Paper 3: Logic models help make sense of complexity in systematic reviews and health technology assessments. Journal of Clinical Epidemiology 2017; 83 : 37–47.
Sharma T, Choudhury M, Rejón-Parrilla JC, Jonsson P, Garner S. Using HTA and guideline development as a tool for research priority setting the NICE way: reducing research waste by identifying the right research to fund. BMJ Open 2018; 8 : e019777.
Squires J, Valentine J, Grimshaw J. Systematic reviews of complex interventions: framing the review question. Journal of Clinical Epidemiology 2013; 66 : 1215–1222.
Tong A, Chando S, Crowe S, Manns B, Winkelmayer WC, Hemmelgarn B, Craig JC. Research priority setting in kidney disease: a systematic review. American Journal of Kidney Diseases 2015; 65 : 674–683.
Tong A, Sautenet B, Chapman JR, Harper C, MacDonald P, Shackel N, Crowe S, Hanson C, Hill S, Synnot A, Craig JC. Research priority setting in organ transplantation: a systematic review. Transplant International 2017; 30 : 327–343.
Turley R, Saith R, Bhan N, Rehfuess E, Carter B. Slum upgrading strategies involving physical environment and infrastructure interventions and their effects on health and socio-economic outcomes. Cochrane Database of Systematic Reviews 2013; 1 : CD010067.
van der Heijden I, Abrahams N, Sinclair D. Psychosocial group interventions to improve psychological well-being in adults living with HIV. Cochrane Database of Systematic Reviews 2017; 3 : CD010806.
Viergever RF. Health Research Prioritization at WHO: An Overview of Methodology and High Level Analysis of WHO Led Health Research Priority Setting Exercises . Geneva (Switzerland): World Health Organization; 2010.
Viergever RF, Olifson S, Ghaffar A, Terry RF. A checklist for health research priority setting: nine common themes of good practice. Health Research Policy and Systems 2010; 8 : 36.
Whitehead M. The concepts and principles of equity and health. International Journal of Health Services 1992; 22 : 429–25.
For permission to re-use material from the Handbook (either academic or commercial), please see here for full details.
Open access
Published: 19 August 2024
Patient reported measures of continuity of care and health outcomes: a systematic review
Patrick Burch 1 ,
Alex Walter 1 ,
Stuart Stewart 1 &
Peter Bower 1
BMC Primary Care volume 25 , Article number: 309 ( 2024 ) Cite this article
207 Accesses
7 Altmetric
Metrics details
There is a considerable amount of research showing an association between continuity of care and improved health outcomes. However, the methods used in most studies examine only the pattern of interactions between patients and clinicians through administrative measures of continuity. The patient experience of continuity can also be measured by using patient reported experience measures. Unlike administrative measures, these can allow elements of continuity such as the presence of information or how joined up care is between providers to be measured. Patient experienced continuity is a marker of healthcare quality in its own right. However, it is unclear if, like administrative measures, patient reported continuity is also linked to positive health outcomes.
Cohort and interventional studies that examined the relationship between patient reported continuity of care and a health outcome were eligible for inclusion. Medline, EMBASE, CINAHL and the Cochrane Library were searched in April 2021. Citation searching of published continuity measures was also performed. QUIP and Cochrane risk of bias tools were used to assess study quality. A box-score method was used for study synthesis.
Nineteen studies were eligible for inclusion. 15 studies measured continuity using a validated, multifactorial questionnaire or the continuity/co-ordination subscale of another instrument. Two studies placed patients into discrete groups of continuity based on pre-defined questions, one used a bespoke questionnaire, one calculated an administrative measure of continuity using patient reported data. Outcome measures examined were quality of life ( n = 11), self-reported health status ( n = 8), emergency department use or hospitalisation ( n = 7), indicators of function or wellbeing ( n = 6), mortality ( n = 4) and physiological measures ( n = 2). Analysis was limited by the relatively small number of hetrogenous studies. The majority of studies showed a link between at least one measure of continuity and one health outcome.
Whilst there is emerging evidence of a link between patient reported continuity and several outcomes, the evidence is not as strong as that for administrative measures of continuity. This may be because administrative measures record something different to patient reported measures, or that studies using patient reported measures are smaller and less able to detect smaller effects. Future research should use larger sample sizes to clarify if a link does exist and what the potential mechanisms underlying such a link could be. When measuring continuity, researchers and health system administrators should carefully consider what type of continuity measure is most appropriate.
Peer Review reports
Introduction
Continuity of primary care is associated with multiple positive outcomes including reduced hospitals admissions, lower costs and a reduction in mortality [ 1 , 2 , 3 ]. Providing continuity is often seen as opposed to providing rapid access to appointments [ 4 ] and many health systems have chosen to focus primary care policy on access rather than continuity [ 5 , 6 , 7 ]. Continuity has fallen in several primary care systems and this has led to calls to improve it [ 8 , 9 ]. However, it is sometimes unclear exactly what continuity is and what should be improved.
In its most basic form, continuity of care can be defined as a continuous relationship between a patient and a healthcare professional [ 10 ]. However, from the patient perspective, continuity of care can also be experienced as joined up seamless care from multiple providers [ 11 ].
One of the most commonly cited models of continuity by Haggerty et al. defines continuity as
“ …the degree to which a series of discrete healthcare events is experienced as coherent and connected and consistent with the patient’s medical needs and personal context. Continuity of care is distinguished from other attributes of care by two core elements—care over time and the focus on individual patients” [ 11 ].
It then breaks continuity down into three parts (see Table 1 ) [ 11 ]. Other academic models of patient continuity exists but they contain elements which are broadly analogous [ 10 , 12 , 13 , 14 ].
Continuity can be measured through administrative measures or by asking patients about their experience of continuity [ 16 ]. Administrative mesures are commonly used as they allow continuity to be calculated easily for large numbers of patient consultations. Administraive measures capture one element of continuity – the frequency or pattern of professionals seen by a patient [ 16 , 17 ]. There are multiple studies and several systematic reviews showing that better health outcomes are associated with administrative measures of continuity of care [ 1 , 2 , 18 , 19 ]. One of the most recent of these reviews used a box-score method to assess the relationship between reduced mortality and continuity (i.e., counting the numbers of studies reporting significant and non-significant relationships) [ 18 ]. The review examined thirteen studies and found a positive association in nine. Administrative measures of continuity cannot capture aspects of continuity such as informational or management continuity or the nature of the relationship between the patient and clinicians. To address this, there have been several patient-reported experience measures (PREMs) of continuity developed that attempt to capture the patient experience of continuity beyond the pattern in which they see particular clinicians [ 14 , 17 , 20 , 21 ]. Studies have shown a variable correlation between administrative and patient reported measures of continity and their relationship to health outcomes [ 22 ]. Pearson correlation co-efficients vary between 0.11 and 0.87 depending on what is measured and how [ 23 , 24 ]. This suggests that they are capturing different things and that both measures have their uses and drawbacks [ 23 , 25 ]. Patients may have good administrative measures of continuity but report a poor experience. Conversely, administrative measures of continuity may be poor, but a patient may report a high level of experienced continuity. Patient experienced continuity and patient satisfaction with healthcare is an aim in its own right in many healthcare systems [ 26 ]. Whilst this is laudable, it may be unclear to policy makers if prioritising patient-experienced continuity will improve health outcomes.
This review seeks to answer two questions.
Is patient reported continuity of care associated with positive health outcomes?
Are particular types of patient reported continuity (relational, informational or management) associated with positive health outcomes?
A review protocol was registered with PROSPERO in June 2021 (ID: CRD42021246606).
Search strategy
A structured search was undertaken using appropriate search terms on Medline, EMBASE, CINAHL and the Cochrane Library in April 2021 (see Appendix ). The searches were limited to the last 20 years. This age limitation reflects the period in which the more holistic description of continuity (as exemplified by Haggerty et al. 2003) became more prominent. In addition to database searches, existing reviews of PREMs of continuity and co-ordination were searched for appropriate measures. Citation searching of these measures was then undertaken to locate studies that used these outcome measures.
Inclusion criteria
Full text papers were reviewed if the title or abstract suggested that the paper measured (a) continuity through a PREM and (b) a health outcome. Health outcomes were defined as outcomes that measured a direct effect on patient health (e.g., health status) or patient use of emergency or inpatient care. Papers with outcomes relating to patient satisfaction or satisfaction with a particular service were excluded as were process measures (such as quality of documentation, cost to health care provider). Cohort and interventional studies were eligible for inclusion, if they reported data on the relationship between continuity and a relevant health outcome. Cross-sectional studies were excluded because of the risk of recall bias [ 27 ].
The majority of participants in a study had to be aged over 16, based in a healthcare setting and receiving healthcare from healthcare professionals (medical or non-medical). We felt that patients under 16 were unlikely to be asked to fill out continuity PREMs. Studies that used PREMs to quantitatively measure one or more elements of experienced continuity of care or coordination were eligible for inclusion [ 11 ]. Any PREMs that could map to one or more of the three key elements of Haggerty’s definition (Table 1 ) definition were eligible for inclusion. The types of continuity measured by each study were mapped to the Haggerty concepts of continuity by at least two reviewers independently. Our search also included patient reported measures of co-ordination, as a previous review of continuity PREMs highlighted the conceptual overlap between patient experienced continuity and some measures of patient experienced co-ordination [ 17 ]. Whilst there are different definitions of co-ordination, the concept of patient perceived co-ordination is arguably the same as management continuity [ 13 , 14 , 28 ]. Patient reported measures of care co-ordination were reviewed by two reviewers to see whether they measured the concept of management continuity. Because of the overlap between concepts of continuity and other theories (e.g., patient-centred care, quality of care), in studies where it was not clear that continuity was being measured, agreement, with documented reasons, was made about their inclusion/exclusion after discussion between three of the reviewers (PB, SS and AW). Disagreements were resolved by documented group discussion. Some PREMs measured concepts of continuity alongside other concepts such as access. These studies were eligible for inclusion only if measurements of continuity were reported and analysed separately.
Data abstraction
All titles/abstracts were initially screened by one reviewer (PB). 20% of the abstracts were independently reviewed by 2 other reviewers (SS and AW), blinded to the results of the initial screening. All full text reviews were done by two blinded reviewers independently. Disagreements were resolved by group discussion between PB, SS, AW and PBo. Excel was used for collation of search results, titles, and abstracts. Rayyan was used in the full text review process.
Data extraction was performed independently by two reviewers. The following data were extracted to an Excel spreadsheet: study design, setting, participant inclusion criteria, method of measurement of continuity, type of continuity measured, outcomes analysed, temporal relationship of continuity to outcomes in the study, co-variates, and quantitative data for continuity measures and outcomes. Disagreements were resolved by documented discussion or involvement of a third reviewer.
Study risk of bias assessment
Cohort studies were assessed for risk of bias at a study level using the QUIP tool by two reviewers acting independently [ 29 ]. Trials were assessed using the Cochrane risk of bias tool. The use of the QUIP tool was a deviation from the review protocol as the Ottowa-Newcastle tool in the protocol was less suitable for use on the type of cohort studies returned in the search. Any disagreements in rating were resolved by documented discussion.
As outlined in our original protocol, our preferred analysis strategy was to perform meta-analysis. However, we were unable to do this as insufficient numbers of studies reported data amenable to the calculation of an effect size. Instead, we used a box-score method [ 30 ]. This involved assessing and tabulating the relationship between each continuity measure and each outcome in each study. These relationships were recorded as either positive, negative or non-significant (using a conventional p value of < 0.05 as our cut off for significance). Advantages and disadvantages of this method are explored in the discussion section. Where a study used both bivariate analysis and multivariate analysis, the results from the multivariate analysis were extracted. Results were marked as “mixed” where more than one measure for an outcome was used and the significance/direction differed between outcome measures. Sensitivity analysis of study quality and size was carried out.
Figure 1 shows the search results and number of inclusions/exclusions. Studies were excluded for a number of reasons including; having inappropriate outcome measures [ 31 ], focusing on non-adult patient populations [ 32 ] and reporting insufficient data to examine the relationship between continuity and outcomes [ 33 ]. All studies are described in Table 2 .
Results of search strategy –NB. 18 studies provided 19 assessments
Study settings
Studies took place in 9 different, mostly economically developed, countries. Studies were set in primary care [ 5 ], hospital/specialist outpatient [ 7 ], hospital in-patient [ 5 ], or the general population [ 2 ].
Study design and assessment of bias
All included studies, apart from one trial [ 34 ], were cohort studies. Study duration varied from 2 months to 5 years. Most studies were rated as being low-moderate or moderate risk of bias, due to outcomes being patient reported, issues with recruitment, inadequately describing cohort populations, significant rates of attrition and/or failure to account for patients lost to follow up.
Measurement of continuity
The majority of the studies (15/19) measured continuity using a validated, multifactorial patient reported measure of continuity or using the continuity/co-ordination subscale of another validated instrument. Two studies placed patients into discrete groups of continuity based on answers to pre-defined questions (e.g., do you have a regular GP that you see? ) [ 35 , 36 ], one used a bespoke questionnaire [ 34 ], and one calculated an administrative measure of continuity (UPC – Usual Provider of Care index) using patient reported visit data collected from patient interviews [ 37 ]. Ten studies reported more than one type of patient reported continuity, four reported relational continuity, three reported overall continuity, one informational continuity and one management continuity.
Study outcomes
Most of the studies reported more than one outcome measure. To enable comparison across studies we grouped the most common outcome measures together. These were quality of life ( n = 11), self-reported health status ( n = 8), emergency department use or hospitalisation ( n = 7), and mortality ( n = 4). Other outcomes reported included physiological parameters e.g., blood pressure or blood test parameters ( n = 2) [ 36 , 38 ] and other indicators of functioning or well-being ( n = 6).
Association between outcomes and continuity measures
Twelve of the nineteen studies demonstrated at least one statistically significant association between at least one patient reported measure of continuity and at least one outcome. However, ten of these studies examined more than one outcome measure. Two of these significant studies showed negative findings; better informational continuity was associated with worse self-reported disease status [ 35 ] and improved continuity was related to increased admissions and ED use [ 39 ]. Four studies demonstrated no association between measures of continuity and any health outcomes.
The four most commonly reported types of outcomes were analysed separately (Table 3 ). All the outcomes had a majority of studies showing no significant association with continuity or a mixed/unclear association. Sensitivity analysis of the results in Table 3 , excluding high and moderate-high risk studies, did not change this finding. Each of these outcomes were also examined in relation to the type of continuity that was measured (Table 4 ) Apart from the relationship between informational continuity and quality or life, all other combinations of continuity type/outcome had a majority of studies showing no significant association with continuity or a mixed/unclear association. However, the relationship between informational continuity and quality of life was only examined in two separate studies [ 40 , 41 ]. One of these studies contained less than 100 patients and was removed when sensitivity analysis of study size was carried out [ 40 ]. Sensitivity analysis of the results in Table 4 , excluding high and moderate-high risk studies, did not change the findings.
Two sensitivity analyses were carried out (a) removing all studies with less than 100 participants and (b) those with less than 1000 participants. There were only five studies with at least 1000 participants. These all showed at least one positive association between continuity and health outcome. Of note, three of these five studies examined emergency department use/readmissions and all three found a significant positive association.
Continuity of care is a multi-dimensional concept that is often linked to positive health outcomes. There is strong evidence that administrative measures of continuity are associated with improved health outcomes including a reduction in mortality, healthcare costs and utilisation of healthcare [ 3 , 18 , 19 ]. Our interpretation of the evidence in this review is that there is an emerging link between patient reported continuity and health outcomes. Most studies in the review contained at least one significant association between continuity and a health outcome. However, when outcome measures were examined individually, the findings were less consistent.
The evidence for a link between patient reported continuity is not as strong as that for administrative measures. There are several possible explanations for this. The review retrieved a relatively small number of studies that examined a range of different outcomes, in different patient populations, in different settings, using different outcomes, and different measures of continuity. This resulted in small numbers of studies examining the relationship of a particular measure of continuity with a particular outcome (Table 4 ). The studies in the review took place in a wide variety of country and healthcare settings and it may be that the effects of continuity vary in different contexts. Finally, in comparison to studies of administrative measures of continuity, the studies in this review were small: the median number of participants in the studies was 486, compared to 39,249 in a recent systematic review examining administrative measures of continuity [ 18 ]. Smaller studies are less able to detect small effect sizes and this may be the principle reason for the difference between the results of this review and previous reviews of administrative measures of continuity. When studies with less than 1000 participants were excluded, all remaining studies showed at least one positive finding and there was a consistent association between reduction in emergency department use/re-admissions and continuity. This suggests that a modest association between certain outcomes and patient reported continuity may be present but, due to effect size, larger studies are needed to demonstrate it. The box score method does not take account of differential size of studies.
Continuity is not a concept that is universally agreed upon. We mapped concepts of continuity onto the commonly used Haggerty framework [ 11 ]. Apart from the use of the Nijmegen Continuity of care questionnaire in three studies [ 42 ], all studies measured continuity using different methods and concepts of continuity. We could have used other theoretical constructs of continuity for the mapping of measures. It was not possible to find the exact questions asked of patients in every study. We therefore mapped several of the continuity measures based on higher level descriptions given by the authors. The diversity of patient measures may account for some of the variability in findings between studies. However, it may be that the nature of continuity captured by patient reported measures is less closely linked to health outcomes than that captured by administrative measures. Administrative measures capture the pattern of interactions between patients and clinicians. All studies in this review (apart from Study 18) use PREMs that attempt to capture something different to the pattern in which a patient sees a clinician. Depending on the specific measure used, this includes: aspects of information transfer between services, how joined up care was between different providers and the nature of the patient-clinician relationship. PREMs can only capture what the patient perceives and remembers. The experience of continuity for the patient is important in its own right. However, it may be that the aspects of continuity that are most linked to positive health outcomes are best reflected by administrative measures. Sidaway-Lee et al. have hypothesised why relational continuity may be linked to health outcomes [ 43 ]. This includes the ability for a clinician to think more holistically and the motivation to “go the extra mile” for a patient. Whilst these are difficult to measure directly, it may be that administrative measures are a better proxy marker than PREMs for these aspects of continuity.
Conclusions/future work
This review shows a potential emerging relationship between patient reported continuity and health outcomes. However, the evidence for this association is currently weaker than that demonstrated in previous reviews of administrative measures of continuity.
If continuity is to be measured and improved, as is being proposed in some health systems [ 44 ], these findings have potential implications as to what type of measure we should use. Measurement of health system performance often drives change [ 45 ]. Health systems may respond to calls to improve continuity differently, depending on how continuity is measured. Continuity PREMs are important and patient experienced continuity should be a goal in its own right. However, it is the fact that continuity is linked to multiple positive health care and health system outcomes that is often given as the reason for pursing it as a goal [ 8 , 44 , 46 ]. Whilst this review shows there is emerging evidence of a link, it is not as strong as that found in studies of administrative measures. If, as has been shown in other work, PREMS and administrative measures are looking at different things [ 23 , 24 ], we need to choose our measures of continuity carefully.
Larger studies are required to confirm the emerging link between patient experienced continuity and outcomes shown in this paper. Future studies, where possible, should collect both administrative and patient reported measures of continuity and seek to understand the relative importance of the three different aspects of continuity (relational, informational, managerial). The relationship between patient experienced continuity and outcomes is likely to vary between different groups and future work should examine differential effects in different patient populations There are now several validated measures of patient experienced continuity [ 17 , 20 , 21 , 42 ]. Whilst there may be an argument more should be developed, the use of a standardised questionnaire (such as the Nijmegen questionnaire) where possible, would enable closer comparison between patient experiences in different healthcare settings.
Data availability
The datasets used and/or analysed during the current study available from the corresponding author on reasonable request.
Gray DJP, Sidaway-Lee K, White E, Thorne A, Evans PH. Continuity of care with doctors - a matter of life and death? A systematic review of continuity of care and mortality. BMJ Open. 2018;8(6):1–12.
Google Scholar
Barker I, Steventon A, Deeny SR. Association between continuity of care in general practice and hospital admissions for ambulatory care sensitive conditions: cross sectional study of routinely collected, person level data. BMJ Online. 2017;356.
Bazemore A, Merenstein Z, Handler L, Saultz JW. The impact of interpersonal continuity of primary care on health care costs and use: a critical review. Ann Fam Med. 2023;21(3):274–9.
Article PubMed PubMed Central Google Scholar
Palmer W, Hemmings N, Rosen R, Keeble E, Williams S, Imison C. Improving access and continuity in general practice. The Nuffield Trust; 2018 [cited 2022 Jan 15]. https://www.nuffieldtrust.org.uk/research/improving-access-and-continuity-in-general-practice
Pettigrew LM, Kumpunen S, Rosen R, Posaner R, Mays N. Lessons for ‘large-scale’ general practice provider organisations in England from other inter-organisational healthcare collaborations. Health Policy. 2019;123(1):51–61.
Article PubMed Google Scholar
Glenister KM, Guymer J, Bourke L, Simmons D. Characteristics of patients who access zero, one or multiple general practices and reasons for their choices: a study in regional Australia. BMC Fam Pract. 2021;22(1):2.
Kringos D, Boerma W, Bourgueil Y, Cartier T, Dedeu T, Hasvold T, et al. The strength of primary care in Europe: an international comparative study. Br J Gen Pract. 2013;63(616):e742–50.
Salisbury H. Helen Salisbury: everyone benefits from continuity of care. BMJ. 2023;382:p1870.
Article Google Scholar
Gray DP, Sidaway-Lee K, Johns C, Rickenbach M, Evans PH. Can general practice still provide meaningful continuity of care? BMJ. 2023;383:e074584.
Ladds E, Greenhalgh T. Modernising continuity: a new conceptual framework. Br J Gen Pr. 2023;73(731):246–8.
Haggerty JL, Reid, Robert, Freeman G, Starfield B, Adair CE, McKendry R. Continuity of care: a multidisciplinary review. BMJ. 2003;327(7425):1219–21.
Freeman G, Shepperd S, Robinson I, Ehrich K, Richards S, Pitman P et al. Continuity of care continuity of care report of a scoping exercise for the national co-ordinating centre for NHS service delivery and organisation R & D. 2001 [cited 2020 Oct 15]. https://njl-admin.nihr.ac.uk/document/download/2027166
Saultz JW. Defining and measuring interpersonal continuity of care. Ann Fam Med. 2003;1(3):134–43.
Uijen AA, Schers HJ, Schellevis FG. Van den bosch WJHM. How unique is continuity of care? A review of continuity and related concepts. Fam Pract. 2012;29(3):264–71.
Murphy M, Salisbury C. Relational continuity and patients’ perception of GP trust and respect: a qualitative study. Br J Gen Pr. 2020;70(698):e676–83.
Gray DP, Sidaway-Lee K, Whitaker P, Evans P. Which methods are most practicable for measuring continuity within general practices? Br J Gen Pract. 2023;73(731):279–82.
Uijen AA, Schers HJ. Which questionnaire to use when measuring continuity of care. J Clin Epidemiol. 2012;65(5):577–8.
Baker R, Bankart MJ, Freeman GK, Haggerty JL, Nockels KH. Primary medical care continuity and patient mortality. Br J Gen Pr. 2020;70(698):E600–11.
Van Walraven C, Oake N, Jennings A, Forster AJ. The association between continuity of care and outcomes: a systematic and critical review. J Eval Clin Pr. 2010;16(5):947–56.
Aller MB, Vargas I, Garcia-Subirats I, Coderch J, Colomés L, Llopart JR, et al. A tool for assessing continuity of care across care levels: an extended psychometric validation of the CCAENA questionnaire. Int J Integr Care. 2013;13(OCT/DEC):1–11.
Haggerty JL, Roberge D, Freeman GK, Beaulieu C, Bréton M. Validation of a generic measure of continuity of care: when patients encounter several clinicians. Ann Fam Med. 2012;10(5):443–51.
Bentler SE, Morgan RO, Virnig BA, Wolinsky FD, Hernandez-Boussard T. The association of longitudinal and interpersonal continuity of care with emergency department use, hospitalization, and mortality among medicare beneficiaries. PLoS ONE. 2014;9(12):1–18.
Bentler SE, Morgan RO, Virnig BA, Wolinsky FD. Do claims-based continuity of care measures reflect the patient perspective? Med Care Res Rev. 2014;71(2):156–73.
Rodriguez HP, Marshall RE, Rogers WH, Safran DG. Primary care physician visit continuity: a comparison of patient-reported and administratively derived measures. J Gen Intern Med. 2008;23(9):1499–502.
Adler R, Vasiliadis A, Bickell N. The relationship between continuity and patient satisfaction: a systematic review. Fam Pr. 2010;27(2):171–8.
Bodenheimer T, Sinsky C. From triple to quadruple aim: care of the patient requires care of the provider. Ann Fam Med. 2014;12(6):573–6.
Althubaiti A. Information bias in health research: definition, pitfalls, and adjustment methods. J Multidiscip Healthc. 2016;9:211–7.
Schultz EM, McDonald KM. What is care coordination? Int J Care Coord. 2014;17(1–2):5–24.
Hayden, van der Windt, Danielle, Cartwright, Jennifer, Cote, Pierre, Bombardier, Claire. Assessing Bias in studies of prognostic factors. Ann Intern Med. 2013;158(4):280–6.
Green BF, Hall JA. Quantitative methods for literature reviews. Annu Rev Psychol. 1984;35(1):37–54.
Article CAS PubMed Google Scholar
Safran DG, Montgomery JE, Chang H, Murphy J, Rogers WH. Switching doctors: predictors of voluntary disenrollment from a primary physician’s practice. J Fam Pract. 2001;50(2):130–6.
CAS PubMed Google Scholar
Burns T, Catty J, Harvey K, White S, Jones IR, McLaren S, et al. Continuity of care for carers of people with severe mental illness: results of a longitudinal study. Int J Soc Psychiatry. 2013;59(7):663–70.
Engelhardt JB, Rizzo VM, Della Penna RD, Feigenbaum PA, Kirkland KA, Nicholson JS, et al. Effectiveness of care coordination and health counseling in advancing illness. Am J Manag Care. 2009;15(11):817–25.
PubMed Google Scholar
Uijen AA, Bischoff EWMA, Schellevis FG, Bor HHJ, Van Den Bosch WJHM, Schers HJ. Continuity in different care modes and its relationship to quality of life: a randomised controlled trial in patients with COPD. Br J Gen Pr. 2012;62(599):422–8.
Humphries C, Jaganathan S, Panniyammakal J, Singh S, Dorairaj P, Price M, et al. Investigating discharge communication for chronic disease patients in three hospitals in India. PLoS ONE. 2020;15(4):1–20.
Konrad TR, Howard DL, Edwards LJ, Ivanova A, Carey TS. Physician-patient racial concordance, continuity of care, and patterns of care for hypertension. Am J Public Health. 2005;95(12):2186–90.
Van Walraven C, Taljaard M, Etchells E, Bell CM, Stiell IG, Zarnke K, et al. The independent association of provider and information continuity on outcomes after hospital discharge: implications for hospitalists. J Hosp Med. 2010;5(7):398–405.
Gulliford MC, Naithani S, Morgan M. Continuity of care and intermediate outcomes of type 2 diabetes mellitus. Fam Pr. 2007;24(3):245–51.
Kaneko M, Aoki T, Mori H, Ohta R, Matsuzawa H, Shimabukuro A, et al. Associations of patient experience in primary care with hospitalizations and emergency department visits on isolated islands: a prospective cohort study. J Rural Health. 2019;35(4):498–505.
Beesley VL, Janda M, Burmeister EA, Goldstein D, Gooden H, Merrett ND, et al. Association between pancreatic cancer patients’ perception of their care coordination and patient-reported and survival outcomes. Palliat Support Care. 2018;16(5):534–43.
Valaker I, Fridlund B, Wentzel-Larsen T, Nordrehaug JE, Rotevatn S, Råholm MB, et al. Continuity of care and its associations with self-reported health, clinical characteristics and follow-up services after percutaneous coronary intervention. BMC Health Serv Res. 2020;20(1):1–15.
Uijen AA, Schellevis FG, Van Den Bosch WJHM, Mokkink HGA, Van Weel C, Schers HJ. Nijmegen continuity questionnaire: development and testing of a questionnaire that measures continuity of care. J Clin Epidemiol. 2011;64(12):1391–9.
Sidaway-Lee K, Gray DP, Evans P, Harding A. What mechanisms could link GP relational continuity to patient outcomes ? Br J Gen Pr. 2021;(June):278–81.
House of Commons Health and Social Care Committee. The future of general practice. 2022. https://publications.parliament.uk/pa/cm5803/cmselect/cmhealth/113/report.html
Close J, Byng R, Valderas JM, Britten N, Lloyd H. Quality after the QOF? Before dismantling it, we need a redefined measure of ‘quality’. Br J Gen Pract. 2018;68(672):314–5.
Gray DJP. Continuity of care in general practice. BMJ. 2017;356:j84.
Download references
Acknowledgements
Not applicable.
Patrick Burch carried this work out as part of a PhD Fellowship funded by THIS Institute.
Author information
Authors and affiliations.
Centre for Primary Care and Health Services Research, Institute of Population Health, University of Manchester, Manchester, England
Patrick Burch, Alex Walter, Stuart Stewart & Peter Bower
You can also search for this author in PubMed Google Scholar
Contributions
PBu conceived the review and performed the searches. PBu, AW and SS performed the paper selections, reviews and data abstractions. PBo helped with the design of the review and was inovlved the reviewer disputes. All authors contributed towards the drafting of the final manuscript.
Corresponding author
Correspondence to Patrick Burch .
Ethics declarations
Ethics approval, consent for publication, competing interests.
The authors declare no competing interests.
Additional information
Publisher’s note.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Supplementary Material 1
Supplementary material 2, rights and permissions.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
Reprints and permissions
About this article
Cite this article.
Burch, P., Walter, A., Stewart, S. et al. Patient reported measures of continuity of care and health outcomes: a systematic review. BMC Prim. Care 25 , 309 (2024). https://doi.org/10.1186/s12875-024-02545-8
Download citation
Received : 27 March 2023
Accepted : 29 July 2024
Published : 19 August 2024
DOI : https://doi.org/10.1186/s12875-024-02545-8
Share this article
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
Background Large Vessel Occlusion (LVO) is a serious condition that causes approximately 24-46% of acute ischemic strokes (AIS). LVO strokes tend to have higher mortality rates and result in more severe longterm disabilities compared to non LVO ischemic strokes. Early intervention with endovascular therapy (EVT) is recommended; however, EVT is limited to tertiary care hospitals with specialized facilities. Therefore, identifying patients with a high probability of LVO in prehospital settings and ensuring their rapid transfer to appropriate hospitals is crucial. While LVO diagnosis typically requires advanced imaging like MRI or CT scans, various scoring systems based on neurological symptoms have been developed for prehospital use. Although previous systematic reviews have addressed some of these scales, recent studies have introduced new scales and additional data on their accuracy. This systematic review and meta-analysis aim to summarize the current evidence on the diagnostic accuracy of these prehospital LVO screening scales.
Methods This systematic review and meta-analysis will be conducted in accordance with the PRISMA-DTA Statement and the Cochrane Handbook for Systematic Reviews of Diagnostic Test Accuracy. We will include observational studies and randomized controlled trials that assess the utility of LVO scales in suspected stroke patients in prehospital settings. Eligible studies must provide sufficient data to calculate sensitivity and specificity, and those lacking such data or being case reports will be excluded. The literature search will cover CENTRAL, MEDLINE, and Ichushi databases, including studies in English and Japanese. Bias will be assessed using QUADAS-2, and meta-analysis will be conducted using a random effects model, with subgroup and sensitivity analyses to explore heterogeneity.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This study did not receive any funding
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
We will search the following databases CENTRAL, MEDLINE, and Ichushi.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Data Availability
All data produced in the present study are available upon reasonable request to the authors.
View the discussion thread.
Thank you for your interest in spreading the word about medRxiv.
NOTE: Your email address is requested solely to identify you as the sender of this article.
Citation Manager Formats
EndNote (tagged)
EndNote 8 (xml)
RefWorks Tagged
Ref Manager
Tweet Widget
Facebook Like
Google Plus One
Subject Area
Addiction Medicine (342)
Allergy and Immunology (665)
Anesthesia (180)
Cardiovascular Medicine (2625)
Dentistry and Oral Medicine (314)
Dermatology (222)
Emergency Medicine (397)
Endocrinology (including Diabetes Mellitus and Metabolic Disease) (930)
Epidemiology (12175)
Forensic Medicine (10)
Gastroenterology (756)
Genetic and Genomic Medicine (4064)
Geriatric Medicine (385)
Health Economics (676)
Health Informatics (2625)
Health Policy (997)
Health Systems and Quality Improvement (979)
Hematology (360)
HIV/AIDS (845)
Infectious Diseases (except HIV/AIDS) (13659)
Intensive Care and Critical Care Medicine (790)
Medical Education (398)
Medical Ethics (109)
Nephrology (430)
Neurology (3832)
Nursing (209)
Nutrition (570)
Obstetrics and Gynecology (734)
Occupational and Environmental Health (690)
Oncology (2008)
Ophthalmology (581)
Orthopedics (238)
Otolaryngology (304)
Pain Medicine (250)
Palliative Medicine (73)
Pathology (471)
Pediatrics (1107)
Pharmacology and Therapeutics (459)
Primary Care Research (447)
Psychiatry and Clinical Psychology (3400)
Public and Global Health (6499)
Radiology and Imaging (1390)
Rehabilitation Medicine and Physical Therapy (806)
Respiratory Medicine (869)
Rheumatology (400)
Sexual and Reproductive Health (407)
Sports Medicine (338)
Surgery (441)
Toxicology (52)
Transplantation (185)
Urology (165)
This paper is in the following e-collection/theme issue:
Published on 20.8.2024 in Vol 3 (2024)
Approaches for the Use of AI in Workplace Health Promotion and Prevention: Systematic Scoping Review
Authors of this article:
Martin Lange 1 , Prof Dr ;
Alexandra Löwe 1 , MA ;
Ina Kayser 2 , Prof Dr ;
Andrea Schaller 3 , Prof Dr
1 Department of Fitness & Health, IST University of Applied Sciences, Duesseldorf, Germany
2 Department of Communication & Business, IST University of Applied Sciences, Duesseldorf, Germany
3 Institute of Sport Science, Department of Human Sciences, University of the Bundeswehr Munich, Munich, Germany
Background: Artificial intelligence (AI) is an umbrella term for various algorithms and rapidly emerging technologies with huge potential for workplace health promotion and prevention (WHPP). WHPP interventions aim to improve people’s health and well-being through behavioral and organizational measures or by minimizing the burden of workplace-related diseases and associated risk factors. While AI has been the focus of research in other health-related fields, such as public health or biomedicine, the transition of AI into WHPP research has yet to be systematically investigated.
Objective: The systematic scoping review aims to comprehensively assess an overview of the current use of AI in WHPP. The results will be then used to point to future research directions. The following research questions were derived: (1) What are the study characteristics of studies on AI algorithms and technologies in the context of WHPP? (2) What specific WHPP fields (prevention, behavioral, and organizational approaches) were addressed by the AI algorithms and technologies? (3) What kind of interventions lead to which outcomes?
Methods: A systematic scoping literature review (PRISMA-ScR [Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews]) was conducted in the 3 academic databases PubMed, Institute of Electrical and Electronics Engineers, and Association for Computing Machinery in July 2023, searching for papers published between January 2000 and December 2023. Studies needed to be (1) peer-reviewed, (2) written in English, and (3) focused on any AI-based algorithm or technology that (4) were conducted in the context of WHPP or (5) an associated field. Information on study design, AI algorithms and technologies, WHPP fields, and the patient or population, intervention, comparison, and outcomes framework were extracted blindly with Rayyan and summarized.
Results: A total of 10 studies were included. Risk prevention and modeling were the most identified WHPP fields (n=6), followed by behavioral health promotion (n=4) and organizational health promotion (n=1). Further, 4 studies focused on mental health. Most AI algorithms were machine learning-based, and 3 studies used combined deep learning algorithms. AI algorithms and technologies were primarily implemented in smartphone apps (eg, in the form of a chatbot) or used the smartphone as a data source (eg, Global Positioning System). Behavioral approaches ranged from 8 to 12 weeks and were compared to control groups. Additionally, 3 studies evaluated the robustness and accuracy of an AI model or framework.
Conclusions: Although AI has caught increasing attention in health-related research, the review reveals that AI in WHPP is marginally investigated. Our results indicate that AI is promising for individualization and risk prediction in WHPP, but current research does not cover the scope of WHPP. Beyond that, future research will profit from an extended range of research in all fields of WHPP, longitudinal data, and reporting guidelines.
Artificial intelligence (AI) is a concept that dates back to the mid-1900s [ 1 ] and was first defined as “the science and engineering of making intelligent machines” [ 2 ]. Today, AI is described as a computer system’s capability to perform complex tasks that mimic human cognitive functions to perform tasks such as reasoning, decision-making, or problem-solving, autonomously and adaptively [ 3 ]. However, its capabilities and underlying functions have changed significantly over the decades [ 1 , 4 ]. More recently, AI has emerged as a transformative force across various industries. Its application has shown promise in health promotion and health care [ 5 - 7 ], opening new possibilities concerning patient care and enhanced medical practices.
There is growing consensus in the literature that adaptivity and autonomy are the key characteristics of AI applications and technologies [ 5 ]. AI is considered an umbrella concept of emerging technologies, enclosing fundamental distinct types such as machine learning (ML), deep learning (DL), or natural language processing (NLP) [ 4 , 8 ]. Technically, AI is an ML-based approach that simulates human minds’ cognitive and affective functions [ 3 , 8 ] and is designed to observe and react to a specific environment. In contrast to deterministic programming, such models feature many free parameters that can adapt autonomously to calibrate the model. For example, AI can be applied in repetitive tasks requiring human intelligence, such as scanning and interpreting magnetic resonance imaging, autonomous driving, or analyzing big data sets [ 9 - 11 ]. ML and DL algorithms and artificial neural networks enable a machine or system to learn from large data sets, make autonomous decisions, and improve their performance over time [ 4 ]. More narrowly, NLP allows machines to generate and understand text and spoken language in the same way humans do. It combines rule-based natural language modeling with ML and DL models to process human language in text or speech data, understand its meaning, including feelings, and even generate human language, as it is sometimes used in chatbots or language translation [ 12 ].
AI in Health Care and Public Health
Implementing AI algorithms and technologies for health care institutions bears enormous potential, ranging from efficient health service management, predictive medicine, patient data, and diagnostics with real-time analyses to clinical decision-making. Most studies report a broader AI architecture with a combination of algorithms rooted in ML, DL, and NLP [ 4 , 11 ]. For example, 1 AI approach evaluated the support of clinical decision-making by analyzing continuous laboratory data, past clinical notes, and current information of physicians synthesizing significant associations [ 13 ]. AI implementation in the form of predictive modeling showed positive results by detecting irregular heartbeats through smartwatches [ 14 ], automatically identifying reports of infectious disease in the media [ 15 ], or ascertaining cardiovascular risk factors from retinal images [ 16 ]. Through systematic profiling of 4518 existing drugs against 578 cancer cell lines with an AI-based approach, a study revealed that nononcology drugs have an unexpectedly high rate of anticancer activity [ 17 ]. Another study developed and evaluated a Medical Instructed Real-Time Assistant that listens to the user’s chief complaint and predicts a specific disease [ 18 ]. Chatbots have been used to detect COVID-19 symptoms through detailed questioning [ 6 ] or to predict the risk of type II diabetes mellitus [ 19 ].
Workplace Health Promotion and Prevention
As adults spend a significant amount of time working, it is widely accepted that work and work environments have a major impact on individuals’ health. Workplace health promotion and prevention (WHPP) are important fields that “[…] improve the health and well-being of people at work […]” [ 20 ] through a combination of behavioral and organizational measures. Workplace health promotion follows a competence-oriented, salutogenetic approach to promoting the resources of an individual [ 20 ]. Prevention in the workplace focuses on minimizing the burden of workplace-related diseases and associated risk factors [ 21 , 22 ]. WHPP interventions range from behavioral measures with active participation (eg, courses or seminars) to organizational measures such as consultations, analyses, inspections, and establishing organizational structures such as a health committee [ 23 , 24 ].
With the Luxembourg declaration, WHPP has evolved into an independent discipline that differentiates from return-to-work (RTW) and occupational safety and health (OSH) measures [ 20 , 25 ]. In OSH-related disciplines, previous reviews have focused on risk assessment or detection related to physical ergonomics [ 26 ], occupational physical fatigue [ 27 ], or core body temperature [ 28 ]. Other reviews explored the evidence of AI in F-related areas, such as vocational rehabilitation [ 29 ] and functional capacity evaluation [ 30 ]. In health promotion in general, 1 review evaluates the use of chatbots to increase health-related behavior but does not focus on the workplace setting [ 31 ]. To the authors’ knowledge, no review has evaluated the use of AI in WHPP.
Therefore, this systematic scoping review aims to comprehensively assess an overview of the current use of AI in WHPP. The results will then be used to point to future research directions. The following research questions (RQ) were derived from these aims:
RQ1: What are the study characteristics of studies on AI algorithms and technologies in WHPP?
RQ2: What specific WHPP fields (prevention, behavioral, and organizational approaches) are addressed by the AI algorithms and technologies?
RQ3: What kind of interventions were conducted, and what outcomes were assessed?
A systematic scoping review approach [ 32 ] was selected following the extended PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews; Multimedia Appendix 1 ) [ 33 ]. We applied the 5-step framework to identify current or emerging research directions and provide an overview of research activities [ 34 ]. Additionally, the patient or population, intervention, comparison, and outcomes (PICO) framework [ 35 ] was used to specify the study’s objective, from the search string and data charting to more systematic discussion [ 36 ]. The review was registered prospectively in the Open Science Framework (OSF) on July 5, 2023. All files (protocol, search string, and search results) have been uploaded to the OSF profile and are publicly accessible [ 37 ].
Eligibility Criteria
Included studies needed to be (1) peer-reviewed, (2) written in English, and (3) focused on any AI-based algorithm or technology that (4) were conducted in the context of WHPP, or (5) an associated field (workplace prevention, occupational health, and workplace health) that applies to WHPP. The types of research considered were review types (systematic, scoping, or rapid), cross-sectional studies, and longitudinal studies.
Our conceptualization of AI included the concepts of “machine learning,” “deep learning,” and “natural language processing.” Our conceptualization of “workplace health promotion and prevention” followed a broader understanding comprising the setting (eg, “work,” “workplace,” or “in or at the workplace”), the target population (eg, “working adults” or “employees”) and the outcome dimension (eg, “health” or “health behavior”). The search period was limited to studies published since January 2000 and before July 31, 2023. During the review, the search was extended to December 20, 2023.
Information Sources and Search
The systematic literature research was conducted in July 2023 in 3 databases: PubMed, IEEE Xplore, and Association for Computing Machinery. The search string included Boolean operators (“AND,” “OR,” and “NOT”) and search terms related to “artificial intelligence,” “workplace health promotion,” “health promotion,” and “workplace setting” (see supplementary files available at OSF profile [ 37 ]). Papers were managed with the software tool Rayyan, followed by a 2-stage screening process. First, 1 reviewer (ML) removed all duplicates. Second, 2 reviewers (ML and AL) screened all titles or abstracts and read full texts for eligibility criteria in a blinded procedure. Disagreement was resolved by either consensus of the 2 reviewers or by consultation of a third reviewer (IK).
Data Charting and Synthesis of Results
In the first step, the study characteristics were extracted: first author (name and year), study design (eg, cross-sectional or randomized controlled trial), the primary type of AI algorithm and technology as referred to in the study (eg, AI, ML, DL, or NLP), and the frontend in which the AI-technology was implemented (eg, mobile app or web app). Second, the PICO framework [ 35 ] was applied to extract information about the target group (number of included participants/workplace context), the intervention approach, the comparison, and the reported outcomes of the study.
We used the extracted information from the study characteristics to answer RQ1 on current AI-based technologies applied in WHPP. For answering RQ2 and RQ3, we used the data extracted by the PICO framework. The information was then categorized within the results’ tables and summarized narratively.
Included Studies
The predefined search led to a total of 3317 results. The screening results revealed 478 duplicates, 712 records not meeting inclusion criteria (eg, publication type, language, or setting), 42 unique records, and 104 with missing information, leaving 1981 records for the title and abstract screening. The title and abstract screening excluded another 1761 records for not meeting inclusion criteria, leading to 220 records for full-text screening, of which one was inaccessible. After screening 219 full-text records, another 209 records were excluded. Finally, 10 studies remained in this systematic scoping review (the PRISMA-ScR flowchart is shown in Figure 1 ).
Study Characteristics (RQ1)
The results of the study characteristics are presented in Table 1 . Regarding the study designs, 6 studies were cross-sectional studies [ 38 - 43 ], 3 were randomized controlled trials [ 44 - 46 ], and 1 was a quasi-controlled trial [ 47 ]. None of the studies explained data protection standards (security protocols, storage location or duration, or access of third parties) within the AI algorithms and technologies used. In most studies, white-collar workers were the intended target group [ 38 , 41 , 42 , 46 ], whereas, in 3 studies, white-collar and physical labor workers participated [ 40 , 45 , 47 ]. Further, 1 study evaluated AI-based technologies with physical labor workers [ 39 ], and another did not disclose any information about the type of work setting [ 44 ]. Information on sample characteristics was missing in 3 studies [ 40 , 41 , 44 ], little information was provided in 2 studies [ 38 , 44 ], and 4 studies offered sufficient information [ 39 , 42 ].
A comparison was used in different ways by 6 studies [ 40 , 42 , 44 - 47 ]. Further, 4 studies recruited a classic control group [ 39 , 44 , 46 , 47 ], 2 of which exposed the control group after a waiting period [ 44 , 46 ]. Another study compared their assessed data to external data thresholds [ 40 ], and 1 study compared assessed objective data with subjective data [ 42 ]. Regarding the outcome, all studies stated sufficient and significant results. Further, 1 study reported no changes in 1 of the 3 assessed outcomes [ 47 ].
Author
Year
Included type of AI algorithm
Implemented frontend
WHPP field
Study design
Anan et al []
2021
Machine learning
Smartphone app with integrated chatbot
Prevention; behavioral health promotion
RCT
Morshed et al []
2022
Machine learning
Software-based sensor technology
Prevention
CS
Cui et al []
2020
Deep learning networks (recurrent neural network or long-short-term neural network)
N/A
Prevention (risk assessment)
CS
Dijkhuis et al []
2018
Machine learning
Web app
Behavioral health promotion
RCT
Hungerbuehler et al []
2021
Machine learning
Viki chatbot within a web browser interface
Prevention (risk assessment)
CS
Kaiser et al []
2021
Fuzzy neural network-based fusion
Smartphone app with GPS and eHealth sensor
Organizational health promotion (risk assessment)
CS
Lopes et al []
2023
Neural language processing or machine learning
EMYS robot
Behavioral health promotion
qCT
Maxhuni et al []
2021
Machine learning
Smartphone app
Prevention (risk assessment)
CS
Piao et al []
2020
Deep learning networks, machine learning, and natural language processing (large language model)
Watson conversation tool (IBM Corp) integrated into a smartphone app
Behavioral health promotion
RCT
Yan et al []
2020
Convolutional neural network
Web-based app
Prevention (risk assessment)
CS
a AI: artificial intelligence.
b WHPP: workplace health promotion and prevention.
c RCT: randomized controlled trial.
d CS: cross-sectional study design.
e N/A: not applicable.
f GPS: Global Positioning System.
g EMYS: emotive head system.
h qCT: quasi controlled trial.
AI Applications and Technologies in Specific WHPP Fields (RQ2)
AI algorithms and technologies were mainly used for preventive purposes in risk assessment ( Table 1 ). Furthermore, 2 studies evaluated prediction models [ 39 , 42 ]. Additionally, 3 studies [ 44 , 46 , 47 ] targeted health behavior change using 3 different approaches ranging from a web app [ 44 ] and smartphone app [ 46 ] to social robot agents [ 47 ]. Further, 1 study [ 41 ] was categorized as an organizational health promotion approach. A major target indication was mental health, which was addressed in 4 studies [ 38 , 40 , 42 , 43 ]. In contrast, 1 study dealt with musculoskeletal disorders [ 45 ] and 1 on overall physical health and work-related factors [ 39 ].
Interventions and Outcomes (RQ3)
The PICO category “intervention” did not apply to studies focusing on prevention since they did not evaluate an intervention [ 38 - 43 ]. Interventions were evaluated by 4 studies [ 44 - 47 ] with a duration of 12 weeks [ 44 - 46 ] and 8 weeks [ 47 ]. Within these 4 studies, 2 used chatbots as a primary AI application [ 45 , 46 ], 1 used a web application [ 44 ], and 1 used a social robot agent [ 47 ]. These 4 studies recruited a control group, of which 2 studies exposed the control group after a waiting period [ 44 , 46 ]. Regarding the outcome, all studies stated sufficient and significant results. The study of Lopes et al [ 47 ] reported no changes in 1 of the 3 assessed outcomes ( Table 2 ).
Population
Intervention
Comparison
Outcome
Anan et al []
IG 48 and CG 46 engineers and white-collar workers
AI -assisted program for MSD that selects exercises depending on participants’ chat input; 12-week intervention with individualized exercises for stretching, maintaining good posture, and mindfulness.
CG: exercise routine of 3 minutes per day during break time; routine consists of standard exercises for stretching, maintaining good posture, and mindfulness.
Adherence rate: 92%; significant difference in the worst pain scores of neck or shoulder pain or stiffness and low back pain between baseline and 12 weeks (score: –1.12; 95% CI –1.53 to –0.70; <.001); significant improvements of IG in the severity of the neck or shoulder pain or stiffness and low back pain compared to CG (OR 6.36, 95% CI 2.57-15.73; <.001); subjective improvement in symptoms in IG at 12 weeks (score: 43; 95% CI 11.25-164.28; <.001).
Morshed et al []
46 remote information workers
Development and implementation of a workplace stress sensing system for 4 weeks using passive sensors (email, calendar, app, mouse and keyboard use; facial positions and facial action units; or physiological sensors).
Comparison of passive sensor data with self-report (study intake, experience sampling, daily check-in, daily check-out, end of study expectations) data.
Passive sensors detect triggers and manifestations of workplace stress effectively (eg, keyboard activity and less facial movement were positively correlated with stress ( =0.05, <.05 and =0.09, <.05 , respectively); the quality of stress models depends on prior data of the worker and the amount of data ( -score: after 10 days=58%; after 19 days=73%).
Cui et al []
4000 steel workers
Development and comparison of 2 AI-based risk prediction models (LSTM vs RNN ) that predict the influence of the work environment on employees’ health.
N/A
Based on sociodemographic data (age, income, education, or marital status), health-related data (BMI, smoking, drinking, or blood lipids [cholesterol or triglyceride]), and work-related factors (length of service, high-temperature exposure, shift work, or noise exposure) the prediction effect of LSTM is significantly better than that of traditional RNN, with an accuracy of more than 95% ( -score).
Dijkhuis et al []
IG 24 and CG 24 population/setting not disclosed
Development and implementation of a prediction model that personalizes physical activity recommendations. Within a 12-week workplace health promotion intervention. The goals of the intervention were to increase physical activity during workdays by improving physical and mental health and several work-related variables.
CG: no participation in the 12-week WHP -program.
Input variables “hours of the day” and “step count” were used in the evaluated model and reached an accuracy of 90% (mean accuracy=0.93; range=0.88-0.99; mean -score=0.90; range=0.87-0.94). Tree algorithms and tree-based ensemble algorithms performed exceedingly well. The individualized algorithms allow for predicting physical activity during the day and provide the possibility to intervene with personalized feedback.
Hungerbuehler et al []
77 industrial, logistic, and office workers
Development of a chatbot system and its implementation in a workplace setting to assess employees’ mental health.
Participation rates were compared to face-to-face collection method rates.
The response rate was 64.2% (77/120). The majority scored in the mild range for anxiety (GAD-7 : mean 6.21, SD 4.56; 50%) and depression (PHQ-9 : mean 4.40, SD 5.21; 57%), the moderate range for stress (DASS-21 : mean 11.09, SD 7.13; 46%), subthreshold level for insomnia (ISI : mean 9.26, SD 5.66; 70%), the low-risk burnout-category (OLBI : mean 27.68, SD 8.38; 68%) and in the increased risk category for stress (JSS : mean 32.38, SD 3.55; 69%). Chatbot-based workplace mental health assessment is highly engaging and effective among employees, with response rates comparable to face-to-face interviews.
Kaiser et al []
12 office workers
Evaluation of a portable health (pHealth) app to detect COVID-19 infection and trace movement to prevent further infections. Additionally, the pHealth app detects employees’ health conditions and recommends further health measures if indicated.
N/A
The app-integrated COVID-19 questionnaire was validated against real-time health conditions. Proximity detection, contact tracing, and health monitoring (external sensors) were confirmed by proximity testing (surf plot evaluation); it effectively estimates COVID-19 infection risk and personal health conditions.
Lopes et al []
IG 28 and CG 28 service and retail workers
IG interacted with a social robot agent that promotes health behavior change of participants’ choice (physical activity, nutrition, tobacco consumption, and stress and anxiety) in the workplace. After baseline assessment 8, social robots were used for 20-30 minutes weekly for 8 weeks. Based on the health action process approach model, the intervention focused on goal setting, monitoring behavior, elaborating action plans, and self-efficacy techniques through videos.
CG received the same intervention measures through human agents via Teams (Microsoft Corp).
IG improved significantly compared to CG in productivity ( =9041, <.005 ; η2=0.26) and in well-being ( =4517, <.005 ; η2=0.079), but not in work-engagement ( =0.5176, >.005 ). Additionally, IG improved significantly in the postintervention scores compared to CG ( =8997, <.001 , Wilk Λ=0.597, partial η2=0.40) despite presenteeism and regard for their level of mental well-being.
Maxhuni et al []
30 office workers
Measurement of smartphone data to assess employees’ stress levels. Data were assessed for 8 weeks on physical activity (accelerometer), location (GPS ), social interaction (microphone, number of phone calls, or text messages), and social activity (app usage).
Objective data was compared to subjective data (OLBI, POMS ).
A high correlation between objective smartphone data and questionnaire scores was overall significant. The accuracy of the supervised decision tree was acceptable ( -score=67.5%). The semisupervised learning approach was somewhat better, with an -score of 70%. Overall, the results confirm that the prediction model is feasible to detect perceived stress at work using smartphone-sensed data.
Piao et al []
IG 57 and CG 49 office and administrative workers
A healthy lifestyle coaching chatbot from the KakaoTalk App (Kakao Corp) was implemented into an office work setting to promote employees’ stair-climbing habits. During the intervention, the IG received cues, intrinsic, and extrinsic rewards for the entire 12 weeks.
CG did not receive intrinsic rewards for the first 4 weeks and only received all rewards, as in IG, from the fifth to the 12th week.
After 4 weeks, the change in SRHI scores was (mean IG 13.54, SD 14.99; mean CG 6.42, SD 9.42) significantly different between groups ( <.05 ). Between the fifth and 12th week, the change in SRHI scores of the intervention and control groups was comparable (mean IG 12.08, SD 10.87; mean CG 15.88, SD 13.29; =.21). Level of physical activity showed a significant difference between the groups after 12 weeks of intervention ( =21.16; =.045). Intrinsic reward was significantly influencing habit formation.
Yan et al []
352 respiratory therapists in medical centers and regional hospitals
Building a model to develop a web-based application for classifying mental illness at the workplace. Data on emotional labor and psychological health was assessed for 4 weeks with the ELMH .
N/A
Model structure with 8 domains was confirmed with exploratory factor analysis, and 4 types of mental health were classified using the Rasch analysis with an accuracy rate of MNSQ =0.92. An app predicting mental illness was successfully developed and demonstrated in this study.
a IG: intervention group.
b CG: control group.
c AI: artificial intelligence.
d MSD: musculoskeletal disorder.
e OR: odds ratio.
f Original P values were not reported in the original publications.
g LSTM: long short-term memory.
h RNN: recurrent neural network.
i N/A: not applicable.
j WHP: workplace health promotion.
k GAD-7: Generalized Anxiety Disorder Scale.
l PHQ-9: Physical Health Questionnaire.
m DASS-21: Depression, Anxiety, Stress Scale.
n ISI: Insomnia Severity Index.
o OLBI: Oldenburg Burnout Inventory.
p JSS: job strain survey.
q GPS: global positioning system.
r POMS: profile of mood states.
s SRHI: self-report habit index.
t ELMH: Emotional Labor and Mental Health questionnaire.
u MNSQ: mean square error.
Principal Results
This study aimed to assess an overview of the current state of AI use in WHPP. Our results underline that despite the rapid increase in AI-related studies, only a small number of studies have addressed AI apps and technologies in WHPP up to now. Risk prediction and modeling were the most identified WHPP fields, followed by behavioral health promotion approaches. AI algorithms and technologies were primarily implemented in smartphone apps (eg, in the form of a chatbot) or used the smartphone as a data source (eg, GPS). Further, our results revealed that most studies validated AI algorithms and feasibility.
Potential Approaches
The results merely indicate the potential of AI in WHPP with individualized, real-time data analysis and health-related information as critical elements but do not fully reflect this at present. AI-assisted chatbot apps were a primary AI technology, reaching reasonable adherence rates and offering a potential access route through various frontend solutions such as smartphones or web-based apps. Chatbots can easily individualize health-related information and recommendations regarding the type of job, educational level, and specific language barriers. The integration of sensor technologies can increase the efficacy of individualized chatbot solutions. This could advance the access and dissemination of workplace health-related information significantly. Chronically ill employees or other target groups can profit from context-specific health information that helps maintain or improve workability [ 48 ]. The aspect of anonymity might increase the acceptance of prevention measures for smoking cessation, alcohol, or substance abuse [ 31 , 49 ]. Due to the diversity of job activities (eg, physical labor or white-collar jobs) and workplace characteristics (eg, office, hybrid, or remote work), individualized access to health interventions can improve resource allocation as well as the density and quality of preventive health care [ 50 , 51 ]. Personalizing health-related information or feedback potentially increases workplace health-related behaviors [ 52 , 53 ]. The genuine ability of AI to analyze large amounts of data in real-time can be applied to predict or detect individual or organizational health risks, for example, infections, stress symptoms, or body positions [ 54 - 59 ].
State of AI-Research in WHPP
The small number of studies on AI and WHPP compared to other sectors of work-related health (eg, OSH or RTW) or public health indicates a considerable research gap. At this point, research in other health care sectors offers much more reviews [ 7 , 60 - 62 ]. Reasons can be found in common challenges of WHPP as a young research field, a high sensitivity regarding data protection regulation in the context of work, and the nonexistent legal requirements for WHPP in many countries [ 23 , 63 , 64 ]. At the same time, WHPP is often entrenched within an OSH paradigm among employers that do not prioritize WHPP [ 65 , 66 ].
As stated, most research WHPP fields were prevention and risk prediction followed by behavioral approaches. Stress and mental health were the primary outcomes of 4 studies within these fields. Given the relevance of mental health, the research interest can be assessed as adequate. At the same time, musculoskeletal disorders are the leading cause of sick leave in most countries [ 67 ] and are therefore highly underrepresented in the included studies. In 2 studies, behavioral approaches focused on physical activity and general health behavior were investigated in 1 study. Other WHPP-related behaviors such as nutrition, sleep, substance abuse (eg, nicotine), or stress management are not targeted by current research [ 24 ]. The same accounts for organizational WHPP approaches centered in only 1 study [ 41 ]. Organizational approaches that aim to disseminate health-related information, increase work-related health literacy, or implement educational measures have not been included in current AI and WHPP research. Areas such as social inequality [ 68 ], specific target groups (eg, chronically ill employees or migrants), or health-oriented leadership were not addressed.
Most studies of our review were conducted in a cross-sectional study design to gain data for any AI learning process in a time- and resource-efficient way [ 69 ]. This has 2 implications regarding the current stage of research. First, AI model life cycles need to be completed to gain high-level semantics and create a comprehensive learning basis, from data preparation (eg, dealing with missing data) and data conditioning to data acquisition and model refinement [ 70 ]. For future AI models, longitudinal data are of utmost importance, as cross-sectional data can only reflect on a specific stage of that life cycle [ 70 , 71 ]. Second, longitudinal study designs are usually more cost- and resource-intensive and often less prioritized. This not only leads to an imbalance of evidence on behavioral WHPP interventions but also to a lack of causal relation between AI and WHPP outcomes.
Most studies reported using ML compared to more sophisticated DL or NLP algorithms. ML algorithms use extracted data to predict binary or multiple outcomes or classes without hidden layers. DL algorithms are characterized by hidden-layer neural networks. They can be employed for the analysis of more complex data sets, for example, for the detection of multidimensional objects in the realm of video and speech analysis [ 4 , 72 ]. The complexity of DL algorithms, in turn, ties in with the AI model life cycle, as DL algorithms require a broader database for learning. While ML approaches are found to be highly predictive and offer more individualized interventions in a specific context, they are also prone to errors. Escorpizio et al [ 29 ] point out that in 1 study, ML classification exceeded clinicians’ decision-making [ 73 ]. Still, the results were later reversed when the approach was implemented with a different cohort [ 74 ]. This is of particular interest, as studies within our results relied on either a small number of participants [ 41 ], few input variables [ 44 ], or a homogenous data input (eg, only self-report data) [ 40 ], causing potential ceiling effects within the AI learning progress [ 75 , 76 ]. Conversely, the benefit of longitudinal data in the context of AI reveals itself through the increase in precision. Further, 1 study pointed out the relevance of multiple measurements and longitudinal data by increasing the accuracy from 46% (time point 0) to 73% after 19 days of data [ 38 ]. Nevertheless, the included studies do not use the potential of AI in comparable health-related fields such as OSH or RTW [ 26 - 31 ]. Some areas of AI application are not addressed, such as big data analysis (eg, comparison with existing data of national cohort studies) or language translation models.
Future Research
As pointed out, current research is on AI in WHPP regarding quantity, fields of WHPP and its subdomains, and AI algorithms. Future research should center around major causes of sick leave, such as musculoskeletal disorders, mental health, respiratory conditions, and influenza [ 67 ]. Behavioral WHPP interventions should extend to all areas of health-related behavior, including nutrition, sleep, substance abuse, and stress management [ 24 ]. Further, setting-specific aspects of WHPP, such as intervention content, implementation strategies, user experience, design, algorithms, and the company’s size, need to be considered more specifically. So far, the studies have provided only moderate information on the job activities or the target groups. At the same time, workplaces and workers are diverse. The health of employees is influenced by numerous organizational and individual factors that must be further considered in the learning cycle of AI [ 77 - 79 ]. Regarding potential errors, existing AI algorithms must be validated with different target groups [ 59 , 80 ], emphasizing the need for longitudinal data and its impact on learning algorithms [ 81 , 82 ]. Beyond this, the technological diversity of the presented studies opens new possibilities for target group-specific or individualized interventions. Providing health information to chronically ill employees, migrants with different language skills, or individualizing health topics of varying age groups can be provided more effectively through AI to move beyond a “one size fits” all paradigm [ 83 , 84 ].
Outside of the objective’s scope, we identified 2 aspects that can improve future research. First, the included studies reported overall positive results regarding feasibility, significance, or accuracy, underlining the vast potential that AI technology harbors. However, the results must be interpreted cautiously as certain information in the primary studies was not provided, assessed, or available at the stages of the investigated technology. For example, few studies mentioned a potential bias through the novelty [ 40 , 47 ] or the Hawthorne effect [ 45 , 47 , 85 ]. The novelty effect [ 86 ] applies to most of the included studies as they did not control for experience with new technologies or their affinity to them. Second, concerns about data access, storage or control, the ownership of AI-generated data, and its further use need to be clarified [ 87 , 88 ]. Standards should be derived and updated at appropriate intervals, especially new AI-generated knowledge based on employee’s personal information [ 89 ]. Transparency and high data protection regulation can increase adherence rates and reduce usage barriers [ 90 ]. In turn, we propose that future research should rely on reporting guidelines [ 76 , 91 , 92 ].
Strength and Limitations
Of note, 1 strength of our review is the explanatory nature of the RQs and the systematic search strategy in this new field. Consequently, the heterogeneity of the identified studies might be considered a limitation. Different AI applications and technologies, the types of intervention, and the variety of workplace settings limit the conclusion significantly. Beyond this, the reporting of the types of AI-based algorithms and technologies used in the study are based on the authors’ self-reports. It is important to consider that the differentiation of the AI algorithm types cannot be made with a high degree of distinction.
Conclusions
Overall, this review underlines that AI in WHPP bears considerable potential but is not used fully at present. The results of our review offer a promising perspective on the predictive and personalized health paradigm shift in WHPP. Nevertheless, we conclude that current AI-related research in WHPP is still at the beginning, as it does not cover the scope of WHPP. The most salient research gaps can be found in lacking fields of WHPP and its subdomains, the predominantly ML-based algorithms and cross-sectional data, and the weak consideration of the work context. We believe we have contributed to future WHPP research by identifying these gaps and recommending future approaches. As AI applications are gaining an increasingly important role, we are convinced that future research will profit from an extended range of research in all fields of WHPP, longitudinal data, and the use of reporting guidelines.
Acknowledgments
The design and registration of the study was handled by ML. The first draft of this paper was by ML, AL, and AS. Data were collected by ML and AL. Analysis was done by ML, AL, and IK. Revision and review of this paper were performed by ML, AL, IK, and AS. This research received no external funding. We did not use any generative AI in this paper.
Data Availability
All data are publicly available in the OSF [ 37 ].
Conflicts of Interest
None declared.
PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews) checklist.
Kaul V, Enslin S, Gross SA. History of artificial intelligence in medicine. Gastrointest Endosc. Oct 2020;92(4):807-812. [ CrossRef ] [ Medline ]
McCarthy J. Programs with common sense mechanisation of thought processes. In: Proceedings of the Symposium of the National Physics Laboratory. London, UK. Her Majesty's Stationery Office; 1959. Presented at: Proceedings of the Symposium of the National Physics Laboratory; 24th-27th November 1958:3-10; Teddington, Middlesex.
Russell SJ, Norvig P. Introduction. In: Russell SJ, Norvig P, editors. Artificial intelligence: a modern approach. Harlow. Pearson; 2022:19-54.
Helm JM, Swiergosz AM, Haeberle HS, Karnuta JM, Schaffer JL, Krebs VE, et al. Machine learning and artificial intelligence: definitions, applications, and future directions. Curr Rev Musculoskelet Med. 2020;13(1):69-76. [ FREE Full text ] [ CrossRef ] [ Medline ]
Grossberg S. A path toward explainable AI and autonomous adaptive intelligence: deep learning, adaptive resonance, and models of perception, emotion, and action. Front Neurorobot. 2020;14:36. [ FREE Full text ] [ CrossRef ] [ Medline ]
Chen J, See KC. Artificial intelligence for COVID-19: rapid review. J Med Internet Res. 2020;22(10):e21476. [ FREE Full text ] [ CrossRef ] [ Medline ]
Dong L, Yang Q, Zhang RH, Wei WB. Artificial intelligence for the detection of age-related macular degeneration in color fundus photographs: a systematic review and meta-analysis. eClinicalMedicine. 2021;35:100875. [ FREE Full text ] [ CrossRef ] [ Medline ]
Boucher P. Artificial intelligence: how does it work, why does it matter, and what can we do about it? Brussels. European Parliament; 2020. URL: https://www.europarl.europa.eu/RegData/etudes/STUD/2020/641547/EPRS_STU(2020)641547_EN.pdf [accessed 2024-07-30]
Lee S, Liu L, Radwin R, Li J. Machine learning in manufacturing ergonomics: recent advances, challenges, and opportunities. IEEE Robot Autom Lett. 2021;6(3):5745-5752. [ CrossRef ]
Secinaro S, Calandra D, Secinaro A, Muthurangu V, Biancone P. The role of artificial intelligence in healthcare: a structured literature review. BMC Med Inform Decis Mak. 2021;21(1):125. [ FREE Full text ] [ CrossRef ] [ Medline ]
Johnson KB, Wei WQ, Weeraratne D, Frisse ME, Misulis K, Rhee K, et al. Precision medicine, AI, and the future of personalized health care. Clin Transl Sci. 2021;14(1):86-93. [ FREE Full text ] [ CrossRef ] [ Medline ]
Raina V, Krishnamurthy S. Building an effective data science practice. Berkeley, CA. Apress; 2022.
Juhn Y, Liu H. Artificial intelligence approaches using natural language processing to advance EHR-based clinical research. J Allergy Clin Immunol. 2020;145(2):463-469. [ FREE Full text ] [ CrossRef ] [ Medline ]
Perez MV, Mahaffey KW, Hedlin H, Rumsfeld JS, Garcia A, Ferris T, et al. Apple Heart Study Investigators. Large-scale assessment of a smartwatch to identify atrial fibrillation. N Engl J Med. 2019;381(20):1909-1917. [ FREE Full text ] [ CrossRef ] [ Medline ]
Feldman J, Thomas-Bachli A, Forsyth J, Patel ZH, Khan K. Development of a global infectious disease activity database using natural language processing, machine learning, and human expertise. J Am Med Inform Assoc. 2019;26(11):1355-1359. [ FREE Full text ] [ CrossRef ] [ Medline ]
Poplin R, Varadarajan AV, Blumer K, Liu Y, McConnell MV, Corrado GS, et al. Prediction of cardiovascular risk factors from retinal fundus photographs via deep learning. Nat Biomed Eng. 2018;2(3):158-164. [ CrossRef ] [ Medline ]
Corsello SM, Nagari RT, Spangler RD, Rossen J, Kocak M, Bryan JG, et al. Discovering the anti-cancer potential of non-oncology drugs by systematic viability profiling. Nat Cancer. 2020;1(2):235-248. [ FREE Full text ] [ CrossRef ] [ Medline ]
Rehman UU, Chang DJ, Jung Y, Akhtar U, Razzaq MA, Lee S. Medical instructed real-time assistant for patient with glaucoma and diabetic conditions. Appl Sci. 2020;10(7):2216. [ CrossRef ]
Jungwirth D, Haluza D. Artificial intelligence and public health: an exploratory study. Int J Environ Res Public Health. 2023;20(5):4541. [ FREE Full text ] [ CrossRef ] [ Medline ]
Luxembourg declaration on workplace health promotion in the European Union. Perugia, Italy. European Network of Workplace Health Promotion; 2018.
Pomaki G, Franche RL, Murray E, Khushrushahi N, Lampinen TM. Workplace-based work disability prevention interventions for workers with common mental health conditions: a review of the literature. J Occup Rehabil. 2012;22(2):182-195. [ CrossRef ] [ Medline ]
Gritzka S, MacIntyre TE, Dörfel D, Baker-Blanc JL, Calogiuri G. The effects of workplace nature-based interventions on the mental health and well-being of employees: a systematic review. Front Psychiatry. 2020;11:323. [ FREE Full text ] [ CrossRef ] [ Medline ]
Terry PE. Workplace health promotion is growing up but confusion remains about what constitutes a comprehensive approach. Am J Health Promot. 2019;33(6):845-849. [ CrossRef ] [ Medline ]
Rongen A, Robroek SJW, van Lenthe FJ, Burdorf A. Workplace health promotion: a meta-analysis of effectiveness. Am J Prev Med. 2013;44(4):406-415. [ CrossRef ] [ Medline ]
Technical and ethical guidelines for workers' health surveillance. Geneva. International Labor Organization; 1998.
Donisi L, Cesarelli G, Pisani N, Ponsiglione AM, Ricciardi C, Capodaglio E. Wearable sensors and artificial intelligence for physical ergonomics: a systematic review of literature. Diagnostics (Basel). 2022;12(12):3048. [ FREE Full text ] [ CrossRef ] [ Medline ]
Moshawrab M, Adda M, Bouzouane A, Ibrahim H, Raad A. Smart wearables for the detection of occupational physical fatigue: a literature review. Sensors (Basel). 2022;22(19):7472. [ FREE Full text ] [ CrossRef ] [ Medline ]
Dolson CM, Harlow ER, Phelan DM, Gabbett TJ, Gaal B, McMellen C, et al. Wearable sensor technology to predict core body temperature: a systematic review. Sensors (Basel). 2022;22(19):7639. [ FREE Full text ] [ CrossRef ] [ Medline ]
Escorpizo R, Theotokatos G, Tucker CA. A scoping review on the use of machine learning in return-to-work studies: strengths and weaknesses. J Occup Rehabil. 2024;34(1):71-86. [ CrossRef ] [ Medline ]
Aggarwal A, Tam CC, Wu D, Li X, Qiao S. Artificial intelligence-based chatbots for promoting health behavioral changes: systematic review. J Med Internet Res. 2023;25:e40789. [ FREE Full text ] [ CrossRef ] [ Medline ]
Peters MDJ, Godfrey CM, Khalil H, McInerney P, Parker D, Soares CB. Guidance for conducting systematic scoping reviews. Int J Evid Based Healthc. 2015;13(3):141-146. [ CrossRef ] [ Medline ]
Tricco AC, Lillie E, Zarin W, O'Brien KK, Colquhoun H, Levac D, et al. PRISMA Extension for Scoping Reviews (PRISMA-ScR): checklist and explanation. Ann Intern Med. 2018;169(7):467-473. [ FREE Full text ] [ CrossRef ] [ Medline ]
Arksey H, O'Malley L. Scoping studies: towards a methodological framework. Int J Soc Res Methodol. 2005;8(1):19-32. [ CrossRef ]
Huang X, Lin J, Demner-Fushman D. Evaluation of PICO as a knowledge representation for clinical questions. AMIA Annu Symp Proc. 2006;2006:359-363. [ FREE Full text ] [ Medline ]
Sager M, Pistone I. Mismatches in the production of a scoping review: highlighting the interplay of (in)formalities. J Eval Clin Pract. 2019;25(6):930-937. [ CrossRef ] [ Medline ]
Lange M, Löwe A, Kayser I, Schaller A. Approaches for the use of artificial intelligence in the field of workplace health: a systematic scoping review. OSF. 2023. URL: https://osf.io/hsu2w/ [accessed 2023-10-06]
Morshed MB, Hernandez J, McDuff D, Suh J, Howe E, Rowan K, et al. Advancing the understanding and measurement of workplace stress in remote information workers from passive sensors and behavioral data. In: 10th International Conference on Affective Computing and Intelligent Interaction (ACII). IEEE; 2022. Presented at: 2022 10th International Conference on Affective Computing and Intelligent Interaction (ACII); October 18-21, 2022:1-8; Nara, Japan. [ CrossRef ]
Cui S, Li C, Chen Z, Wang J, Yuan J. Research on risk prediction of dyslipidemia in steel workers based on recurrent neural network and LSTM neural network. IEEE Access. 2020;8:34153-34161. [ CrossRef ]
Hungerbuehler I, Daley K, Cavanagh K, Garcia Claro H, Kapps M. Chatbot-based assessment of employees' mental health: design process and pilot implementation. JMIR Form Res. 2021;5(4):e21678. [ FREE Full text ] [ CrossRef ] [ Medline ]
Kaiser MS, Mahmud M, Noor MBT, Zenia NZ, Mamun SA, Mahmud KMA, et al. iWorksafe: towards healthy workplaces during COVID-19 with an intelligent phealth app for industrial settings. IEEE Access. 2021;9:13814-13828. [ CrossRef ]
Maxhuni A, Hernandez-Leal P, Morales EF, Sucar LE, Osmani V, Mayora O. Unobtrusive stress assessment using smartphones. IEEE Trans on Mobile Comput. 2021;20(6):2313-2325. [ CrossRef ]
Yan YH, Chien TW, Yeh YT, Chou W, Hsing SC. An app for classifying personal mental illness at workplace using fit statistics and convolutional neural networks: survey-based quantitative study. JMIR mHealth uHealth. 2020;8(7):e17857. [ FREE Full text ] [ CrossRef ] [ Medline ]
Dijkhuis TB, Blaauw FJ, van Ittersum MW, Velthuijsen H, Aiello M. Personalized physical activity coaching: a machine learning approach. Sensors (Basel). 2018;18(2):623. [ FREE Full text ] [ CrossRef ] [ Medline ]
Anan T, Kajiki S, Oka H, Fujii T, Kawamata K, Mori K, et al. Effects of an artificial intelligence-assisted health program on workers with neck/shoulder pain/stiffness and low back pain: randomized controlled trial. JMIR mHealth uHealth. 2021;9(9):e27535. [ FREE Full text ] [ CrossRef ] [ Medline ]
Piao M, Ryu H, Lee H, Kim J. Use of the healthy lifestyle coaching chatbot app to promote stair-climbing habits among office workers: exploratory randomized controlled trial. JMIR mHealth uHealth. 2020;8(5):e15085. [ FREE Full text ] [ CrossRef ] [ Medline ]
Lopes SL, Ferreira AI, Prada R. The use of robots in the workplace: conclusions from a health promoting intervention using social robots. Int J Soc Robot. 2023;15:893-905. [ FREE Full text ] [ CrossRef ] [ Medline ]
Schachner T, Keller R, V Wangenheim F. Artificial intelligence-based conversational agents for chronic conditions: systematic literature review. J Med Internet Res. 2020;22(9):e20701. [ FREE Full text ] [ CrossRef ] [ Medline ]
Ogilvie L, Prescott J, Carson J. The use of chatbots as supportive agents for people seeking help with substance use disorder: a systematic review. Eur Addict Res. 2022;28(6):405-418. [ FREE Full text ] [ CrossRef ] [ Medline ]
Xiao Z, Liao QV, Zhou M, Grandison T, Li Y. Powering an AI chatbot with expert sourcing to support credible health information access. In: Proceedings of the 28th International Conference on Intelligent User Interfaces. New York, NY, United States. Association for Computing Machinery; 2023. Presented at: 28th International Conference on Intelligent User Interfaces; 27th -31st March 2023:2-18; Sydney Australia. URL: https://iui.acm.org/2023/ [ CrossRef ]
Jovanovic M, Baez M, Casati F. Chatbots as conversational healthcare services. IEEE Internet Comput. 2021;25(3):44-51. [ CrossRef ]
Moore PV. OSH and the future of work: benefits and risks of artificial intelligence tools in workplaces. In: Digital Human Modeling and Applications in Health, Safety, Ergonomics, and Risk Management : 10th International Conference, DHM 2019, Held as part of the 21st HCI International Conference, HCII 2019. Orlando, FL, USA. Cham: Springer; 2019. Presented at: HCI International; 26th-31st July 2019:292-315; Orlando, Florida, United States of America. URL: https://2019.hci.international/ [ CrossRef ]
Zhang J, Oh YJ, Lange P, Yu Z, Fukuoka Y. Artificial intelligence chatbot behavior change model for designing artificial intelligence chatbots to promote physical activity and a healthy diet: viewpoint. J Med Internet Res. 2020;22(9):e22845. [ FREE Full text ] [ CrossRef ] [ Medline ]
Conroy B, Silva I, Mehraei G, Damiano R, Gross B, Salvati E, et al. Real-time infection prediction with wearable physiological monitoring and AI to aid military workforce readiness during COVID-19. Sci Rep. 2022;12(1):3797. [ FREE Full text ] [ CrossRef ] [ Medline ]
Alberto R, Draicchio F, Varrecchia T, Silvetti A, Iavicoli S. Wearable monitoring devices for biomechanical risk assessment at work: current status and future challenges-a systematic review. Int J Environ Res Public Health. 2018;15(9):2001. [ FREE Full text ] [ CrossRef ] [ Medline ]
Saarela K, Huhta-Koivisto V, Kemell KK, Nurminen J. Work disability risk prediction using machine learning. In: Daimi K, Alsadoon A, Seabra Dos Reis S, editors. Current and Future Trends in Health and Medical Informatics. Cham. Springer Nature Switzerland; 2023:345-359.
Zawad MRS, Rony CSA, Haque MY, Banna MHA, Mahmud M, Kaiser MS. A hybrid approach for stress prediction from heart rate variability. In: Frontiers of ICT in Healthcare: Proceedings of EAIT 2022. Singapore. Springer Nature Singapore; 2023. Presented at: https://www.csikolkata.org/eait2022/?i=1; 30th-31st March 2022:111-121; Kolkata, India. [ CrossRef ]
Seo W, Kim N, Park C, Park SM. Deep learning approach for detecting work-related stress using multimodal signals. IEEE Sensors J. 2022;22(12):11892-11902. [ CrossRef ]
Nijhawan T, Attigeri G, Ananthakrishna T. Stress detection using natural language processing and machine learning over social interactions. J Big Data. 2022;9(1):33. [ CrossRef ]
Sarker S, Jamal L, Ahmed SF, Irtisam N. Robotics and artificial intelligence in healthcare during COVID-19 pandemic: a systematic review. Rob Auton Syst. 2021;146:103902. [ FREE Full text ] [ CrossRef ] [ Medline ]
Kumar Y, Koul A, Singla R, Ijaz MF. Artificial intelligence in disease diagnosis: a systematic literature review, synthesizing framework and future research agenda. J Ambient Intell Humaniz Comput. 2023;14(7):8459-8486. [ FREE Full text ] [ CrossRef ] [ Medline ]
Aggarwal R, Sounderajah V, Martin G, Ting DSW, Karthikesalingam A, King D, et al. Diagnostic accuracy of deep learning in medical imaging: a systematic review and meta-analysis. NPJ Digit Med. 2021;4(1):65. [ FREE Full text ] [ CrossRef ] [ Medline ]
Faller G. Future challenges for work-related health promotion in Europe: a data-based theoretical reflection. Int J Environ Res Public Health. 2021;18(20):10996. [ FREE Full text ] [ CrossRef ] [ Medline ]
Robroek SJ, Coenen P, Oude Hengel KM. Decades of workplace health promotion research: marginal gains or a bright future ahead. Scand J Work Environ Health. 2021;47(8):561-564. [ FREE Full text ] [ CrossRef ] [ Medline ]
Pescud M, Teal R, Shilton T, Slevin T, Ledger M, Waterworth P, et al. Employers' views on the promotion of workplace health and wellbeing: a qualitative study. BMC Public Health. 2015;15:642. [ FREE Full text ] [ CrossRef ] [ Medline ]
McCoy K, Stinson K, Scott K, Tenney L, Newman LS. Health promotion in small business: a systematic review of factors influencing adoption and effectiveness of worksite wellness programs. J Occup Environ Med. 2014;56(6):579-587. [ FREE Full text ] [ CrossRef ] [ Medline ]
Work-related MSDs: prevalence, costs and demographics in the EU. European Risk Observatory Executive summary. Luxembourg. European Agency for Safety and Health at Work (EU-OSHA); 2019. URL: https://osha.europa.eu/sites/default/files/Work_related_MSDs_prevalence_costs_and_demographics_in_EU_summary.pdf [accessed 2024-07-30]
van der Put AC, Mandemakers JJ, de Wit JBF, van der Lippe T. Worksite health promotion and social inequalities in health. SSM Popul Health. 2020;10:100543. [ FREE Full text ] [ CrossRef ] [ Medline ]
Wang X, Cheng Z. Cross-sectional studies: strengths, weaknesses, and recommendations. Chest. 2020;158(1S):S65-S71. [ CrossRef ] [ Medline ]
Ng MY, Kapur S, Blizinsky KD, Hernandez-Boussard T. The AI life cycle: a holistic approach to creating ethical AI for health decisions. Nat Med. 2022;28(11):2247-2249. [ FREE Full text ] [ CrossRef ] [ Medline ]
Challen R, Denny J, Pitt M, Gompels L, Edwards T, Tsaneva-Atanasova K. Artificial intelligence, bias and clinical safety. BMJ Qual Saf. 2019;28(3):231-237. [ FREE Full text ] [ CrossRef ] [ Medline ]
Lepakshi VA. Machine learning and deep learning based AI tools for development of diagnostic tools. In: Parihar A, Khan R, Kumar A, Kaushik A, Gohel H, editors. Computational Approaches for Novel Therapeutic and Diagnostic Designing to Mitigate SARS-CoV-2 Infection. Cambridge, Massachusetts, United States. Academic Press; 2022:399-420.
Gross DP, Zhang J, Steenstra I, Barnsley S, Haws C, Amell T, et al. Development of a computer-based clinical decision support tool for selecting appropriate rehabilitation interventions for injured workers. J Occup Rehabil. 2013;23(4):597-609. [ CrossRef ] [ Medline ]
Gross DP, Steenstra IA, Shaw W, Yousefi P, Bellinger C, Zaïane O. Validity of the work assessment triage tool for selecting rehabilitation interventions for workers' compensation claimants with musculoskeletal conditions. J Occup Rehabil. 2020;30(3):318-330. [ CrossRef ] [ Medline ]
Janssen M, Brous P, Estevez E, Barbosa LS, Janowski T. Data governance: organizing data for trustworthy artificial intelligence. Gov Inf Q. 2020;37(3):101493. [ CrossRef ]
Liang W, Tadesse GA, Ho D, Fei-Fei L, Zaharia M, Zhang C, et al. Advances, challenges and opportunities in creating data for trustworthy AI. Nat Mach Intell. 2022;4(8):669-677. [ CrossRef ]
Braithwaite J, Herkes J, Ludlow K, Testa L, Lamprell G. Association between organisational and workplace cultures, and patient outcomes: systematic review. BMJ Open. 2017;7(11):e017708. [ FREE Full text ] [ CrossRef ] [ Medline ]
Shanafelt TD, Gorringe G, Menaker R, Storz KA, Reeves D, Buskirk SJ, et al. Impact of organizational leadership on physician burnout and satisfaction. Mayo Clin Proc. 2015;90(4):432-440. [ CrossRef ] [ Medline ]
Xueyun Z, Al Mamun A, Masukujjaman M, Rahman MK, Gao J, Yang Q. Modelling the significance of organizational conditions on quiet quitting intention among Gen Z workforce in an emerging economy. Sci Rep. 2023;13(1):15438. [ FREE Full text ] [ CrossRef ] [ Medline ]
Ali Shah SA, Uddin I, Aziz F, Ahmad S, Al-Khasawneh MA, Sharaf M. An enhanced deep neural network for predicting workplace absenteeism. Complexity. 2020;2020:1-12. [ CrossRef ]
Su TH, Wu CH, Kao JH. Artificial intelligence in precision medicine in hepatology. J Gastroenterol Hepatol. 2021;36(3):569-580. [ CrossRef ] [ Medline ]
Schafer KM, Kennedy G, Gallyer A, Resnik P. A direct comparison of theory-driven and machine learning prediction of suicide: a meta-analysis. PLoS One. 2021;16(4):e0249833. [ FREE Full text ] [ CrossRef ] [ Medline ]
Sallam M. ChatGPT utility in healthcare education, research, and practice: systematic review on the promising perspectives and valid concerns. Healthcare (Basel). 2023;11(6):887. [ FREE Full text ] [ CrossRef ] [ Medline ]
Purgato M, Singh R, Acarturk C, Cuijpers P. Moving beyond a 'one-size-fits-all' rationale in global mental health: prospects of a precision psychology paradigm. Epidemiol Psychiatr Sci. 2021;30:e63. [ FREE Full text ] [ CrossRef ] [ Medline ]
Becker S, Miron-Shatz T, Schumacher N, Krocza J, Diamantidis C, Albrecht UV. mHealth 2.0: experiences, possibilities, and perspectives. JMIR mHealth uHealth. 2014;2(2):e24. [ FREE Full text ] [ CrossRef ] [ Medline ]
Elston DM. The novelty effect. J Am Acad Dermatol. 2021;85(3):565-566. [ CrossRef ] [ Medline ]
Gerke S, Minssen T, Cohen G. Ethical and legal challenges of artificial intelligence-driven healthcare. Artif Intell Healthcare. 2020:295-336. [ CrossRef ]
Rodrigues R. Legal and human rights issues of AI: gaps, challenges and vulnerabilities. J Responsible Technol. 2020;4:100005. [ CrossRef ]
Andraško J, Mesarčík M, Hamuľák O. The regulatory intersections between artificial intelligence, data protection and cyber security: challenges and opportunities for the EU legal framework. AI Soc. 2021;36(2):623-636. [ CrossRef ]
Schönberger D. Artificial intelligence in healthcare: a critical analysis of the legal and ethical implications. Int J Law Inf Technol. 2019;27(2):171-203. [ CrossRef ] [ Medline ]
Fischer L, Ehrlinger L, Geist V, Ramler R, Sobiezky F, Zellinger W, et al. AI system engineering—key challenges and lessons learned. MAKE. 2021;3(1):56-83. [ CrossRef ]
Ibrahim H, Liu X, Denniston AK. Reporting guidelines for artificial intelligence in healthcare research. Clin Exp Ophthalmol. 2021;49(5):470-476. [ CrossRef ] [ Medline ]
Abbreviations
artificial intelligence
deep learning
machine learning
natural language processing
Open Science Framework
occupational safety and health
patient or population, intervention, comparison, and outcomes
Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews
research question
return-to-work
workplace health promotion and prevention
Edited by J-L Raisaro; submitted 09.10.23; peer-reviewed by M Ijaz, C Ordun; comments to author 12.12.23; revised version received 02.01.24; accepted 10.07.24; published 20.08.24.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR AI, is properly cited. The complete bibliographic information, a link to the original publication on https://www.ai.jmir.org/, as well as this copyright and license information must be included.
Exploring the factors driving AI adoption in production: a systematic literature review and future research agenda
Frank Schultmann ORCID: orcid.org/0000-0001-6405-9763 1
Our paper analyzes the current state of research on artificial intelligence (AI) adoption from a production perspective. We represent a holistic view on the topic which is necessary to get a first understanding of AI in a production-context and to build a comprehensive view on the different dimensions as well as factors influencing its adoption. We review the scientific literature published between 2010 and May 2024 to analyze the current state of research on AI in production. Following a systematic approach to select relevant studies, our literature review is based on a sample of articles that contribute to production-specific AI adoption. Our results reveal that the topic has been emerging within the last years and that AI adoption research in production is to date still in an early stage. We are able to systematize and explain 35 factors with a significant role for AI adoption in production and classify the results in a framework. Based on the factor analysis, we establish a future research agenda that serves as a basis for future research and addresses open questions. Our paper provides an overview of the current state of the research on the adoption of AI in a production-specific context, which forms a basis for further studies as well as a starting point for a better understanding of the implementation of AI in practice.
Explore related subjects
Artificial Intelligence
Avoid common mistakes on your manuscript.
1 Introduction
The technological change resulting from deep digitisation and the increasing use of digital technologies has reached and transformed many sectors [ 1 ]. In manufacturing, the development of a new industrial age, characterized by extensive automation and digitisation of processes [ 2 ], is changing the sector’s ‘technological reality’ [ 3 ] by integrating a wide range of information and communication technologies (such as Industry 4.0-related technologies) into production processes [ 4 ].
Although the evolution of AI traces back to the year 1956 (as part of the Dartmouth Conference) [ 5 ], its development has progressed rapidly, especially since the 2010s [ 6 ]. Driven by improvements, such as the fast and low-cost development of smart hardware, the enhancement of algorithms as well as the capability to manage big data [ 7 ], there is an increasing number of AI applications available for implementation today [ 8 ]. The integration of AI into production processes promises to boost the productivity, efficiency as well as automation of processes [ 9 ], but is currently still in its infancy [ 10 ] and manufacturing firms seem to still be hesitant to adopt AI in a production-context. This appears to be driven by the high complexity of AI combined with the lack of practical knowledge about its implementation in production and several other influencing factors [ 11 , 12 ].
In the literature, many contributions analyze AI from a technological perspective, mainly addressing underlying models, algorithms, and developments of AI tools. Various authors characterise both machine learning and deep learning as key technologies of AI [ 8 , 13 ], which are often applied in combination with other AI technologies, such as natural language recognition. While promising areas for AI application already exist in various domains such as marketing [ 14 ], procurement [ 15 ], supply chain management [ 16 ] or innovation management [ 17 ], the integration of AI into production processes also provides significant performance potentials, particularly in the areas of maintenance [ 18 ], quality control [ 19 ] and production planning and management [ 20 ]. However, AI adoption requires important technological foundations, such as the provision of data and the necessary infrastructure, which must be ensured [ 11 , 12 , 21 ]. Although the state of the art literature provides important insights into possible fields of application of AI in production, the question remains: To what extent are these versatile applications already in use and what is required for their successful adoption?
Besides the technology perspective of AI, a more human-oriented field of discussion is debated in scientific literature [ 22 ]. While new technologies play an essential role in driving business growth in the digital transformation of the production industry, the increasing interaction between humans and intelligent machines (also referred to as ‘augmentation’) creates stress challenges [ 23 ] and impacts work [ 24 ], which thus creates managerial challenges in organizations [ 25 , 26 ]. One of the widely discussed topics in this context is the fear of AI threatening jobs (including production jobs), which was triggered by e.g. a study of Frey, Osborne [ 27 ]. Another issue associated to the fear of machines replacing humans is the lack of acceptance resulting from the mistrust of technologies [ 28 , 29 ]. This can also be linked to the various ethical challenges involved in working with AI [ 22 ]. This perspective, which focuses on the interplay between AI and humans [ 30 ], reveals the tension triggered by AI. Although this is discussed from different angles, the question remains how these aspects influence the adoption of AI in production.
Another thematic stream of current literature can be observed in a series of contributions on the organizational aspects of the technology. In comparison to the two research areas discussed above, the number of publications in this area seems to be smaller. This perspective focuses on issues to implement AI, such as the importance of a profound management structure [ 31 , 32 ], leadership [ 33 ], implications on the organizational culture [ 34 ] as well as the need for digital capabilities and special organizational skills [ 33 ]. Although some studies on the general adoption of AI without a sectoral focus have already been conducted (such as by Chen, Tajdini [ 35 ] or Kinkel, Baumgartner, Cherubini [ 36 ]) and hence, some initial factors influencing the adoption of AI can be derived, the contributions from this perspective are still scarce, are usually not specifically analyzed in the context of production or lack a comprehensive view on the organization in AI adoption.
While non-industry specific AI issues have been researched in recent years, the current literature misses a production-specific analysis of AI adoption, providing an understanding of the possibilities and issues related to integrating AI into the production context. Moreover, the existing literature tells us little about relevant mechanisms and factors underlying the adoption of AI in production processes, which include both technical, human-centered as well as organizational issues. As organizational understanding of AI in a business context is currently still in its early stages, it is difficult to find an aggregate view on the factors that can support companies in implementing AI initiatives in production [ 37 , 38 ]. Addressing this gap, we aim to systematise the current scientific knowledge on AI adoption, with a focus on production. By drawing on a systematic literature review (SLR), we examine existing studies on AI adoption in production and explore the main issues regarding adoption that are covered in the analyzed articles. Building on these findings, we conduct a comprehensive analysis of the existing studies with the aim of systematically investigating the key factors influencing the adoption of AI in production. This systematic approach paves the way for the formulation of a future research agenda.
Our SLR addresses three research questions (RQs). RQ1: What are the statistical characteristics of existing research on AI adoption in production? To answer this RQ, we conduct descriptive statistics of the analyzed studies and provide information on time trends, methods used in the research, and country specifications. RQ2: What factors influence the adoption of AI in production? RQ2 specifies the adoption factors and forms the core component of our analysis. By adoption factors, we mean the factors that influence the use of AI in production (both positively and negatively) and that must therefore be analyzed and taken into account. RQ3: What research topics are of importance to advance the research field of AI adoption in production? We address this RQ by using the analyzed literature as well as the key factors of AI adoption as a starting point to derive RQs that are not addressed and thus provide an outlook on the topic.
2 Methodology
In order to create a sound information base for both policy makers and practitioners on the topic of AI adoption in production, this paper follows the systematic approach of a SLR. For many fields, including management research, a SLR is an important tool to capture the diversity of existing knowledge on a specific topic for a scientific investigation [ 39 ]. The investigator often pursues multiple goals, such as capturing and assessing the existing environment and advancing the existing body of knowledge with a proprietary RQ [ 39 ] or identifying key research topics [ 40 ].
Our SLR aims to select, analyze, and synthesize findings from the existing literature on AI adoption in production over the past 24 years. In order to identify relevant data for our literature synthesis, we follow the systematic approach of the Preferred Reporting Items for Systematic reviews (PRISMA) [ 41 ]. In evaluating the findings, we draw on a mixed-methods approach, combining some quantitative analyses, especially on the descriptive aspects of the selected publications, as well as qualitative analyses aimed at evaluating and comparing the contents of the papers. Figure 1 graphically summarizes the methodological approach that guides the content of the following sub-chapters.
Methodical procedure of our SLR following PRISMA [ 41 ]
2.1 Data identification
Following the development of the specific RQs, we searched for suitable publications. To locate relevant studies, we chose to conduct a publication analysis in the databases Scopus, Web of Science and ScienceDirect as these databases primarily contain international scientific articles and provide a broad overview of the interdisciplinary research field and its findings. To align the search with the RQs [ 42 ], we applied predefined key words to search the titles, abstracts, and keywords of Scopus, Web of Science and ScienceDirect articles. Our research team conducted several pre-tests to determine the final search commands for which the test results were on target and increased the efficiency of the search [ 42 ]. Using the combination of Boolean operators, we covered the three topics of AI, production, and adoption by searching combinations of ‘Artificial Intelligence’ AND ‘production or manufacturing’ AND ‘adopt*’ in the three scientific databases. Although ‘manufacturing’ tends to stand for the whole sector and ‘production’ refers to the process, the two terms are often used to describe the same context. We also follow the view of Burbidge, Falster, Riis, Svendsen [ 43 ] and use the terms synonymously in this paper and therefore also include both terms as keywords in the study location as well as in the analysis.
AI research has been credited with a resurgence since 2010 [ 6 ], which is the reason for our choice of time horizon. Due to the increase in publications within the last years, we selected articles published online from 2010 to May 8, 2024 for our analysis. As document types, we included conference papers, articles, reviews, book chapters, conference reviews as well as books, focusing exclusively on contributions in English in the final publication stage. The result of the study location is a list of 3,833 documents whose titles, abstracts, and keywords meet the search criteria and are therefore included in the next step of the analysis.
2.2 Data analysis
For these 3,833 documents, we then conducted an abstract analysis, ‘us[ing] a set of explicit selection criteria to assess the relevance of each study found to see if it actually does address the research question’ [ 42 ]. For this step, we again conducted double-blind screenings (including a minimum of two reviewers) as pilot searches so that all reviewers have the same understanding of the decision rules and make equal decisions regarding their inclusion for further analysis.
To ensure the paper’s focus on all three topics regarded in our research (AI, production, and adoption), we followed clearly defined rules of inclusion and exclusion that all reviewers had to follow in the review process. As a first requirement for inclusion, AI must be the technology in focus that is analysed in the publication. If AI was only mentioned and not further specified, we excluded the publication. With a second requirement, we checked the papers for the context of analysis, which in our case must be production. If the core focus is beyond production, the publication was also excluded from further analysis. The third prerequisite for further consideration of the publication is the analysis of the adoption of a technology in the paper. If technology adoption is not addressed or adoption factors are not considered, we excluded the paper. An article was only selected for full-text analysis if, after analyzing the titles, abstracts, and keywords, a clear focus on all three research areas was visible and the inclusion criteria were met for all three contexts.
By using this tripartite inclusion analysis, we were able to analyse the publications in a structured way and to reduce the 3,833 selected documents in our double-blind approach to 300 articles that were chosen for the full-text analysis. In the process of finding full versions of these publications, we had to exclude three papers as we could not access them. For the rest of the 297 articles we obtained full access and thus included them for further analysis. After a thorough examination of the full texts, we again had to exclude 249 publications because they did not meet our content-related inclusion criteria mentioned above, although the abstract analysis gave indications that they did. As a result, we finally obtained 47 selected papers on which we base the literature analysis and synthesis (see Fig. 1 ).
2.3 Descriptive analysis
Figure 2 summarises the results of the descriptive analysis on the selected literature regarding AI adoption in production that we analyse in our SLR. From Fig. 2 a), which illustrates annual publication trends (2010–2024), the increase in publications on AI adoption in production over the past 5 years is evident, yet slightly declining after a peak in 2022. After a steady increase until 2022, in which 11 articles are included in the final analysis, 2023 features ten articles, followed by three articles for 2024 until the cut-off date in May 2024. Of the 47 papers identified through our search, the majority (n = 33) are peer-reviewed journal articles and the remaining thirteen contributions conference proceedings and one book chapter (see Fig. 2 b)).
Descriptive analyses of the selected articles addressing AI adoption in production
The identified contributions reveal some additional characteristics in terms of the authors country base (Fig. 2 c)) and research methods used (Fig. 2 d)). Almost four out of ten of the publications were written in collaboration with authors from several countries (n = 19). Six of the papers were published by authors from the United States, five from Germany and four from India. In terms of the applied research methods used by the researchers, a wide range of methods is used (see Fig. 2 c), with qualitative methods (n = 22) being the most frequently used.
2.4 Factor analysis
In order to derive a comprehensive list of factors that influence the use of AI in production at different levels, we follow a qualitative content analysis. It is based on inductive category development, avoiding prefabricated categories in order to allow new categories to emerge based on the content at hand [ 44 , 45 ]. To do this, we first read the entire text to gain an understanding of the content and then derive codes [ 46 ] that seem to capture key ideas [ 45 ]. The codes are subsequently sorted into distinct categories, each of which is clearly defined and establishes meaningful connections between different codes. Based on an iterative process with feedback loops, the assigned categories are continuously reviewed and updated as revisions are made [ 44 ].
Various factors at different levels are of significance to AI and influence technology adoption [ 47 , 48 ]. To identify the specific factors that are of importance for AI adoption in production, we analyze the selected contributions in terms of the factors considered, compare them with each other and consequently obtain a list of factors through a bottom-up approach. While some of the factors are based on empirical findings, others are expected factors that result from the research findings of the respective studies. Through our analysis, a list of 35 factors emerges that influence AI adoption in production which occur with varying frequency in the studies analyzed by our SLR. Table 1 visualizes each factor in the respective contributions sorted by the frequency of occurrence.
The presence of skills is considered a particularly important factor in AI adoption in the studies analyzed (n = 35). The availability of data (n = 25) as well as the need for ethical guidelines (n = 24) are also seen as key drivers of AI adoption, as data is seen as the basis for the implementation of AI and ethical issues must be addressed in handling such an advanced technology. As such, these three factors make up the accelerants of AI adoption in production that are most frequently cited in the studies analyzed.
Also of importance are issues of managerial support (n = 22), as well as performance measures and IT infrastructure (n = 20). Some factors were also mentioned, but only addressed by one study at a time: government support, industrial sector, product complexity, batch size, and R&D Intensity. These factors are often used as quantitatively measurable adoption factors, especially in empirical surveys, such the study by Kinkel, Baumgartner, Cherubini [ 36 ].
3 Factors influencing AI adoption
The 35 factors presented characteristically in Sect. 2.4 serve as the basis for our in-depth analysis and for developing a framework of influences on AI adoption in production which are grouped into supercategories. A supercategory describes a cluster of topics to which various factors of AI adoption in production can be assigned. We were able to define seven categories that influence AI adoption in production: the internal influences of ‘business and structure’, ‘organizational effectiveness’, ‘technology and system’, ‘data management’ as well as the external influences of the ‘regulatory environment’, ‘business environment’ and ‘economic environment’ (see Fig. 3 ). The factors that were mentioned most frequently (occurrence in at least half of the papers analyzed) are marked accordingly (*) in Fig. 3 .
Framework of factors influencing AI adoption in production
3.1 Internal Environment
The internal influences on AI adoption in production refer to factors that an organization carries internally and that thus also influence adoption from within. Such factors can usually be influenced and clearly controlled by the organization itself.
3.1.1 Business and structure
The supercategory ‘business and structure’ includes the various factors and characteristics that impact a company’s performance, operations, and strategic decision-making. By considering and analyzing these business variables when implementing AI in production processes, companies can develop effective strategies to optimize their performance, increase their competitiveness, and adapt to changes in the business environment.
To understand and grasp the benefits in the use of AI, quantitative performance measures for the current and potential use of AI in industrial production systems help to clarify the value and potential benefits of AI use [ 49 , 54 , 74 , 79 , 91 ]. Assessing possible risks [ 77 ] as well as the monetary expected benefits for AI (e.g. Return on Investment (ROI)) in production plays an important role for adoption decisions in market-oriented companies [ 57 , 58 , 63 , 65 , 78 ]. Due to financial constraints, managers behave cautiously in their investments [ 78 ], so they need to evaluate AI adoption as financially viable to want to make the investment [ 61 , 63 , 93 ] and also drive acceptance [ 60 ]. AI systems can significantly improve cost–benefit structures in manufacturing, thereby increasing the profitability of production systems [ 73 ] and making companies more resilient [ 75 ]. However, in most cases, the adoption of AI requires high investments and the allocation of resources (s.a. personnel or financial) for this purpose [ 50 , 51 , 57 , 80 , 94 ]. Consequently, a lack of budgets and high expected transition costs often hinder the implementation of smart concepts [ 56 , 62 , 67 , 82 , 84 , 92 ]. It is up to management to provide necessary funding for AI adoption [ 53 , 59 , 79 ], which is required, for example, for skill development of employees [ 59 , 61 , 63 ], IT adaptation [ 62 , 66 ], AI development [ 74 ] or hardware deployment [ 68 ]. In their empirical study, Kinkel, Baumgartner, Cherubini [ 36 ] confirm a positive correlation between company size and the intensity in the use of AI technologies. Large companies generally stand out with a higher propensity to adopt [ 53 ] as they have less difficulties in comparison to small firms regarding the availability of resources [ 69 ], such as know-how, budget [ 68 , 84 ] and general data organization [ 68 ]. Others argue that small companies tend to be more open to change and are characterized by faster decision-making processes [ 68 , 93 ]. Product complexity also influences a company’s propensity for AI. Companies that produce rather simple products are more likely to digitize, which in turn offers good starting points for AI adoption. On the other hand, complex product manufacturers (often characterized by small batch sizes) are often less able to standardize and automate [ 36 ]. The company’s produced batch size has a similar influence on AI adoption. Small and medium batch sizes in particular hinder the integration of intelligent technologies, as less automation often prevails here as well. Nevertheless, even small and medium lot sizes can benefit economically from AI [ 36 ]. Since a high R&D intensity indicates a high innovation capability of a company, it is assumed to have a positive influence on AI adoption, as companies with a high R&D intensity already invest heavily in and use new innovations. This in turn speaks for existing competencies, know how and structures [ 36 ].
3.1.2 Organizational effectiveness
This supercategory focuses on the broader aspects that contribute to the effectiveness, development, and success of an organization when implementing AI in a production context. As the factors are interconnected and influence each other, decision makers should consider them carefully.
Users´ trust in AI is an essential factor to enable successful AI adoption and use in production [ 52 , 68 , 78 , 79 , 88 , 90 ]. From the users´ perspective, AI often exhibits the characteristics of a black box because its inherent processes are not fully understood [ 50 , 90 ] which can lead individuals to develop a fear towards the unknown [ 71 ]. Because of this lack of understanding, successful interaction between humans and AI is not guaranteed [ 90 ], as trust is a foundation for decisions that machines are intended to make autonomously [ 52 , 91 ]. To strengthen faith in AI systems [ 76 , 80 ], AI users can be involved in AI design processes in order to understand appropriate tools [ 54 , 90 ]. In this context, trust is also discussed in close connection with transparency and regulation [ 79 ]. User resistance is considered a barrier to implementing new information technologies, as adoption requires change [ 53 , 62 , 92 ]. Ignorance, as a kind of resistance to change, is a main obstacle to successful digital transformation [ 51 , 56 , 65 ]. Some employees may resist the change brought about by AI because they fear losing their jobs [ 52 ] or have other concerns [ 78 ]. Overcoming resistance to technology adoption requires organizational change and is critical for the success of adoption [ 50 , 51 , 62 , 67 , 71 , 80 ]. Therefore, change management is important to create awareness of the importance of AI adoption and increase acceptance of the workforce [ 66 , 68 , 74 , 83 ]. Management commitment is seen as a significant driver of technology adoption [ 53 , 59 , 81 , 82 , 86 ] and a lack of commitment can negatively impact user adoption and workforce trust and lead to skepticism towards technology [ 86 ]. The top management’s understanding and support for the benefits of the adopted technology [ 53 , 56 , 67 , 78 , 93 , 94 ] enhances AI adoption, can prioritize its implementation and also affects the performance of the AI-enabled application [ 55 , 60 , 83 ]. Preparing, enabling, and thus empowering the workforce, are considered the management’s responsibility in the adoption of digital technologies [ 59 , 75 ]. This requires intelligent leadership [ 52 ] as decision makers need to integrate their workforce into decision-making processes [ 75 ]. Guidelines can support managers by providing access to best practices that help in the adoption of AI [ 50 ]. Critical measures to manage organizational change include the empowerment of visionaries or appointed AI champions leading the change and the collaborative development of digital roadmaps [ 54 , 62 ]. To demonstrate management commitment, managers can create such a dedicated role, consisting of an individual or a small group that is actively and enthusiastically committed to AI adoption in production. This body is considered the adoption manager, point of contact and internal driver of adoption [ 62 , 74 , 80 ]. AI initiatives in production do not necessarily have to be initiated by management. Although management support is essential for successful AI adoption, employees can also actively drive integration initially and thus realize pilot projects or initial trials [ 66 , 80 ]. The development of strategies as well as roadmaps is considered another enabling and necessary factor for the adoption of AI in production [ 50 , 53 , 54 , 62 , 71 , 93 ]. While many major AI strategies already exist at country level to further promote research and development of AI [ 87 ], strategy development is also important at the firm level [ 76 , 77 , 81 ]. In this context, strategies should not be delegated top-down, but be developed in a collaborative manner, i.e. by engaging the workforce [ 75 ] and be in alignment with clear visions [ 91 , 94 ]. Roadmaps are used to improve planning, support implementation, facilitate the adoption of smart technologies in manufacturing [ 93 ] and should be integrated into both business and IT strategy [ 62 , 66 ]. In practice, clear adoption roadmaps that provide approaches on how to effectively integrate AI into existing strategies and businesses are often lacking [ 56 , 87 ]. The need for AI-related skills in organizations is a widely discussed topic in AI adoption analyses [ 79 ]. In this context, the literature points both at the need for specific skills in the development and design of AI applications [ 57 , 71 , 72 , 73 , 76 , 93 ] as well as the skills in using the technology [ 53 , 65 , 73 , 74 , 75 , 84 , 93 ] which availability in the firm is not always given [ 49 ]. AI requires new digital skills [ 36 , 50 , 52 , 55 , 56 , 59 , 61 , 63 , 66 , 78 , 80 ], where e.g. advanced analytics [ 64 , 75 , 81 ], programming skills [ 68 ] and cybersecurity skills [ 78 , 93 ] gain importance. The lack of skills required for AI is seen as a major challenge of digital transformation, as a skilled workforce is considered a key resource for companies [ 51 , 54 , 56 , 60 , 62 , 67 , 69 , 70 , 82 , 93 ]. This lack of a necessary skillset hinders the adoption of AI tools in production systems [ 58 , 77 ]. Closely related to skills is the need for new training concepts, which organizations need to consider when integrating digital technologies [ 49 , 50 , 51 , 56 , 59 , 63 , 71 , 74 , 75 ]. Firms must invest in qualification in order to create necessary competences [ 73 , 78 , 80 , 81 , 92 ]. Additionally, education must target and further develop the skills required for effectively integrating intelligent technologies into manufacturing processes [ 54 , 61 , 62 , 83 ]. Regarding this issue, academic institutions must develop fitting curricula for data driven manufacturing engineering [ 64 ]. Another driving factor of AI adoption is the innovation culture of an organization, which is influenced by various drivers. For example, companies that operate in an environment with high innovation rates, facing intense competitive pressures are considered more likely to see smart technologies as a tool for strategic change [ 83 , 91 , 93 ]. These firms often invest in more expensive and advanced smart technologies as the pressure and resulting competition forces them to innovate [ 93 ]. Another way of approach this is that innovation capability can also be supported and complemented by AI, for example by intelligent systems supporting humans in innovation or even innovating on their own [ 52 ].The entrepreneurial orientation of a firm is characterized in particular by innovativeness [ 66 ], productivity [ 63 ], risk-taking [ 86 ] as well as continuous improvement [ 50 ]. Such characteristics of an innovating culture are considered essential for companies to recognise dynamic changes in the market and make adoption decisions [ 51 , 71 , 81 , 84 , 86 , 94 ]. The prevalence of a digital mindset in companies is important for technology adoption, as digital transformation affects the entire organizational culture and behavior [ 59 , 80 , 92 ] and a lack of a digital culture [ 50 , 65 ] as well as a ‘passive mindset’ [ 78 ] can hinder the digital transformation of firms. Organizations need to develop a corresponding culture [ 66 , 67 , 71 ], also referred to as ‘AI-ready-culture’ [ 54 ], that promotes development and encourages people and data through the incorporation of technology [ 71 , 75 ]. With the increasing adoption of smart technologies, a ‘new digital normal’ is emerging, characterized by hybrid work models, more human–machine interactions and an increased use of digital technologies [ 75 , 83 ].
3.1.3 Technology and System
The ‘technology and system’ supercategory focuses on the broader issues related to the technology and infrastructure that support organizational operations and provide the technical foundation for AI deployment.
By IT infrastructure we refer to issues regarding the foundational systems and IT needed for AI adoption in production. Industrial firms and their IT systems must achieve a mature technological readiness in order to enable successful AI adoption [ 51 , 60 , 67 , 69 , 83 ]. A lack of appropriate IT infrastructure [ 68 , 71 , 78 , 91 ] or small maturity of Internet of Things (IoT) technologies [ 70 ]) hinders the efficient use of data in production firms [ 56 ] which is why firms must update their foundational information systems for successful AI adoption [ 53 , 54 , 62 , 66 , 72 , 75 ]. IT and data security are fundamental for AI adoption and must be provided [ 50 , 51 , 68 , 82 ]. This requires necessary developments that can ensure security during AI implementation while complying with legal requirements [ 52 , 72 , 78 ]. Generally, security concerns are common when implementing AI innovations [ 72 , 79 , 91 , 94 ]. This fear of a lack of security can also prevent the release of (e.g. customer) data in a production environment [ 56 ]. Additionally, as industrial production systems are vulnerable to failures as well as cyberattacks, companies need to address security and cybersecurity measures [ 49 , 76 , 88 , 89 ]. Developing user-friendly AI solutions can facilitate the adoption of smart solutions by increasing user understanding and making systems easy to use by employees as well as quick to integrate [ 50 , 72 , 84 ]. When developing user-friendly solutions which satisfy user needs [ 76 ], it is particularly important to understand and integrate the user perspective in the development process [ 90 ]. If employees find technical solutions easy to use, they are more confident in its use and perceived usefulness increases [ 53 , 67 , 68 ]. The compatibility of AI with a firm and its existing systems, i.e., the extent to which AI matches existing processes, structures, and infrastructures [ 53 , 54 , 56 , 60 , 78 , 80 , 82 , 83 , 93 , 94 ], is considered an important requirement for the adoption of AI in IT systems [ 91 ]. Along with compatibility also comes connectivity, which is intended to ensure the links within the overall network and avoid silo thinking [ 59 ]. Connectivity and interoperability of AI-based processes within the company’s IT manufacturing systems must be ensured at different system levels and are considered key factors in the development of AI applications for production [ 50 , 72 , 89 ]. The design of modular AI solutions can increase system compatibility [ 84 ]. Firms deciding for AI adoption must address safety issues [ 51 , 54 , 59 , 72 , 73 , 78 ]. This includes both safety in the use and operation of AI [ 60 , 69 ]. In order to address safety concerns of integrating AI solutions in industrial systems [ 49 ], systems must secure high reliability [ 71 ]. AI can also be integrated as a safety enabler, for example, by providing technologies to monitor health and safety in the workplace to prevent fatigue and injury [ 75 ].
3.1.4 Data management
Since AI adoption in the organization is strongly data-driven, the ‘data management’ supercategory is dedicated to the comprehensive aspects related to the effective and responsible management of data within the organization.
Data privacy must be guaranteed when creating AI applications based on industrial production data [ 49 , 58 , 59 , 60 , 72 , 76 , 78 , 79 , 82 , 88 , 89 , 91 , 94 ] as ‘[M]anufacturing industries generate large volumes of unstructured and sensitive data during their daily operations’ [ 89 ]. Closely related to this is the need for anonymization and confidentiality of data [ 61 , 69 , 70 , 78 ]. The availability of large, heterogeneous data sets is essential for the digital transformation of organizations [ 52 , 59 , 78 , 80 , 88 , 89 ] and is considered one of the key drivers of AI innovation [ 62 , 68 , 72 , 86 ]. In production systems, lack of data availability is often a barrier to AI adoption [ 58 , 70 , 77 ]. In order to enable AI to establish relationships between data, the availability of large input data that is critical [ 62 , 76 , 81 ]. New AI models are trained with this data and can adapt as well as improve as they receive new data [ 59 , 62 ]. Big data can thus significantly improve the quality of AI applications [ 59 , 71 ]. As more and more data is generated in manufacturing [ 85 ], AI opens up new opportunities for companies to make use of it [ 62 ]. However, operational data are often unstructured, as they come from different sources and exist in diverse formats [ 85 , 87 ]. This challenges data processing, as data quality and origin are key factors in the management of data [ 78 , 79 , 80 , 88 , 89 , 91 ]. To make production data valuable and usable for AI, consistency of data and thus data integrity is required across manufacturing systems [ 50 , 62 , 77 , 84 ]. Another key prerequisites for AI adoption is data governance [ 56 , 59 , 67 , 68 , 71 , 78 , 88 ] which is an important asset to make use of data in production [ 50 ] and ensure the complex management of heterogenous data sets [ 89 ]. The interoperability of data and thus the foundation for the compatibility of AI with existing systems, i.e., the extent to which AI matches existing processes, structures, and infrastructures [ 53 , 56 , 84 , 93 ], is considered another important requirement for the adoption of AI in IT systems. Data interoperability in production systems can be hindered by missing data standards as different machines use different formats [ 87 ]. Data processing refers to techniques used to preparing data for analysis which is essential to obtain consistent results from data analytics in production [ 58 , 72 , 80 , 81 , 84 ]. In this process, the numerous, heterogeneous data from different sensors are processed in such a way that they can be used for further analyses [ 87 ]. The capability of production firms to process data and information is thus important to enable AI adoption [ 77 , 86 , 93 ]. With the increasing data generation in the smart and connected factory, the strategic relevance of data analytics is gaining importance [ 55 , 69 , 78 ], as it is essential for AI systems in performing advanced data analyses [ 49 , 67 , 72 , 86 , 88 ]. Using analytics, valuable insights can be gained from the production data obtained using AI systems [ 58 , 77 , 87 ]. In order to enable the processing of big data, a profound data infrastructure is necessary [ 65 , 75 , 87 ]. Facilities must be equipped with sensors, that collect data and model information, which requires investments from firms [ 72 ]. In addition, production firms must build the necessary skills, culture and capabilities for data analytics [ 54 , 75 , 87 , 93 ]. Data storage, one of the foundations and prerequisites for smart manufacturing [ 54 , 68 , 71 , 74 ], must be ensured in order to manage the larg amounts of data and thus realize the adoption of intelligent technologies in production [ 50 , 59 , 72 , 78 , 84 , 87 , 88 , 89 ].
3.2 External environment
The external drivers of AI adoption in production influence the organization through conditions and events from outside the firm and are therefore difficult to control by the organization itself.
3.2.1 Regulatory environment
This supercategory captures the broader concept of establishing rules, standards, and frameworks that guide the behavior, actions, and operations of individuals, organizations, and societies when implementing AI.
AI adoption in production faces many ethical challenges [ 70 , 72 , 79 ]. AI applications must be compliant with the requirements of organizational ethical standards and laws [ 49 , 50 , 59 , 60 , 62 , 75 ] which is why certain issues must be examined in AI adoption and AI design [ 62 , 73 , 82 , 91 ] so that fairness and justice are guaranteed [ 78 , 79 , 92 ]. Social rights, cultural values and norms must not be violated in the process [ 49 , 52 , 53 , 81 ]. In this context, the explainability and transparency of AI decisions also plays an important role [ 50 , 54 , 58 , 70 , 78 , 89 ] and can address the characteristic of AI of a black box [ 90 ]. In addition, AI applications must be compliant with legal and regulatory requirements [ 51 , 52 , 59 , 77 , 81 , 82 , 91 ] and be developed accordingly [ 49 , 76 ] in order to make organization processes using AI clear and effective [ 65 ]. At present, policies and regulation of AI are still in its infancy [ 49 ] and missing federal regulatory guidelines, standards as well as incentives hinder the adoption of AI [ 67 ] which should be expanded simultaneously to the expansion of AI technology [ 60 ]. This also includes regulations on the handling of data (e.g. anonymization of data) [ 61 , 72 ].
3.2.2 Business environment
The factors in the ‘business environment’ supercategory refer to the external conditions and influences that affect the operations, decision making, and performance of the company seeking to implement AI in a production context.
Cooperation and collaboration can influence the success of digital technology adoption [ 52 , 53 , 59 , 72 ], which is why partnerships are important for adoption [ 53 , 59 ] and can positively influence its future success [ 52 , 67 ]. Both intraorganizational and interorganizational knowledge sharing can positively influence AI adoption [ 49 ]. In collaborations, companies can use a shared knowledge base where data and process sharing [ 51 , 59 , 94 ] as well as social support systems strengthen feedback loops between departments [ 79 , 80 ]. With regard to AI adoption in firms, vendors as well as service providers need to collaborate closely to improve the compatibility and operational capability of smart technologies across different industries [ 82 , 93 ]. Without external IT support, companies can rarely integrate AI into their production processes [ 66 ], which is why thorough support from vendors can significantly facilitate the integration of AI into existing manufacturing processes [ 80 , 91 ]. Public–private collaborations can also add value and governments can target AI dissemination [ 60 , 74 ]. The support of the government also positively influences AI adoption. This includes investing in research projects and policies, building a regulatory setting as well as creating a collaborative environment [ 60 ]. Production companies are constantly exposed to changing conditions, which is why the dynamics of the environment is another factor influencing the adoption of AI [ 52 , 63 , 72 , 86 ]. Environmental dynamics influence the operational performance of firms and can favor an entrepreneurial orientation of firms [ 86 ]. In order to respond to dynamics, companies need to develop certain capabilities and resources (i.e. dynamic capabilities) [ 86 ]. This requires the development of transparency, agility, as well as resilience to unpredictable changes, which was important in the case of the COVID-19 pandemic, for example, where companies had to adapt quickly to changing environments [ 75 ]. A firm’s environment (e.g. governments, partners or customers) can also pressure companies to adopt digital technologies [ 53 , 67 , 82 , 91 ]. Companies facing intense competition are considered more likely to invest in smart technologies, as rivalry pushes them to innovate and they hope to gain competitive advantages from adoption [ 36 , 66 , 82 , 93 ].
3.2.3 Economic environment
By considering both the industrial sector and country within the subcategory ‘economic environment’, production firms can analyze the interplay between the two and understand how drivers can influence the AI adoption process in their industrial sector’s performance within a particular country.
The industrial sector of a firm influences AI adoption in production from a structural perspective, as it indicates variations in product characteristics, governmental support, the general digitalization status, the production environment as well as the use of AI technologies within the sector [ 36 ]. Another factor that influences AI adoption is the country in which a company is located. This influences not only cultural aspects, the availability of know-how and technology orientation, but also regulations, laws, standards and subsidies [ 36 ]. From another perspective, AI can also contribute to the wider socio-economic growth of economies by making new opportunities easily available and thus equipping e.g. more rural areas with advanced capabilities [ 78 ].
3.3 Future research directions
The analysis of AI adoption in production requires a comprehensive analysis of the various factors that influence the introduction of the innovation. As discussed by Kinkel, Baumgartner, Cherubini [ 36 ], our research also concludes that organizational factors have a particularly important role to play. After evaluating the individual drivers of AI adoption in production in detail in this qualitative synthesis, we draw a conclusion from the results and derive a research agenda from the analysis to serve as a basis for future research. The RQs emerged from the analyzed factors and are presented in Table 2 . We developed the questions based on the literature review and identified research gaps for every factor that was most frequently mentioned. From the factors analyzed and RQs developed, the internal environment has a strong influence on AI adoption in production, and organizational factors play a major role here.
Looking at the supercategory ‘business and environment’, performance indicators and investments are considered drivers of AI adoption in production. Indicators to measure the performance of AI innovations are necessary here so that managers can perform cost–benefit analyses and make the right decision for their company. There is a need for research here to support possible calculations and show managers a comprehensive view of the costs and benefits of technology in production. In terms of budget, it should be noted that AI adoption involves a considerable financial outlay that must be carefully weighed and some capital must be available to carry out the necessary implementation efforts (e.g., staffing costs, machine retrofits, change management, and external IT service costs). Since AI adoption is a complex process and turnkey solutions can seldom be implemented easily and quickly, but require many changes (not only technologically but also on an organizational level), it is currently difficult to estimate the necessary budgets and thus make them available. Especially the factors of the supercategory ‘organizational effectiveness’ drive AI adoption in production. Trust of the workforce is considered an important driver, which must be created in order to successfully implement AI. This requires measures that can support management in building trust. Closely related to this are the necessary change management processes that must be initiated to accompany the changes in a targeted manner. Management itself must also play a clear role in the introduction of AI and communicate its support, as this also influences the adoption. The development of clear processes and measures can help here. Developing roadmaps for AI adoption can facilitate the adoption process and promote strategic integration with existing IT and business strategy. Here, best practice roadmaps and necessary action steps can be helpful for companies. Skills are considered the most important driver for AI adoption in manufacturing. Here, there is a lack of clear approaches that support companies in identifying the range of necessary skills and, associated with this, also opportunities to further develop these skills in the existing workforce. Also, building a culture of innovation requires closer research that can help companies foster a conducive environment for AI adoption and the integration of other smart technologies. Steps for developing a positive mindset require further research that can provide approaches for necessary action steps and measures in creating a positive digital culture. With regard to ‘technology and system’, the factors of IT infrastructure and security in particular are driving AI adoption in production. Existing IT systems must reach a certain maturity to enable AI adoption on a technical level. This calls for clear requirements that visualize for companies which systems and standards are in place and where developments are needed. Security must be continuously ensured, for which certain standards and action catalogs must be developed. With regard to the supercategory ‘data management’, the availability of data is considered the basis for successful AI adoption, as no AI can be successfully deployed without data. In the production context in particular, this requires developments that support companies in the provision of data, which usually arises from very heterogeneous sources and forms. Data analytics must also be closely examined, and production companies usually need external support in doing so. The multitude of data also requires big data storage capabilities. Here, groundwork is needed to show companies options about the possibilities of different storage options (e.g., on premis vs. cloud-based).
In the ‘regulatory environment’, ethics in particular is considered a driver of AI adoption in production. Here, fundamental ethical factors and frameworks need to be developed that companies can use as a guideline to ensure ethical standards throughout the process. Cooperations and environmental dynamism drive the supercategory ‘business environment’. Collaborations are necessary to successfully implement AI adoption and action is needed to create the necessary contact facilitation bodies. In a competitive environment, companies have to make quick decisions under strong pressure, which also affects AI adoption. Here, guidelines and also best practice approaches can help to simplify decisions and quickly demonstrate the advantage of the solutions. There is a need for research in this context.
4 Conclusions
The use of AI technologies in production continues to gain momentum as managers hope to increase efficiency, productivity and reduce costs [ 9 , 13 , 20 ]. Although the benefits of AI adoption speak for themselves, implementing AI is a complex decision that requires a lot of knowledge, capital and change [ 95 ] and is influenced by various internal and external factors. Therefore, managers are still cautious about implementing the technology in a production context. Our SLR seeks to examine the emergent phenomenon of AI in production with the precise aim of understanding the factors influencing AI adoption and the key topics discussed in the literature when analyzing AI in a production context. For this purpose, we use the current state of research and examine the existing studies based on the methodology of a systematic literature analysis and respond to three RQs.
We answer RQ1 by closely analyzing the literature selected in our SLR to identify trends in current research on AI adoption in production. In this process, it becomes clear that the topic is gaining importance and that research has increased over the last few years. In the field of production, AI is being examined from various angles and current research addresses aspects from a business, human and technical perspective. In our response to RQ2 we synthesized the existing literature to derive 35 factors that influence AI adoption in production at different levels from inside or outside the organization. In doing so, we find that AI adoption in production poses particularly significant challenges to organizational effectiveness compared to other digital technologies and that the relevance of data management takes on a new dimension. Production companies often operate more traditionally and are sometimes rigid when it comes to change [ 96 , 97 ], which can pose organizational challenges when adopting AI. In addition, the existing machines and systems are typically rather heterogeneous and are subject to different digitalization standards, which in turn can hinder the availability of the necessary data for AI implementation [ 98 , 99 ]. We address RQ3 by deriving a research agenda, which lays a foundation for further scientific research and deepening the understanding of AI adoption in production. The results of our analysis can further help managers to better understand AI adoption and to pay attention to the different factors that influence the adoption of this complex technology.
4.1 Contributions
Our paper takes the first step towards analysing the current state of the research on AI adoption from a production perspective. We represent a holistic view on the topic, which is necessary to get a better understanding of AI in a production-context and build a comprehensive view on the different dimensions as well as factors influencing its adoption. To the best of our knowledge, this is the first contribution that systematises research about the adoption of AI in production. As such, it makes an important contribution to current AI and production research, which is threefold:
First, we highlight the characteristics of studies conducted in recent years on the topic of AI adoption in production, from which several features and developments can be deduced. Our results confirm the topicality of the issue and the increasing relevance of research in the field.
Having laid the foundations for understanding AI in production, we focused our research on the identification and systematization of the most relevant factors influencing AI adoption in production at different levels. This brings us to the second contribution, our comprehensive factor analysis of AI adoption in production provides a framework for further research as well as a potential basis for managers to draw upon when adopting AI. By systematizing the relevant factors influencing AI adoption in production, we derived a set of 35 researched factors associated with AI adoption in production. These factors can be clustered in two areas of analysis and seven respective supercategories. The internal environment area includes four levels of analysis: ‘business and structure’ (focusing on financial aspects and firm characteristics), ‘organizational effectiveness’ (focusing on human-centred factors), ‘technology and system’ (based on the IT infrastructure and systems) as well as ‘data management’ (including all data related factors). Three categories are assigned to the external environment: the ‘regulatory environment’ (such as ethics and the regulatory forms), the ‘business environment’ (focused on cooperation activities and dynamics in the firm environment) and the ‘economic environment’ (related to sectoral and country specifics).
Third, the developed research plan as outlined in Table 2 serves as an additional outcome of the SLR, identifying key RQs in the analyzed areas that can serve as a foundation for researchers to expand the research area of AI adoption in production. These RQs are related to the mostly cited factors analyzed in our SLR and aim to broaden the understanding on the emerging topic.
The resulting insights can serve as the basis for strategic decisions by production companies looking to integrate AI into their processes. Our findings on the factors influencing AI adoption as well as the developed research agenda enhance the practical understanding of a production-specific adoption. Hence, they can serve as the basis for strategic decisions for companies on the path to an effective AI adoption. Managers can, for example, analyse the individual factors in light of their company as well as take necessary steps to develop further aspects in a targeted manner. Researchers, on the other hand, can use the future research agenda in order to assess open RQs and can expand the state of research on AI adoption in production.
4.2 Limitations
Since a literature review must be restricted in its scope in order to make the analyses feasible, our study provides a starting point for further research. Hence, there is a need for further qualitative and quantitative empirical research on the heterogeneous nature of how firms configure their AI adoption process. Along these lines, the following aspects would be of particular interest for future research to improve and further validate the analytical power of the proposed framework.
First, the lack of research on AI adoption in production leads to a limited number of papers included in this SLR. As visualized in Fig. 2 , the number of publications related to the adoption of AI in production has been increasing since 2018 but is, to date, still at an early stage. For this reason, only 47 papers published until May 2024 addressing the production-specific adoption of AI were identified and therefore included in our analysis for in-depth investigation. This rather small number of papers included in the full-text analysis gives a limited view on AI adoption in production but allows a more detailed analysis. As the number of publications in this research field increases, there seems to be a lot of research happening in this field which is why new findings might be constantly added and developed as relevant in the future [ 39 ]. Moreover, in order to research AI adoption from a more practical perspective and thus to build up a broader, continuously updated view on AI adoption in production, future literature analyses could include other publication formats, e.g. study reports of research institutions and companies, as well discussion papers.
Second, the scope of the application areas of AI in production has been increasing rapidly. Even though our overview of the three main areas covered in the recent literature serves as a good basis for identifying the most dominant fields for AI adoption in production, a more detailed analysis could provide a better overview of possibilities for manufacturing companies. Hence, a further systematisation as well as evaluation of application areas for AI in production can provide managers with the information needed to decide where AI applications might be of interest for the specific company needs.
Third, the systematisation of the 35 factors influencing AI adoption in production serve as a good ground for identifying relevant areas influenced by and in turn influencing the adoption of AI. Further analyses should be conducted in order to extend this view and extend the framework. For example, our review could be combined with explorative research methods (such as case studies in production firms) in order to add the practical insights from firms adopting AI. This integration of practical experiences can also help exploit and monitor more AI-specific factors by observing AI adoption processes. In enriching the factors through in-depth analyses, the results of the identified AI adoption factors could also be examined in light of theoretical contributions like the technology-organization-environment (TOE) framework [ 47 ] and other adoption theories.
Fourth, in order to examine the special relevance of identified factors for AI adoption process and thus to distinguish it from the common factors influencing the adoption of more general digital technologies, there is a further need for more in-depth (ethnographic) research into their impacts on the adoption processes, particularly in the production context. Similarly, further research could use the framework introduced in this paper as a basis to develop new indicators and measurement concepts as well as to examine their impacts on production performance using quantitative methods.
Benner MJ, Waldfogel J (2020) Changing the channel: digitization and the rise of “middle tail” strategies. Strat Mgmt J 86:1–24. https://doi.org/10.1002/smj.3130
Article Google Scholar
Roblek V, Meško M, Krapež A (2016) A complex view of industry 4.0. SAGE Open. https://doi.org/10.1177/2158244016653987
Oliveira BG, Liboni LB, Cezarino LO et al (2020) Industry 4.0 in systems thinking: from a narrow to a broad spectrum. Syst Res Behav Sci 37:593–606. https://doi.org/10.1002/sres.2703
Li B, Hou B, Yu W et al (2017) Applications of artificial intelligence in intelligent manufacturing: a review. Frontiers Inf Technol Electronic Eng 18:86–96. https://doi.org/10.1631/FITEE.1601885
Dhamija P, Bag S (2020) Role of artificial intelligence in operations environment: a review and bibliometric analysis. TQM 32:869–896. https://doi.org/10.1108/TQM-10-2019-0243
Collins C, Dennehy D, Conboy K et al (2021) Artificial intelligence in information systems research: a systematic literature review and research agenda. Int J Inf Manage 60:102383. https://doi.org/10.1016/j.ijinfomgt.2021.102383
Chien C-F, Dauzère-Pérès S, Huh WT et al (2020) Artificial intelligence in manufacturing and logistics systems: algorithms, applications, and case studies. Int J Prod Res 58:2730–2731. https://doi.org/10.1080/00207543.2020.1752488
Chen H (2019) Success factors impacting artificial intelligence adoption: perspective from the telecom industry in China, Old Dominion University
Sanchez M, Exposito E, Aguilar J (2020) Autonomic computing in manufacturing process coordination in industry 4.0 context. J Industrial Inf Integr. https://doi.org/10.1016/j.jii.2020.100159
Lee J, Davari H, Singh J et al (2018) Industrial artificial intelligence for industry 4.0-based manufacturing systems. Manufacturing Letters 18:20–23. https://doi.org/10.1016/j.mfglet.2018.09.002
Heimberger H, Horvat D, Schultmann F (2023) Assessing AI-readiness in production—A conceptual approach. In: Huang C-Y, Dekkers R, Chiu SF et al. (eds) intelligent and transformative production in pandemic times. Springer, Cham, pp 249–257
Horvat D, Heimberger H (2023) AI Readiness: An Integrated Socio-technical Framework. In: Deschamps F, Pinheiro de Lima E, Da Gouvêa Costa SE et al. (eds) Proceedings of the 11 th international conference on production research—Americas: ICPR Americas 2022, 1 st ed. 2023. Springer Nature Switzerland; Imprint Springer, Cham, pp 548–557
Wang J, Ma Y, Zhang L et al (2018) Deep learning for smart manufacturing: methods and applications. J Manuf Syst 48:144–156. https://doi.org/10.1016/J.JMSY.2018.01.003
Davenport T, Guha A, Grewal D et al (2020) How artificial intelligence will change the future of marketing. J Acad Mark Sci 48:24–42. https://doi.org/10.1007/s11747-019-00696-0
Cui R, Li M, Zhang S (2022) AI and procurement. Manufacturing Serv Operations Manag 24(691):706. https://doi.org/10.1287/msom.2021.0989
Pournader M, Ghaderi H, Hassanzadegan A et al (2021) Artificial intelligence applications in supply chain management. Int J Prod Econ 241:108250. https://doi.org/10.1016/j.ijpe.2021.108250
Su H, Li L, Tian S et al (2024) Innovation mechanism of AI empowering manufacturing enterprises: case study of an industrial internet platform. Inf Technol Manag. https://doi.org/10.1007/s10799-024-00423-4
Venkatesh V, Raman R, Cruz-Jesus F (2024) AI and emerging technology adoption: a research agenda for operations management. Int J Prod Res 62:5367–5377. https://doi.org/10.1080/00207543.2023.2192309
Senoner J, Netland T, Feuerriegel S (2022) Using explainable artificial intelligence to improve process quality: evidence from semiconductor manufacturing. Manage Sci 68:5704–5723. https://doi.org/10.1287/mnsc.2021.4190
Fosso Wamba S, Queiroz MM, Ngai EWT et al (2024) The interplay between artificial intelligence, production systems, and operations management resilience. Int J Prod Res 62:5361–5366. https://doi.org/10.1080/00207543.2024.2321826
Uren V, Edwards JS (2023) Technology readiness and the organizational journey towards AI adoption: an empirical study. Int J Inf Manage 68:102588. https://doi.org/10.1016/j.ijinfomgt.2022.102588
Berente N, Gu B, Recker J (2021) Managing artificial intelligence special issue managing AI. MIS Quarterly 45:1433–1450
Google Scholar
Scafà M, Papetti A, Brunzini A et al (2019) How to improve worker’s well-being and company performance: a method to identify effective corrective actions. Procedia CIRP 81:162–167. https://doi.org/10.1016/j.procir.2019.03.029
Wang H, Qiu F (2023) AI adoption and labor cost stickiness: based on natural language and machine learning. Inf Technol Manag. https://doi.org/10.1007/s10799-023-00408-9
Lindebaum D, Vesa M, den Hond F (2020) Insights from “the machine stops ” to better understand rational assumptions in algorithmic decision making and its implications for organizations. Acad Manag Rev 45:247–263. https://doi.org/10.5465/amr.2018.0181
Baskerville RL, Myers MD, Yoo Y (2020) Digital first: the ontological reversal and new challenges for information systems research. MIS Quarterly 44:509–523
Frey CB, Osborne MA (2017) The future of employment: How susceptible are jobs to computerisation? Technol Forecast Soc Chang 114:254–280. https://doi.org/10.1016/J.TECHFORE.2016.08.019
Jarrahi MH (2018) Artificial intelligence and the future of work: human-AI symbiosis in organizational decision making. Bus Horiz 61:577–586. https://doi.org/10.1016/j.bushor.2018.03.007
Fügener A, Grahl J, Gupta A et al (2021) Will humans-in-the-loop become borgs? Merits and pitfalls of working with AI. MIS Quarterly 45:1527–1556
Klumpp M (2018) Automation and artificial intelligence in business logistics systems: human reactions and collaboration requirements. Int J Log Res Appl 21:224–242. https://doi.org/10.1080/13675567.2017.1384451
Li J, Li M, Wang X et al (2021) Strategic directions for AI: the role of CIOs and boards of directors. MIS Quarterly 45:1603–1644
Brock JK-U, von Wangenheim F (2019) Demystifying AI: What digital transformation leaders can teach you about realistic artificial intelligence. Calif Manage Rev 61:110–134. https://doi.org/10.1177/1536504219865226
Lee J, Suh T, Roy D et al (2019) Emerging technology and business model innovation: the case of artificial intelligence. JOItmC 5:44. https://doi.org/10.3390/joitmc5030044
Chen J, Tajdini S (2024) A moderated model of artificial intelligence adoption in firms and its effects on their performance. Inf Technol Manag. https://doi.org/10.1007/s10799-024-00422-5
Kinkel S, Baumgartner M, Cherubini E (2022) Prerequisites for the adoption of AI technologies in manufacturing—evidence from a worldwide sample of manufacturing companies. Technovation 110:102375. https://doi.org/10.1016/j.technovation.2021.102375
Mikalef P, Gupta M (2021) Artificial intelligence capability: Conceptualization, measurement calibration, and empirical study on its impact on organizational creativity and firm performance. Inf Manag 58:103434. https://doi.org/10.1016/j.im.2021.103434
McElheran K, Li JF, Brynjolfsson E et al (2024) AI adoption in America: Who, what, and where. Economics Manag Strategy 33:375–415. https://doi.org/10.1111/jems.12576
Tranfield D, Denyer D, Smart P (2003) Towards a methodology for developing evidence-informed management knowledge by means of systematic review. Br J Manag 14:207–222. https://doi.org/10.1111/1467-8551.00375
Cooper H, Hedges LV, Valentine JC (2009) Handbook of research synthesis and meta-analysis. Russell Sage Foundation, New York
Page MJ, McKenzie JE, Bossuyt PM et al (2021) The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ 372:n71. https://doi.org/10.1136/bmj.n71
Denyer D, Tranfield D (2011) Producing a systematic review. In: Buchanan DA, Bryman A (eds) The Sage handbook of organizational research methods. Sage Publications Inc, Thousand Oaks, CA, pp 671–689
Burbidge JL, Falster P, Riis JO et al (1987) Integration in manufacturing. Comput Ind 9:297–305. https://doi.org/10.1016/0166-3615(87)90103-5
Mayring P (2000) Qualitative content analysis. Forum qualitative Sozialforschung/Forum: Qualitative social research, Vol 1, No 2 (2000): Qualitative methods in various disciplines I: Psychology. https://doi.org/10.17169/fqs-1.2.1089
Hsieh H-F, Shannon SE (2005) Three approaches to qualitative content analysis. Qual Health Res 15:1277–1288. https://doi.org/10.1177/1049732305276687
Miles MB, Huberman AM (2009) Qualitative data analysis: An expanded sourcebook, 2nd edn. Sage, Thousand Oaks, Calif
Tornatzky LG, Fleischer M (1990) The processes of technological innovation. Issues in organization and management series. Lexington Books, Lexington, Mass.
Alsheibani S, Cheung Y, Messom C (2018) Artificial Intelligence Adoption: AI-readiness at Firm-Level: Research-in-Progress. Twenty-Second Pacific Asia Conference on Information Systems
Akinsolu MO (2023) Applied artificial intelligence in manufacturing and industrial production systems: PEST considerations for engineering managers. IEEE Eng Manag Rev 51:52–62. https://doi.org/10.1109/EMR.2022.3209891
Bettoni A, Matteri D, Montini E et al (2021) An AI adoption model for SMEs: a conceptual framework. IFAC-PapersOnLine 54:702–708. https://doi.org/10.1016/j.ifacol.2021.08.082
Boavida N, Candeias M (2021) Recent automation trends in portugal: implications on industrial productivity and employment in automotive sector. Societies 11:101. https://doi.org/10.3390/soc11030101
Botha AP (2019) A mind model for intelligent machine innovation using future thinking principles. Jnl of Manu Tech Mnagmnt 30:1250–1264. https://doi.org/10.1108/JMTM-01-2018-0021
Chatterjee S, Rana NP, Dwivedi YK et al (2021) Understanding AI adoption in manufacturing and production firms using an integrated TAM-TOE model. Technol Forecast Soc Chang 170:120880. https://doi.org/10.1016/j.techfore.2021.120880
Chiang LH, Braun B, Wang Z et al (2022) Towards artificial intelligence at scale in the chemical industry. AIChE J. https://doi.org/10.1002/aic.17644
Chouchene A, Carvalho A, Lima TM et al. (2020) Artificial intelligence for product quality inspection toward smart industries: quality control of vehicle Non-conformities. In: Garengo P (ed) 2020 9th International Conference on Industrial Technology and Management: ICITM 2020 February 11–13, 2020, Oxford, United Kingdom. IEEE, pp 127–131
Corti D, Masiero S, Gladysz B (2021) Impact of Industry 4.0 on Quality Management: identification of main challenges towards a Quality 4.0 approach. In: 2021 IEEE International Conference on Engineering, Technology and Innovation (ICE/ITMC). IEEE, pp 1–8
Demlehner Q, Schoemer D, Laumer S (2021) How can artificial intelligence enhance car manufacturing? A Delphi study-based identification and assessment of general use cases. Int J Inf Manage 58:102317. https://doi.org/10.1016/j.ijinfomgt.2021.102317
Dohale V, Akarte M, Gunasekaran A et al (2022) (2022) Exploring the role of artificial intelligence in building production resilience: learnings from the COVID-19 pandemic. Int J Prod Res 10(1080/00207543):2127961
Drobot AT (2020) Industrial Transformation and the Digital Revolution: A Focus on artificial intelligence, data science and data engineering. In: 2020 ITU Kaleidoscope: Industry-Driven Digital Transformation (ITU K). IEEE, pp 1–11
Ghani EK, Ariffin N, Sukmadilaga C (2022) Factors influencing artificial intelligence adoption in publicly listed manufacturing companies: a technology, organisation, and environment approach. IJAEFA 14:108–117
Hammer A, Karmakar S (2021) Automation, AI and the future of work in India. ER 43:1327–1341. https://doi.org/10.1108/ER-12-2019-0452
Hartley JL, Sawaya WJ (2019) Tortoise, not the hare: digital transformation of supply chain business processes. Bus Horiz 62:707–715. https://doi.org/10.1016/j.bushor.2019.07.006
Kyvik Nordås H, Klügl F (2021) Drivers of automation and consequences for jobs in engineering services: an agent-based modelling approach. Front Robot AI 8:637125. https://doi.org/10.3389/frobt.2021.637125
Mubarok K, Arriaga EF (2020) Building a smart and intelligent factory of the future with industry 4.0 technologies. J Phys Conf Ser. https://doi.org/10.1088/1742-6596/1569/3/032031
Muriel-Pera YdJ, Diaz-Piraquive FN, Rodriguez-Bernal LP et al. (2018) Adoption of strategies the fourth industrial revolution by micro, small and medium enterprises in bogota D.C. In: Lozano Garzón CA (ed) 2018 Congreso Internacional de Innovación y Tendencias en Ingeniería (CONIITI). IEEE, pp 1–6
Olsowski S, Schlögl S, Richter E et al. (2022) Investigating the Potential of AutoML as an Instrument for Fostering AI Adoption in SMEs. In: Uden L, Ting I-H, Feldmann B (eds) Knowledge Management in Organisations: 16th International Conference, KMO 2022, Hagen, Germany, July 11–14, 2022, Proceedings, 1st ed. 2022, vol 1593. Springer, Cham, pp 360–371
Rodríguez-Espíndola O, Chowdhury S, Dey PK et al (2022) Analysis of the adoption of emergent technologies for risk management in the era of digital manufacturing. Technol Forecast Soc Chang 178:121562. https://doi.org/10.1016/j.techfore.2022.121562
Schkarin T, Dobhan A (2022) Prerequisites for Applying Artificial Intelligence for Scheduling in Small- and Medium-sized Enterprises. In: Proceedings of the 24 th International Conference on Enterprise Information Systems. SCITEPRESS—Science and Technology Publications, pp 529–536
Sharma P, Shah J, Patel R (2022) Artificial intelligence framework for MSME sectors with focus on design and manufacturing industries. Mater Today: Proc 62:6962–6966. https://doi.org/10.1016/j.matpr.2021.12.360
Siaterlis G, Nikolakis N, Alexopoulos K et al. (2022) Adoption of AI in EU Manufacturing. Gaps and Challenges. In: Katalinic B (ed) Proceedings of the 33 rd International DAAAM Symposium 2022, vol 1. DAAAM International Vienna, pp 547–550
Tariq MU, Poulin M, Abonamah AA (2021) Achieving operational excellence through artificial intelligence: driving forces and barriers. Front Psychol 12:686624. https://doi.org/10.3389/fpsyg.2021.686624
Trakadas P, Simoens P, Gkonis P et al (2020) An artificial intelligence-based collaboration approach in industrial IoT manufacturing: key concepts. Architectural Ext Potential Applications Sens. https://doi.org/10.3390/s20195480
Vernim S, Bauer H, Rauch E et al (2022) A value sensitive design approach for designing AI-based worker assistance systems in manufacturing. Procedia Computer Sci 200:505–516. https://doi.org/10.1016/j.procs.2022.01.248
Williams G, Meisel NA, Simpson TW et al (2022) Design for artificial intelligence: proposing a conceptual framework grounded in data wrangling. J Computing Inf Sci Eng 10(1115/1):4055854
Wuest T, Romero D, Cavuoto LA et al (2020) Empowering the workforce in Post–COVID-19 smart manufacturing systems. Smart Sustain Manuf Syst 4:20200043. https://doi.org/10.1520/SSMS20200043
Javaid M, Haleem A, Singh RP (2023) A study on ChatGPT for Industry 4.0: background, potentials, challenges, and eventualities. J Economy Technol 1:127–143. https://doi.org/10.1016/j.ject.2023.08.001
Rathore AS, Nikita S, Thakur G et al (2023) Artificial intelligence and machine learning applications in biopharmaceutical manufacturing. Trends Biotechnol 41:497–510. https://doi.org/10.1016/j.tibtech.2022.08.007
Jan Z, Ahamed F, Mayer W et al (2023) Artificial intelligence for industry 4.0: systematic review of applications, challenges, and opportunities. Expert Syst Applications 216:119456
Waschull S, Emmanouilidis C (2023) Assessing human-centricity in AI enabled manufacturing systems: a socio-technical evaluation methodology. IFAC-PapersOnLine 56:1791–1796. https://doi.org/10.1016/j.ifacol.2023.10.1891
Stohr A, Ollig P, Keller R et al (2024) Generative mechanisms of AI implementation: a critical realist perspective on predictive maintenance. Inf Organ 34:100503. https://doi.org/10.1016/j.infoandorg.2024.100503
Pazhayattil AB, Konyu-Fogel G (2023) ML and AI Implementation Insights for Bio/Pharma Manufacturing. BioPharm International 36:24–29
Ronaghi MH (2023) The influence of artificial intelligence adoption on circular economy practices in manufacturing industries. Environ Dev Sustain 25:14355–14380. https://doi.org/10.1007/s10668-022-02670-3
Rath SP, Tripathy R, Jain NK (2024) Assessing the factors influencing the adoption of generative artificial intelligence (GenAI) in the manufacturing sector. In: Sharma SK, Dwivedi YK, Metri B et al (eds) Transfer, diffusion and adoption of next-generation digital technologies, vol 697. Springer Nature Switzerland, Cham
Bonnard R, Da Arantes MS, Lorbieski R et al (2021) Big data/analytics platform for Industry 4.0 implementation in advanced manufacturing context. Int J Adv Manuf Technol 117:1959–1973. https://doi.org/10.1007/s00170-021-07834-5
Confalonieri M, Barni A, Valente A et al. (2015) An AI based decision support system for preventive maintenance and production optimization in energy intensive manufacturing plants. In: 2015 IEEE international conference on engineering, technology and innovation/ international technology management conference (ICE/ITMC). IEEE, pp 1–8
Dubey R, Gunasekaran A, Childe SJ et al (2020) Big data analytics and artificial intelligence pathway to operational performance under the effects of entrepreneurial orientation and environmental dynamism: a study of manufacturing organisations. Int J Prod Econ 226:107599. https://doi.org/10.1016/j.ijpe.2019.107599
Lee J, Singh J, Azamfar M et al (2020) Industrial AI: a systematic framework for AI in industrial applications. China Mechanical Eng 31:37–48
Turner CJ, Emmanouilidis C, Tomiyama T et al (2019) Intelligent decision support for maintenance: an overview and future trends. Int J Comput Integr Manuf 32:936–959. https://doi.org/10.1080/0951192X.2019.1667033
Agostinho C, Dikopoulou Z, Lavasa E et al (2023) Explainability as the key ingredient for AI adoption in Industry 5.0 settings. Front Artif Intell. https://doi.org/10.3389/frai.2023.1264372
Csiszar A, Hein P, Wachter M et al. (2020) Towards a user-centered development process of machine learning applications for manufacturing domain experts. In: 2020 third international conference on artificial intelligence for industries (AI4I). IEEE, pp 36–39
Merhi MI (2023) Harfouche A (2023) Enablers of artificial intelligence adoption and implementation in production systems. Int J Prod Res. https://doi.org/10.1080/00207543.2023.2167014
Demlehner Q, Laumer S (2024) How the terminator might affect the car manufacturing industry: examining the role of pre-announcement bias for AI-based IS adoptions. Inf Manag 61:103881. https://doi.org/10.1016/j.im.2023.103881
Ghobakhloo M, Ching NT (2019) Adoption of digital technologies of smart manufacturing in SMEs. J Ind Inf Integr 16:100107. https://doi.org/10.1016/j.jii.2019.100107
Binsaeed RH, Yousaf Z, Grigorescu A et al (2023) Knowledge sharing key issue for digital technology and artificial intelligence adoption. Systems 11:316. https://doi.org/10.3390/systems11070316
Papadopoulos T, Sivarajah U, Spanaki K et al (2022) Editorial: artificial Intelligence (AI) and data sharing in manufacturing, production and operations management research. Int J Prod Res 60:4361–4364. https://doi.org/10.1080/00207543.2021.2010979
Chirumalla K (2021) Building digitally-enabled process innovation in the process industries: a dynamic capabilities approach. Technovation 105:102256. https://doi.org/10.1016/j.technovation.2021.102256
Fragapane G, Ivanov D, Peron M et al (2022) Increasing flexibility and productivity in Industry 4.0 production networks with autonomous mobile robots and smart intralogistics. Ann Oper Res 308:125–143. https://doi.org/10.1007/s10479-020-03526-7
Shahbazi Z, Byun Y-C (2021) Integration of Blockchain, IoT and machine learning for multistage quality control and enhancing security in smart manufacturing. Sensors (Basel). https://doi.org/10.3390/s21041467
Javaid M, Haleem A, Singh RP et al (2021) Significance of sensors for industry 4.0: roles, capabilities, and applications. Sensors Int 2:100110. https://doi.org/10.1016/j.sintl.2021.100110
Download references
Open Access funding enabled and organized by Projekt DEAL.
Author information
Authors and affiliations.
Business Unit Industrial Change and New Business Models, Competence Center Innovation and Knowledge Economy, Fraunhofer Institute for Systems and Innovation Research ISI, Breslauer Straße 48, 76139, Karlsruhe, Germany
Heidi Heimberger, Djerdj Horvat & Frank Schultmann
Karlsruhe Institute for Technology KIT, Institute for Industrial Production (IIP) - Chair of Business Administration, Production and Operations Management, Hertzstraße 16, 76187, Karlsruhe, Germany
Heidi Heimberger
You can also search for this author in PubMed Google Scholar
Corresponding author
Correspondence to Heidi Heimberger .
Ethics declarations
Conflict of interest.
The authors report no conflict of interest.
Additional information
Publisher's note.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .
Reprints and permissions
About this article
Heimberger, H., Horvat, D. & Schultmann, F. Exploring the factors driving AI adoption in production: a systematic literature review and future research agenda. Inf Technol Manag (2024). https://doi.org/10.1007/s10799-024-00436-z
Download citation
Accepted : 10 August 2024
Published : 23 August 2024
DOI : https://doi.org/10.1007/s10799-024-00436-z
Share this article
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
Step 1. Formulate the Research Question. A systematic review is based on a pre-defined specific research question (Cochrane Handbook, 1.1).The first step in a systematic review is to determine its focus - you should clearly frame the question(s) the review seeks to answer (Cochrane Handbook, 2.1).It may take you a while to develop a good review question - it is an important step in your review.
Formulating a research question
It is important to consider the reasons that the research question is being asked. Any research question has ideological and theoretical assumptions around the meanings and processes it is focused on. A systematic review should either specify definitions and boundaries around these elements at the outset, or be clear about which elements are ...
LibGuides: Systematic Reviews: 2. Develop a Research Question
Systematic Reviews. 2. Develop a Research Question. A well-developed and answerable question is the foundation for any systematic review. This process involves: Using the PICO framework can help team members clarify and refine the scope of their question. For example, if the population is breast cancer patients, is it all breast cancer patients ...
Step 2: Define Your Research Question
Systematic reviews require a focused research question, often developed using one of the frameworks in the box below. A well-developed research question will inform the entirety of your review process, including: The development of your inclusion and exclusion criteria. The terms used in your search strategies.
Question frameworks (e.g PICO)
Your systematic review or systematic literature review will be defined by your research question. A well formulated question will help: Frame your entire research process. Determine the scope of your review. Provide a focus for your searches. Help you identify key concepts. Guide the selection of your papers.
Systematic reviews: Formulate your question
Defining the question. Defining the research question and developing a protocol are the essential first steps in your systematic review. The success of your systematic review depends on a clear and focused question, so take the time to get it right. A framework may help you to identify the key concepts in your research question and to organise ...
Systematic Review
A systematic review is a type of review that uses repeatable methods to find, select, and synthesize all available evidence. It answers a clearly formulated research question and explicitly states the methods used to arrive at the answer. Example: Systematic review. In 2008, Dr. Robert Boyle and his colleagues published a systematic review in ...
Systematic reviews: Structure, form and content
In recent years, there has been an explosion in the number of systematic reviews conducted and published (Chalmers & Fox 2016, Fontelo & Liu 2018, Page et al 2015) - although a systematic review may be an inappropriate or unnecessary research methodology for answering many research questions. Systematic reviews can be inadvisable for a ...
Ten Steps to Conduct a Systematic Review
The systematic review process is a rigorous and methodical approach to synthesizing and evaluating existing research on a specific topic. The 10 steps we followed, from defining the research question to interpreting the results, ensured a comprehensive and unbiased review of the available literature.
How to Do a Systematic Review: A Best Practice Guide for Conducting and
Systematic reviews are characterized by a methodical and replicable methodology and presentation. They involve a comprehensive search to locate all relevant published and unpublished work on a subject; a systematic integration of search results; and a critique of the extent, nature, and quality of evidence in relation to a particular research question.
Evidence Synthesis Guide : Develop & Refine Your Research Question
Develop & Refine Your Research Question. A clear, well-defined, and answerable research question is essential for any systematic review, meta-analysis, or other form of evidence synthesis. The question must be answerable. Spend time refining your research question. PICO Worksheet.
Research question
Develop your research question. A systematic review is an in-depth attempt to answer a specific, focused question in a methodical way. Start with a clearly defined, researchable question, that should accurately and succinctly sum up the review's line of inquiry. A well formulated review question will help determine your inclusion and exclusion ...
Developing a Research Question
After developing the research question, it is necessary to confirm that the review has not previously been conducted (or is currently in progress). Make sure to check for both published reviews and registered protocols (to see if the review is in progress). Do a thorough search of appropriate databases; if additional help is needed, consult a ...
Formulate Question
A narrow and specific research question is required in order to conduct a systematic review. The goal of a systematic review is to provide an evidence synthesis of ALL research performed on one particular topic. Your research question should be clearly answerable from the studies included in your review. Another consideration is whether the ...
Steps of a Systematic Review
Once you've finalized a research question, you should be able to locate existing systematic reviews on or similar to your topic. Existing systematic reviews will be your clues to mine for keywords, sample searches in various databases, and will help your team finalize your review question and develop your inclusion and exclusion criteria ...
Defining your review question
Research topic vs review question. A research topic is the area of study you are researching, and the review question is the straightforward, focused question that your systematic review will attempt to answer.. Developing a suitable review question from a research topic can take some time. You should: perform some scoping searches; use a framework such as PICO
Identifying Your Research Question
The difference with a systematic review research question is that you must have a clearly defined question and consider what problem are you trying to address by conducting the review. The most important point is that you focus your question and design the question so that it is answerable by the research that you will be systematically examining.
4 Developing review questions and planning the systematic review
4.1 Number of review questions. The exact number of review questions for each clinical guideline depends on the topic and the breadth of the scope (see chapter 2).However, the number of review questions must be manageable for the GDG and the National Collaborating Centre (NCC) or the NICE Internal Clinical Guidelines Programme [] within the agreed timescale.
Systematic Reviews: Formulating Your Research Question
evidence-based practice process. One way to streamline and improve the research process for nurses and researchers of all backgrounds is to utilize the PICO search strategy. PICO is a format for developing a good clinical research question prior to starting one's research. It is a mnemonic used to describe the four elements
1. Formulating the research question
Systematic review vs. other reviews. Systematic reviews required a narrow and specific research question. The goal of a systematic review is to provide an evidence synthesis of ALL research performed on one particular topic. So, your research question should be clearly answerable from the data you gather from the studies included in your review.
How to do a systematic review
A systematic review aims to bring evidence together to answer a pre-defined research question. This involves the identification of all primary research relevant to the defined review question, the critical appraisal of this research, and the synthesis of the findings.13 Systematic reviews may combine data from different.
Framing a Research Question
Resources for conducting a systematic review research. Image: PressBooks The process for developing a research question. There are many ways of framing questions depending on the topic, discipline, or type of questions.
Identifying the research question
A systematic review question A scoping review question; Typically a focused research question with narrow parameters, and usually fits into the PICO question format ... (can be copied and pasted into the Embase search box then combined with the concepts of your research question): (exp review/ or (literature adj3 review$).ti,ab. or exp meta ...
Chapter 2: Determining the scope of the review and the questions it
2.2 Aims of reviews of interventions. Systematic reviews can address any question that can be answered by a primary research study. This Handbook focuses on a subset of all possible review questions: the impact of intervention(s) implemented within a specified human population. Even within these limits, systematic reviews examining the effects of intervention(s) can vary quite markedly in ...
Patient reported measures of continuity of care and health outcomes: a
Future research should use larger sample sizes to clarify if a link does exist and what the potential mechanisms underlying such a link could be. ... This review seeks to answer two questions. 1) ... White E, Thorne A, Evans PH. Continuity of care with doctors - a matter of life and death? A systematic review of continuity of care and mortality ...
Protocol for Systematic Review and Meta-Analysis of Prehospital Large
Methods: This systematic review and meta-analysis will be conducted in accordance with the PRISMA-DTA Statement and the Cochrane Handbook for Systematic Reviews of Diagnostic Test Accuracy. We will include observational studies and randomized controlled trials that assess the utility of LVO scales in suspected stroke patients in prehospital ...
Approaches for the Use of AI in Workplace Health Promotion and
Objective: The systematic scoping review aims to comprehensively assess an overview of the current use of AI in WHPP. The results will be then used to point to future research directions. The following research questions were derived: (1) What are the study characteristics of studies on AI algorithms and technologies in the context of WHPP?
John's dementia risk
John's dementia risk Take the free cognative function test yourself, foodforthebrain.org Direct order for Patrick's book Upgrade Your brain,...
Exploring the factors driving AI adoption in production: a systematic
Following a systematic approach to select relevant studies, our literature review is based on a sample of articles that contribute to production-specific AI adoption. Our results reveal that the topic has been emerging within the last years and that AI adoption research in production is to date still in an early stage.
COMMENTS
Step 1. Formulate the Research Question. A systematic review is based on a pre-defined specific research question (Cochrane Handbook, 1.1).The first step in a systematic review is to determine its focus - you should clearly frame the question(s) the review seeks to answer (Cochrane Handbook, 2.1).It may take you a while to develop a good review question - it is an important step in your review.
It is important to consider the reasons that the research question is being asked. Any research question has ideological and theoretical assumptions around the meanings and processes it is focused on. A systematic review should either specify definitions and boundaries around these elements at the outset, or be clear about which elements are ...
Systematic Reviews. 2. Develop a Research Question. A well-developed and answerable question is the foundation for any systematic review. This process involves: Using the PICO framework can help team members clarify and refine the scope of their question. For example, if the population is breast cancer patients, is it all breast cancer patients ...
Systematic reviews require a focused research question, often developed using one of the frameworks in the box below. A well-developed research question will inform the entirety of your review process, including: The development of your inclusion and exclusion criteria. The terms used in your search strategies.
Your systematic review or systematic literature review will be defined by your research question. A well formulated question will help: Frame your entire research process. Determine the scope of your review. Provide a focus for your searches. Help you identify key concepts. Guide the selection of your papers.
Defining the question. Defining the research question and developing a protocol are the essential first steps in your systematic review. The success of your systematic review depends on a clear and focused question, so take the time to get it right. A framework may help you to identify the key concepts in your research question and to organise ...
A systematic review is a type of review that uses repeatable methods to find, select, and synthesize all available evidence. It answers a clearly formulated research question and explicitly states the methods used to arrive at the answer. Example: Systematic review. In 2008, Dr. Robert Boyle and his colleagues published a systematic review in ...
In recent years, there has been an explosion in the number of systematic reviews conducted and published (Chalmers & Fox 2016, Fontelo & Liu 2018, Page et al 2015) - although a systematic review may be an inappropriate or unnecessary research methodology for answering many research questions. Systematic reviews can be inadvisable for a ...
The systematic review process is a rigorous and methodical approach to synthesizing and evaluating existing research on a specific topic. The 10 steps we followed, from defining the research question to interpreting the results, ensured a comprehensive and unbiased review of the available literature.
Systematic reviews are characterized by a methodical and replicable methodology and presentation. They involve a comprehensive search to locate all relevant published and unpublished work on a subject; a systematic integration of search results; and a critique of the extent, nature, and quality of evidence in relation to a particular research question.
Develop & Refine Your Research Question. A clear, well-defined, and answerable research question is essential for any systematic review, meta-analysis, or other form of evidence synthesis. The question must be answerable. Spend time refining your research question. PICO Worksheet.
Develop your research question. A systematic review is an in-depth attempt to answer a specific, focused question in a methodical way. Start with a clearly defined, researchable question, that should accurately and succinctly sum up the review's line of inquiry. A well formulated review question will help determine your inclusion and exclusion ...
After developing the research question, it is necessary to confirm that the review has not previously been conducted (or is currently in progress). Make sure to check for both published reviews and registered protocols (to see if the review is in progress). Do a thorough search of appropriate databases; if additional help is needed, consult a ...
A narrow and specific research question is required in order to conduct a systematic review. The goal of a systematic review is to provide an evidence synthesis of ALL research performed on one particular topic. Your research question should be clearly answerable from the studies included in your review. Another consideration is whether the ...
Once you've finalized a research question, you should be able to locate existing systematic reviews on or similar to your topic. Existing systematic reviews will be your clues to mine for keywords, sample searches in various databases, and will help your team finalize your review question and develop your inclusion and exclusion criteria ...
Research topic vs review question. A research topic is the area of study you are researching, and the review question is the straightforward, focused question that your systematic review will attempt to answer.. Developing a suitable review question from a research topic can take some time. You should: perform some scoping searches; use a framework such as PICO
The difference with a systematic review research question is that you must have a clearly defined question and consider what problem are you trying to address by conducting the review. The most important point is that you focus your question and design the question so that it is answerable by the research that you will be systematically examining.
4.1 Number of review questions. The exact number of review questions for each clinical guideline depends on the topic and the breadth of the scope (see chapter 2).However, the number of review questions must be manageable for the GDG and the National Collaborating Centre (NCC) or the NICE Internal Clinical Guidelines Programme [] within the agreed timescale.
evidence-based practice process. One way to streamline and improve the research process for nurses and researchers of all backgrounds is to utilize the PICO search strategy. PICO is a format for developing a good clinical research question prior to starting one's research. It is a mnemonic used to describe the four elements
Systematic review vs. other reviews. Systematic reviews required a narrow and specific research question. The goal of a systematic review is to provide an evidence synthesis of ALL research performed on one particular topic. So, your research question should be clearly answerable from the data you gather from the studies included in your review.
A systematic review aims to bring evidence together to answer a pre-defined research question. This involves the identification of all primary research relevant to the defined review question, the critical appraisal of this research, and the synthesis of the findings.13 Systematic reviews may combine data from different.
Resources for conducting a systematic review research. Image: PressBooks The process for developing a research question. There are many ways of framing questions depending on the topic, discipline, or type of questions.
A systematic review question A scoping review question; Typically a focused research question with narrow parameters, and usually fits into the PICO question format ... (can be copied and pasted into the Embase search box then combined with the concepts of your research question): (exp review/ or (literature adj3 review$).ti,ab. or exp meta ...
2.2 Aims of reviews of interventions. Systematic reviews can address any question that can be answered by a primary research study. This Handbook focuses on a subset of all possible review questions: the impact of intervention(s) implemented within a specified human population. Even within these limits, systematic reviews examining the effects of intervention(s) can vary quite markedly in ...
Future research should use larger sample sizes to clarify if a link does exist and what the potential mechanisms underlying such a link could be. ... This review seeks to answer two questions. 1) ... White E, Thorne A, Evans PH. Continuity of care with doctors - a matter of life and death? A systematic review of continuity of care and mortality ...
Methods: This systematic review and meta-analysis will be conducted in accordance with the PRISMA-DTA Statement and the Cochrane Handbook for Systematic Reviews of Diagnostic Test Accuracy. We will include observational studies and randomized controlled trials that assess the utility of LVO scales in suspected stroke patients in prehospital ...
Objective: The systematic scoping review aims to comprehensively assess an overview of the current use of AI in WHPP. The results will be then used to point to future research directions. The following research questions were derived: (1) What are the study characteristics of studies on AI algorithms and technologies in the context of WHPP?
John's dementia risk Take the free cognative function test yourself, foodforthebrain.org Direct order for Patrick's book Upgrade Your brain,...
Following a systematic approach to select relevant studies, our literature review is based on a sample of articles that contribute to production-specific AI adoption. Our results reveal that the topic has been emerging within the last years and that AI adoption research in production is to date still in an early stage.