Synthesising quantitative evidence in systematic reviews of complex health interventions

Loading

  1. Julian P T Higginsone,
  2. José A López-Lópezane,
  3. Betsy J Beckerii,
  4. Sarah R Daviesi,
  5. Sarah Dawson1,
  6. Jeremy 1000 Grimshaw3,iv,
  7. Luke A McGuinness1,
  8. Theresa H Thousand Moore1,5,
  9. Eva A Rehfuesshalf dozen,
  10. James Thomas7,
  11. Deborah Grand Caldwell1
  1. i Population Health Sciences, Bristol Medical Schoolhouse, University of Bristol, Bristol, Uk
  2. 2 Department of Educational Psychology and Learning Systems, College of Teaching, Florida State University, Tallahassee, Florida, U.s.
  3. 3 Clinical Epidemiology Program, Ottawa Hospital Enquiry Constitute, The Ottawa Hospital, Ottawa, Ontario, Canada
  4. 4 Department of Medicine, University of Ottawa, Ottawa, Ontario, Canada
  5. 5 NIHR Collaboration for Leadership in Practical Wellness Care (CLAHRC) Westward, University Hospitals Bristol NHS Foundation Trust, Bristol, UK
  6. 6 Institute for Medical Information Processing, Biometry and Epidemiology, Pettenkofer School of Public Health, LMU Munich, Munich, Germany
  7. 7 EPPI-Centre, Department of Social Science, University College London, London, Britain
  1. Correspondence to Professor Julian P T Higgins; julian.higgins{at}bristol.air conditioning.britain

Abstruse

Public wellness and health service interventions are typically circuitous: they are multifaceted, with impacts at multiple levels and on multiple stakeholders. Systematic reviews evaluating the effects of complex wellness interventions can exist challenging to acquit. This paper is office of a special series of papers considering these challenges particularly in the context of WHO guideline development. We outline established and innovative methods for synthesising quantitative evidence within a systematic review of a complex intervention, including considerations of the complexity of the system into which the intervention is introduced. We draw methods in three broad areas: not-quantitative approaches, including tabulation, narrative and graphical approaches; standard meta-analysis methods, including meta-regression to investigate study-level moderators of effect; and avant-garde synthesis methods, in which models allow exploration of intervention components, investigation of both moderators and mediators, exam of mechanisms, and exploration of complexities of the arrangement. Nosotros offering guidance on the selection of approach that might be taken by people collating evidence in back up of guideline development, and emphasise that the appropriate methods volition depend on the purpose of the synthesis, the similarity of the studies included in the review, the level of detail bachelor from the studies, the nature of the results reported in the studies, the expertise of the synthesis team and the resources available.

  • meta-analysis
  • complex interventions
  • systematic reviews
  • guideline evolution

This is an open up access commodity distributed under the terms of the Artistic Commons Attribution-Non commercial IGO License (CC Past-NC 3.0 IGO), which permits employ, distribution,and reproduction for non-commercial purposes in any medium, provided the original work is properly cited. In any reproduction of this article there should not be whatever suggestion that WHO or this commodity endorse any specific system or products. The use of the WHO logo is not permitted. This notice should exist preserved along with the article's original URL.

Statistics from Altmetric.com

  • meta-analysis
  • complex interventions
  • systematic reviews
  • guideline development

Summary box

  • Quantitative syntheses of studies on the furnishings of complex health interventions confront loftier diverseness across studies and limitations in the data available.

  • Statistical and non-statistical approaches are available for tackling intervention complexity in a synthesis of quantitative information in the context of a systematic review.

  • Appropriate methods volition depend on the purpose of the synthesis, the number and similarity of studies included in the review, the level of item bachelor from the studies, the nature of the results reported in the studies, the expertise of the synthesis team and the resources bachelor.

  • We offer considerations for selecting methods for synthesis of quantitative data to address important types of questions about the effects of complex interventions.

Background

Public health and health service interventions are typically complex. They are usually multifaceted, with impacts at multiple levels and on multiple stakeholders. Likewise, the systems inside which they are implemented may change and adapt to enhance or dampen their impact.1 Quantitative syntheses ('meta-analyses') of studies of complex interventions seek to integrate quantitative findings beyond multiple studies to achieve a coherent message greater than the sum of their parts. Interest is growing on how the standard systematic review and meta-analysis toolkit can be enhanced to address complexity of interventions and their touch on.2 A contempo report from the Agency for Healthcare Research and Quality and a series of papers in the Journal of Clinical Epidemiology provide useful background on some of the challenges.3–six

This newspaper is role of a series to explore the implications of complexity for systematic reviews and guideline development, commissioned by WHO.seven Clearly, and every bit covered past other papers in this series, guideline development encompasses the consideration of many different aspects,8 such every bit intervention effectiveness, economic considerations, acceptability9 or certainty of evidence,10 and requires the integration of different types of quantitative as well as qualitative show.11 12 This paper is specifically concerned with methods available for the synthesis of quantitative results in the context of a systematic review on the furnishings of a complex intervention. We aim to point those collating evidence in support of guideline development to methodological approaches that will help them integrate the quantitative show they identify. A summary of how these methods link to many of the types of complexity encountered is provided in table i, based on the examples provided in a table from an earlier newspaper in the series.one An annotated listing of the methods we cover is provided in tabular array 2.

Tabular array i

Quantitative synthesis possibilities to accost aspects of complexity

Table 2

Quantitative graphical and synthesis approaches mentioned in the paper, with their main strengths and weaknesses in the context of complex interventions

Nosotros brainstorm by reiterating the importance of starting with meaningful research questions and an awareness of the purpose of the synthesis and whatsoever relevant background knowledge. An important result in systematic reviews of complex interventions is that data available for synthesis are often extremely limited, due to pocket-sized numbers of relevant studies and limitations in how these studies are conducted and their results are reported. Furthermore, it is uncommon for 2 studies to evaluate exactly the same intervention, in part because of the interventions' inherent complexity. Thus, each written report may be designed to provide information on a unique context or a novel intervention approach. Outcomes may exist measured in different ways and at different time points. We therefore discuss possible approaches when data are highly limited or highly heterogeneous, including the apply of graphical approaches to present very basic summary results. Nosotros then discuss statistical approaches for combining results and for understanding the implications of various kinds of complexity.

In several places we depict on an example of a review undertaken to inform a contempo WHO guideline on protecting, promoting and supporting breast feeding.13 The review seeks to determine the effects of interventions to promote breast feeding delivered in five types of settings (health services, home, community, workplace, policy context or a combination of settings).8 The included interventions were predominantly multicomponent, and were implemented in complex systems across multiple contexts. The review included 195 studies, including many from low-income and middle-income countries, and concluded that interventions should be delivered in a combination of settings to attain high breastfeeding rates.

The importance of the research question

The starting point in any synthesis of quantitative prove is a clear purpose. The input of stakeholders is disquisitional to ensure that questions are framed accordingly, addressing issues important to those commissioning, delivering and affected by the intervention. Detailed discussion of the development of research questions is provided in an earlier paper in the series,1 and a subsequent paper explains the importance of taking context into account.9 The start of these papers describes two possible perspectives. A complex interventions perspective emphasises the complexities involved in conceptualising, specifying and implementing the intervention per se, including the array of peradventure interacting components and the behaviours required to implement information technology. A circuitous systems perspective emphasises the complication of the systems into which the intervention is introduced, including possible interactions between the intervention and the system, interactions betwixt individuals within the organization and how the whole organisation responds to the intervention.

The simplest purpose of a systematic review is to determine whether a detail type of circuitous intervention (or class of interventions) is constructive compared with a 'usual practice' alternative. The familiar PICO framework is helpful for framing the review:xiv in the PICO framework, a broad inquiry question about effectiveness is uniquely specified past describing the participants ('P', including the setting and prevailing weather) to which the intervention is to be practical; the intervention ('I') and comparator ('C') of interest, and the outcomes ('O', including their time course) that might exist impacted past the intervention. In the breastfeeding review, the master synthesis arroyo was to combine all bachelor studies, irrespective of setting, and perform carve up meta-analyses for dissimilar outcomes.15

More useful than a review that asks 'does a circuitous intervention work?' is one that determines the situations in which a complex intervention has a larger or smaller upshot. Indeed, inquiry questions targeted by syntheses in the presence of complication oft dissect ane or more of the PICO elements to explore how intervention effects vary both within and across studies (ie, treating the PICO elements as 'moderators'). For case, analyses may explore variation across participants, settings and prevailing conditions (including context); or across interventions (including different intervention components that may be present or absent in unlike studies); or across outcomes (including unlike outcome measures, at unlike levels of the system and at different time points) on which furnishings of the intervention occur. In addition, there may exist interest in how aspects of the underlying system or the intervention itself mediate the effects, or in the role of intermediate outcomes on the pathway from intervention to impact.xvi In the breastfeeding review, interest moved from the overall effects across interventions to investigations of how effects varied by such factors as intervention delivery setting, high-income versus low-income country, and urban versus rural setting.15

The part of logic models to inform a synthesis

An earlier paper describes the benefits of using system-based logic models to characterise a priori theories nigh how the system operates.1 These provide a useful starting point for most syntheses since they encourage consideration of all aspects of complexity in relation to the intervention or the system (or both). They can aid place important mediators and moderators, and inform decisions about what aspects of the intervention and system need to be addressed in the synthesis. As an case, a protocol for a review of the health furnishings of environmental interventions to reduce the consumption of saccharide-sweetened beverages included a system-based logic model, detailing how the characteristics of the beverages, and the physiological characteristics and psychological characteristics of individuals, are thought to bear upon on outcomes such as weight gain and cardiovascular disease.17 The logic model informs the selection of outcomes and the general plans for synthesis of the findings of included studies. Even so, organization-based models do non ordinarily include details of how implementation of an intervention into the organisation is likely to affect subsequent outcomes. They therefore accept a limited role in informing syntheses that seek to explain mechanisms of activity.

A quantitative synthesis may draw on a specific proposed framework for how an intervention might work; these are sometimes referred to as process-orientated logic models, and may be strongly driven by qualitative research evidence.12 They stand for causal processes, describing what components or aspects of an intervention are thought to impact on what behaviours and actions, and what the further consequences of these impacts are likely to exist.18 They may encompass mediators of effect and moderators of effect. A synthesis may simply adopt the proposed causal model at face value and attempt to quantify the relationships described therein. Where more than one possible causal model is bachelor, a synthesis may explore which of the models is better supported by the data, for example, by examining the evidence for specific links within the model or by identifying a statistical model that corresponds to the overall causal model.eighteen xix

A systematic review on community-level interventions for improving access to food in low-income and heart-income countries was based on a logic model that depicts how interventions might lead to improved health status.20 The model includes direct effects, such as increased financial resources of individuals and decreased food prices; intermediate furnishings, such as increased quantity of food bachelor and increment in intake; and main outcomes of interest, such equally nutritional status and wellness indicators. The planned statistical synthesis, however, was to tackle these 1 at a time.

Considering the types of studies available

Studies of the furnishings of complex interventions may be randomised or non-randomised, and oftentimes involve clustering of participants within social or organisational units. Randomised trials, if sufficiently large, provide the most convincing evidence about the furnishings of interventions considering randomisation should upshot in intervention and comparator groups with similar distributions of both observed and unobserved baseline characteristics. However, randomised trials of complex interventions may be difficult or impossible to undertake, or may exist performed only in specific contexts, yielding results that are not generalisable. Not-randomised written report designs include then-called 'quasi-experiments' and may exist longitudinal studies, including interrupted time serial and earlier-after studies, with or without a control group. Non-randomised studies are at greater risk of bias, sometimes essentially and then, although may be undertaken in contexts that are more relevant to decision making. Analyses of non-randomised studies often employ statistical controls for confounders to account for differences between intervention groups, and challenges are introduced when different sets of confounders are used in different studies.21 22

Randomised trials and non-randomised studies might both exist included in a review, and analysts may have to decide whether to combine these in 1 synthesis, and whether to combine results from unlike types of not-randomised studies in a single analysis. Studies may differ in two ways: past answering different questions, or by answering similar questions with different risks of bias. The research questions must be sufficiently like and the studies sufficiently costless of bias for a synthesis to be meaningful. In the breastfeeding review, randomised, quasi-experimental and observational studies were combined; no bear witness suggested that the effects differed across designs.xv In practice, many methodologists by and large recommend confronting combining randomised with not-randomised studies.23

Preparing for a quantitative synthesis

Before undertaking a quantitative synthesis of complex interventions, it tin exist helpful to brainstorm the synthesis non-quantitatively, looking at patterns and characteristics of the data identified. Systematic tabulation of information is recommended, and this might be informed by a prespecified logic model. The about established framework for non-quantitative synthesis is that proposed by Popay et al.24 The Cochrane Consumers and Communication group succinctly summarise the process every bit an 'investigation of the similarities and the differences between the findings of different studies, also every bit exploration of patterns in the data'.25 Another useful framework was described by Petticrew and Roberts.26 They identify three stages in the initial narrative synthesis: (1) Arrangement of studies into logical categories, the structure of which will depend on the purpose of the synthesis, possibly relating to report design, outcome or intervention types. (2) Inside-written report analysis, involving the description of findings inside each report. (3) Cross-written report synthesis, in which variations in written report characteristics and potential biases are integrated and the range of furnishings described. Aspects of this process are probable to be implemented in whatsoever systematic review, fifty-fifty when a detailed quantitative synthesis is undertaken.

In some circumstances the available data are besides diverse, likewise non-quantitative or as well sparse for a quantitative synthesis to be meaningful even if information technology is possible. The best that tin can be accomplished in many reviews of circuitous interventions is a non-quantitative synthesis following the guidance given in the above frameworks.

Options when consequence size estimates cannot be obtained or studies are likewise various to combine

Graphical approaches

Graphical displays tin be very valuable to illustrate patterns in results of studies.27 We illustrate some options in figure 1. Forest plots are the standard illustration of the results of multiple studies (see effigy ane, panel A), simply require a similar issue size estimate from each study. For studies of circuitous interventions, the variety of approaches to the intervention, the context,1 evaluation approaches and reporting differences tin pb to considerable variation across studies in what results are available. Some novel graphical approaches have been proposed for such situations. A recent development is the albatross plot, which plots p values against sample sizes, with approximate effect-size contours superimposed (run across figure 1, console B).28 The contours are computed from the p values and sample sizes, based on an assumption virtually the type of analysis that would have given rise to the p values. Although these plots are designed for situations when result size estimates are non available, the contours tin can be used to infer approximate effect sizes from studies that are analysed and reported in highly diverse ways. Such an advantage may prove to be a disadvantage, however, if the contours are overinterpreted.

Figure 1

Figure 1

Example graphical displays of information from a review of interventions to promote breast feeding, for the outcome of continued chest feeding upward to 23 months.15 Panel A: Forest plot for relative take chances (RR) estimates from each written report. Panel B: Albatross plot of p value against sample size (event contours drawn for chance ratios assuming a baseline risk of 0.15; sample sizes and baseline risks extracted from the original papers past the electric current authors); Panel C: Harvest plot (heights reverberate pattern: randomised trials (tall), quasi-experimental studies (medium), observational studies (short); bar shading reflects follow-up: longest follow-upwardly (blackness) to shortest follow-up (lite grey) or no information (white)). Panel D: Bubble plot (chimera sizes and colours reflect pattern: randomised trials (large, greenish), quasi-experimental studies (medium, carmine), observational studies (minor, blue); precision defined as inverse of the SE of each effect estimate (derived from the CIs); categories are: "Potential Harm": RR <0.eight; "No Effect": RRs betwixt 0.8 and 1.25; "Potential Benefit": RR >ane.25 and CI includes RR=1; "Do good": RR >1.25 and CI excludes RR=1).

Harvest plots have been proposed past Ogilvie et al as a graphical extension of a vote counting arroyo to synthesis (see effigy one, panel C).29 However, approaches based on vote counting of statistically significant results have been criticised on the basis of their poor statistical backdrop, and because statistical significance is an outdated and unhelpful notion.30 The harvest plot is a matrix of small illustrations, with unlike outcome domains defining rows and different qualitative conclusions (negative effect, no consequence, positive result) defining columns. Each report is represented by a bar that is positioned according to its measured outcome and qualitative decision. Bar heights and shadings tin can describe features of the report, such every bit objectivity of the consequence measure, suitability of the report design and study quality.29 31 A similar idea to the harvest plot is the effect direction plot proposed by Thomson and Thomas.32

A device to plot the findings from a large and circuitous collection of evidence is a bubble plot (see figure 1, panel D). A bubble plot illustrates the direction of each finding (or whether the finding was unclear) on a horizontal scale, using a vertical scale to indicate the volume of evidence, and with bubble sizes to signal some measure out of credibility of each finding. Such an approach can as well describe findings of collections of studies rather than private studies, and was used successfully, for example, to summarise findings from a review of systematic reviews of the effects of acupuncture on various indications for hurting.33

Statistical methods not based on effect size estimates

We have mentioned that a frequent problem is that standard meta-analysis methods cannot be used because data are not available in a similar format from every report. In general, the cadre principles of meta-analysis can be applied even in this state of affairs, equally is highlighted in the Cochrane Handbook, by addressing the questions: 'What is the direction of effect?'; 'What is the size of effect?'; 'Is the effect consistent beyond studies?'; and 'What is the strength of prove for the outcome?'.34

Alternatives to the estimation of effect sizes could be used more than often than they are in practice, allowing some basic statistical inferences despite diversely reported results. The most central assay is to examination the overall null hypothesis of no issue in any of the studies. Such a test can exist undertaken using only minimally reported information from each written report. At its simplest, a binomial test tin can be performed using merely the direction of effect observed in each written report, irrespective of its CI or statistical significance.35 Where exact p values are bachelor too as the direction of effect, a more than powerful test tin can exist performed by combining these using, for instance, Fisher's combination of p values.36 It is of import that these p values are computed appropriately, even so, accounting for clustering or matching of participants within the studies. Rejecting the null model based on such tests provides no information about the magnitude of the effect, providing information only on whether at least i report shows an consequence is present, and if and so, its direction.37

Standard synthesis methods

Meta-analysis for overall event

Probably the well-nigh familiar arroyo to meta-analysis is that of estimating a unmarried summary consequence across like studies. This unproblematic approach lends itself to the use of woods plots to display the results of individual studies also as syntheses, equally illustrated for the breastfeeding studies in effigy i (panel A). This analysis addresses the broad question of whether evidence from a drove of studies supports an bear on of the complex intervention of interest, and requires that every written report makes a comparison of a relevant intervention confronting a similar culling. In the context of circuitous interventions, this is described past Caldwell and Welton as the 'lumping' approach,38 and past Guise et al every bit the 'holistic' arroyo.5 6 1 key limitation of the simple approach is that information technology requires similar types of data from each study. A second limitation is that the meta-analysis result may take express relevance when the studies are diverse in their characteristics. Fixed-upshot models, for case, are unlikely to be appropriate for complex interventions because they ignore between-studies variability in underlying event sizes. Results based on random-effects models volition need to be interpreted past acknowledging the spread of effects beyond studies, for example, using prediction intervals.39

A mutual trouble when undertaking a simple meta-assay is that individual studies may report many consequence sizes that are correlated with each other, for example, if multiple outcomes are measured, or the same upshot variable is measured at several time points. Numerous approaches are bachelor for dealing with such multiplicity, including multivariate meta-analysis, multilevel modelling, and strategies for selecting effect sizes.40 A very simple strategy that has been used in systematic reviews of complex interventions is to accept the median consequence size within each written report, and to summarise these using the median of these event sizes across studies.41

Exploring heterogeneity

Diversity in the types of participants (and contexts), interventions and outcomes are key to agreement sources of complexity.ix Many of these important sources of heterogeneity are nigh usefully examined—to the extent that they can reliably exist understood—using standard approaches for understanding variability beyond studies, such as subgroup analyses and meta-regression.

A simple strategy to explore heterogeneity is to estimate the overall outcome separately for different levels of a factor using subgroup analyses (referring to subgrouping studies rather than participants).42 As an instance, McFadden et al conducted a systematic review and meta-assay of 73 studies of support for healthy breastfeeding mothers with salubrious term babies.43 They calculated split up average effects for interventions delivered by a health professional, a lay supporter or with mixed back up, and establish that the effect on cessation of exclusive breast feeding at up to 6 months was greater for lay support compared with professionals or mixed support (p=0.02). Guise et al provide several means of grouping studies according to their interventions, for case, grouping studies past key components, by function or by theory.5 half dozen

Meta-regression provides a flexible generalisation to subgroup analyses, whereby study-level covariates are included in a regression model using effect size estimates as the dependent variable.44 45 Both continuous and categorical covariates can be included in such models; with a single categorical covariate, the arroyo is essentially equivalent to subgroup analyses. Meta-regression with continuous covariates in theory allows the extrapolation of relationships to contexts that were non examined in any of the studies, but this should generally exist avoided. For instance, if the effect of an interventional approach appears to increase as the size of the group to which it is applied decreases, this does not mean that it will work even better when applied to a single private. More generally, the mathematical form of the relationship modelled in a meta-regression requires conscientious option. About often a linear relationship is assumed, merely a linear relationship does not permit step changes such as might occur if an interventional approach requires a item level of some feature of the underlying system before it has an consequence.

Several texts provide guidance for using subgroup analysis and meta-regression in a full general context45 46 and for circuitous interventions.three iv 47 In principle, many aspects of complexity in interventions can exist addressed using these strategies, to create an understanding of the 'response surface'.48–50 Notwithstanding, in do, the number of studies is often too small for reliable conclusions to exist drawn. In general, subgroup analysis and meta-regression are fraught with dangers associated with having few studies, many sources of variation across study features and confounding of these features with each other as well as with other, ofttimes unobserved, variables. It is therefore important to prespecify a modest number of plausible sources of diverseness so as to reduce the danger of reaching spurious conclusions based on written report characteristics that correlate with the effects of the interventions only are not the cause of the variation. The ability of statistical analyses to identify true sources of heterogeneity will depend on the number of studies, the sizes of the studies and the true differences between furnishings in studies with dissimilar characteristics.

Synthesis methods for understanding components of the intervention

When interventions comprise distinct components, it is attractive to separate out the individual effects of these components.51 Meta-regression can exist used for this, using covariates to lawmaking the presence of particular features in each intervention implementation. As an example, Blakemore et al analysed 39 intervention comparisons from 33 independent studies aiming to reduce urgent healthcare use in adults with asthma.52 Effect size estimates were coded according to components used in the interventions, and the authors found that multicomponent interventions including skills training, education and relapse prevention appeared peculiarly effective. In some other example, of interventions to support family caregivers of people with Alzheimer's disease,53 the authors used methods for decomposing circuitous interventions proposed past Czaja et al,54 and created covariates that reduced the complexity of the interventions to a small number of features virtually the intensity of the interventions. More than sophisticated models for examining components have been described by Welton et al,55 Ivers et al 56 and Madan et al.57

A component-level arroyo may be useful when at that place is a need to disentangle the 'active ingredients' of an intervention, for example, when adapting an existing intervention for a new setting. Withal, components-based approaches require assumptions, such as whether individual components are condiment or interact with each other. Furthermore, the effects of components can be difficult to estimate if they are used simply in detail contexts or populations, or are strongly correlated with utilise of other components. An alternative approach is to treat each combination of components as a separate intervention. These divide interventions might and then exist compared in a single analysis using network meta-assay. A network meta-analysis combines results from studies comparing two or more of a larger gear up of interventions, using indirect comparisons via mutual comparators to rank-order all interventions.47 58 59 As an example, Achana et al examined the effectiveness of safety interventions on the uptake of three poisoning prevention practices in households with children. Each singular combination of intervention components was divers as a separate intervention in the network.60 Network meta-analysis may likewise exist useful when there is a need to compare multiple interventions to reply an 'in principle' question of which intervention is most effective. Consideration of the principal goals of the synthesis will assist those aiming to prepare guidelines to determine which of these approaches is almost advisable to their needs.

A case study exploring components is provided in box 1, and an illustration is provided in figure two. The component-based analysis approach can be likened to a factorial trial, in that it attempts to separate out the effects of private components of the complex interventions, and the network meta-analysis approach can be likened to a multiarm trial approach, where each complex intervention in the set of studies is a different arm in the trial.47 Deciding betwixt the 2 approaches can leave the analyst defenseless betwixt the demand to 'split up' components to reflect complexity (and minimise heterogeneity) and 'lump' to make an assay feasible. Both approaches tin be used to examine other features of interventions, including interventions designed for delivery at unlike levels. For example, a review of the effects of interventions for children exposed to domestic violence and abuse included studies of interventions targeted at children alone, parents lone, children and parents together, and parents and children separately.61 A network meta-analysis approach was taken to the synthesis, with the people targeted by the intervention used equally a distinguishing feature of the interventions included in the network.

Box i

Example of understanding components of psychosocial interventions for coronary middle affliction

Welton et al reanalysed data from a Cochrane review89 of randomised controlled trials assessing the effects of psychological interventions on mortality and morbidity reduction for people with coronary eye affliction.55 The Cochrane review focused on the effectiveness of any psychological intervention compared with usual intendance, and found evidence that psychological interventions reduced non-fatal reinfarctions and depression and anxiety symptoms. The Cochrane review authors highlighted the large heterogeneity among interventions as an of import limitation of their review.

Welton et al were interested in the effects of the different intervention components. They classified interventions co-ordinate to which of five central components were included: educational, behavioural, cognitive, relaxation and psychosocial support (effigy two). Their reanalysis examined the effect of each component in three dissimilar ways: (1) An additive model assuming no interactions between components. (ii) A two-factor interaction model, allowing for interactions between pairs of components. (three) A network meta-analysis, defining each combination of components as a split intervention, therefore allowing for total interaction between components. Results suggested that interventions with behavioural components were constructive in reducing the odds of all-cause mortality and non-fatal myocardial infarction, and that interventions with behavioural and/or cognitive components were effective for reducing depressive symptoms.

Figure 2

Effigy ii

Intervention components in the studies integrated by Welton et al (a sample of 18 from 56 active treatment arms). EDU, educational component; BEH, behavioural component; COG, cognitive component; REL, relaxation component; SUP, psychosocial back up component.

A common limitation when implementing these quantitative methods in the context of complex interventions is that replication of the aforementioned intervention in two or more studies is rare. Qualitative comparative analysis (QCA) might overcome this trouble, being designed to address the 'pocket-size North; many variables' problem.62 QCA involves: (1) Identifying theoretically driven thresholds for determining intervention success or failure. (ii) Creating a 'truth table', which takes the form of a matrix, cross-tabulating all possible combinations of conditions (eg, participant and intervention characteristics) against each written report and its associated outcomes. (three) Using Boolean algebra to eliminate redundant weather and to identify configurations of weather that are necessary and/or sufficient to trigger intervention success or failure. QCA can usefully complement quantitative integration, sometimes in the context of synthesising diverse types of evidence.

Synthesis methods for understanding mechanisms of action

An alternative purpose of a synthesis is to gain insight into the mechanisms of action behind an intervention, to inform its generalisability or applicability to a particular context. Such syntheses of quantitative data may complement syntheses of qualitative information,11 and the two forms might be integrated.12 Logic models, or theories of action, are important to motivate investigations of mechanism. The synthesis is likely to focus on intermediate outcomes reflecting intervention processes, and on mediators of issue (factors that influence how the intervention affects an outcome measure out). Ii possibilities for analysis are to use these intermediate measurements as predictors of principal outcomes using meta-regression methods,63 or to use multivariate meta-analysis to model the intermediate and main outcomes simultaneously, exploiting and estimating the correlations between them.64 65 If the synthesis suggests that hypothesised chains of outcomes hold, this lends weight to the theoretical model underlying the hypothesis.

An approach to synthesis closely identified with this category of interventions is model-driven meta-analysis, in which different sources of evidence are integrated inside a causal path model akin to a directed acyclic graph. A model-driven meta-analysis is an explanatory assay.66 It attempts to get further than a standard meta-analysis or meta-regression to explore how and why an intervention works, for whom it works, and which aspects of the intervention (factors) are driving overall result. Such syntheses have been described in frequentist19 67–70 and Bayesian71 72 frameworks and are variously known equally model-driven meta-analysis, linked meta-analysis, meta-mediation assay and meta-analysis of structural equation models. In their simplest class, standard meta-analyses estimate a summary correlation independently for each pair of variables in the model. The approach is inherently multivariate, requiring the estimation of multiple correlations (which, if obtained from a single study, are too non independent).73–75 Each study is likely to contribute fragments of the correlation matrix. A summary correlation matrix, combined either past fixed-effects or random-effects methods, then serves as the input for subsequent analysis via a standardised regression or structural equation model.

An example is provided in box 2. The model in figure three postulates that the upshot of 'Dietary adherence' on 'Diabetes complications' is not direct merely is mediated by 'Metabolic control'.76 The potential for model-driven meta-assay to contain such indirect effects also allows for mediating effects to be explicitly tested and in so doing allows the meta-analyst to identify and explore the mechanisms underpinning a complex intervention.77

Box ii

Instance of a model-driven meta-analysis for blazon 2 diabetes

Dark-brown et al present a model-driven meta-assay of correlational research on psychological and motivational predictors of diabetes outcomes, with medication and dietary adherence factors as mediators.76 In a linked methodological newspaper, they present the a priori theoretical model on which their assay is based.68 The model is simplified in figure iii, and summarised for the dietary adherence pathway merely. The aim of their total analysis was to make up one's mind the predictive relationships among psychological factors and motivational factors on metabolic control and torso mass index (BMI), and the role of behavioural factors equally possible mediators of the associations among the psychological and motivational factors and metabolic control and BMI outcomes.

The assay is based on a comprehensive systematic review. Due to the number of variables in their total model, 775 individual correlational or predictive studies reported across 739 enquiry papers met eligibility criteria. Correlations between each pair of variables in the model were summarised using an overall average correlation, and homogeneity assessed. Multivariate analyses were used to gauge a combined correlation matrix. These results were used, in plow, to estimate path coefficients for the predictive model and their standard errors. For the simplified model illustrated here, the results suggested that coping and self-efficacy were strongly related to dietary adherence, which was strongly related to improved glycaemic control and, in plow, a reduction in diabetic complications.

Synthesis approaches for agreement complexities of the organisation

Syntheses may seek to accost complexities of the system to understand either the touch of the system on the effects of the intervention or the effects of the intervention on the organisation. This may start past modelling the salient features of the system's dynamics, rather than focusing on interventions. Subgroup assay and meta-regression are useful approaches for investigating the extent to which an intervention'south effects depend on baseline features of the arrangement, including aspects of the context. Sophisticated meta-regression models might investigate multiple baseline features, using similar approaches to the component-based meta-analyses described earlier. Specifically, aspects of context or population characteristics tin be regarded as 'components' of the system into which the intervention is introduced, and similar statistical modelling strategies used to isolate furnishings of individual factors, or interactions between them.

When interventions act at multiple levels, it may be of import to understand the furnishings at these different levels. Outcomes may be measured at different levels (eg, at patient, clinician and clinical practice levels) and analysed separately. Qualitative research plays a especially of import function in identifying the outcomes that should be assessed through quantitative synthesis.12 Care is needed to ensure that the unit of analysis issues are addressed. For instance, if clinics are the unit of measurement of randomisation, and then outcomes measured at the dispensary level can exist analysed using standard methods, whereas outcomes measured at the level of the patient inside the clinic would need to account for clustering. In fact, multiple dependencies may arise in such information, when patients receive care in small groups. Detailed investigations of consequence at dissimilar levels, including interactions between the levels, would lend themselves to multilevel (hierarchical) models for synthesis. Unfortunately, individual participant data at all levels of the bureaucracy are needed for such analyses.

Model-based approaches likewise offer possibilities for addressing complex systems; these include economical models, mathematical models and systems science methods mostly.78–80 Broadly speaking, these provide mathematical representations of logic models, and analyses may involve incorporation of empirical data (eg, from systematic reviews), reckoner simulation, direct computation or a mixture of these. Multiparameter evidence synthesis methods might be used.81 82 Approaches include models to represent systems (eg, systems dynamics models) and approaches that simulate individuals within the arrangement (eg, agent-based models).79 Models tin be particularly useful when empirical testify does non address all important considerations, such as 'real-world' contexts, long-term effects, non-linear furnishings and complexities such as feedback loops and threshold furnishings. An example of a model-based arroyo to synthesis is provided in box 3. The challenge when adopting these approaches is often in the identification of organization components, and accurately estimating causes and furnishings (and uncertainties). There are few examples of the use of these analytical tools in systematic reviews, but they may be useful when the focus of assay is on understanding the causes of complexity in a given system rather than on the affect of an intervention.

Box 3

Example of a mathematical modelling approach for soft drinks manufacture levy

Briggs et al examined the potential impact of a soft drinks levy in the UK, because possible different types of response to the levy past manufacture.90 Various scenarios were posited, with furnishings on health outcomes informed by empirical data from randomised trials and accomplice studies of association between sugar intake and body weight, diabetes and dental caries. Figure 4 provides a elementary characterisation of how the empirical data were fed into the model. Inputs into the model included levels of consumption of various types of drinks (by historic period and sex), volume of drinks sales, and baseline levels of obesity, diabetes and dental caries (past historic period and sex). The authors concluded that health gains would be greatest if industry reacted by reformulating their products to include less sugar.

Considerations of bias and relevance

It is e'er important to consider the extent to which (1) The findings from each written report have internal validity, particularly for non-randomised studies which are typically at higher risk of bias. (ii) Studies may have been conducted simply not reported because of unexciting findings. (3) Each study is applicable to the purposes of the review, that is, has external validity (or 'directness'), in the linguistic communication of the Grading of Recommendations Assessment, Development and Evaluation (Course) Working Group.83 At minimum, internal and external validity should exist examined and reported, and the gamble of publication bias assessed, and these can be achieved through the Course framework.ten With sufficient studies, information collected might be used in meta-regression analyses to evaluate empirically whether studies with and without specific sources of bias or indirectness differ in their results.

It may be desirable to larn nearly a specific setting, intervention blazon or upshot measure more than directly than others. For example, to inform a decision for a low-income setting, emphasis should be placed on results of studies performed in depression-income countries. One selection is to restrict the synthesis to these studies. An alternative is to model the dependence of an intervention'southward effect on some feature(south) related to the income setting, and excerpt predictions from the model that are most relevant to the setting of interest. This latter approach makes fuller utilise of available data, but relies on stronger assumptions.

Oftentimes, nonetheless, the accumulated studies are likewise few or too disparate to draw conclusions near the impact of bias or relevance. On rare occasions, syntheses might implement formal adjustments of individual written report results for probable biases. Such adjustments may be made by imposing prior distributions to draw the magnitude and direction of whatever biases believed to be.84 85 The choice of a prior distribution may be informed by formal assessments of risk of bias, by expert judgement, or perhaps by empirical data from meta-epidemiological studies of biases in randomised and/or non-randomised studies.86 For example, Wolf et al implemented a prior distribution based on findings of a meta-epidemiological study87 to adjust for lack of blinding in studies of interventions to improve quality of betoken-of-use water sources in low-income and middle-income settings.88 Unfortunately, empirical evidence of bias is by and large limited to clinical trials, is weak for trials of public health and social care interventions, and is largely non-existent for non-randomised studies.

Decision

Our review of quantitative synthesis methods for evaluating the effects of complex interventions has outlined many possible approaches that might exist considered by those collating evidence in support of guideline development. Nosotros have described three broad categories: (1) Non-quantitative methods, including tabulation, narrative and graphical approaches. (two) Standard meta-analysis methods, including meta-regression to investigate written report-level moderators of consequence. (3) More avant-garde synthesis methods, in which models allow exploration of intervention components, investigation of both moderators and mediators, test of mechanisms, and exploration of complexities of the system.

The choice amidst these approaches will depend on the purpose of the synthesis, the similarity of the studies included in the review, the level of detail bachelor from the studies, the nature of the results reported in the studies, the expertise of the synthesis team, and the resources available. Clearly the advanced methods require more expertise and resource than the simpler methods. Furthermore, they require a greater level of detail and typically a sizeable evidence base. We therefore expect them to exist used seldomly; our aim here is largely to articulate what they can reach so that they can be adopted when they are advisable. Notably, the choice among these approaches will also depend on the extent to which guideline developers and users at global, national or local levels understand and are willing to base their decisions on unlike methods. Where possible, information technology will thus be important to involve concerned stakeholders during the early stages of the systematic review process to ensure the relevance of its findings.

Complication is common in the evaluation of public health interventions at private, organisational or community levels. To help systematic review and guideline development teams decide how to address this complication in syntheses of quantitative prove, we summarise considerations and methods in tables one and two. We close with the important remark that quantitative synthesis is non always a desirable feature of a systematic review. Whereas some sophisticated methods are available to bargain with a multifariousness of complex bug, on many occasions—perhaps fifty-fifty the majority in practice—the studies may be too different from each other, too weak in design or have information likewise thin, for statistical methods to provide insight beyond a commentary on what bear witness has been identified.

Acknowledgments

The authors thank the following for helpful comments on earlier drafts of the paper: Philippa Easterbrook, Matthias Egger, Anayda Portela, Susan 50 Norris, Marking Petticrew.

References

  1. 1.
  2. ii.
  3. 3.
  4. iv.
  5. five.
  6. 6.
  7. seven.
  8. viii.
  9. ix.
  10. 10.
  11. eleven.
  12. 12.
  13. 13.
  14. 14.
  15. fifteen.
  16. 16.
  17. 17.
  18. 18.
  19. xix.
  20. 20.
  21. 21.
  22. 22.
  23. 23.
  24. 24.
  25. 25.
  26. 26.
  27. 27.
  28. 28.
  29. 29.
  30. 30.
  31. 31.
  32. 32.
  33. 33.
  34. 34.
  35. 35.
  36. 36.
  37. 37.
  38. 38.
  39. 39.
  40. 40.
  41. 41.
  42. 42.
  43. 43.
  44. 44.
  45. 45.
  46. 46.
  47. 47.
  48. 48.
  49. 49.
  50. 50.
  51. 51.
  52. 52.
  53. 53.
  54. 54.
  55. 55.
  56. 56.
  57. 57.
  58. 58.
  59. 59.
  60. 60.
  61. 61.
  62. 62.
  63. 63.
  64. 64.
  65. 65.
  66. 66.
  67. 67.
  68. 68.
  69. 69.
  70. 70.
  71. 71.
  72. 72.
  73. 73.
  74. 74.
  75. 75.
  76. 76.
  77. 77.
  78. 78.
  79. 79.
  80. lxxx.
  81. 81.
  82. 82.
  83. 83.
  84. 84.
  85. 85.
  86. 86.
  87. 87.
  88. 88.
  89. 89.
  90. ninety.

Request Permissions

If you wish to reuse any or all of this commodity delight use the link below which will take you to the Copyright Clearance Center's RightsLink service. You will be able to go a quick toll and instant permission to reuse the content in many dissimilar ways.