Challenges in the peer review of systematic reviews and meta-analyses

2013

http://informahealthcare.com/jmfISSN: 1476-7058 (print), 1476-4954 (electronic)

J Matern Fetal Neonatal Med, 2013; 26(8): 768–771! 2013 Informa UK Ltd. DOI: 10.3109/14767058.2012.755161

Challenges in the peer review of systematic reviews and meta-analyses

Anthony M. Vintzileos1, Jonathan Carvajal1, and Shahidul Islam2

1Department of Obstetrics and Gynecology, and 2Department of Health Outcomes Research, Winthrop University Hospital, Mineola, NY, USA

Abstract

Objective: To assess the role of the referees in assisting the peer review process of systematicreviews and meta-analyses.Methods: A one-page questionnaire was mailed to 1391 referees of two journals, theAmerican Journal of Obstetrics and Gynecology and Obstetrics and Gynecology. The refereeswere asked how often they verified by their own independent analysis 11 key items relatedto the methodology and statistical analysis of systematic reviews and meta-analyses.Response categories included ‘‘always’’, ‘‘frequently’’ (450% of the time), ‘‘infrequently’’(�50% of the time) and ‘‘never’’. A second and a third mailing was sent to the non-respondents.Results: 42 mailings were returned because of change of address. Of the remaining 1349referees, 272 responded (response rate 20%). Of the 272 respondents, 159 (58%) had previouslyreviewed articles dealing with systematic reviews or meta-analyses. The responses variedaccording to the key items in the questions but the referees used their own independentanalyses ‘‘always’’ in only 2%–17% of the time. The rates of ‘‘infrequently’’ or ‘‘never’’ responsescombined together ranged from 51% to 86% for the various key items.Conclusion: The overwhelming majority of the referees do not verify, by their own independentanalysis, key items related to methodology and statistical analysis of submitted systematicreviews and meta-analyses.

Keywords

Evidence-based medicine, peer-review,questionnaire, survey

History

Received 15 August 2012Revised 19 October 2012Accepted 6 November 2012Published online 15 January 2013

Introduction

Systematic reviews and meta-analyses are tools used to

summarize the evidence emanating from studies on the same

topic. Approximately 1 in 10 level A recommendations in the

obstetrical bulletins are based on meta-analyses [1]. Since the

results of such systematic reviews and meta-analyses are

frequently used by physicians and policy makers to imple-

ment health care changes, these reports should have

transparency. In order to improve the quality of the reporting,

reporting guidelines have been mandated by most journals.

Such reporting guidelines for systematic reviews and meta-

analyses include the QUOROM (QUality Of Reporting Of

Meta-analysis) [2] and the PRISMA (Preferred Reporting

Items for Systematic reviews and Meta-Analyses) [3].

Although these guidelines may increase the quality of

reporting of systematic reviews and meta-analyses they do

not address the quality of the peer review process which

precedes the publication.

The purpose of this study was to determine the manner in

which the referees peer review articles labeled as systematic

reviews and meta-analyses.

Material and methods

A packet containing a one-page questionnaire along with an

explanatory cover letter, a stamped postcard with the referee’s

ID number and a stamped return envelope, were mailed to

1391 referees across the United States. The referees were

chosen from the published lists of the ‘‘editorial consultants’’

of two leading journals in obstetrics and gynecology: the

American Journal of Obstetrics and Gynecology [4] and

Obstetrics and Gynecology [5]. The exact mailing addresses

of the referees were extracted from ‘‘Google’’ using the first

and last name of each referee as key words and the city and

state which appeared under the editorial consultants’

published lists. The names and mailing addresses of the

referees were compiled into a list and each referee was

assigned an ID number. The same ID number was placed on

the return postcard.

The cover letter to the referees stated that the purpose of

the survey was to identify the role of the referees in assisting

with the peer review process of systematic reviews and meta-

analyses and that the survey was anonymous and confidential.

In order to ensure the anonymity and confidentiality, the

referees were instructed to mail back the completed

questionnaire and the postcard with their ID number

separately. This methodology allowed for the anonymity and

confidentiality and at the same time prevented us from

sending repeat reminders to those who had already completed

and returned the questionnaire. The referees were given the

Address for correspondence: Dr Anthony Vintzileos, MD, Department ofObstetrics and Gynecology, Winthrop University Hospital, Mineola,USA. E-mail: [email protected]

J M

ater

n Fe

tal N

eona

tal M

ed D

ownl

oade

d fr

om in

form

ahea

lthca

re.c

om b

y B

iblio

teka

Uni

wer

syte

tu W

arsz

awsk

iego

on

10/2

9/14

For

pers

onal

use

onl

y.

option of either responding to the questionnaire by regular

mail using the enclosed stamped envelope or by using a

website (http://www.surveymonkey.com/s/Q3FKLHD) (last

assessed 9 Jan 2012). In order to increase the response rate

a second and third mailing was sent to non-respondents.

The questionnaire included questions related to the

referees’ age, gender, medical or other degrees, number of

years of reviewing and number of reviewed papers in the prior

12 months. Those referees who had previously reviewed

systematic reviews or meta-analyses articles were asked how

often they verified by their own independent analysis 11 key

items related to methodology and statistical analysis

(Table 1). For each of the items there were four possible

responses: ‘‘always’’, ‘‘frequently’’ (450% of the time),

‘‘infrequently’’ (�50% of the time) and ‘‘never’’. There were

no open-ended questions. The study was approved by the

Institutional Review Board of Winthrop University Hospital.

The results were reported as descriptive statistics.

Continuous data were presented as medians (range) and

categorical data were presented as number (%). In order to

determine if the responses were influenced by age, gender,

degree, years of reviewing or number of reviewed papers

during the prior year, the responses were collapsed into two

groups (‘‘always’’ and ‘‘frequently’’ combined versus

‘‘infrequently’’ and ‘‘never’’ combined) and then applied a

random-intercept logistic regression model. In these models,

we used survey question as the fixed effects and ID variable

as the random effect. This statistical analysis was performed

by the SAS (SAS Institute, Cary, NC). The results were

considered statistically significant when p50.05.

Results

Of the 1391 initial mailings, 42 were returned because of

incorrect address. Of the remaining 1349 referees, 272

responded (response rate 20%). The majority of responses

(237 or 87%) were received via regular mail; the remaining

responses (35 or 13%) were received electronically. Of the

272 respondents, 113 (42%) had not reviewed any systematic

reviews or meta-analyses, whereas 159 (58%) had previously

reviewed such articles. The demographic characteristics of the

responders are shown in Table 2. Among those who

responded via regular mail, there was wide geographic

representation from 40 states across the United States

(Northeast 11%; Mid-Atlantic 18%; Midwest East North

Central 17%; West North Central 6%; South Atlantic 13%;

East South Central 9%; West South Central 6%; West

Mountain 9%; and West Pacific 11%). The responses with

respect to the frequency that the referees verified by their own

independent analysis of the 11 key items are shown in Table 1.

The referees used their own independent analyses ‘‘always’’

in only 2%–17% of the time. In contrast, the rates of

‘‘infrequently’’ or ‘‘never’’ responses combined together

ranged from 51% to 86%. The regression analysis showed that

the responses were not influenced by age, gender, degree,

years of reviewing, number of reviewed papers during the

prior year or response media (regular versus electronic)

(Table 3).

Discussion

It is a commonly held belief that systematic reviews and

meta-analyses are valuable tools to summarize evidence

relating the efficacy and safety of health care interventions.

However, systematic reviews and meta-analyses often lack

clarity resulting in poor reporting [3]. This has led to the

creation of reporting guidelines for authors, such as

QUOROM and PRISMA [2,3] which have been adopted by

several peer review journals in order to increase clarity of

reporting. Although these guidelines improve communication

between author(s) and reader(s), they were not developed to

improve the quality of the peer review process. There are

virtually no data regarding the quality of the peer review by

the referees in general and how often they verify by their own

Table 1. Referees’ responses on how frequently they use their own independent analyses in verifying key items in methodology and statisticalanalysis.

AlwaysFrequently

(450% of the time)Infrequently

(�50% of the time) Never

Verify the initial set of articles by using the specified key words andsearch engine(s) (i.e. PubMed) that the authors used

19 (12%) 26 (16%) 58 (37%) 56 (35%)

Verify all additional articles, identified by the authors, by using othersources (i.e. abstracts, conference proceedings, etc.)

7 (4%) 25 (16%) 60 (38%) 67 (42%)

Verify the articles which needed to be removed because they wereduplicates

17 (11%) 15 (9%) 51 (32%) 76 (48%)

Verify that the authors’ exclusion criteria were applied appropriatelyfor each excluded study

20 (13%) 28 (18%) 54 (34%) 57 (36%)

Verify that the selection criteria were applied appropriately byreviewing all full-text articles which were included in the finalanalysis

18 (11%) 26 (16%) 53 (34%) 62 (39%)

Verify the appropriateness of studies included in the qualitativesynthesis (systematic reviews)

26 (16%) 53 (33%) 44 (28%) 36 (23%)

Verify the appropriateness of studies included in the quantitativesynthesis (meta-analysis)

27 (17%) 49 (31%) 40 (25%) 43 (27%)

Verify heterogeneity of the included studies 17 (11%) 33 (21%) 47 (29%) 62 (39%)Verify publication bias of the included studies 21 (13%) 32 (20%) 41 (26%) 65 (41%)Verify that the data synthesis from the included individual studies was

appropriate23 (14%) 33 (21%) 44 (28%) 59 (37%)

Verify that the statistical analyses were appropriate by using your ownsoftware or statistical package

4 (2%) 19 (12%) 30 (19%) 106 (67%)

DOI: 10.3109/14767058.2012.755161 Reviews of systematic reviews and meta-analyses 769

J M

ater

n Fe

tal N

eona

tal M

ed D

ownl

oade

d fr

om in

form

ahea

lthca

re.c

om b

y B

iblio

teka

Uni

wer

syte

tu W

arsz

awsk

iego

on

10/2

9/14

For

pers

onal

use

onl

y.

independent analysis key items of the methodology and

statistical analysis. The quality of the peer review process is

extremely important when comes to reviewing systematic

reviews and meta-analyses because the results of such articles

are frequently used by physicians to treat individual patients

and by policy makers to make health care changes in entire

populations.

In general, referees do not undergo any special training on

how to conduct their peer reviews. Since there is no formal

training, most referees ‘‘learn on the job’’. To assist young

referees, we recently published the guidelines on how to peer

review an original research paper [6]. In our guidelines, we

emphasized that a referee should fulfill certain prerequisites

including the willingness to do a thorough review and

availability of time, so that a constructive review can be

accomplished within a reasonable period of time. These two

prerequisites are not difficult to fulfill when referees are asked

by journal editors to review an original research paper

involving a single study.

In reviewing systematic reviews or meta-analyses, a

plethora of studies have to be reviewed by the referee in

order to verify the appropriateness of the studies that were

included (or excluded) from the qualitative or quantitative

synthesis. In addition, the referee should make sure that

heterogeneity, publication bias and appropriate statistical

analyses were done correctly. Such a task would require an

enormous amount of time and it would most likely be

disruptive to the referee’s daily work and life. Thus, the

expectation that the referee will repeat the analysis is not

reasonable. Although our questionnaire did not address

directly the question why the referees do not verify the

methodology and the statistical analysis of the systematic

reviews or meta-analyses articles, it is logical to assume that

the main reason is time constraints. Other possible reasons

could be the lack of statistical knowledge or being unaware of

such an expectation.

It has long been recognized that there are serious

discrepancies between the results of published meta-analyses

and subsequent large randomized-controlled trials in the same

topics [7,8]. Despite the introduction of reporting guidelines

for systematic reviews and meta-analyses articles since 1999,

the contradictions between meta-analyses and randomized-

controlled trials continue to exist [9,10]. Although there are

several possible explanations for these discrepancies, one of

the possibilities may be differences in the quality or reliability

of the peer review process.

There is a fundamental difference between a single study

and a systematic review or meta-analysis with respect to the

complexity of the methodology. In peer-reviewing single

study articles, verification of the methodology and conse-

quently its conduct and statistical analysis are not feasible

because the raw data are not available to the referee; however,

the chance for error(s) is much less as compared to systematic

reviews or meta-analyses. In contrast, in systematic reviews

and meta-analyses, the methodology is more complex relative

to single studies and therefore more prone to errors.

Therefore, verification of the methodology (and subsequent

statistical analysis) are of paramount importance and indeed

feasible since all the necessary data are available to the

referees. The problem is that in order to accomplish a review

of systematic review or meta-analysis papers within a

reasonable period of time, most referees accept the entire

methodology and statistical analysis, exactly as proposed by

the author(s) of the paper without verification despite the

availability of the actual data. Thus, any errors in the

qualitative or quantitative synthesis of the studies by the

authors will escape the detection. Errors in the qualitative or

quantitative synthesis by the author(s) may be unintentional or

due to personal bias. Unchecked inappropriate manipulation

or inclusion (or exclusion) of studies has raised concerns

about the political uses of meta-analysis, a phenomenon

which has been characterized as ‘‘tyranny of meta-analysis’’

[11]. Yet, it is impossible to expect the referee to uncover such

inappropriate manipulations or biases without dedicating an

enormous amount of time. Thus, guidelines for the peer

review of systematic reviews and meta-analyses are required

to avoid the publication of undetected bias because of

suboptimal peer review. In order for the systematic reviews,

meta-analyses or any other qualitative or quantitative

syntheses to be reliable, the review of the evidence should

not be done by the journal’s individual referees but perhaps by

a dedicated group of experts. One example is the Cochrane

Table 3. Logistic regression analysis for clustered data: effects of referee characteristics on the survey responses.

Parameter Estimate Standard error Odds ratio (95% C.I.) p Value

Intercept 0.7271 1.5284 – 0.6343Referee characteristicsAge �0.0071 0.0220 0.99 (0.95, 1.04) 0.7479Gender (female versus male) �0.3085 0.3328 0.73 (0.38, 1.41) 0.3539Degree (MD versus MDþ other degrees) �0.3215 0.2945 0.73 (0.41, 1.29) 0.2750Number of years of reviewing �0.0006 0.0244 1.00 (0.95, 1.05) 0.9806Number of papers reviewed prior year (411 versus55) �0.2643 0.3496 0.77 (0.39, 1.52) 0.4495Number of papers reviewed prior year (6–10 versus55) �0.0666 0.3849 0.94 (0.44, 1.99) 0.8627Response media (web versus hard copy) �0.3479 0.3516 0.71 (0.35, 1.41) 0.3224

Table 2. Characteristics of the responders (n¼ 159).

n (%)

Male 129 (81%)Female 30 (19%)MD degree only 133 (84%)MD and additional degree 26 (16%)Number of papers reviewed the prior year

11 or more 68 (43%)6–10 60 (38%)5 or less 31 (19%)

Age (median, range) 53 (34–80)Number of years reviewing papers (median, range) 17 (3–50)

770 A. M. Vintzileos et al. J Matern Fetal Neonatal Med, 2013; 26(8): 768–771

J M

ater

n Fe

tal N

eona

tal M

ed D

ownl

oade

d fr

om in

form

ahea

lthca

re.c

om b

y B

iblio

teka

Uni

wer

syte

tu W

arsz

awsk

iego

on

10/2

9/14

For

pers

onal

use

onl

y.

Collaboration’s contributors group, which is a mix of

volunteers and paid staff dealing with the methodology of

systematic reviews. Many of these dedicated members are

world leaders in their field of medicine, health policy,

research methodology or consumer advocacy, and are work-

ing in some of the world’s best academic and medical

institutions. Such groups consisting of expert individuals may

have all the time available for a thorough qualitative and

quantitative review and analysis.

One of the important observations in this study is the

relatively low response rate (20%) despite a robust methodol-

ogy using well-established methods to maintain confidenti-

ality and anonymity [12]. Low response rates usually raise

questions regarding the reliability of the survey results.

However, the effect of non-response depends on the extent

and direction to which those not responding are biased. In our

study, a possible bias is that those reviewers who do not use

their own independent analyses may be embarrassed to

respond. However, this bias would suggest that in real life, the

percentage of referees that they do not use their own

independent analyses may even be greater than in our study.

In addition, the response rate after the third mailing was much

less (approximately 50% less) than after the second mailing;

thus, further mailings would not have increased the number of

respondents or alter the results considerably. Another possible

reason for the low response rate may be the small number of

systematic reviews and meta-analyses, as compared to

original articles, published by the two journals. For all the

aforementioned reasons, we believe that the results of this

survey are valid given the wide geographic representation of

the respondents.

Another possible limitation of the study is that the survey

did not explore how much knowledge or experience the

referees themselves had in conducting systematic reviews and

meta-analyses. This can be a crucial issue; however, it is

likely that Editors choose referees who already have published

systematic reviews or meta-analyses. Another limitation is

that, our survey to the referees did not address the possibility

that some of the results of the published systematic reviews

and meta-analyses may have been verified by the statistical

consultants of the journals.

Our survey included only referees from two major

obstetrics and gynecology journals. However, we have no

reason to believe that the results will be different for referees

of other medical journals or specialties. In our view, the

results of this study highlight the need for academic units,

investigators and Journal editors to develop guidelines for

the peer review of systematic reviews and meta-analyses.

These guidelines can be provided to the referees at the time of

invitation. One suggestion may be that referees should be

guided by the Editors to specify, in a check list form, all those

items which they are expected to verify by their own

independent analysis when reviewing systematic reviews

and meta-analyses. Another possibility is to have the authors

of systematic reviews, upon request by the referee, submit a

copy of the abstracted data, a copy of the articles as well as

data obtained from other sources, so that the referee does not

have to spend a lot of time searching the literature and

abstracting data. Editors should also try, whenever possible, to

employ experienced referees for these types of publications.

This will hopefully lead to better quality of medical

information and ultimately patient care.

Declaration of interest

None of the authors have a conflict of interest.

References

1. Chauhan SP, Berghella V, Sanderson M, et al. Randomized clinicaltrials behind level A recommendations in obstetric practicebulletins; compliance with CONSORT statement. Am J Perinatol2009;26:69–80.

2. Moher D, Cook DJ, Eastwood S, et al. Improving the quality ofreports of meta-analyses of randomized controlled trials: theQUOROM statement. Quality of reporting of meta-analyses.Lancet 1999;354:1896–900.

3. Liberati A, Altman DG, Tetzlaff J, et al. The PRISMA statement forreporting systematic reviews and meta-analyses of studies thatevaluate health care interventions: explanation and elaboration.J Clin Epidemiol 2009;62:e1–34.

4. Consultants for the American Journal of Obstetrics andGynecology, 1 January 2005–31 December 2005. Am J ObstetGynecol 2006;194:e49–e63.

5. Editorial consultants. Obstet Gynecol 2009;114:1165–457.6. Vintzileos AM, Ananth CV. The art of peer-reviewing an original

research paper. J Ultrasound Med 2010;29:513–18.7. Villar J, Carroli G, Belizan JM. Predictive ability of meta-analyses

of randomized controlled trials. Lancet 1995;345:772–6.8. LeLorier J, Gregoire G, Benhaddad A, et al. Discrepancies between

meta-analyses and subsequent large randomized, controlled trials.N Engl J Med 1997;337:536–42.

9. Vas J, Aranda JM, Nishishinya B, et al. Correction of nonvertexpresentation with moxibustion: a systematic review and metaana-lysis. Am J Obstet Gynecol 2009;201:241–59.

10. Guittier MJ, Pichon M, Dong H, et al. Moxibustion for breechversion: a randomized controlled trial. Obstet Gynecol2009;114:1034–40.

11. Klein MC. The tyranny of meta-analysis and the misuse ofrandomized controlled trials in maternity care. Birth 2012;39:80–2.

12. Fowler FJ. Survey research methods. Applied social researchmethods series. 4th ed. Thousand Oaks, CA: Sage Publications, Inc;2009:49–67.

DOI: 10.3109/14767058.2012.755161 Reviews of systematic reviews and meta-analyses 771

J M

ater

n Fe

tal N

eona

tal M

ed D

ownl

oade

d fr

om in

form

ahea

lthca

re.c

om b

y B

iblio

teka

Uni

wer

syte

tu W

arsz

awsk

iego

on

10/2

9/14

For

pers

onal

use

onl

y.

Documents

Challenges in the peer review of systematic reviews and meta-analyses