1-s2.0-S089543560900016X-main

Journal of Clinical Epidemiology 62 (2009) 506e510

REVIEW ARTICLES

Choice of data extraction tools for systematic reviews dependson resources and review complexity

Mohamed B. Elamina, David N. Flynna, Dirk Basslerb, Matthias Brielc,d, Pablo Alonso-Coelloe,f,Paul Jack Karanicolasc, Gordon H. Guyattc, German Malagag, Toshiaki A. Furukawah,

Regina Kunzd, Holger Schunemanni, Mohammad Hassan Murada, Corrado Barbuij, AndreaCiprianij, Victor M. Montoria,*

aKnowledge and Encounter Research Unit, College of Medicine, Mayo Clinic, Rochester, MN, USAbThe Department of Neonatology, University Children’s Hospital, Tubingen, Germany

cThe Department of Clinical Epidemiology and Biostatistics, McMaster University, Hamilton, Ontario, CanadadBasel Institute for Clinical Epidemiology, University Hospital Basel, Basel, Switzerland

eDepartment of Clinical Epidemiology and Public Health, Iberoamerican Cochrane Centre, Hospital de Sant Pau, Barcelona, SpainfCIBER de Epidemiologıa y Salud Publica (CIBERESP), Spain

gFacultad de Medicina, Universidad Peruana Cayetano Heredia, Lima, PeruhDepartment of Psychiatry and Cognitive-Behavioral Medicine, Nagoya City University Graduate School of Medical Sciences, Nagoya, Japan

iDepartment of Epidemiology, Italian National Cancer Institute Regina Elena, Rome, ItalyjSection of Psychiatry and Clinical Psychology, Department of Medicine and Public Health, University of Verona, Verona, Italy

Accepted 1 October 2008

Abstract

Objective: To assist investigators planning, coordinating, and conducting systematic reviews in the selection of data-extraction tools forconducting systematic reviews.

Study Design and Setting: We constructed an initial table listing available data-collection tools and reflecting our experience withthese tools and their performance. An international group of experts iteratively reviewed the table and reflected on the performance ofthe tools until no new insights and consensus resulted.

Results: Several tools are available to manage data in systematic reviews, including paper and pencil, spreadsheets, web-based surveys,electronic databases, and web-based specialized software. Each tool offers benefits and drawbacks: specialized web-based software is wellsuited in most ways, but is associated with higher setup costs. Other approaches vary in their setup costs and difficulty, training require-ments, portability and accessibility, versatility, progress tracking, and the ability to manage, present, store, and retrieve data.

Conclusion: Available funding, number and location of reviewers, data needs, and the complexity of the project should govern theselection of a data-extraction tool when conducting systematic reviews. � 2009 Elsevier Inc. All rights reserved.

1. Introduction

Systematic reviews allow for the rigorous identification,selection, and summary of research evidence to answera focused question [1,2]. Increasingly, clinicians, patients,researchers, and policy makers rely on high-quality system-atic reviews for decision making. A key step in the reviewprocess is the structured extraction of data from eligible pri-mary studies, a procedure that, typically, pairs of reviewersconduct in duplicate and independently. In any systematic

* Corresponding author. Knowledge and Encounter Research Unit,

College of Medicine, Mayo Clinic, W18A, 200, First Street SW, Roches-

ter, MN 55905, USA. Tel.: þ507-284-2617; fax: þ507-284-5745.

E-mail address: [email protected] (V.M. Montori).

0895-4356/08/$ e see front matter � 2009 Elsevier Inc. All rights reserved.

doi: 10.1016/j.jclinepi.2008.10.016

review, a crucial task for the project’s leader, or in a bet-ter-funded situation, a dedicated systematic review coordi-nator, is managing the data. Several data-extraction toolsmay assist reviewers in conducting systematic reviews.Selecting the optimal tool requires balancing the upstartand maintenance effort and costs to obtain the necessaryfunctionality for often complex projects.

In this report, we collate the collective experience ofseveral systematic reviewers, from different countries, tooffer a reference guide for investigators planning, coordi-nating, and conducting systematic reviews, henceforthreferred to as systematic reviewers, choosing and designingdata-extraction procedures for systematic reviews. To ourknowledge, this is the first assessment of data-extractiontools for conducting systematic reviews.

mailto:[email protected]

507M.B. Elamin et al. / Journal of Clinical Epidemiology 62 (2009) 506e510

What is new?

� Several tools are available to manage data in sys-tematic reviews.

� Each data-extraction tool has its arguments for andagainst.

� Cost and number of reviewers govern the tool selec-tion to a large extent.

� We expect that a high degree of integration betweenthe electronic versions of the study reports and thedata-extraction forms may reduce the amount ofdata input to a greater extent than possible todayand may facilitate collaboration, error reduction,and quality control.

2. Methods

We drafted a preliminary table capturing the experienceof two systematic review coordinators (M.B.E. and D.N.F.)that outlined the most common data-extraction tools used inour systematic review group, the Knowledge and EncounterResearch (KER) Unit at Mayo Clinic in Rochester, Minne-sota [3]. The tools used for data extraction were paper andpencil and their equivalent electronic word processing soft-ware, spreadsheets, local relational databases, web-basedforms, and electronic web-based database. In addition,and through discussion within our group, we generated a listof issues as a basis for comparing these tools. The list ofissues was based on personal experience with the toolsidentified and was synthesized to meet the general needsof systematic reviewers.

We circulated this first draft to an international group ofexpert reviewers, identified by snowball sampling (i.e., onereviewer suggesting others) [4]. Iteratively, each expertreviewed the tools and provided feedback regarding theissues we identified or contributed with other insights. Wecontinued this process until new participants were unable tocontribute additional tools or issues and consensus emerged.

3. Results

Table 1 describes the collected tools and characteristicsthat resulted from our exploration. Additionally, Table 2describes the different ways to distribute the selected toolto reviewers, a key consideration in selecting extraction tools.

3.1. Paper and pencil (and their electronic derivatives)

Although it is our impression that reviewers are not oftenusing paper and pencil as the only approach to extract andmanage data, and we strongly discourage exclusive use ofpaper-and-pencil forms, many reviewers prefer to use paperforms before they enter the data into an electronic tool. Paper

forms carry a relatively low cost of implementation and use.These are well suited for small local projects (few primarystudies, few reviewers, and all reviewers at the same loca-tion). However, inaccurate interpretation of handwritten dataor comments, input errors, and the need to transfer data toother electronic platforms for computer-assisted statisticalanalyses represent key drawbacks. Some groups that alsouse other approaches still find paper-and-pencil forms usefulto mock and pilot data extraction items and procedures.

Because systematic reviewers usually use word processorsoftware (e.g., Microsofts Word; Microsoft Corporation,http://www.microsoft.com) to produce these paper-and-pen-cil forms, these electronic files (shared through e-mail, for in-stance) represent the logical extension of this approach intothe electronic environment. Also, HTML or PDF formsmay include functionality (using macros or a ‘‘submit’’ but-ton) used to send back data by means of e-mail after the re-viewer completes the form. Review coordinators can thendownload the data from these forms into spreadsheet or data-base software. This step avoids the need for transcriptionfrom paper to analysis software but still requires the reviewer(or software) to verify the data for accuracy and completenessand to compare and resolve discrepancies between reviewers.

3.2. Spreadsheet and database software

Spreadsheet software, both standard (e.g., Microsofts Ex-cel, Microsoft Corporation) or open access (e.g., GoogleDocs, http://www.google.com) is a common and relativelyinexpensive solution to the problem of collecting data in a for-mat that is ready for summary and analyses. Recent versionsof this software allow reviewers to collaborate on the samedocument online simultaneously; new spreadsheet softwarethat resides entirely online also enables simultaneous input(e.g., http://docs.google.com). Spreadsheet software func-tionality includes drop-down menus and range checks thatmay prevent data entry error and limit free-text typing. Thesetools, however, become cumbersome and prone to wrong-cellerrors when data forms span many rows or columns. They arealso burdensome for the consolidation and identification,tracking, and resolution of disagreements in the data-extrac-tion worksheets from reviewer pairs.

Relational databases (e.g., Microsofts Access, MicrosoftCorporation) are particularly suited for the collection of datafrom primary studies that correspond to different domains(study publication details, methodological quality, and char-acteristics of the patients, interventions, and outcomes).These programs allow for the simultaneous contribution ofreviewers into the database. Attractive and well-designeddata extraction interfaces can facilitate the task. Databasesoftware allows for the production of tailored tables and re-ports that facilitate analyses. The main drawback is the needfor expertise in setting up relational databases and in design-ing appropriate queries and useful reports.

Consider a relatively small and simple review. The searchfor potentially eligible studies that match the protocol’s

http://www.microsoft.com

http://www.google.com

http://docs.google.com

Table 2

Methods used to distribute different tools

Tools Mail Local network E-mail Online access

Paper and pencila � � � �Spreadsheet � � �Database software � � �E-mail-based forms �Web-based surveys �Web-based review software �

a This tool could be modified into an electronic version (word docu-

ment), which facilitates online distribution and sharing.

Tab

le1

Com

par

iso

nb

etw

een

dif

fere

nt

dat

a-co

llec

tio

nto

ols

incl

ud

ing

Rev

Man

To

ols

Set

up

cost

aP

roje

ctse

tup

dif

ficu

lty

bV

ersa

tili

tyc

Tra

inin

g

requ

irem

entd

Po

rtab

ilit

y/

acce

ssib

ilit

y

Ab

ilit

yto

man

age

dat

aeA

bil

ity

totr

ack

pro

gre

ssf

Ab

ilit

yto

pre

sen

t

dat

agA

bil

ity

tost

ore

and

retr

ieve

dat

a

Low

est/

hig

hes

t

Eas

iest

/to

ugh

est

Eas

iest

/to

ugh

est

Eas

iest

/to

ugh

est

Eas

iest

/to

ugh

est

Eas

iest

/to

ugh

est

Eas

iest

/to

ug

hes

t

Eas

iest

/to

ug

hes

t

Eas

iest

/to

ugh

est

Pap

eran

dp

enci

l-

,,

,,

-,

,,

,-

--

-,

-,

,,

,-

--

-,

--

--

,-

--

-,

--

--

--

--

-,

E-m

ail-

base

dfo

rms

--

,,

,-

-,

,,

--

-,

,-

,,

,,

-,

,,

,-

--

,,

--

,,

,-

--

,,

--

,,

,S

pre

adsh

eet

soft

war

e-

-,

,,

-,

,,

,-

--

,,

--

,,

,-

-,

,,

--

,,

,-

-,

,,

--

-,

,-

-,

,,

Coc

hra

ne’

sR

evM

an,

,,

,,

--

,,

,-

,,

,,

--

,,

,-

-,

,,

--

,,

,-

,,

,,

--

--

,,

,,

,,

Dat

abas

eso

ftw

are

--

,,

,-

--

-,

--

,,

,-

-,

,,

--

,,

,-

-,

,,

--

,,

,-

-,

,,

--

,,

,

Web

-bas

edsu

rvey

s-

-,

,,

--

,,

,-

-,

,,

-,

,,

,-

,,

,,

--

,,

,-

-,

,,

--

-,

,-

-,

,,

Web

-bas

edsp

ecia

lize

d

appli

cati

onsh

--

--

--

-,

,,

-,

,,

,-

-,

,,

-,

,,

,-

,,

,,

,,

,,

,-

,,

,,

-,

,,

,

aT

he

cost

may

incl

ud

eso

ftw

are,

com

pu

ters

,In

tern

etac

cess

,an

do

ther

s.P

erso

nnel

tim

en

ot

incl

ud

ed.

bD

esig

nth

eto

olto

be

mos

tco

mp

atib

lew

ith

the

pro

ject

tofa

cili

tate

dat

am

anag

emen

t,i.

e.,

setu

pth

ed

ata

extr

acti

on

form

for

dif

fere

nt

leve

ls.

cR

efer

sto

abil

ity

tom

od

ify

pro

ject

req

uir

emen

tsaf

ter

init

ial

exec

utio

n.

dT

he

tim

en

eed

edfo

rre

vie

wer

sto

get

fam

ilia

rw

ith

the

too

lan

du

seit

effi

cien

tly

toin

put

dat

a.e

Ref

ers

toco

mp

ilin

go

fal

ld

ata

ente

red

by

rev

iew

ers

afte

rfi

nis

hin

gd

ata

extr

acti

on

and

con

flic

t/d

isag

reem

ent

reso

luti

on

.f

To

trac

kre

vie

wer

s’p

rog

ress

.T

his

pro

cess

isen

han

ced

by

shar

ing

the

too

l(s

pre

adsh

eet

soft

war

e,d

atab

ase

soft

war

e)ov

erth

eW

orl

dW

ide

Web

(ww

w)

assh

own

inth

eta

ble

.g

Ref

ers

toth

eab

ilit

yto

pres

entd

ata

indi

ffer

entf

orm

ats

and

toqu

ery

the

data

base

toge

nera

tere

port

sfo

rdif

fere

ntus

ers

(e.g

.,qu

anti

fyan

dre

solv

edi

sagr

eem

ents

,cre

ate

lett

ers

foro

rigi

nala

utho

rsto

confi

rmda

taex

trac

ted)

.h

Web

-bas

edso

ftw

are

appli

cati

ons

spec

ifica

lly

des

igned

tom

anag

esy

stem

atic

revie

ws.

508 M.B. Elamin et al. / Journal of Clinical Epidemiology 62 (2009) 506e510

eligibility criteria yielded 10 studies. The eligibility andextraction forms are simple and straightforward, will notrequire extensive data entry, and only two reviewers whowork in the same facility are involved in the review. In thiscase, the reviewer decided to create a spreadsheet softwarefile that captured the data items in columns and arranged onestudy per row (project setup) while keeping the costs to theminimum (setup cost). A paper form could have sufficed,but the reviewers preferred avoiding data entry from thepaper forms into the analyses software (data management)and wanted also the ease of modifying the extraction formsdown the road (versatility). The electronic versions, distrib-uted by means of e-mail, for instance, may represent a betterfit and may be more practical when reviewers are geograph-ically far apart (portability and accessibility).

3.2.1. Review software (RevMan) from the CochraneCollaboration

Although extraction and storage of data is not the primaryfunction of RevMan, (Review Manager [RevMan; Computerprogram]; Version 5.0. Copenhagen: The Nordic CochraneCentre, The Cochrane Collaboration, 2008) its widespreaddistribution among reviewers justifies an appraisal of itsstrengths and weaknesses for data extractions and manage-ment (Table 1). RevMan offers screens for descriptive infor-mation on population, interventions and outcomes, and formethodological issues (risk of bias) that go into tables aboutthe included studies, and screens for results that will form theforest plot with the analysis. The range of information thatcan be entered is limited with many fields presented by de-fault. All data are used in the report, with no obvious func-tionality to collect data that may support the review (e.g.,to get a complete picture of the primary studies), but thatwould not enter the final review analyses and report. Rather,all entries appear in the final review. The instructions on datacollection in the most recent Cochrane Handbook advise theuse of tools presented in Table 1 [5]. Specific strategies varyamong review groups within the Cochrane Collaboration.Some provide standardized data-extraction sheets in ‘‘pen-and-pencil’’ forms (or the electronic equivalent). Othersleave the approach up to the individual review team, whichmight opt to enter the requested data directly into thesoftware. In any case, RevMan lacks an option to import

509M.B. Elamin et al. / Journal of Clinical Epidemiology 62 (2009) 506e510

information from an electronic database; rather, it requiresmanual entry of data or ‘‘cut and paste’’ from an appropri-ately formatted (i.e., spreadsheet format) report.

3.3. Web-based forms and survey software

Many inexpensive web-based survey services exist (e.g.,Survey Monkey [6]), that allow individuals with minimalcomputer literacy to create attractive web-available surveysand repurpose them for data extraction. Data capturedthrough these surveys reside in the company’s server andcan be retrieved and formatted for use in spreadsheet ordatabase software. Reviewers can tailor a wide variety ofquestion formats (i.e., drop-down boxes, multiple choicequestions) to the task, with some of these limiting typingand input errors through selection of pre-entered answersor range checks. This tool shares the downsides of otherapproaches that require the consolidation, tracking, and res-olution of discrepancies between reviewers using databaseor spreadsheet forms.

3.4. Web-based specialized systematic reviewapplications

Reviewers conducting numerous and large systematicreviews are using dedicated online software services thatenable them to manage the systematic review effort, includ-ing data extraction (e.g., Trialstat SRS [7] and EPPI [theevidence for policy and practice information and coordinat-ing] Centre Reviewer [8]). These specialized software appli-cations allow reviewers to upload references from differentreference managers (e.g., Endnote, ProCite, Reference Man-ager), design screening and data-extraction forms, dividework load across reviewers and track their progress in realtime, identify disagreements between reviewers (and calcu-late agreement statistics), and generate custom reports. Theavailable services still require the download of the dataextracted into spreadsheet or statistical analysis software(for which these tools generate reports in the necessary for-mats). Careful design of the web data-collection interfaceminimizes, but does not eliminate further formatting to readythe data for analyses, a problem shared by all available tools.These services are costly (e.g., cost per review could rangefrom 2,000 to 25,000 US dollars).

Consider a complex and large systematic review. Thesearch initially yielded 2,500 studies, and the selectionand extraction forms were complex. Four pairs of reviewersare involved, some of them in different countries. The pro-ject has adequate funding.

The reviewer in this situation decided to use a web-based systematic review service, because this facilitatedall steps of the review, saved valuable time, and likelydecreased errors. The review coordinator had to first trainthe reviewers on how to use the web-based service (trainingrequirement). Nevertheless, the review coordinator kept thiscomplex project on time and within the allotted budget by

monitoring the progress of each of the reviewer pairs (prog-ress tracking). The coordinator also had the data collectedreadily available at any given time (presenting data) andwas able to accommodate protocol updates (versatility).On completion of the review, the data was safely storedfor future needs (storing and retrieving data).

4. Discussion

The data-extraction step in a systematic review requiresextensive planning and piloting. This planning includes notonly the design of the data-extraction procedures, but alsothe selection of the optimal tool and piloting its perfor-mance in conducting the specific review. Volume, nature,and complexity of the data, the number, location, and com-puter literacy and access (both to a computer and to theInternet) of the reviewers, and the available funding forthe project play a major role in the selection process amongthese available tools. Table 1 presents some issues acrosswhich the available tools can be compared.

The Cochrane Handbook refers to electronic and papersolutions for data collection when conducting Cochranereviews [5]. Using the current RevMan interface as anextraction tool is cumbersome. Although RevMan facili-tates the process of writing and organizing a systematicreview in a uniform framework and provides the additionaladvantage of statistical analysis, it does not adequatelyaddress most of the domains compared in Table 1. Thisarticle elaborates the Collaboration’s guidance by offeringin-depth description of tools available to date to reviewersfacing a broad range of task complexity, number and loca-tion of reviewers, and budgets. Our desire would be that, infuture developments, RevMan would incorporate softwarefunctionality that addresses, using open access software,all domains discussed in Table 1.

Our analyses may reflect a particular approach to dataextraction that may not apply to other systematic reviewers,but that we think reflects the state of the art: duplicate dataextraction by pair of reviewers working independently,detection, quantification, and resolution of inconsistencies,and arbitration of disagreements. None of the tools reviewedhere facilitate to a great extent the process of blinded data ex-traction (e.g., collecting data on the methodological qualityof the studies while blinded to the study results) or are partic-ularly adequate in tracking data origin (from the publishedpaper vs. from direct query to the authors). Our presentationwill be necessarily outdated as more web-based and open ac-cess software with sufficient functionality to allow for onlinecollaboration become available. For example, we expect thata high degree of integration between the electronic versionsof the study reports and the data-extraction forms may reducethe amount of data input to a greater extent than possibletoday and may facilitate collaboration, error reduction, andquality control.

This overview represents the views of the researcherswho author this report, and thus is colored by their specific

510 M.B. Elamin et al. / Journal of Clinical Epidemiology 62 (2009) 506e510

experiences with the tools mentioned as well as their stylesand preferences. We await input from other groups whomay have used other technologies to great advantage orwho may have additional arguments for and against thetools presented here that we may have overlooked.

5. Conclusion

No single data-extraction method is best for all system-atic reviews in all circumstances. Before selecting a tool,reviewers must bear in mind the volume, nature and com-plexity of the data, the number, location, and computer lit-eracy and access (both to a computer and to the internet) ofthe reviewers, and the available funding for the project.Planning and careful selection and piloting of the appropri-ate data-extraction tools are paramount in efficiently yield-ing data for a high-quality systematic review.

Acknowledgment/funding

Matthias Briel is supported by the Swiss National Foun-dation (PASMA-112951/1). Holger Schunemann is sup-ported by a ‘‘The human factor, mobility and Marie Curie

Actions Scientist Reintegration’’ European CommissionGrant: IGR 42192. Regina Kunz is supported bysantesuisse and the Bangerter-Rhyner-Foundation.

References

[1] Antman EM, Lau J, Kupelnick B, Mosteller F, Chalmers TC. A com-

parison of results of meta-analyses of randomized control trials and

recommendations of clinical experts. Treatments for myocardial

infarction. JAMA 1992;268:240e8.

[2] Oxman AD, Guyatt GH. The science of reviewing research. Ann N Y

Acad Sci 1993;703:125e33; discussion 33e34.

[3] Knowledge & Encounter Research Unit http://mayoresearch.mayo.

edu/mayo/research/ker_unit/index.cfm) (Accessed August 1, 2008).

[4] Goodman LA. Snowball sampling. Ann Math Statist 1961;32:148e70.

[5] Higgins JPT, Deeks JJ. Chapter 7: selecting studies and collecting data.

In: Higgins JPT, Green S (editors), Cochrane handbook for systematic

reviews of interventions version 5.0.0 (updated February 2008). The

Cochrane Collaboration, 2008. Available at www.cochrane-hand

book.org. (Accessed August 1, 2008).

[6] Survey Monkey (http://www.surveymonkey.com/). (Accessed August

1, 2008).

[7] TrialStat (https://www.clinical-analytics.com (Accessed August 1,

2008).

[8] The evidence for policy and practice information and coordinating

centre reviewer (http://eppi.ioe.ac.uk/cms/Default.aspx?tabid5184)

(Accessed August 1, 2008).

http://mayoresearch.mayo.edu/mayo/research/ker_unit/index.cfm

http://mayoresearch.mayo.edu/mayo/research/ker_unit/index.cfm

http://www.cochrane-handbook.org

http://www.cochrane-handbook.org

http://www.surveymonkey.com/

https://www.clinical-analytics.com

http://eppi.ioe.ac.uk/cms/Default.aspx?tabid=184

Documents

1-s2.0-S089543560900016X-main