Open access: a researcher’s perspective
Antonio GasparriniLondon School of Hygiene and Tropical Medicine, London, UK
Open Research and DataOpen Access Week
22 October 2012 - Birbeck College, London
My background
Graduated in biology in Italy, then 4 years working asepidemiologist in a cancer research center in Florence
MSc + postgraduate school (still in Italy) + PhD (in UK) inmedical statistics,
Worked at LSHTM in the last 5 years, mainly in statisticalmethodology and software development
My MRC fellowship
Awarded a Research Methodology fellowship from MRC (Dec2011 – Nov 2014)
Project developed on my previous research
Success of this project critical for next funding application
Need to comply with the MRC regulations on open access
My budget for open access costs: 6000£ in total
Outline
Some points:
My perspective: as scientist and junior academic
Publishing: steps and costs
My publications as a case study
Open research: beyond publications
The scientist’s perspective
I favour a system which:
guarantees high-quality research
allows the independent assessment of research findings
ensures the dissemination of the such findings
The academic’s perspective
I favour a system which:
covers the costs of my research
delivers a fast and effective peer-review process
provides tools for disseminating my work
Publishing a research paper: steps
Literature review
Drafting the manuscript
Choice of the journal and submission
Review and acceptance
Copyright agreement
Open access fee
Publication
Actors: the researcher, the institution, the research community,the funder, the journals/publishers
An efficient and fair system?
A first article
Published online in Statistics in Medicine (2012):
The choice of the journal
Copyright transferred
Open access fee ∼2250£
Impact factor 1.99
Submitted and published versions
Research ArticleStatisticsin Medicine
Received XXXX
(www.interscience.wiley.com) DOI: 10.1002/sim.0000
Multivariate meta-analysis for non-linear andother multi-parameter associations
A. Gasparrinia∗†, B. Armstrongb, M. G. Kenwarda
In this paper we formalize the application of multivariate meta-analysis and meta-regression to synthesize estimatesof multi-parameter associations obtained from different studies. This modelling approach extends the standardtwo-stage analysis used to combine results across different sub-groups or populations. The most straightforwardapplication is for the meta-analysis of non-linear relationships, described for example by regression coefficientsof splines or other functions, but the methodology easily generalizes to any setting where complex associationsare described by multiple correlated parameters. The modelling framework of multivariate meta-analysis isimplemented in the package mvmeta within the statistical environment R. As an illustrative example, we proposea two-stage analysis for investigating the non-linear exposure-response relationship between temperature and non-accidental mortality using time series data from multiple cities. Multivariate meta-analysis represents a usefulanalytical tool for studying complex associations through a two-stage procedure. Copyright c© 2011 John Wiley &Sons, Ltd.
Keywords: meta-analysis; multivariate analysis; multivariate meta-analysis; non-linear; splines
1. Introduction
Meta-analysis is a standard, well-grounded statistical procedure for combining the evidence from independent studiesthat address the same research hypothesis [1]. This methodology was developed originally for pooling the results frompublished observational or experimental studies, for which individual data were not available. Recently, meta-analysis hasbeen described more broadly as a research synthesis method, with the aim of estimating an average association acrossstudies and to explore the degree and sources of heterogeneity [2]. The analytical approach adopted in this context may bedescribed as a two-stage hierarchical procedure: in the first stage, study-specific estimates of the association of interest arederived from individual data, controlling for individual-level covariates; in the second stage, these estimates are combinedacross studies, optionally exploring the association with study-level predictors. The two-stage approach, a specific formof individual patient data (IPD) meta-analysis, has been shown to be a flexible and computationally efficient method [3],and has been adopted in different contexts: to pool estimates from multiple randomized controlled trials [4]; to combineresults from survival models on time-to-event data in multi-centre cohorts [5]; and to synthesize associations from Poissontime series models in multi-city analyses [6].
The common approach to two-stage meta-analysis consists of summarizing the association in a single parameterestimate from the first stage, optionally controlling for individual-level confounders. This procedure allows standard meta-analytic techniques to be applied. However, complex associations, such as non-linear exposure-responses, are usuallydescribed with functions defined by multiple parameters, and require more sophisticated meta-analytical approaches,capable of handling the multivariate nature of the summary estimates. Multivariate meta-analysis, a method originally
a Department of Medical Statistics, London School of Hygiene and Tropical Medicineb Department of Social and Environmental Health Research, London School of Hygiene and Tropical Medicine∗Correspondence to: Antonio Gasparrini, London School of Hygiene and Tropical Medicine, Keppel Street, London WC1E 7HT, UK.† E-mail: [email protected]
Contract/grant sponsor: Medical Research Council (UK), grants G0701030 and G1002296
Statist. Med. 2011, 00 1–18 Copyright c© 2011 John Wiley & Sons, Ltd.Prepared using simauth.cls [Version: 2010/03/10 v3.00]
Research Article
Received 9 August 2011, Accepted 11 May 2012 Published online in Wiley Online Library
(wileyonlinelibrary.com) DOI: 10.1002/sim.5471
Multivariate meta-analysis fornon-linear and other multi-parameterassociationsA. Gasparrini,a*† B. Armstrongb and M. G. Kenwarda
In this paper, we formalize the application of multivariate meta-analysis and meta-regression to synthesizeestimates of multi-parameter associations obtained from different studies. This modelling approach extendsthe standard two-stage analysis used to combine results across different sub-groups or populations. The moststraightforward application is for the meta-analysis of non-linear relationships, described for example byregression coefficients of splines or other functions, but the methodology easily generalizes to any settingwhere complex associations are described by multiple correlated parameters. The modelling framework ofmultivariate meta-analysis is implemented in the package mvmeta within the statistical environment R. Asan illustrative example, we propose a two-stage analysis for investigating the non-linear exposure–responserelationship between temperature and non-accidental mortality using time-series data from multiple cities.Multivariate meta-analysis represents a useful analytical tool for studying complex associations through atwo-stage procedure. Copyright © 2012 John Wiley & Sons, Ltd.
Keywords: meta-analysis; multivariate analysis; multivariate meta-analysis; non-linear; splines
1. Introduction
Meta-analysis is a standard, well-grounded statistical procedure for combining the evidence from inde-pendent studies that address the same research hypothesis [1]. This methodology was developed origi-nally for pooling the results from published observational or experimental studies, for which individualdata were not available. Recently, meta-analysis has been described more broadly as a research synthesismethod, with the aim of estimating an average association across studies and to explore the degree andsources of heterogeneity [2]. The analytical approach adopted in this context may be described as atwo-stage hierarchical procedure: in the first stage, study-specific estimates of the association of interestare derived from individual data, controlling for individual-level covariates; in the second stage, theseestimates are combined across studies, optionally exploring the association with study-level predictors.The two-stage approach, a specific form of individual patient data (IPD) meta-analysis, has been shownto be a flexible and computationally efficient method [3] and has been adopted in different contexts: topool estimates from multiple randomized controlled trials [4]; to combine results from survival modelson time-to-event data in multi-centre cohorts [5]; and to synthesize associations from Poisson time-seriesmodels in multi-city analyses [6].
The common approach to two-stage meta-analysis consists of summarizing the association in a singleparameter estimate from the first stage, optionally controlling for individual-level confounders. Thisprocedure allows standard meta-analytic techniques to be applied. However, complex associations, suchas non-linear exposure–responses, are usually described with functions defined by multiple parametersand require more sophisticated meta-analytical approaches capable of handling the multivariate natureof the summary estimates. Multivariate meta-analysis, a method originally developed to pool multiple
aDepartment of Medical Statistics, London School of Hygiene and Tropical Medicine, London, U.K.bDepartment of Social and Environmental Health Research, London School of Hygiene and Tropical Medicine, London, U.K.*Correspondence to: Antonio Gasparrini, London School of Hygiene and Tropical Medicine, Keppel Street, London WC1E7HT, U.K.
†E-mail: [email protected]
Copyright © 2012 John Wiley & Sons, Ltd. Statist. Med. 2012
A second article
Firstly submitted to Biostatistics:
Copyright transferred
Open access fee ∼2250£
Impact factor 2.145
Rejected, re-submitted to BMC Med Res Method:
Copyright retained
Open access fee ∼1475£ (∼1255£ with LSHTM discount)
Impact factor 2.67
A third article
Published in Journal of Statistical Software (2011):
Not automatically indexed in PubMed
Included ’manually’ through PubMed Central
Copyright retained
Open access fee: 0£
Impact factor 4.01
Open research: beyond publishing
Open data: research data collected with public funding availableto other researchers
Open source and free software
Reproducible research: open and thorough assessment ofresearch findings
A similar case
Statistical software is mainly based on commercial programs(e.g. Stata, SAS, SPSS)
Substantial fees to be paid by research institutions
However, implementation of novel methodologies provided byresearchers
Same story: researchers working (for free) for third parties...
An alternative model
An example: the R software
A project entirely based on a community of users and developers
Comparison with commercial programs
Model also applicable to publishing
The third article again
The manuscript is freely available at journal’s web site and otherrepositories
The code for the analysis is included as supplementary material
The software is implemented and fully documented in a freestatistical package
The data are stored online and freely available through thesoftware
All of this at no cost
The internet era
Different approach to search and dissemination: what role forjournals?
Drop in editorial and publication costs: do we really needpublishers?
Role of funders, institutions and research community is critical
Why so late?!
The open access era
Important changes: Wellcome and RCUK policies
Limitations of the Finch Report
Alternative models already available
Changes require a different approach from researchers