Evidence based policy and the quality of evidence ... · Evidence based policy and the quality of evidence: rethinking peer review Introduction Learning from past experience is one

ESRC UK Centre for Evidence Based Policy and Practice:Working Paper 7

Evidence based policy and thequality of evidence:

Rethinking peer review

Lesley Grayson

ESRC UK Centre for Evidence Based Policy andPractice

Queen MaryUniversity of London

Email [email protected]

© March 2002: ESRC UK Centre for Evidence Based Policy and Practice

Lesley Grayson is Research Fellow (Information Access) at the ESRC UK Centrefor Evidence Based Policy and Practice.

mailto:[email protected]

Abstract

Learning from past experience through the analysis of documented research is a keyelement of evidence based policy and practice (EBPP), and has promptedconsiderable debate about evidential quality. Much of this has focused on therelative merits of particular research methodologies and the rigour with which theyare applied. There has been less concern about the effectiveness of the principalquality assurance mechanism applied to documented research – publication peerreview. The example of evidence based medicine suggests that the advent of EBPPin social fields is likely to focus greater attention on this issue. This paper draws onthe biomedical experience to identify peer review problems and possible solutions. Italso argues that an effective publication peer review system is important not just as ameans of assuring the quality of the published research record, but as a protectiveshield for social scientists as they engage with the potentially dangerous world ofEBPP.

Key words: social sciences; medicine; research literature; peer review

AcknowledgementsThe author would like to thank the following people for their comments on thisWorking Paper: Alan Gomersall, Bill Solesbury, Fiona Steele and Ken Young.

The purpose of the Working Paper series of the ESRC UK Centre for Evidence BasedPolicy and Practice is the early dissemination of outputs from Centre research and

other activities. Some titles may subsequently appear in peer reviewed journals or otherpublications. In all cases, the views expressed are those of the author(s) and do not

necessarily represent those of the ESRC.

1

Evidence based policy and the quality ofevidence: rethinking peer review

Introduction

Learning from past experience is one of the central tenets of evidence based policyand practice (EBPP). It is encapsulated in the practice of reviewing and synthesisingexisting information – especially that derived from formal research – to identifyuseful knowledge. Particular emphasis is placed on the need to be systematic andobjective in gathering evidence, extracting the most reputable results, andcrystallising the lessons they provide. Accordingly, much effort is being expended indeveloping methods of literature searching, review and synthesis to ensure thatevidence can be comprehensively identified and effectively analysed.

In the process there has been a great deal of debate on evidential quality, most of itrelated to the relative merits and demerits of particular research methodologies andthe rigour with which they are applied. However, little or no attention has been paidto another quality issue which may be of equal importance to EBPP: theeffectiveness of the quality assurance mechanisms applied to the raw material ofsystematic review, research manuscripts. Prime among these is publication peerreview which is routinely seen as a guarantee of research quality. Peer reviewedliterature is preferred by systematic reviewers and by many users of research,including policy makers, but is the confidence placed in it fully justified?

The experience of the medical community suggests that it is not. The spotlightdirected at the medical literature by the advent of evidence based medicine hasrevealed serious deficiencies in the quality of the scientific record, and much of theblame has been attributed to the nature and operation of peer review. The rise ofEBPP in social fields may well have a similar outcome, and the core of this papersummarises the medical experience to illustrate the problems and some possibleways forward. If successful, they will help to ensure that a high quality, peerreviewed research article is exactly what it claims to be.

However, effective quality control of the research literature is important not just as ameans of assuring evidential quality. It could also play a vital role in protecting thesocial science research community as it enters unfamiliar EBPP territory. In theseearly days, closer engagement with the ‘real world’ of policy making and practiceseems to offer all sorts of exciting opportunities, but experience from the naturalsciences suggests that it also lays researchers open to unaccustomed challenges anddangers. Before moving on to the nuts and bolts of peer review, it is as well toexplore what these might be.

Different realities

As the first Working Paper in this series pointed out, research results are not the onlyform of evidence that matters in the drive for EBPP1. There are many different kindsof evidence, and the ‘objective’ results of research are not always the most relevantto a given policy or practice issue. This has become painfully clear in recent years in

2

respect of scientific evidence about risk which has, on occasion, proved disastrouslyinadequate as a basis for policy making and problem solving. Failed attempts to usesuch evidence to try and allay fears about BSE or persuade the public to accept GMcrops are among the more obvious examples.

It is not that the scientific evidence about health or environmental risk is wrong orthat the public is ignorant, although both may sometimes be the case. The maindifficulty is that this kind of evidence simply doesn’t reflect the way the vastmajority of people perceive and experience the world. This may be particularly trueof ‘hard’ science which is often counter-intuitive2, but a similar phenomenon canalso be observed in the context of research about social risks such as becoming avictim of crime3.

The figures may state unambiguously that older people are at the lowest risk ofviolent attack, or that children are at far greater risk of abuse in their own homesthan at the hands of strangers. And yet many pensioners are too frightened to go outat night, and the lives of numberless children are circumscribed by their parents’ fearof ‘stranger danger’. Researchers who deal in ‘hard facts’ perhaps understandablytend to characterise this as irrational behaviour, to be addressed by better educationor dissemination of evidence. But this is to miss the point. The realities of the livesled by ‘ordinary’ individuals are far more complex and subtle than the results of themost sophisticated analysis of people in the aggregate, encompassing moral,physical and emotional dimensions that mere figures can never measure. Fear ofcrime seems anything but irrational to the frail old lady contemplating the incivilityof 21st century street life, or to the mother for whom no risk to her child is too smallto ignore.

And for those who retain a sneaking belief in what the science policy communityeuphemistically calls the ‘deficit model’ of public understanding, it is worthremembering that we are all lay people in large parts of our lives. The researcherwho knows all there is to know about the dangers of child abuse is no more likely totake an ‘objective’ view of the dangers of GM foods than anyone else of similareducational level and cast of mind.

Playing a dangerous game

Thus, in many situations, the development of effective evidence based policy orpractice cannot rely solely on the unvarnished findings of research. Indeed, such anapproach can be positively counter-productive given the gulf that exists between the‘scientific’ and the ‘common sense’ view of the world, and the ever present risk thatapparent certainties will be over-turned by events or the advance of knowledge. Inthe science policy field, the dangers are only too apparent in the decline of publicconfidence and trust not just in particular policies but in the science – and thescientists – that underpin them4. The ‘scientification of politics’, which bearswitness to the presumed power of objective evidence to resolve society’s problems,paradoxically threatens to destroy the esteem in which that evidence is held bysimultaneously promoting (or appearing to promote) the ‘politicisation of science’5.The threat both to science policy making and to science itself is such that restoringthe public’s faith – or overcoming its sense of betrayal – is now a major concern ofboth government and the scientific community.

3

Most social scientists have yet to face this kind of onslaught, and some have donesterling work in warning their colleagues in the natural sciences of the pitfalls ofassuming that what is self-evident to them will necessarily convince the rest of theworld6. However, they may also need to keep the dangers in mind on their ownaccount as they bask in the spotlight of government and research council enthusiasmfor EBPP. The evidence they provide may, superficially, be nearer to the ‘commonsense’ view of the world and thus more accessible to lay people who, as alreadynoted, include not just the general public but most politicians as well asprofessionals outside their own areas of expertise. But, like the mysterious findingsof the geneticists and the biotechnologists, it will be also be viewed through acomplex moral, emotional and political prism and could be found equally wanting asa guide to problem solving or innovation in the real world. The high ground of‘objectivity’ occupied by research evidence can all too often turn out to be of littlestrategic value.

Social scientists are, of course, not unused to this outcome and there is a substantialliterature on the many reasons why their efforts have so often had such an apparentlylimited impact. Marginalisation is perhaps their most usual fate7 but there areoccasions on which researchers have come under direct attack for providingevidence that conflicts with a moral and/or political consensus. A striking recentexample is the meta-analysis by Rind et al of research on the psychologicalcorrelates of child sexual abuse which challenged the assumption that such abuse is,by definition, harmful and traumatic8. Their study was condemned in Congress andby sections of the US media, disavowed by the American Psychological Association(which had published it), and referred to the American Association for theAdvancement of Science (AAAS) for re-review.

It is not beyond the bounds of possibility that other researchers may find themselvesin a similar position as social science evidence attains a higher public profile. Whilethe EBPP movement offers greatly increased opportunities for new primaryresearch, programme evaluation, systematic review, meta-analysis and the rest, the‘social scientification of politics’ is not without risk and social scientists would bewise to anticipate problems. If possible, they should develop strategies forpreventing or heading them off, and these could operate at many stages of theresearch cycle – for example, involving ‘ordinary’ people (as well as policy makersand practitioners) in setting the research agenda, designing and carrying outresearch, implementing and evaluating the results, and disseminating the widerlessons that have been learned. There have been initiatives by social scientists in allthese areas, and the work already done on the problems faced by natural scienceshould also be of help.

However, there will inevitably be occasions on which social scientists come underfire because their ‘realities’ differ from those of wider society, or particularlyinfluential sections within it. At times like these, they need to be able to defend theircorner.

4

Quality as a defence

The experience of Rind et al offers a clue to one potential defence strategy. Theyobserve that their critics fell ‘notably silent’ after the AAAS judged both theirmethods and analysis to be without fault, and expressed ‘grave concerns’ about theway their study had been politicised and mis-represented. Researchers cannot avoidmoral and political criticism of the evidence they bring to bear on society’sproblems, but their ability to argue their case will be greatly strengthened if they candemonstrate the unimpeachable standing of their work within its own terms ofreference.

Unfortunately, this is not always the case. Social scientists have long been criticisedby elements of the natural science world for their supposed lack of methodologicalrigour, and the rise of EBPP has generated a fierce debate within social science itselfabout what constitutes ‘good’ evidence9. This has sometimes been characterised,rather crudely, as ‘RCTs vs. the rest’ but, in reality, a range of methodologicalapproaches is accepted as both valid and appropriate to the goal of generating acumulative body of knowledge. While the merits of a more ‘scientific’ approach tothe study of social phenomena will undoubtedly continue to exercise minds,methodological pluralism remains the norm (indeed, the ‘softer’, qualitativemethods used within social science are gaining increasing support within more‘scientific’ areas like public health research10). Some judgements about the qualityof social science research may well suggest that a different, more ‘scientific’,approach would have been preferable, but most will be concerned with how well thechosen methodology has been applied. All too often, the results can be less thanflattering.

Education is one area in which the quality of research has been found wanting, mostnotably by David Hargreaves whose 1996 Teacher Training Agency (TTA) lecturetriggered a vigorous exchange of professional views along ‘RCTs vs. the rest’lines11. It also prompted the Department for Education and Employment tocommission a review of published educational research from James Tooley andDoug Darby12. This included a detailed analysis of 41 articles published in fourmajor UK education research journals. Most of the research reported was qualitativeor non-empirical – and thus not of the type that Hargreaves and others of like mindfeel to be the most valuable – but within its own terms of reference it also left muchto be desired. Twenty-six of the 41 articles were judged to reflect poor researchpractice for reasons including:

• Methodological inadequacies, especially sampling bias and lack oftriangulation. In addition, methodologies were rarely reported.

• Partisanship in the conduct and presentation of research.

• The introduction in non-empirical research of contentious propositionswithout acknowledgement of their controversy, and sometimes withoutlogical coherence. There was also a tendency towards the ‘adulation’ of‘great thinkers’.

5

The last bastion

How, one wonders, could this come about? These articles were published inprestigious peer reviewed journals, and thus subjected to one of the two principalmethods of quality assurance that operate within academic research. Peer review istraditionally assumed to act alongside the verification of experimental results toproduce a self-correcting system in which the inevitable human failures ofresearchers are detected and put right.

In reality, peer review stands alone in many areas of research whereexperimentation, let alone its verification, is inappropriate, impractical, orinadvisable on ethical grounds. For many social scientists, this still accounts for agood part of what they do. Even in the natural sciences, the exact replication ofexperiments may be relatively rare except where the results of original research areparticularly momentous or anomalous. Repeating someone else’s work is,unsurprisingly, not an especially popular area of activity for ambitious scientistswhose publication prospects, and thus career development, are heavily dependent onproducing significant new results.

Peer review operates at both ends of the academic research process; in the selectionof projects for funding, and in the quality control of the journal articles thatdocument their findings. The rest of this paper is concerned with the latter processwhich traditionally acts as the last bastion against mediocrity, bias, error anddeception – both conscious and unconscious – in research. A manuscript submittedfor publication in a peer reviewed journal is referred to selected specialists in thefield who offer advice on whether it should be published and what, if any, changesneed to be made to bring it up to standard. It will not appear unless the changes aremade to the satisfaction of both the editor and referees.

An effective system of journal peer review is of crucial importance to that part of theEBPP enterprise dedicated to identifying lessons from the existing knowledge base.Those involved in systematic review, whether narrative or quantitative, may wellhave a good general understanding of the subject areas with which they are dealingbut will inevitably lack detailed knowledge of the many individual studies that needto be assessed. To at least some degree, they will need to take on trust that thestudies identified as relevant to their purpose are also fit for their purpose as a resultof careful peer review – preferably including the data and other material so oftenexcluded from the final published versions for reasons of space.

In the social sciences this assurance is by no means always available. Systematicreviewers have first to contend with the fact that a smaller proportion of potentiallyrelevant studies appears in peer reviewed sources than is the case with, say, medicalresearch. Someone reviewing the literature on, for example, home help services forthe elderly might reasonably expect to find, in addition to peer reviewed journalpapers:

6

• Articles in practitioner journals.

• Grey literature including reports from local authorities, charities orcommunity groups.

• Conference papers.

• Books, either as a primary publication medium or as an added-valuesecondary means of disseminating conference papers or journal articles.

• Government publications, for example a commissioned evaluation of aparticular policy programme.

There may well be high quality information to be gleaned from sources like these,but their application of formal peer review is patchy to say the least. Systematicreviewers from an academic background may be tempted to downgrade them infavour of the peer reviewed journal literature which is, after all, rather easier toidentify even with all the deficiencies of social science bibliographical databases.However, even here – as Tooley and Darby found with their sample of educationresearch articles – quality may not be assured.

The medical condition

Advocates of EBPP in the social policy field are fond of using evidence basedmedicine as a model of what things ought to be like13 and there is, indeed, a lot to belearned about both the benefits and the pitfalls. Among the latter are the dangers ofassuming that peer review guarantees quality, an issue that has greatly exercised theeditors of the world’s leading medical journals over the last 15 to 20 years. It is noaccident that this period coincides with the rise of evidence based medicine and thespotlight it has thrown on the quality of the biomedical literature. In 1986Drummond Rennie, deputy editor of the Journal of the American MedicalAssociation (JAMA) wrote:

There seems to be no study too fragmented, no hypothesis too trivial, noliterature citation too biased or too egotistical, no design too warped, nomethodology too bungled, no presentation of results too inaccurate, tooobscure, and too contradictory, no analysis too self-serving, no argument toocircular, no conclusions too trifling or too unjustified, and no grammar andsyntax too offensive for a paper to end up in print14.

It is likely that much the same (perhaps even more) could be said of the peerreviewed social science journal literature, and for the same reason: the inexorablepressure to publish. It is arguable that researchers no longer publish predominantlyto disseminate new knowledge or ideas – this function is increasingly carried outinformally, either electronically or through face-to-face contact at conferences, orthrough working and discussion papers. Peer reviewed publication is now largelyabout establishing academic status, embellishing CVs and boosting the researchratings of employing departments. As more and more papers appear, the competitionfor inclusion in ‘high impact’ journals becomes ever more intense, fuelling yet morepublishing because a key criterion used by peer reviewers is likely to be the author’sprevious publication record. More and more journal titles appear, while the editorialand peer review systems of the most prestigious are in danger of burial under anavalanche of manuscripts.

7

Rennie’s memorable editorial signalled JAMA’s intention to launch a series ofinternational congresses on peer review in biomedical publishing, the latest takingplace in September 2001 in Barcelona15. From the start, the British Medical Journal(BMJ) was closely involved, reflecting the concern of its own editor at the time,Stephen Lock, about the deficiencies of the system16. One of the driving forcesbehind the congresses was the need for more reputable evidence about the operationof peer review and how it might be improved. A cross-disciplinary annotatedbibliography on peer review compiled in the early 1990s by Bruce Speck included780 entries, 643 relating to the publication system and many of them to do withbiomedicine17. A subsequent analysis of ‘assertions’ contained in the annotationsfound that less than 40% were supported by empirical evidence and, in the case ofrecommendations about how to improve the peer review system, the figure fell toonly 8%18.

Since the first congress in 1990, a research literature has grown up which looks atboth the problems and the possible solutions in a more systematic way, although aCochrane review presented at the fourth congress in 2001 argues that there is stillonly ‘a small and scattered amount of empirical evidence supporting the use ofeditorial peer review as a mechanism to ensure quality of biomedical research’19.However, this study is strictly limited to prospective or retrospective studies withtwo or more comparison groups generated by random or other ‘appropriatemethods’. Another contribution to the fourth congress examines the literature morebroadly, taking into account editorial comment and reviews as well as moremethodologically rigorous research20. The Deputy Director of the Library of HealthSciences at the University of Illinois, Chicago, has also recently published a detailedand systematic study of research in this area21.

Briefly, the research literature on biomedical peer review suggests that it is, amongother things:

1. SlowThis is an inevitable consequence of the sheer weight of paper landing on the desksof editors and generally unpaid reviewers, and of low levels of inter-revieweragreement on whether manuscripts should be accepted22. Delays can be exacerbatedby the need for revisions and re-review of papers that are basically of good qualitybut prepared under pressure and submitted in sub-standard form. The gap betweensubmission of a manuscript and eventual publication can stretch to years in somecore journals.

2. ExpensivePeer review consumes a vast amount of academic as well as editorial time,generating significant costs23. One study presented to the second congress arguesthat it is relatively rare for a manuscript rejected by one core journal to appear inanother of the same standing – suggesting that peer review is in fact quite effectiveat weeding out sub-standard material24. However, unless withdrawn frompublication altogether, articles that are rejected by their authors’ journal of firstchoice could occupy the time of editors and reviewers in several alternative corejournals, and subsequently in ‘second tier’ publications, before they finally appear inprint.

8

3. Prone to biasPublication bias in favour of papers that report statistically significant results is awell recognised phenomenon within the biomedical literature25 although, to someextent, this might be less the result of intellectual prejudice on the part of editors andpeer reviewers than a practical, if crude, response to the huge over-supply ofmanuscripts. Indeed the main problem may be that researchers who come up withnon-significant results choose (or are pressured by sponsors) not to submit theirfindings for publication, rather than that editors actively discriminate against them.Whatever the causes, publication bias has serious implications for the overallbalance of the scientific record and, of course, for those who conduct systematicreviews and meta-analyses.

Other forms of bias unrelated to the statistical significance of research results havealso been alleged including bias on the grounds of age, gender, nationality,language, geographical location or institution. However, they are difficult tosubstantiate: ‘it is hard to show whether the basis of reviewers’ recommendationsderives from what we might term “good biases” such as importance, originality,good design, and good reporting or not’26. For example, an apparent bias in favourof older authors may simply reflect the fact that they are more experienced andperform better research.

Perhaps the most fundamental charge of bias laid against peer review in biomedicine(and other disciplines27) is that of conservatism, or discrimination againstinnovatory, heretical or dissenting opinions. This is in line with a Kuhnian view ofscientific progress in which knowledge is shaped in line with conventional wisdomuntil the discrepancies between ‘normal science’ and observed reality become tooglaring to ignore28. While most authors will probably be engaged in ‘normalscience’ rather than challenging conventional wisdom, there are those who may loseout because they are ahead of their time or trying to break the mould.

4. Open to abuseUnconscious bias may sometimes shade into conscious abuse of the peer reviewsystem although proving it may be almost impossible29, and attempts can all tooeasily look like special pleading or witch hunting30. Theft of information or ideas bypeer reviewers has been alleged, and anecdotal evidence abounds of the suppressionof new ideas and the work of rivals (actual or potential), or the promotion offavoured colleagues and protégés. Equally, it should be said, authors are alsosometimes accused of stealing the ideas of reviewers by incorporating theircomments into manuscripts without acknowledgement.

Reviewer abuses are possible in part because, in many cases, the pool of potentialreviewers for a given article is relatively small. Science is now so specialised thatthere may be only a few people able to give a truly informed opinion on a piece ofresearch, and the granting of anonymity to authors is unlikely to blind reviewers totheir identity. The same may well be true of the social sciences, at least in the UK.The ‘villages’ occupied by experts in particular subject specialisms are small, andeveryone knows everyone else.

Editors can also abuse the peer review system. A presentation to the fourth congresson biomedical peer review argues that there may be an ‘unhealthy propensity’

9

towards the publication of articles that generate higher impact factors, as measuredby the all-important Science Citation Index31. In other words, articles on ‘hot topics’are more likely to be published than research of equal (or greater) merit on currentlyless fashionable subjects.

A recent example concerning the very hot topic of human cloning illustrates thepoint. An article by researchers from the US firm Advanced Cell Technology waspublished in the peer reviewed electronic journal e-biomed: The Journal ofRegenerative Medicine in November 2001, announcing the cloning of the first everhuman embryo32. It received major press coverage but aroused considerablescientific doubts, not least in the mind of a pioneer of human stem cell research whowas a member of the journal’s editorial board alongside two of the paper’s authors.He subsequently resigned, alleging that the editor refused to tell him who hadreviewed the paper. The truth may never be known, but there is at least a suspicionthat the editor was less interested in publishing unimpeachable research than inboosting the impact factor of a relatively new journal.

5. Sometimes incompetentThe acknowledged poor quality of a significant proportion of the published medicalliterature certainly suggests incompetence on the part of peer reviewers. The highlyspecialised nature of modern science may be one explanation: ‘the Achilles’ heel ofpeer review is that referees usually know less than the author about the work underexamination’33. However, sloppy practice is also likely to play a part. In one study,researchers at the BMJ took a paper about to be published in the journal, insertedeight deliberate ‘areas of weakness’, and sent it to 420 potential reviewers of whom221 responded. The median number of errors detected was two, no-one found morethan five, and 16% failed to spot any34.

There is some evidence to suggest that relative youth and a training in epidemiologyand statistics are associated with better quality peer review performance inbiomedicine, but most other reviewer characteristics seem to play no part except,paradoxically, membership of a journal editorial board which is associated withpoorer performance35. In this context, it is worth noting that a majority of journalsuse editorial board members as reviewers.

6. Unable to detect fraudThe inability of peer review to guarantee the detection of even gross scientific fraudsis particularly worrying to the medical community because it suggests that manymore subtle – and potentially dangerous – forms of deception are passing unnoticed.Moreover, the ‘temptation to behave dishonestly is surely far greater now, when alltoo often the main reason for a piece of research seems to be to lengthen aresearcher’s curriculum vitae’36. Although editors and reviewers have beensuccessful in unmasking some of the cheats, there have been spectacular failures, forexample the case of surgeon Malcolm Pearce in the mid-1990s.

Pearce claimed in the British Journal of Obstetrics and Gynaecology in 1994 that hehad successfully transferred an embryo from a patient’s Fallopian tube to her uterus,and that a healthy baby was duly born. This was rapidly revealed to be a completefabrication when colleagues at St George’s Hospital in London claimed to knownothing of any such operation. Pearce was duly sacked and struck off the medical

10

register. The article had successfully negotiated clinical and statistical peer review,and ‘it never occurred to the referees that the whole thing might be a lie’. To add yetmore layers of embarrassment, the editor of the journal was Pearce’s departmentalhead and President of the Royal College of Gynaecologists and Obstetricians, andhad allowed his name to be appended to the article as an honorary co-author: ‘Irubber stamped this paper out of politeness and because he asked me to as head ofthe department’37.

Honorary authorship is yet another important strand in the debate about peer reviewand the quality of the biomedical literature, and it is not unknown in otherdisciplines, including the social sciences. Junior researchers may want a well knownname alongside them to increase the chances of publication, while senior researchershave been known to impose themselves on the title pages of promising work byjuniors in order to further inflate their own or their departments’ reputations. Thelatter practice has recently been aired in the pages of Nature38, with the subsequentdebate migrating to The Times where it will no doubt do little to improve the publicreputation of academia. Whatever the justification for honorary authoriship, the endresult is to make the task of judging the true quality of a piece of research even moredifficult than it already is.

The medical prescription

All of this suggests that the peer review patient is in desperate straits, beset on allsides by insidious infections – some deliberately induced – that threaten to carry itoff. There are some who argue that it should be put out of its misery and thatadvanced technologies offer a more effective approach in the form of electronicpublishing and open peer commentary. The Los Alamos Physics Archive is usuallyquoted as evidence that this could be a viable alternative. However, there are fewwho would go the whole way and abolish peer review; even Stevan Harnad, one ofthe foremost proponents of electronic publishing and coiner of the term ‘scholarlyskywriting’, has serious reservations39. He argues that the reason Los Alamos worksso well is because virtually every pre-print lodged in the Archive is ultimatelydestined for a peer reviewed journal and is thus already of relatively good quality.Without this ‘invisible constraint’, it would inevitably degenerate into little morethan a ‘free-for-all chat group’.

This certainly seems to be the view of most of the biomedical community who agreethat ‘crude and understudied’ as it is, peer review is nonetheless ‘indispensable’40. Italso has the firm support of ministers who routinely respond to allegations that theintegrity of the scientific record is under threat by insisting baldly that ‘peer reviewworks’. Both scientists and politicians may be driven to this conclusion less by realconviction than by fear of the implications should it turn out not to be true41.However, the former are at least well aware that, indispensable as it may be, the peerreview system is in urgent need of remedial treatment.

1. Defining terms and setting standardsAs a first step, it would help to know exactly what the term ‘peer review’ means.Schotland et al, in a presentation to the fourth international congress on biomedicalpeer review, note that policy makers tend to prefer evidence from peer reviewedsources42. No doubt this is primarily because they want the assurance of a quality

11

label, although one suspects that they also value that label as a potential get-outclause in the event of things going wrong. Unfortunately, identifying whatconstitutes peer review for any given journal is not all that easy. Schotland et allooked at a cross-section of journals publishing articles on a scientific issue ofinterest to policy makers – passive smoking – and examined their printed or internet-based statements on peer review practices. These were categorised according towhether they were: a) extensive (an explicit statement of policy, including fulldetails of how it operates); b) basic (an explicit statement only); or c) hints (noexplicit statement, although peer review is implied).

Of the 278 journals identified, only 67% had a published peer review policy. Ofthese, 28% were extensive, 40% were basic and 32% offered hints (although not partof this study, the electronic journal noted above – e-biomed: The Journal ofRegenerative Medicine – clearly falls into the last category). Of the whole sample,only 19% gave full and explicit information about their peer review policy and howit operated. Needless to say, the main conclusion of the study was that standardreporting requirements in this area should be adopted.

The UK’s Committee on Publication Ethics (COPE) agrees. Established in 1997 bythe key medical journals in response to the persistent failure of government and/orthe scientific establishment to take action on scientific misconduct, it published acomprehensive set of Guidelines on Good Publication Practice in 1999 whichinclude the recommendation that ‘journals should publish accurate descriptions oftheir peer review, selection and appeals procedures’43. So far, the Guidelines havebeen endorsed and re-published by 28 journals – a good start but a drop in the oceanof medical publishing as a whole.

Another initiative designed to identify practice and develop standards has beenlaunched by the Association of Learned and Professional Society Publishers, theEuropean Association of Science Editors, and the Academy of Learned Societies inthe Social Sciences. In October 2000 a questionnaire about peer review was postedon the ALPSP website and attracted 200 responses, mainly from editors, executiveeditors or administrators responsible for peer review. The bulk, 127, were from thelife sciences, medicine and veterinary medicine44. The results from this admittedlyunscientific sample of scholarly publishing show, for example, that:

• The vast majority of peer reviewers are members of the journal’s editorialboard or other contacts of the editor, with editors themselves frequentlyreviewing commissioned articles and review articles.

• Author concealment is practised by 40% of the respondents, and reviewerconcealment by 88%.

• Nearly 80% of journals provide peer reviewers with a checklist covering, inaddition to quality assessment, items such as illustrations, tables, references,statistics, language and ethical issues. ‘Relevance’, ‘importance’,‘originality’ and ‘methodology employed’ seem to be given lesser priorityalthough these could be subsumed under the general heading of qualityassessment. General guidance on issues such as confidentiality, theavoidance of personal bias, and the provision of clear reasons for commentsis also common.

12

• Only 15% of the sample provide feedback to reviewers on the quality of theirreviews and 34% provide no feedback at all, not even on the publicationoutcome. Only just over half the respondents keep any kind of records aboutpeer review quality although nearly 70% monitor peer reviewer responsetimes.

• In nearly a third of cases, reviewers get no acknowledgement in the journal,let alone payment or other rewards in kind such as free offprints or asubscription. Perhaps not surprisingly, few keep to the 3-4 week deadlinesgenerally imposed by editors.

A working group has been set up to create a set of best practice guidelines on thebasis of this survey evidence. The report as published does not distinguish betweendisciplines, and gives no specific information on the 37 social science and educationjournals that participated in the survey. In one sense this is a pity, and further studyof peer review practices in these disciplines would be both instructive andinteresting. However, if ALPSP is correct and ‘there are many common elements’between disciplines, at least some of what has been revealed about biomedical peerreview will have relevance for the social sciences.

2. Tackling bias and incompetenceAccusations of reviewer bias – conscious or unconscious – have perhaps the greatestpotential to damage the reputation of peer review as a quality assurance mechanism,and methods of tackling it in biomedicine were an early subject of methodologicallyrigorous research. Randomised studies have been presented to all four biomedicalpeer review congresses which attempt to test whether peer review works more fairlyand effectively in the presence or in the absence of reviewer and/or authoranonymity. Both outcomes are, of course plausible. There have also been studiesevaluating attempts to counter peer reviewer incompetence, including training andfeedback provision, and the impact of editorial and peer review processes onreadability. The results of all this are, so far, not particularly encouraging. TheCochrane systematic review presented at the fourth congress45 concludes, amongother things, that:

• Reviewer concealment and/or author concealment have little effect on thequality of peer review.This may seem a surprising finding but perhaps needs to be considered, atleast in part, in the light of another study presented at this congress. Thisargues that author concealment often fails to work effectively in that highlysuggestive clues still remain in many papers, allowing editors or reviewers toidentify the authors of a significant proportion of apparently blindedmanuscripts46. These include the citation of an author’s initials in the text,referencing work ‘in press’ or identifying references as the author’s earlierwork, and revealing the identity of the author’s institutions in the figures.Even if the reviewer cannot detect another member of his or her ‘village’ bythe style or content of a manuscript, there are often plenty of other clues toreveal who the author is.

If this is generally the case – and there have been other studies which havecome to the same conclusion – it follows that author concealment has never,

13

in fact, been properly tested. This conclusion has led at least one core journalto maintain its support for anonymity on the grounds that the strong a prioricase in its favour has yet to be disproved47. The ALPSP survey aboveconfirms that many other journals agree, especially in respect of revieweranonymity.

• There is no evidence that the training of peer reviewers improves the qualityof their performance.Indeed, basic feedback to poorly performing peer reviewers may make thingsworse rather than better, no doubt because criticism without effective supportserves merely to demoralise48. Unfortunately, providing more extensivefeedback (for instance, enclosing an example of a highly rated review withthe scored copy of the reviewer’s own inadequate effort) still appears to haveno effect. Nor does more sophisticated training provision, according to onerecent study, even though peer reviewers themselves expected it to bebeneficial49. Nonetheless, there is at least one initiative in progress todevelop ‘an expanded electronic textbook of peer review’ as a trainingmanual for novice peer reviewers and a reference resource for those who aremore experienced50.

• There is some limited evidence that editorial and peer review processesimprove the quality of reporting and make papers more readable, but this isof limited generalisability.Nor, of course, does it mean the same as improving the quality of theresearch. More conclusive evidence on this issue would clearly be of value inthe context of the significant time and effort invested in bringing papers upto scratch; one line-by-line analysis of 22 papers published in the Annals ofInternal Medicine reports a total of 1,358 substantive revisions by its editors,peer reviewers, statisticians and technical editors51. This may still not beenough according to another study which compares the quality of reviewarticles published in five peer reviewed journals (including the Annals) andfive ‘throwaway’ or non-peer reviewed practitioner journals52. It finds thatthe latter are lower in methodological quality but ‘better at communicatingtheir messages’, and that peer reviewed articles are ‘judged to besignificantly less relevant to clinical practice’.

3. Opening up the ‘black box’Some journals are making changes to their peer review systems on grounds otherthan those provided by research evidence. Thus the BMJ is running an experiment inopen online peer review which allows readers to comment on a selected pre-reviewmanuscript, and posts their views on its website53. At the same time, the manuscriptis peer reviewed in the ordinary way, with the reviewers’ comments and those of theeditorial department also posted on the website. The process is not without danger tothe authors who may find, if the BMJ rejects the manuscript (as is quite probablegiven the huge number of submissions it receives), that other journals will notconsider it on the grounds that it has already been published. Thus, ‘at this earlystage’, it will be restricted to authors who make a positive choice to participate andwill not be applied to papers ‘that have substantial and direct implications for healthcare’.

14

To Richard Smith, BMJ editor, the ultimate argument for open review is ethical –‘for putting authors and reviewers in equal positions and for increasingaccountability’54. The decision to open up the ‘black box’ may well take researchfindings about the technical inadequacies of the peer review process into account butit is primarily based on principles of fairness and openness. And at a time when thereputation of biomedical research is being questioned, there may also be useful‘political’ capital to be gained in demonstrating very publicly that the editorial andpeer review system has nothing to hide.

Unfortunately, the swingeing criticisms that rained down on the authors of the firstmanuscript to undergo this ordeal seem to have stopped the experiment in its tracks.Their contribution on The metamorphosis of biomedical journals remains the onlyone available for comment on the BMJ website although the editorial committeemade its decision in January 1999, taking into account ‘the large number of on-linepeer review opinions’ it received55. The topic was ‘interesting and thoughtprovoking’ but the article was ‘poorly written and badly presented’, its arguments‘poorly referenced and largely unsubstantiated’, and it lacked the all-importantmethods section explaining how the evidence was gathered.

Other biomedical journals listed by the ESPERE (Electronic Submission and PeerReview) project take a rather more merciful line56. For example, eMJA (theelectronic version of the Medical Journal of Australia) is experimenting with peerreview as a password-controlled online discussion via the internet. Only if themanuscript is accepted for publication is it submitted to open review, and then onlyfor a period of four weeks before the final version appears in print. This two stageapproach capitalises on the much desired streamlining benefits to be gained fromusing the internet57, and potentially improves the quality of peer review byfacilitating communication between editors, authors, reviewers and – ultimately –the wider scientific community. In the future, one might see this wider peer reviewprocess involving the consumers as well as producers of research, an approach thatthe Cochrane Collaboration clearly finds valuable in its world of secondarysystematic review of the literature58.

Perhaps most importantly, the two stage process saves at least some of the blushesof authors whose grosser inadequacies will be revealed only to a limited onlineaudience in the first instance. Despite their professed enthusiasm for the electronicroute as a means of submitting manuscripts, the glare of completely open internet-based peer review may be more than most can bear. And unless the scientificcommunity feels comfortable with changes to the traditional system, they areunlikely to succeed, however ethically desirable some editors may feel them to be.Stevan Harnad – whose online journal Psycoloquy puts into practice his concept of‘scholarly skywriting’ – is reported as saying that ‘the best authors are still afraid tosubmit to Psycoloquy’ although it has now been running since 1990 and is indexedby PsycINFO and the Institute for Scientific Information59. Clearly, there is someway to go to devise a model of electronic peer review that satisfies all the partiesinvolved.

15

4. Coming cleanIn their drive to ensure the integrity and quality of the research record, medicaljournal editors have not restricted themselves solely to the reform of peer review.The best statistical or clinical peer reviewers in the world (who may well includethose who judged Malcolm Pearce’s infamous manuscript) can find it impossible todetect deliberate deception or bias when, as is probable, the author has gone to greatlengths to massage or fabricate plausible data and interpretation. In the not so distantpast, scientists who engaged in such behaviour were routinely described as eithermad or bad, the latter group including those with a financial interest in a particularresearch outcome. The increasingly complex inter-relationships between academiaand industry, fuelled to a considerable degree by government policies in theadvanced economies, led most core medical journals to introduce rules on thedeclaration of financial and other competing interests some years ago. These arespreading to other areas of scientific publishing including, in August 2001, Nature60.

But now there is a third category of deceiver; the more or less innocent victims ofsponsors who usurp control over the research process and ultimate publication toensure that the results are favourable to their cause. A BMJ editorial in September2001 expressed the increasing concern of many medical journals that the authors ofsome manuscripts have very little say, if any, in the design, analysis andinterpretation of the research in which they are involved61. In an attempt to counterthis new threat to the integrity of the scientific record, all members of theInternational Committee of Medical Journal Editors now require contributingauthors to disclose details of their own and funders’ roles in the research they report.They will be asked to sign a statement to the effect that they accept fullresponsibility for the conduct of the study, had full access to the data, and controlledthe decision to publish. If they cannot give these assurances, their manuscripts –presumably regardless of apparent quality – will be rejected.

Only time will tell whether these new rules will have the desired effect, but recentevidence on existing conflict of interest policies raises some doubts62. An analysis of1,396 highly rated scientific and biomedical journals found that just under 16% hadsuch policies in 1997. However, in that year, less than 1% of articles in thesejournals contained any disclosure of personal financial interest, and nearly 60% ofthe journals contained no such disclosure. This suggests that either there is a lowrate of financial interest in research on the part of authors, or that there is poorcompliance with (or enforcement of) the rules. The analysts suspect the latter.

Changing the culture

The number of rigorously conducted studies of the peer review process is still small,in part because it is not particularly easy to gain funding for this kind of researchwithin biomedicine where most of the activity is taking place. But is this money wellspent? Is the peer review research community, again in Kuhnian terms, stuck in atraditional paradigm and trying to shore up a system that is approaching the end ofits useful life? One recent contribution to the debate from the Cochrane stable doespoint out that, to date, researchers have failed to carry out any rigorous comparisonsof peer review and other possible methods of selecting manuscripts and applyingquality control63. This may be because researchers literally cannot conceive ofalternatives, being themselves part of the system that has been governed and

16

sustained by peer review for centuries. However, it is equally plausible that theirinsider knowledge tells them that the conditions for a paradigm shift are still someway off. Certainly, commitment to the principle of peer review seems extremelystrong, and efforts may be better spent for the time being in trying to find ways ofimproving rather than replacing it.

J Scott Armstrong of the Wharton School, University of Pennsylvania, is oneresearcher to take this pragmatic approach, based on analysis of the empiricalevidence (much of it, inevitably, biomedical) and his own long experience as anauthor, peer reviewer and journal editor in the field of marketing64. He stronglysupports the concept of peer review but argues that its practical operation in his owndiscipline inhibits the publication of replicable, valid, useful and ‘surprising’ (orinnovative) research – not a very comfortable finding for a social sciencecommunity in the new world of EBPP. In his most recent paper65 he remarks that‘usefulness is not common in academic research’ and places much of the blame onpeer review, arguing that in order to get into print under the current system,‘researchers should not’:

• examine important problems

• challenge existing beliefs

• obtain surprising results

• use simple methods

• provide disclosure, or

• write clearly

Moreover, some of the measures designed to improve the fairness and objectivity ofpeer review can make matters worse. For example, double-blinding inhibits informalcommunication between authors, reviewers and other contacts which can resolvemisunderstandings, allow viewpoints to be discussed and – potentially – improvemanuscripts. The imposition of strict rules of confidentiality on peer reviewers canhave the same effect although, in the case of the natural sciences, there maysometimes be legitimate reasons of commercial sensitivity for a ban on widerdiscussion.

Armstrong’s proposals for improvement (not all of which, he admits, are strictlyevidence based) encompass researchers, journals, business schools, funding agenciesand professional bodies. The primary responsibility for producing useful andimportant work lies with researchers themselves – ‘if they produce nothing of value,all else is meaningless’. Tips include the ‘press release test’: assume that theproposed study is as successful as it could be and try writing an interesting pressrelease explaining why the findings are important. If this is impossible, considerabandoning the project. For those that pass the test, researchers should routinelyseek informal peer review before they submit a manuscript, canvassing opinion fromas many relevant experts as possible in order to maximise the chances of deliveringboth useful and important findings to the editor’s desk.

However, expecting researchers to carry the entire burden of improving the qualityof research is unrealistic. The vast majority are, understandably, primarily concernedwith career advancement and few will resist the sometimes perverse incentives

17

provided by the peer review system. This itself has to change, and Armstrong’soverall aim is to promote a shift in its culture. Rather than acting as a secretscreening process, delivering judgements on whether manuscripts should beaccepted or rejected, it should be perceived and used as a process for improvingmanuscripts and informing readers. Armstrong, unsurprisingly, is a keen advocate ofelectronic publication and open (including post-publication) peer review.

His ‘most important recommendation’ is that social science journal editors shouldpublish everything they receive, albeit ‘only those sections of papers that cover whatthe authors discovered, how they discovered it, and why it is important’. Data and afull account of research methods would be posted on the internet. The editor couldattach ratings to papers as a guide to readers, and reviews would be publishedalongside (in abbreviated form in the printed version, and in full on the internet).Authors whose papers were poorly reviewed could withdraw them, and Armstrongexpects that ‘the number of useless publications would decrease if researchersreceived no credit for simply publishing, and if their work were subject to open peerreview’.

This approach is clearly not compatible with traditional print publication althoughsome social science journals do adopt part of the recipe by publishing ‘responses’either immediately following articles, or in subsequent issues of the journal.However, these are responses to articles that have already successfully negotiatedpeer review, and – as in medicine – it is by no means certain whether socialscientists would be willing to submit themselves to the ordeal of completely openpeer review.

Other Armstrong recommendations are less radical and more compatible with purelyprint publication. For example:

• The decision about whether papers should be accepted or rejected should beentirely that of the editor, and reviewers should not be asked for publicationrecommendations. This sends a signal that the primary purpose of peerreview is critical assessment and the suggestion of substantive improvementswhere these are needed.

• If journals insist on asking reviewers to make recommendations aboutpublication, structured rating sheets can help focus their attention on theusefulness, importance or innovatory nature of research results. For example,in the case of methodological articles they might be prompted to rejectmanuscripts that simply present a new method (a common occurrence in thesocial science literature) and accept only those which compare the methodwith alternative ways of tackling a problem and provide empirical evidenceof why, and in what circumstances, it is superior.

• In order to aid replication and allow judgements about the conditions underwhich the findings hold, all authors should be required to disclose full detailsof methods, data and sources of funding.

18

Getting back to basics

Armstrong, and many of those researching peer review in the biomedical field,concentrate on systems issues but others have argued that the focus of researchneeds to move even further back, into the minds of peer reviewers themselves. Forexample, Australian researcher, Malcolm Atkinson, suggests that the scientificcommunity has failed to discover much that is really useful about the complexitiesof peer review through deduction from ‘painstaking statistical surveys’66. Therehave certainly been plenty of these; the analysis of the annotated bibliography bySpeck, noted earlier, identified survey evidence as the source of 239 ‘assertions’ asagainst only 55 from case histories and 12 from experiments67. There have beenmore experiments in recent years, but Atkinson favours inductive reasoning basedon the detailed examination of single events. This approach may seem less objectivebut could enable ‘more searching questions’ to be asked about peer review which,like all processes of human judgement, is ‘a complex and chancy business subject toall sorts of distractions’.

Jerome Kassirer and Edward Campion who judged peer review to be ‘crude butindispensable’68 are blunter, remarking (admittedly now some seven years ago) that‘studies of peer review and debate about it have focused on everything but its mostimportant aspect – the cognitive task of the reviewer in assessing a manuscript’. Intheir view, the fundamental research question is: how do reviewers think and makedecisions? Until we have more evidence about what peer review is, it will beimpossible to design effective ways of improving its practice and, in turn, the qualityof the published research record.

The closer study of peer review as a complex cognitive and social phenomenon issomething which the social science research community might consider undertaking.And at the more practical level, some of Armstrong’s suggestions for social sciencejournals could also be worth the kind of rigorous examination that has been the casein biomedicine. For example, as editor of the International Journal of Forecasting,he has the ‘impression’ that asking reviewers to refrain from making publicationrecommendations leads them to make more substantive suggestions for improvingpapers. It could be instructive to investigate this impression more thoroughly, or toexamine whether structured reviewer guidance really can lead to the publication ofmore useful, important and innovatory research. Such projects might fit well with arecent Cochrane recommendation that the peer review research community needs toshift its attention away from the quality of peer review itself and towards its effectson the ‘usefulness, relevance, methodological soundness, ethical soundness,completeness and accuracy of published research reports’69.

Social scientists beware!

The case for social scientists to become more aware of, and to address, issues ofpublication peer review has been greatly strengthened by the advent of EBPP.Effective peer review is needed if evidence is to be quality assured and fit for itspurpose, but more is involved than this. It can also provide a shield against moral,emotional or political attack from the wider world by helping to ensure that researchevidence can be vigorously defended within its own terms of reference. And in the

19

process, it may help to protect the producers of that evidence from some of the risksinherent in a closer relationship with its users.

The dangers of the explicitly evidence based world are already apparent in thenatural, especially life, sciences where linking research more closely to innovationand the solution of society’s problems has led to widespread accusations that sciencehas become politicised. The sense of betrayal among a public traditionally used tothinking of science as the disinterested pursuit of truth has been heightened by theincreasing involvement of external funders whose own agendas sometimes conflictwith those of objective research. Social science is now moving further into thisworld, with government as the dominant consumer and provider of funding. Someobservers already detect ‘a blurring, or even erasure, of the distinction betweenresearch contracted by government departments and that funded through theuniversities and by ESRC’70, and worry that social scientists will be forced into anarrow ‘what works’ mould that could prejudice innovatory, blue skies thinking.

The risks to the wider public reputation of social science do not seem to have beenmuch considered, perhaps because they seem less acute than in the life sciences. It isunlikely that many ordinary people see social science research as the disinterestedpursuit of truth, so there is less of a pedestal from which to fall. However, the issueswith which social scientists deal – crime, education, race, social exclusion and so on– are just as productive of strong feeling as the new genetics, so the landing could beequally painful. And to an outsider, the pedestal might already look a little shaky.

EBPP has been promulgated by government as a partnership with the social sciencecommunity but, even in the rhetoric of ministers, it is not a genuine partnership ofequals. Increasing numbers of social scientists are engaged on commissionedresearch for government departments, and the possibility of influence or interferenceshould not be discounted. This may be extremely subtle – a word here, a hint therethat this is the favoured line – and difficult to resist or even recognise by researcherswho pride themselves on having a ‘special relationship’ with a particulardepartment. Just as pharmaceutical companies have, on occasion, been willing toexert pressure on academic researchers in the interests of competitive advantage, sotoo may government departments on what seem to them equally justifiable grounds.Even if this is not so, the fact that it might be could be enough to trigger disquiet in apopulation whose capacity for cynicism about the doings of both government andresearchers has increased markedly in the last few years.

If social scientists engaged in EBPP activities are to minimise the risk of becomingpoliticised, and thus tainted, in the eyes of the wider world, their intellectualindependence and probity needs to be demonstrated beyond doubt. Distinguishingbetween high and low quality research is doubtless a difficult process, but it is notuniquely difficult even though the agonising of the biomedical research communityabout peer review may give the impression that it is71. Clear statements about whatpeer review is intended to do, transparent procedures, a requirement that decisionsmust be justified, monitoring of outcomes to identify possible bias – none of theseseems intrinsically impossible. Nor is the full disclosure of methods, data andsources of funding to provide the peer reviewer with essential background andcontextual information. All it needs is the recognition that a problem exists, and thewill to do something about it.

20

References

1. Solesbury, W (2001) Evidence based policy: whence it came and where it’sgoing ESRC UK Centre for Evidence Based Policy and Practice: London. 10pp(Working Paper 1). Available via:http://www.evidencenetwork.org

2. Wolpert, L (1992) The unnatural nature of science Faber and Faber: London.191pp

3. Kemshall, H (1997) Sleep safely: crime risks may be smaller than you thinkSocial Policy and Administration Sep 31(3) pp247-59

4. House of Lords Select Committee on Science and Technology (2000) Scienceand society: 3rd report, session 1999-2000 Stationery Office: London. 2 vols.HL 1999-2000 38/38-I. Available via:http://www.publications.parliament.uk/pa/ld199900/ldselect/ldsctech/38/3801.htm

5. For discussion of this paradox, see: Mentzel, M; Rutgers, M (eds) (1999) Specialissue on scientific expertise and political accountability Science and PublicPolicy Jun 26(3) Complete issue

6. For example: Grove-White, R et al (1997) Uncertain world: genetically modifiedorganisms, food and public attitudes in Britain Sussex University (formerlyLancaster University) ESRC Centre for the Study of Environmental Change:Brighton. 64pp [At the height of the GM furore in late 1999 the then ChiefScientific Adviser, Sir Robert May, reportedly said that he wished he had readthis study because it was ‘remarkably prescient’.]

7. For example: Reuter, P (2001) Editorial: why does research have so little impacton American drug policy? Addiction 96(3) pp73-76

8. Rind, B et al (2001) The condemned meta-analysis on child sexual abuse: goodscience and long-overdue skepticism Skeptical Inquirer Jul/Aug 25(4) pp68-72

9. For a thought-provoking contribution, see Oakley, A (2000) Experiments inknowing: gender and method in the social sciences Polity Press: Cambridge.402pp

10. Green, J; Tones, K (1999) Towards a secure evidence base for health promotionJournal of Public Health Medicine Jun 21(2) pp133-39

11. Key early contributions to the debate included:Hammersley, M (1997) Educational research and teaching: a response to DavidHargreaves’ TTA lecture British Educational Research Journal 23(2) pp141-61;followed by:Hargreaves, D H (1997) In defence of research for evidence-based teaching: arejoinder to Martyn Hammersley British Educational Research Journal 23(4)pp405-19

12. Tooley, J; Darby, D (1998) Education research: a critique: a survey ofpublished educational research OFSTED, Alexandra House, 33 Kingsway,London WC2B 6SE. 82pp

13. For example: Smith, A F M (1996) Mad cows and ecstasy: chance and choice inan evidence-based society Journal of the Royal Statistical Society A 159(3)pp367-83

14. Rennie, D (1986) Guarding the guardians: a conference on editorial peer reviewJournal of the American Medical Association 7 Nov 256(27) pp2391-92

http://www.evidencenetwork.org/

http://www.publications.parliament.uk/pa/ld199900/ldselect/ldsctech/38/3801.htm

http://www.publications.parliament.uk/pa/ld199900/ldselect/ldsctech/38/3801.htm

21

15. Abstracts and articles from the first three congresses appear in Journal of theAmerican Medical Association, 9 Mar 1990 263(10) pp1317-1441; and ‘themeissues’ published on 13 Jul 1994 272(2), and 15 Jul 1998 280(3)

16. Lock, S (1985) A difficult balance: editorial peer review in medicine NuffieldProvincial Hospitals Trust: London. 172pp (reprinted in 1991 by BMJPublishing Group)

17. Speck, Bruce W (1993) Publication peer review: an annotated bibliographyGreenwood Press: Westport, CT. 277pp

18. Stamps, A E (1997) Advances in peer review research: an introduction Scienceand Engineering Ethics Jan 3(1) pp3-10

19. Alderson, P A; Davidoff, F; Jefferson, T et al (2001) Editorial peer review forimproving the quality of reports of biomedical studies: a Cochrane Reviewc

20. Curran, C F (2001) Journal manuscript peer review: research and controversyc

21. Weller, A C (2001) Editorial peer review: its strengths and weaknessesInformation Today: Medford, NJ, Jun. 360pp [Based on a systematic review ofall studies identified from searches of 19 major databases]

22. Cicchetti, D V (1997) Referees, editors, and publication practices: improving thereliability and usefulness of the peer review system Science and EngineeringEthics Jan 3(1) pp51-62 [Rejection rates of 66% are typical of biomedicine butnot of the physical sciences where 66% acceptance is the norm]

23. Donovan, B (1998) The truth about peer review Learned Publishing Jul 11(3)pp179-84 [the costs presented here range from £60 to £400 per accepted article,although the sample is small and unscientific]. Also available via:http://www.bodley.ox.ac.uk/icsu/donovanppr.htm

24. Abby, M; Massey, M D; Galandiuk, S et al (1994) Peer review is an effectivescreening process to evaluate medical manuscripts Journal of the AmericanMedical Association 13 Jul 272(2) pp105-07a

25. Song, F; Eastwood, A; Gilbody, S et al (2000) Publication and related biasesNational Co-ordinating Centre for Health Technology Assessment:Southampton. 115pp (Health Technology Assessment Vol. 4 No. 10). Availablevia:http://www.hta.nhsweb.nhs.uk/htapubs.htm

26. van Rooyen, S (2001) The evaluation of peer-review quality (LearnedPublishing Apr 14(2) pp85-91

27. For example, see: Fölster, S (1995) The perils of peer review in economics andother sciences Journal of Evolutionary Economics 5(1) pp43-57

28. Kuhn, T S (1970) The structure of scientific revolutions: 2nd enlarged editionUniversity of Chicago Press: Chicago, IL. 210pp (International Encyclopaedia ofUnified Science, Vol. 2 No. 2)

29. For some cases of alleged suppression of dissent, see the site maintained byBrian Martin (Fund for Intellectual Dissent, Box U129 Wollagong University,Wollagong NSW 2500, Australia) at:http://www.uow.edu.au/arts/sts/bmartin/

30. For example: Moran, G (1998) Silencing scientists and scholars in other fields:power, paradigm controls, peer review and other scholarly communicationsAblex Publishing Corporation: Greenwich, CT. 187pp [The somewhat polemicaltone of this book probably derives from Moran’s experience, when living as an‘independent scholar’ in Italy, of trying to challenge the provenance of a wellknown Renaissance painting. For more on this, see Moran, G (1998) Peer review

http://www.bodley.ox.ac.uk/icsu/donovanppr.htm

http://www.hta.nhsweb.nhs.uk/htapubs.htm

http://www.uow.edu.au/arts/sts/bmartin/

22

and academic paradigms: referees and information ethics Journal of InformationEthics Fall 7(2) pp19-29]

31. Hart, M; Healy, E S (2001) Influence of impact factors on scientific publicationc

32. Press notice on the human cloning research athttp://www.advancedcell.com/pr_11-25-2001.html with links to the researchpaper. Brief details about concerns at the results athttp://www.chemind.org/news/issue23/story1.html andhttp://www.chemind.org/news/issue24/story5.html

33. Henderson, A (2001) Undermining peer review Society Jan/Feb 38(2) pp47-5434. Godlee, F; Gale, C R; Martyn, C N (1998) Effect on the quality of peer review

of blinding reviewers and asking them to sign their reports: a randomizedcontrolled trial Journal of the American Medical Association 15 Jul 280(3)pp237-40b

35. Black, N; van Rooyen, S; Godlee, F et al (1998) What makes a good reviewerand a good review for a general medical journal? Journal of the AmericanMedical Association 15 Jul 280(3) pp231-33b

36. Altman, D G (1994) Editorial: the scandal of poor medical research BritishMedical Journal 29 Jan 308(6924) pp283-84. Available at:http://www.bmj.com/cgi/content/full/308/6924/283

37. For discussion of scientific fraud, including issues relating to the publicationsystem, see: Grayson, L (1995) Scientific deception: an overview and guide tothe literature of misconduct and fraud in scientific research British Library:London. 107pp; and Grayson, L (1997) Scientific deception: an update BritishLibrary: London. 72pp

38. Harnad, S (2000) The invisible hand of peer review Exploit Interactive Apr (5)6pp. Available at:http://www.exploit-lib.org/issue5/peer-review/

39. Lawrence, P A (2002) Rank injustice: the misallocation of credit is endemic inscience Nature 21 Feb 415(6874) pp835-36 + reader responses, p819

40. Kassirer, J P; Campion, E W (1994) Peer review: crude and understudied, butindispensable Journal of the American Medical Association 13 Jul 272(2) pp96-97a

41. See Ref. 3342. Schotland, M; Van Scoyoc, E; Bero, L (2001) Published peer review policies:

determining journal peer review status from a non-expert perspectivec

43. The Guidelines were published as part of COPE’s 1999 annual report which isavailable via:http://www.publicationethics.org.uk

44. Association of Learned and Professional Society Publishers et al (2000)ALPSP/EASE peer review survey ALPSP, South House, The Street, Worthing,West Sussex B13 2UU. 15pp. Available via:http://www.alpsp.org.uk/pub4.htm

45. See Ref. 1946. Olmsted, W W; Proto, A V; Katz, D S (2001) Unblinding by authors: incidence

and nature at two medical journals with double blind peer review policiesc

47. Davidoff, F (1998) Editorial: masking, blinding, and peer review: the blindleading the blinded Annals of Internal Medicine 1 Jan 128(1) pp66-68. Availableat:http://www.acponline.org/journals/annals/01jan98/masking.htm

http://www.advancedcell.com/pr_11-25-2001.html

http://www.chemind.org/news/issue23/story1.html

http://www.chemind.org/news/issue24/story5.html

http://www.bmj.com/cgi/content/fukk/308/6924/283

http://www.exploit-lib.org/issue5/peer-review/

http://www.publicationethics.org.uk/

http://www.alpsp.org.uk/pub4.htm

http://www.acponline.org/journals/annals/01jan98/masking.htm

23

48. Callahan, M L; Knopp, R K; Gallagher, E J (2001) Effect of written feedback byeditors on quality of reviews: two randomized trialsc

49. Callahan, M L; Schriger, D L (2001) Effect of voluntary attendance at a formaltraining session on subsequent performance of journal peer reviewersc

50. Cooper, R J; Callahan, M L; Schriger, D L (2001) Structured training resourcesfor scientific peer reviewersc

51. Purcell, G P; Donovan, S L; Davidoff, F (2001) Changes in manuscripts andquality: the contribution of peer reviewc

52. Rochon, P A; Bero, L A; Bay, A M et al (2001) Comparing the quality of reviewarticles published in peer-reviewed and throwaway articlesc

53. Brief details at:http://www.bmj.com/cgi/shtml/misc/peer/index.shtml

54. Smith, R (1997) Editorial: peer review: reform or revolution? British MedicalJournal 27 Sep 315(7111) pp759-60. Available at:http://www.bmj.com/cgi/content/full/315/7111/759

55. Report from BMJ’s full editorial committee British Medical Journal 19 Jan 1999317(7162) Available at:http://www.bmj.com/cgi/content/full/317/7162/0/z/DC2

56. The ESPERE project is investigating the technical and cultural issues of peerreview in an electronic environment, focusing initially on biomedicine. Itswebsite is at http://www.espere.org/online.htm

57. Wood, D J (1998) Peer review and the Web: the implications of electronic peerreview for biomedical authors, referees and learned society publishers Journal ofDocumentation Mar 54(2) pp173-97

58. Middleton, P; Clarke, M; Alderson, P et al (2001) Peer review: are there otherways?c

59. Quoted in: Weller, A C (2000) Emerging models of editorial peer review in theelectronic publication environment. Paper given at the 8th InternationalConference on Medical Librarianship, London, 2-5 Jul 2000. 6pp. Available at:http://www.icml.org/Sunday/publishing/weller.htm

60. Campbell, P (2001) Introducing a new policy for authors of research papers inNature journals Nature, 23 Aug 412(6849) p751

61. Smith, R (2001) Maintaining the integrity of the scientific record British MedicalJournal, 15 Sep 323(7313) p588. Available at:http://www.bmj.com/cgi/content/full/323/7313/588

62. Krimsky, S; Rothenburg, L S (2001) Conflict of interest policies in scientific andmedical journals: editorial practices and author disclosures Science andEngineering Ethics Apr 7(2) pp205-18

63. Jefferson, T; Wager, E; Davidoff, F (2001) Measuring the quality of editorialpeer reviewc

64. Armstrong, J S (1997) Peer review for journals: evidence on quality control,fairness, and innovation Science and Engineering Ethics Jan 3(1) pp63-84. Alsoavailable at:http://knowledge.wharton.upenn.edu/PDFs/648.pdf

65. Armstrong, J S (2001) Discovery and communication of important marketingfindings: evidence and proposals Wharton School, University of Pennsylvania:Philadelphia, PA, Oct. 31pp. Available at:http://www-marketing.wharton.upenn.edu/ideas/pdf/01-011.pdf

66. Atkinson, M (2001) ‘Peer review’ culture Science and Engineering Ethics Apr7(2) pp193-204

http://www.bmj.com/cgi/shtml/misc/peer/index.shtml

http://www.bmj.com/cgi/content/full/315/7111/759

http://www.bmj.com/cgi/content/full/317/7162/0/z/DC2

http://www.espere.org/online.htm

http://www.icml.org/Sunday/publishing/weller.htm

http://www.bmj.com/cgi/content/full/323/7313/588

http://knowledge.wharton.upenn.edu/PDFs/648.pdf

http://www-marketing.wharton.upenn.edu/ideas/pdf/01-011.pdf

24

67. See Ref. 1868. See Ref. 4069. See Ref. 5870. Hammersley, M (2000) The sky is never blue for modernisers: the threat posed

by David Blunkett’s offer of ‘partnership’ to social sciences ResearchIntelligence (72) pp12-14. Available via:http://www.bera.ac.uk/ri/no72/ri72hammers.htmlBlunkett’s offer, made in a speech on ‘Influence or irrelevance’ to the ESRC inFebruary 2000, is available via:http://www/bera.ac.uk/ri/no71/index.html

71. Solesbury, W (1999) Worlds apart Research Fortnight 10 Mar p3 [letter]

a,bFull text available via:http://www.ama-assn.org/public/peer/peerhome.htm

cPresentation at the 4th International Congress on Peer Review in the BiomedicalSciences, Barcelona, September 2001. Abstract available via:http://www.ama-assn.org/public/peer/peerhome.htm

All web addresses correct as of 27 February 2002.

http://www.bera.ac.uk/ri/no72/ri72hammers.html

http://www/bera.ac.uk/ri/no71/index.html

http://www.ama-assn.org/public/peer/peerhome.htm

http://www.ama-assn.org/public/peer/peerhome.htm

Documents

Evidence based policy and the quality of evidence ... · Evidence based policy and the quality of evidence: rethinking peer review Introduction Learning from past experience is one