Strategic assessment of research performance indicators

8/7/2019 Strategic assessment of research performance indicators

1/21

Strategic assessment of researchperformance indicators

- an ARC Linkage Project

PROJECT PERSONNEL

Researchers

y Ms Linda Butler, REPP, RSSS, Au s tralianNational Univer s ity

y D r Grit Laudel, REPP, RSSS, Au s tralianNational Univer s ity

y D r Claire D onovan, REPP, RSSS, Au s tralianNational Univer s ity

y Prof Frank Jack s on, Philo s ophy Program, RSSS,Au s tralian National Univer s ity

y Prof D avid Siddle, DV C Re s earch, Univer s ity of

Queensland

y M r Ian Luca s , Re s earch Policy Unit, HigherEducation Group, D EST

Research Assistants

y Ms Anne Hill, REPP, RSSS, Au s tralian NationalUniver s ity

y D r Kumara Henadeerage, REPP, RSSS,Au s tralian National Univer s ity

THE INCREASING IMPORTANCE OFQUANTITATIVE INDICATORS OF RESEARCHPERFORMANCE


2/21

In most OECD countries, increasing emphasis isbeing laid on greater public accountability, with aneed to demonstrate the effectiveness and efficiencyof government-supported research. A workshop heldby the OECD in 1997 characterised recentevaluation of basic research as a rapid growthindustry (OECD 1997).This new demand for research evaluation can not befully catered for by traditional peer review whichhas only a finite capacity. Researchers can devoteonly a limited proportion of their time to peer evaluation before their own work begins to suffer.As a result, there has been an increased use of quantitative performance indicators that have theadded advantage of being more cost efficient. Peer review at the institutional or system level is veryexpensive - it has been estimated that the BritishResearch Assessment Exercise (RAE) expends 10%of available research funding on the process(Schnitzler & Kazemzadeh 1995).Australia is no exception to this changing policyenvironment. The first major study on performancemeasures for universities reported little systematicuse of research indicators at the level of departmentor institution (Bourke 1985). By 1991, the use of awide range of performance indicators was beingproposed, including the establishment of a research

publications collection (Linke et al. 1991).The introduction of the Research Quantum (RQ) inthe early 1990s saw the first distribution of researchfunds to universities based on a formulaencapsulating a number of performance measures(graduate student numbers or completion rates,


3/21

research income, and publications). The RQ wassubsequently replaced by two new funding schemes- the Institutional Grants Scheme (IGS) and theResearch Training Scheme (RTS). Both use aformula comprised of the same three elements,though the weighting for each element variesbetween schemes. In 1996, a study undertaken onbehalf of the National Board of Employment,Education and Training reported that mostuniversities used some research performancemeasures as a basis for the distribution of a portionof their research funds within the institution.The application of quantitative performanceindicators - be it in funding formulas or elsewhere -is not without problems. An indication of this can beseen in the public discussions and opposing viewson what is actually measured by performanceindicators and how they affect the researchenterprise. For example, the publications element inthe RQ/IGS/RTS formulas attracted a considerableamount of attention in submissions to recentgovernment enquiries(White Paper 1999, Senate2001, Batterham 2000). A proposal to remove thepublications component from the formula wasvigorously opposed by those institutions whobelieved they would be financially disadvantaged.Another source of tension within the sector is the

internal use of variations of the RQ formula bymany institutions to distribute research funds tofaculties or even to individuals (Anderson et al.1996, Marginson & Considine 2000). This hasoccurred in spite of the fact that the formula wasnever designed for intra-institutional funding


4/21

allocations (Strand 1998).It is essential that when the deployment of performance measures in the higher education sector results in substantive impact, the choice of measuresused rests on a sound knowledge base covering their validity, fairness, transparency, independence, costand the impact they will have on the researchenterprise. By assessing an extensive range of performance measures, this study will provideAustralian research management and research policymakers with rigorous information on which to baseinformed judgements on the utilisation of quantitative indicators of research performance.

CRITICAL ASSESSMENT OF PERFORMANCEINDICATORS

The assessment of research performance on thebasis of quantitative indicators must meet a number of requirements, of which three are paramount.

Firstly, it must accurately measure thecharacteristics which have been selected as the basisfor the assessment. Secondly, it must be just - itmust not create disadvantages for institutions or for researchers working in specific fields for reasonswhich cannot be influenced by them. And finally, itseffects on research must be consistent with policyobjectives. It may be impossible for everyperformance measure to satisfy each requirement,and trade-offs will frequently have to be made(Sizer 1998).Quantitative indicators of research performance areused in two different contexts. They are increasinglyapplied systematically to rank the performance of


5/21

institutions, groups or individuals, or to feed intoformulas used to distribute research funds. Despitethe rapid development of this use of quantitativemeasures, no thorough critical assessment of theindicators used has been undertaken in Australia or elsewhere (OECD 1997). Criticisms of the RQ byAnderson et al. (1996) and Marginson & Considine(2000) remain on a descriptive level and are notbacked up by empirical research.A recent investigation of Australias scientificoutput gives raise to concerns about the continueduse of formulas in their existing forms (Butler 2001,2002). It documents a significant increase in thecountrys journal output, accompanied by aworrying decrease in the relative internationalimpact of these publications as measured bycitations. The timing of this productivity increase inrelation to the introduction of funding formulassuggests that there is a causal relationship. Thisassumption is further supported by micro-sociological studies on researchers adaptivebehaviour (Knorr-Cetina), anecdotal evidence froman Australian Study (Marginson & Considine 2000),and results of an extensive survey among Australianacademics (Taylor 2001a, b). However, none of these studies provides conclusive evidence of theRQs impact on research practice. Moreover, the

indicators applied in this formula have not beeninvestigated separately and can therefore not beassessed.Quantitative indicators are also being applied in adhoc evaluations. In this context, science studies havefocused on the application of bibliometric indicators


6/21

for the evaluation of research performance. Anumber of problems have been identified wheresuch evaluations rest on a single indicator. Usingdata from the Institute for Scientific Information(ISI), sociologists of science have demonstrated thatpublication counts are a poor measure of quality.Contrasting publication and citation counts in ananalysis of American physicists, Cole and Cole(1967) classified scientists into four different typesaccording to their publication practices. They foundthat publication counts tend to overrate massproducers (high publication counts, low citationcounts) while underrating perfectionists (lowpublication counts, high citation counts). Later analyses also showed that it is misleading to applysimple publication counts across fields becausepublication practices (and thus average numbers of publications per year) are field-specific (Moed et al.1985a).The use of ISIs journal impact factors for evaluation has been heavily criticised because it iseasy to use but the calculation method employed hasserious flaws (e.g. Glaenzel and Moed 2002).Recent work in scientometrics has focused on theapplication of advanced bibliometric indicators,such as citation counts that have been normalised byfield, the journals used, or other characteristics (e.g.

Schubert 1988). However the validity of theseindicators has only been assessed against the peer review process of evaluation (e.g. van Raan 1996,Rinia et al. 1998). No comparative analysis on theapplication of those indicators in repeatedevaluations or in funding mechanisms has been


7/21

undertaken. The one exception is the recentdevelopment of a funding allocation model at theDelft University of Technology by researchers fromthe Centre for Science and Technology Studies(CWTS) at the University of Leiden (Van Leeuwenand Moed 2002).The conduct of research is a complex activity Nosingle measure will provide and adequateassessment of its performance, and it is necessary touse a suite of indicators. The choice of performancemeasures sends a powerful message about what isand is not considered an important outcome of thatresearch (Weiler 2000). It is essential that allmeasures used to distribute scarce research funds, or to determine the relative standing of institutions or researchers, are critically analysed and the fullramifications of their deployment understood.

AIMS OF THE PROJECT

The project aims to provide a knowledge base thatsupports informed decisions on the use of quantitative indicators of research performance.This knowledge base will consist of:

y a comprehensive range of quantifiable bibliometricperformance indicators that could be used to assessresearch performance, including all those currently

used in the Australian higher education sector, andthose identified in an extensive literature review;y an assessment of each indicator in terms of its

validity, fairness, transparency, independence, costeffectiveness and behavioural impact (thesecharacteristics are covered later in greater detail);


8/21

y an evaluation of the likely effects on institutions,organisational units and individuals of usingdifferent performance measures; and

y an assessment of whether the application of field-specific weights to performance measures canprovide a solution to inequalities that can arise fromfield-specific characteristics.

The projects primary focus will be the range of bibliometric indicators that can be used for assessingthe written output of research. However, other inputand output measures will be included in the

analysis.The study will not attempt to identify a single best-practice list of performance measures for use in thehigher education sector. A basic premise of theproject is that the ideal measures to be applied inany context will vary according to institutionalsettings, management priorities, and the basicpurpose of the exercise in which they are beingdeployed. For example, measures to be used in aformula to distribute funds between institutions mayhave little overlap with measures aimed atidentifying the leading researchers in a university.However, in achieving its aims the project willprovide analysts with rigorous data on which tomake informed judgements on the employment of performance indicators in a variety of commonsituations.

SIGNIFICANCE AND INNOVATION

The use of performance indicators sends a powerfulmessage to those being evaluated. Implicit in the


9/21

choice of measures is a statement of what thoseutilising them consider to be most important.Participants at a 1997 OECD workshop noted that,in spite of the increasing emphasis on measuringresearch performance, the effectiveness of thevarious approaches to the evaluation of research hasnot been critically assessed (OECD 1997).This study aims to overcome the gap in our knowledge base on performance indicators byundertaking an extensive empirical analysis tocritically assess the measures commonly used. Itwill also undertake a systemic investigation of alternative performance measures and will assessmeasures used in other higher education systems.The database constructed in the course of the study,drawing information from a wide range of sourcesand covering an extensive variety of researchcharacteristics, will be the most comprehensiveinformation bank yet assembled on these issues.On completion of this study, it will be possible, for the first time, to compare a wide range of performance measures (in terms of validity, fairness,transparency, impact on research, cost, andbehavioural impact) when deciding on the mostappropriate indicators to be introduced in a specificcontext. Until now, policy debate has foundered ona lack of firm empirical data to guide decision

making. When the adoption of a modified version of Englands Research Assessment Exercise (RAE)was mooted in 1997, debate on the proposalfoundered on institutions inability to calculate itsresource implications (Bourke 1997).The study will enable research managers and


10/21

governments to make informed judgements on thedeployment of performance indicators. Anyprojected changes in government policy, or in theadministrative practices of research institutions, willnow be informed by an extensive knowledge baseon a wide range of possible indicators. Thatknowledge can be used to judge the robustness andlikely impact of using performance measures in thespecific context proposed.It will provide a shared information base for dialogue with government over higher educationresearch policy and enable a greater understandingof the implications of proposed managementstrategies.

IDENTIFYING PERFORMANCE MEASURES

The first step in this study will identify possibleperformance measures applicable in assessingresearch. Three different strategies will be used to

obtain a comprehensive overview of such indicators:S trategy 1

All quantitative performance measures currently inuse in the Australian higher education sector will beidentified. This will be accomplished by extensivelyutilising internet resources to obtain institutionalresearch management policies and procedures. Inaddition, follow-up interviews will be undertakenwith research managers - either by phone or inperson. The information obtained in this phase of the study will be used to select the institutions to becovered in more detailed case studies.


11/21

S trategy 2

A comprehensive search to identify additionalbibliometric performance indicators will be

undertaken. We will canvass the literatureextensively for all measures that could conceivablybe used to assess research performance. Animportant part of this work will be to determinewhether bibliometric indicators developed byCWTS for ad hoc evaluations can be applied in amore systematic manner in the Australian context.CWTS has undertaken several studies focusing onthe evaluation of research using a number of complex bibliometric techniques. While most havenot focussed on higher education systems in total,the measures they encompassed may be highlyrelevant to this study. The centres work has coveredthe problematic question of indicators that can beapplied to the humanities and social sciences(Nederhof and van Raan 1989). CWTS has alsovalidated a number of indicators against moretraditional peer review judgements (Nederhof andvan Raan 1987, van Raan 1996, Rinia et al. 1998),and have undertaken several evaluations of university research performance (Moed et al. 1985b,Moed and van Raan 1988).

S trategy 3

Other quantitative performance indicators will beidentified. A number of studies have sought theopinion of academics on the most appropriateindicators by which to judge their researchperformance. A report commissioned in the early


12/21

1990s by the National Board of EmploymentEducation and Training (NBEET) detailed theresponses of nearly 4000 Australian academics to asurvey on research performance indicators (NBEET1993). In addition to the standard performanceindicators (publication counts, research studentnumbers and completion rates, and levels of externalresearch earnings), this study identified keynoteaddresses and prizes as important indicators of research performance.In England, a working group set up by Higher Education Funding Council of England (HEFCE) toconsider the role of quality assurance and education,identified several measures that universities wereputting forward as indicators of quality: patents,innovations and spin-off developments,consultancies, industry links, awards, prizes andfellowships, journal editorships and editorial boardmembership, visiting positions elsewhere, andprofessional body activities (HEFCE 2000).Other studies have looked at performance measuresfor specific fields of research. These include adetailed study of performance indicators that arerelevant to the creative arts (Strand 1998), and ananalysis of book reviews as a measure of researchperformance in the history of medicine (Lewison2001). Other less common elements used include:

institutional rankings, survey responses fromgraduate students, and number of patents (Husso etal. 2000, Shale 1999).

ESTABLISHING AN EXPERIMENTALDATABASE


13/21

A central element of the proposed study will be theestablishment of an experimental database, whichwill be used to test the efficacy of proposedperformance measures identified by our research.This database will consist of:

y the Research Data Collection returns fromuniversities (from Universities);

y a list of all Australian publications indexed by ISI inits three main indices (from REPP);

y annual staffing figures (from DEST); andy data on additional performance measures (generated

in the study).

The various components of the experimentaldatabase will be classified by field of research. Thiswill enable us to test the use of various measures atboth the sectoral and institutional level, and identifythe effects of field-specific characteristics.In order to provide an additional test of robustness

to any proposed measure, the project will draw oninternational data from CWTS. CWTS maintains adatabase of all ISI-indexed publications, and inaddition has institutional data equivalent to REPPsAustralian data for a number of European higher education systems. Research activities are global.Most performance measures that can be applied inthe Australian context should also be applicable in

other systems, and testing the measures in anadditional setting will strengthen their assessment.

ASSESSING THE PERFORMANCE MEASURES


14/21

Each performance measure will initially be assessedon the ease with which the necessary data can beaccessed and/or compiled. The robustness of thosemeasures that are deemed practical, will then beassessed in relation to:

y Validity: the extent to which it is an effectivesurrogate for the characteristics it purports tomeasure;

y Reliability: the consistency of different evaluationmeasures in ranking research performance;

y Fairness: the degree to which it accommodates field-

specific and institution-specific characteristics;y Transparency: the extent to which data used can be

independently verified;y Independence: the extent to which the measures are

resistant to manipulation by researchers or institutions;

y Cost effectiveness: how complex it is to obtain therequired data, and the expected compliance cost toinstitutions and government; and

y Behavioural impact: the likely effect it will have onthe practice of research units and individualresearchers, and whether that impact is in line withdesired policy outcomes.

In order to fully assess the impact of indicators usedin an institutional setting (rather than in the sector as

a whole), case studies of selected Australianuniversities will be conducted using a representativesample of institutions. Senior research managersfrom several universities have already signaled their interest in participating in this phase of the project.Institutions will be chosen to reflect a range of


15/21

research management strategies and researchcultures, and will include both research-intensiveuniversities and regional universities with a smallresearch base. These case studies will be discussedwith the universities senior research managers. Inorder to obtain information about likely impacts onresearch practices, the project will draw on thesocial studies of science literature and on theexperience and opinions of the wider researchcommunity.The project will identify performance measuressuitable for deployment in the Australian higher education system, and will identify the range of contexts in which their use is appropriate. This willinvolve providing for each possible measure acomprehensive list of benefits and shortcomings.The project will also investigate the use of differential weighting systems to handle major variations that occur in the practices of researchersin the different fields.

COMMUNICATION OF RESULTS

The results of the research undertaken in this projectwill be disseminated through severalcommunication channels to all interested parties -peers in the research field, participants in theproject, relevant government agencies, and senior

research administrators.

Communication with peers: The major findings of the project will be published in internationaljournals focussing on research policy, the sociologyof science, and bibliometrics. They will also form


16/21

the basis for presentations at internationalconferences and seminars in these and related fields.

Communication with policy analysts

The results of the study will be published in a detailedreport describing in detail the assessment of all measuresanalysed. The identity of individual institutions will beprotected to ensure the focus of discussion is on theperformance measures themselves, not the relativestanding of institutions. The aim of the report would beto raise issues and provide detailed data to inform further discussions and policy decisions.

Communication with senior research managers and policy makers

Both intermediate and final results of case studies will bediscussed with senior research managers of theuniversities involved. Prior to the completion of the finalreport, a workshop of senior research managers andpolicy makers will be held to discuss the findings of thestudy and seek feedback.

REFERENCES

Anderson, Don, Richard Johnson, Bruce Milligan,1996: P erformance-based funding of universities .Commissioned Report No. 51, National Board of Employment, Education and Training. Canberra.

Batterham , Robin, 2000: The Chance to Change:Final report by the Chief Scientist . Canberra,Department of Industry, Science and Resources.

Bourke, Paul, 1997: Evaluating UniversityResearch: The British Research Assessment Exercise and Australian P ractice , National Board of


17/21

Employment, Education and TrainingCommissioned Report No. 56, AustralianGovernment Publishing Service, Canberra.

Butler, Linda, 2001: M onitoring AustraliasScientific Research: P artial indicators of Australias research performance . Canberra:Australian Academy of Science.

Butler, Linda, 2002: Explaining Australiasincreased share of ISI publications - the effects of afunding formula based on publication

counts, ResearchP

olicy , 32(1):143-155.Cole, Stephen and Jonathan Cole, 1967: ScientificOutput and Recognition, a Study in the Operation of the Reward System in Science, AmericanSociological Review , 32(3), 377-90.

Glnzel, Wolfgang and Henk Moed, 2002: Journalimpact measures in bibliometricresearch, Scientometrics 53(2), 171-193.

HEFCE, 2000 (Higher Education Funding Councilof England): HEFCE Fundamental Review of Research P olicy and Funding: Sub-group toconsider the role of quality assurance and evaluation . www.hefce.ac.uk/Research/review/sub/qaa.pdf .

Husso, Kai, Karjalainen, Sakari and Parkkari,Tuomas, 200: The State and Quality of ScientificResearch inFinland , www.aka.fi/users/33/1979.cfm .


18/21

Knorr-Cetina, Karin D., 1981: The M anufacture of Knowledge: An Essay on the Constructivist and Contextual Nature of Science . Oxford et al.:Pergamon Press.

Lewison, Grant, 2001: Evaluation of books asresearch outputs in history. Research Evaluation .10(2), 89-95.

Linke, Russel et al., 1991. P erformance Indicatorsin Higher Education. Report of a Trial EvaluationStudy Commissioned by the Commonwealth

Department of Employment, Education and Training . Volume 1, Commonwealth Department of Employment, Education and Training.

Marginson, Simon and Mark Considine, 2000: TheEnterprise University. P ower, Governance and Reinvention in Australia . Cambridge: CambridgeUniversity Press.

Moed, H, W.J.M Burger, J.G. Frankfort, A.F.J. VanRaan, 1985a: The application of bibliometricindicators: Important field- and time-dependentfactors to be considered, Scientometrics 8 , 177-203.

Moed, H, W.J.M Burger, J.G. Frankfort, A.F.J. VanRaan, 1985b: The use of bibliometric data for themeasurement of university research

performance, Research P olicy , 14, 131-149.

Moed, Henk F and Anthony FJ van Raan,Indicators of research performance: applications inuniversity research policy, in Anthony FJ van Raan


19/21

(ed), Handbook of Quantitative Studies of Scienceand Technology , Elsevier, Netherlands.

NBEET, 1993 (National Board of Employment,

Education and Training): Research P erformanceIndicators Survey . Commissioned Report No.21.Canberra.

Nederhof, Anton J and Anthony FJ van Raan, 1987:Peer Review and Bibliometric Indicators of Scientific Performance: A Comparison of CumLaude Doctorates with Ordinary Doctorates in

Physics, Scientometrics , 11(5-6), 333-350.Nederhof, Anton J and Anthony FJ van Raan, 1989:A Validation Study of Bibliometric Indicators: TheComparative Performance of Cum Laude Doctoratesin Chemistry, Scientometrics , 17(5-6), 427-435.

OECD, 1997: The Evaluation of ScientificResearch: Selected Experiences , OCDE/GD(97)194,Paris.

Rinia, E.J., van Leewen, Th.N., van Vuren, H.G.and van Raan, A.F.J., 1998: Comparative Analysisof a set of Bibliometric Indicators and Central Peer Review Criteria: Evaluation of Condensed Matter Physics in the Netherlands, Research P olicy , 27(1),95-107.

Schnitzer, Klaus.and Kazemzadeh, Foad, 1995:Formelgebundene Finanzzuweisung des Staates andie Hochschulen - Erfahrungen aus demeuropischen Ausland, HIS-Kurzinformationen A11/1995, Hannover.


20/21

Senate Employment, Workplace Relations, SmallBusiness and Education References Committee,2001: Universities in Crisis , Commonwealth of Australia, www.aph.gov.au/senate/committee/eet_ctte/public uni/report .

Shale Doug, 1999: Albertas performance based funding mechanism and the Alberta Universities .University of Calgary: www.uquebec.ca/conf-quebec/actes/s12.pdf .

Sizer, John, 1998: The politics of performance

assessment: lessons for higher education? Acomment., Studies in Higher Education , 13 (1),101-3.

Strand, Dennis, 1998: Research in the Creative Arts .No 98/6, Evaluations and Investigations Program,Department of Employment, Education, Trainingand Youth Affairs, Canberra.

Taylor, Jeannette, 2001a: Improving performanceindicators in higher education: the academicsperspective, Journal of Further and Higher Education , 25(3), 379-393.

Taylor, Jeannette, 2001b: The impact of performance indicators on the work of universityacadmenics: evidence from Australian

universities, Higher Education Quarterly , 55(1),42-61.

van Leeuwen, Thed N. and Henk Moed, 2002:Development and application of journal impact


21/21

measures in the Dutch sciencesystem, Scientometrics , 53(2), 249-266.

van Raan, Anthony FJ, 1996: Advanced

bibliometric methods as quantitative core of peer review based evaluation and foresightexercises, Scientometrics , 36(3), 397-420.

Weiler Hans W, 2000: States, Markets andUniversity Funding: new paradigms for the reformof higher education in Europe. Compare , 30 (3),333-9.

White Paper, 1999: Knowledge and Innovation: Apolicy statement on research and researchtraining. Department of Employment, Training andYouthAffairs. www.detya.gov.au/archive/highered/whitepaper .

Documents

Strategic assessment of research performance indicators