The SF-36 health survey questionnaire—a tool for economists

HEALTH ECONOMICS, VOL. 2: 213-215 (1993)

ECONOMIC EVALUATION

The purpose of the next two articles is to look at the RAND version of the SF-36 and to demonstrate the potential usefulness of that system for health economics.

The editors are grateful to John Brazier for writing an economic consideration and to Ron Hays and his colleagues at RAND for submitting their version of the 36 item health survey.

THE SF-36 HEALTH SURVEY QUESTIONNAIRE-A TOOL FOR ECONOMISTS

JOHN BRAZIER Medical Care Research Unit, University of Shefield Medical School, ShefieId, UK

The short form 36 (SF-36) health survey instrument is a self-administered general health questionnaire, which generates a profile of scores across eight dimensions of health. It has become one of the most widely applied measures of health in the USA and is increasingly being used in the UK.2,394 It is also being translated and tested for use in at least ten other countries in a project sup- ported by two major pharmaceutical companies. In the near future results from many clinical trials will include this instrument, and their results will be used to market health care products to pur- chasers worldwide. Economists undertaking or appraising health care evaluations will be unable to ignore this new instrument for long.

VERSIONS OF THE INSTRUMENT

The paper by Hays and colleagues in this issue of Health Economics describes two versions of the ‘SF-36’: the Rand 36 item health survey and the MOS 36 item short form health survey. The versions differ slightly in their scoring systems for two of the dimensions, but this has had little effect on the ranks or scores produced. A third version provided by Interstudy in Minneapolis differed on another dimension (having a six rather

than five point scale on social functioning), but this has now been withdrawn. Hays et al. recog- nise the importance of using standardised ques- tionnaires for promoting comparability, and it is unfortunate that there are several similar versions available in the USA. In the UK, a network of users based at the King’s Fund have agreed to use a single anglicised version of the SF-36, with six minor changes to the wording.’ The scoring system of the UK SF-36 is based on the Interstudy version, which uses the same scoring as the MOS SF-36, but applies the Rand scoring system for the pain dimension.

There are no restrictions on the use of the Rand version. Use of the MOS SF-36 requires regis- tration with the MOS Trust and furthermore, in the UK, researchers should register with the Out- comes Clearing House who are keeping track of its proliferation use (see details at the end of this article). In other countries, translation and testing of the SF-36 is at varying stages of development.

EXPERIENCE WITH THE INSTRUMENT

The popularity of the 36-item health survey with researchers has arisen from its ease of administra-

Address for correspondence: John Brazier. Medical Care Research Unit, University of Sheffield Medical School, Beech Road, Sheffield S10 2RX.

Hill

1057-9230/93/030213-03$06.50 @ 1993 by John Wiley & Sons, Ltd.

214 JOHN BRAZIER

tion (since it is self-completed and rarely takes more than ten minutes), its acceptability (achieving response rates in excess of 70% in UK general population studies) and its psychometric performance. It has been found to achieve satisfactory levels of internal consistency, re-test reliability (at least in a general population) and ‘construct’ validity on a general population. 2 9 3

Compared to the Nottingham Health Profile, another general health profile measure, and the Euroqol, it was found to be more sensitive at detecting lower levels of perceived ill-health. 296

For patients suffering common clinical con- ditions, it has been found to generate profiles of health which differ in predictable ways from a general p ~ p u l a t i o n . ~ However, there is no pub- lished evidence on the responsiveness of the SF-36 to changes in health status over time.

USE IN ECONOMIC EVALUATION

The main concern for readers of this journal is whether this new instrument could be used in economic evaluation. Profile general health measures have been used in economic evaluations in the past, but their limitations are well known. The SF-36 could be used in a study to assess effi- ciency where one alternative health care pro- gramme dominates another in terms of all SF-36 dimensions (assuming its domains cover all the relevant health consequences of the intervention), survival, non-health outcomes (where relevant) and costs. But where a tradeoff exists between any of these the SF-36 cannot assist decision makers, since the scores it generates are not in units com- parable either between its own dimensions, or with other consequences.

Secondly, the psychometric tests of validity which have been applied to this questionnaire are not sufficient to satisfy the requirements of an economic measure. The scoring system of the SF-36 has not been compared with revealed preferences, nor with the stated preferences of patients or other population groups. The items and their domains have evolved through a long process, starting with a review of the literature, psychometric testing, and probably an element of researcher intuition. However, as Williams in an earlier issue of this journal commented ‘At no point in the entire exercise are the values of patients used to establish weights on outcomes’.

DERIVING A SINGLE INDEX

The key issue is whether a single index measure can be derived from the SF-36, for use in economic evaluation. One approach is to simply combine the dimension scores into a single index using an arbitrary set of weights. As part of an evaluation of heart transplantation, researchers at Brunel University attempted to aggregate the Nottingham Health Profile (NHP) using three weighting schemes.7 The results were tested for their sensitivity to the different weighting schemes. However this was a very limited sensitivity analysis and other plausible values may have altered the results. To be of use in cost utility analysis, quality of life must be combined with quantity, but it was not clear how such an overall score for the NHP should be combined with survival. In their conclusion, the authors therefore advocated a more formal process using patient preferences. The use of arbitrary weights on the SF-36 should be resisted.

The designers of the SF-36 never intended it to be used to derive a single measure of health. The 36-item health survey is both larger in size and more complex than generic QALY instruments such as the Euroqol or Torrances Health Utilities Index.” These scales not only have fewer items than the SF-36, but have attribute levels which are ordinal. This latter feature permits the application of conjoint analysis (sometimes referred to as multi-attribute utility theory), a technique which avoids the need to value every potential health state. This is an important feature, since even a comparatively small multi-attribute scale such as the Euroqol has 214 potential health states, and experience suggests that responders can only value a fairly limited number of states, typically between 9 to 16 at one interview. However, items in the SF-36 questionnaire do not have an unambiguous ranking. For example in the mental health dimension, it is not clear whether feeling ‘downhearted and blue’ (or ‘low’ in the UK version) is better or worse than being ‘a very nervous person’ for most of the time. There is a further layer of complexity since each item has between 2 and 6 responses. The result is that there are poten- tially millions of health states defined by the

One way forward which we are exploring at the University of Sheffield is both to reduce the size and to simplify the structure of the SF-36 questionnaire into a set of attributes, each with ordinal

SF-36.

THE SF-36 HEALTH SURVEY QUESTIONNAIRE 215

levels. The aim is to achieve a multi-attribute scale which can then be valued from a sample using a range of techniques. An alternative approach is to administer to patients the SF-36 alongside single index measures such as a time tradeoff question or visual analogue scale, and then estimate weights to a set of SF-36 responses.

In the health economics literature, there is growing criticism of basing economic measures of value or utility on generic multi-attribute utility

In examining the SF-36 or other instruments it is important to address the conceptual concerns. Alternative methods of valuing the benefits of health care, such as Health Year Equivalents and willingness to pay, have been advocated but have been applied in only very limited ways. 13 ,14 Whether the SF-36, Euroqol or these other techniques will be able in practice to generate more relevant and valid information for promoting efficient and equitable decision-making is as yet unproven. The best research strategy at the moment must be to pursue a range of approaches. Whether the widely applied SF-36 can be used in economic evaluation remains an important research question.

Further information about the UK SF-36 can be obtained from: Robert Hall, Outcomes Clearing House, The Nuffield Institute for Health Services Research, 71-75 Clarendon Road, Leeds, LS2 9PL.

REFERENCES

Ware, J. E. and Sherbourne, C. D. The SF-36 short-form health status survey 1 . Conceptual framework and item selection. Medical Care, 1992;

Brazier, J. E., Harper, R., Jones, N. M. B., O’Cathain, A., Thomas, K. J., Usherwood, T. and Westlake, L. Validating the SF-36 health survey questionnaire: new outcome measure for primary care. British Medical Journal, 1992; 305: 160-164. Jenkinson, C., Coulter, A. and Wright, L. Short form 36 (SF 36) health survey questionnaire: nor-

30: 473-483.

mative data for adults of working age. British Medical Journal, 1993; 306: 1437-1440.

4. Garratt, A. M., Ruta, D. A., Abdalla, M. I., Buckingham, J. K. and Russell, I . T. The SF-36 health survey questionnaire: an outcome measure suitable for routine use within the NHS. British Medical Journal, 1993; 306: 1440-1444.

5. Aaronson, N. K,, Acquadro, C., Alonso, J., Apolone, G., Bucquet, D., Bullinger, M., Bungay, K., Fukuhara, S., Gandek, B., Keller, S., Razavi, D., Sanson-Fisher, R., Sullivan, M., Wood- Dauphinee, s., Wagner, A. and Ware, J. E. Jr. International quality of life assessment (IQOLA) project. Quality of Life Research, 1992; 1:

6 . Brazier, J . E., Jones, N. M. B. and Kind, P. Testing the validity of the Euroqol and comparing it with the SF-36 health survey questionnaire. Quality of Life Research, 1993; (in press).

7. O’Brien, B. J., Buxton, M. J. and Ferguson, B. A. Measuring the effectiveness of heart transplant pro- grammes: quality of life data and their relationship to survival analysis. Journal of Chronic Diseases,

8. Stewart, A. L. and Ware, J. E. Jr. (eds) Measuring functioning and well-being. The medical outcomes study approach, Duke University Press, Durham, N. Carolina. 1992.

9. Williams, A. Review of: Stewart, A. L. and Ware, J . E. (eds) Measuring functioning and well-being. The Medical Outcomes Study approach. Health Economics, 1992; l(4): 255-258.

10. Torrance, G. W., Zang, Y., Feeny, D., Furlong, W. and Barr, R. Multi-attribute preference func- tions for a comprehensive health status classifica- tion system. CHEPA Working Paper 92-18, McMaster University, Ontario. 1992.

1 1 . Kroes, E. Q. and Sheldon, R. J . Stated preference methods: an introduction. Journal of Transport Economics and Policy, 1988; 22: 11-25.

12. Loomes, G. and McKenzie, L. The use of QALYs in health care decision making. Social Science and Medicine, 1989; 28(4): 299-308.

13. Mehrez, A. and Gafni, A. Quality-adjusted life years, utility theory, and healthy years equivalents. Medical Decision Making, 1989, 9: 132-149.

14. Donaldson, C. Theory and practice of willingness to pay for health care. Health Economics Research Unit, Discussion Paper 01/93, University of Aberdeen 1993.

349-351.

1987; 40(1): 137s-153s.

Documents

The SF-36 health survey questionnaire—a tool for economists