RACE 62 Study Designs & Measurements in Clinical Epidemiology€¦ · 1986; 100:398–407. 4. Rothman K. Modern epidemiology. Section 1, Chapter 3: Measures of Occurrence. 3rd ed

RACE 622 :Study Designs & Measurements in Clinical Epidemiology

Dr.Pawin Numthavaj

Semester 1 Academic year 2017

Doctor of Philosophy Program in Clinical Epidemiology, Master of Science Program in Medical Epidemiology Section for Clinical Epidemiology & Biostatistics

Faculty of Medicine Ramathibodi Hospital, Mahidol University

CONTENTS

Overview in Measurement in Epidemiology ....................................................... 4

Measurement error .............................................................................................. 5

The Quality of a Measurement ............................................................................ 6

Assessing Validity ............................................................................................... 7

Reliability ............................................................................................................. 9

Measures of occurrence ................................................................................... 10

Measures of association ................................................................................... 22

Measures of Public Health Impact .................................................................... 25

2 | P a g e

OBJECTIVES

1) Able to describe common classifications of measurements in epidemiology

2) Able to discuss definitions, classifications and significance of measurement error

in epidemiologic study

3) Able to describe quality of measurement in terms of validity and reliability

4) Able to calculate and interpret common measurements in epidemiology

REFERENCES

1. Stevens SS. On the theory of scales of measurement. Science 1946; 103, 677–

680.

2. Luce RD. Simultaneous conjoint measurement: a new type of fundamental

measurement. Journal of Mathematical Psychology. 1964; 1: 1–27.

3. Michell J. Measurement scales and statistics: a clash of paradigms, Psychological

Bulletin 1986; 100:398–407.

4. Rothman K. Modern epidemiology. Section 1, Chapter 3: Measures of

Occurrence. 3rd ed. Lippincott Williams & Wilkins. 2008;32-50

5. Rothman K. Modern epidemiology. Section 1, Chapter 4: Measures of Effect and

Measures of Association. 3rd ed. Lippincott Williams & Wilkins. 2008;32-50

6. Fletcher RH. Clinical Epidemiology the Essentials 5th Edition: Chapter 3

Frequency. Lippincott Williams & Wilkins. 2014; 18-31

7. Fletcher RH. Clinical Epidemiology the Essentials 5th Edition: Chapter 4 Risk: Basic

principles. . Lippincott Williams & Wilkins. 2014; 51-60

3 | P a g e

SUGGESTED READING Appendix I: McDowell I. On the Classification of Population Health Measurements. Am Jour Pub Health 2004; 94: 388-393 Appendix II: Enarson DA, Kennedy SM, Miller DL. Measurement in epidemiology. Int J

Tuberc Lung Dis. 2004;8(10):1269-73

4 | P a g e

Measurement in Epidemiology

Data is the cornerstone of epidemiological research. Therefore, Careful measuring,

recording, and handling of information are crucial to the research process. The nature of

the data required must be precisely defined and its collection requires the use of reliable

tools and instruments. As Epidemiology is mostly about identifying associations between

exposures and outcomes, the identification process of any association, exposures and

outcomes, must first be measured in a quantitative manner. Then rates of occurrence of

events are computed. These measures are called “ measures of occurrence. ” Once

measured, the association between exposures and outcomes are then evaluated by

calculating “ measures of association or effect. ” Finally, the impact of removal of an

exposure on the outcome is evaluated by computing “measures of potential impact. ” In

general, measures of disease occurrence are needed to generate measures of

association, and both of these are needed to get measures of impact. There are some

overlaps between these measures, and terminology is poorly standardized.

Figure 1: Measurement in epidemiology in overview

5 | P a g e

Measurement error

In epidemiology, measurement error is usually classified into random error (chance) and

systematic error (bias). Random error, or random fluctuation in our measurement, always

occurs every time of measurement and may reflect inadequate sample size. Systematic

error, or limit of accuracy or precision of our measurement in definable way, usually results

in inappropriate study design. Bias in measuring the effects of an exposure on the risk of

any disease can stem from imperfect sampling, imperfect information gathering, and the

effects of risk factors that are correlated with the aspect of disease under study.

Information bias is imposed by the inability to correctly quantify the interesting exposure

of an individual. This is a major threat to the validity of epidemiologic research.

Imprecision in the measurement of any exposure has a number of important

consequences in epidemiology. Also important to epidemiology is that the combination of

mis-measurement and confounding can induce biases in the impact of both true risk

factors and correlated factors that impart no alteration of risk. These biases are not, in

general, eliminated by standard procedures for statistical control.

6 | P a g e

The Quality of a Measurement

Validity and Reliability

Suppose we use the example of arrow shooting. The consistency (or reliability) of a

measurement would be represented by how close successive arrows fall to each other,

wherever they land on the target. Validity would be represented by the aim of the

shooting—how close, on average, the shots come to the center of the target as shown in

figure 2.

The core idea in validity concerns the meaning, or interpretation, of instrument or tool on

a measurement. Validity is often defined as the extent to which a test measures that which

it is intended to measure. This conception of validity, which reflects the idea of agreement

with a criterion, is commonly used in epidemiology and underlies the notions of sensitivity

and specificity. However this approach has limitation. The shift in definition is significant,

in that validity is no longer a property of the measurement, but rather of the interpretation

placed on the results.

Figure 2: Demonstrates reliability and validity

A

C

B

D

ReliabilityHigh Low

High

Low

Validity

7 | P a g e

Assessing Validity

Assessing validity may be problematic in the situation that there is no specific target of

reference, naming reference standard or “gold standard”. The example of quality of life or

health status is a good example. The tools to measure quality of life must be based on

other types of validity, construct validity, content validity and criterion validity.

Content validity has two components. Face validity is the subjective judgment about

whether a measurement makes sense intuitively, whether it is reasonable. Sampling

validity refers to whether the measurement incorporates most of the aspects of the

phenomenon under study for example, a valid measure of quality of life should include

questions on social, physical, emotional, and intellectual functioning.

Construct validity refers to how well the measurement conforms to theoretical

concepts (construct) concerning the entity under study. For example, if a particular trait

is theoretically believed to differ between two groups of individuals a measurement with

construct validity should show this difference.

Criterion-related validity is the degree to which the measurement correlates with an

external criterion of the phenomenon under investigation. A variation is predictive validity,

the ability of the measurement to predict the future occurrence of that criterion; for

example, an investigator could validate a measure of depression by examining its ability

to predict suicide.

8 | P a g e

The general approach to validating an abstract measurement is to begin by

searching the literature and consulting with experts in an effort to find a suitable instrument

that has already been validated. If one can be found, its use may bypass the need to

study the validity of the measurement approach. This strategy also has the advantage of

making the results of the new study comparable with earlier work in the area and may

simplify and strengthen the process of applying for grants and publishing the results. The

disadvantage, however, is that the instrument taken off the shelf may be outdated or not

appropriate for the research question.

9 | P a g e

Reliability

The repeatability of a measurement refers to the extent to which repeated extraction and

analysis of the marker indicate the closest value, time and time again.

Reliability is a more restrictive criterion of measurability is. Reliability is the repeatability of

a measure net of any repetition of errors of measurement.

There are various measures of internal consistency that give us the reliability of a scale or

test. The most common one is “ Cronbach's alpha” . This is a measure of how

homogeneous the scale items are, that is, to what extent they measure the same thing. If

a scale is composed of several different subscales, each measuring different things, then

the Cronbach's alpha should be used for each of the subscales separately rather than the

whole scale. Cronbach's alpha gives the lower bound for reliability. If it is high for the

whole scale, then the scale is reliable ( repeatable, highly correlated with the “ true,” but

unknown, scores). If you get a low alpha for the whole scale, then either it is unreliable or

it measures several different things.

10 | P a g e

Measures of occurrence

Occurrence of disease is the very important basic element because epidemiology is the

study of distribution and determinants of disease frequency in human population.

Moreover, due to the broad scope of epidemiology today, we may define epidemiology

as the study of distribution of health-related states and events in populations, not only

disease. Epidemiologist must be able to measure the frequency of disease occurrence

which includes measurement of disease frequency in mathematical quantity and

measurement of disease frequency in epidemiology.

Measurement of disease frequency in mathematical quantity

Common frequency measures are ratios, proportions, and rates. The simple algorithm for

distinguishing rates, proportions, and ratios has been demonstrated in Figure 3.

Figure 3: Algorithm for distinguishing rates, proportions, and ratios

11 | P a g e

Ratio A ratio is the relative magnitude of two quantities or a comparison of any two values. It is

calculated by dividing one interval- or ratio-scale variable by the other. The numerator

and denominator need not be related. Therefore, one could compare number of flowers

with number of hospitalizations.

Method for calculating a ratio

Number or rate of events, items, persons, etc. in one group

Number or rate of events, items, persons, etc. in another group

.

Example I Between 2000 and 2002, as part of the Ramathibodi kidney transplant registry, 240 recipients ages 18–77 years were enrolled in a follow-up study. At the time of enrollment, each study participant was classified according to type of donors as living related donor (LRKT) or deceased donors (DDKT) and gender. During 2010–2012, enrollees were documented either to have graft failure or were still graft function. The results are summarized as follows. At enrollment Graft failure at follow-up LRKT men 80 3 LRKT women 60 2 DDKT men 55 4 DDKT women 45 3 Calculate the ratio of LRKT to DDKT men: Ratio 80/55 = 1.45:1

12 | P a g e

Proportion A proportion is the comparison of a part to the whole. It is a type of ratio in which the

numerator is included in the denominator. You might use a proportion to describe what

fraction of clinic patients were diagnosed DM, or what percentage of the population is

older than 75 years of age. A proportion may be expressed as a decimal, a fraction, or a

percentage.

Method for calculating a proportion

For a proportion, 10n is usually 100 (or n=2) and is often expressed as a percentage.

Examples II A. Calculate the proportion of men in the enrollment who had LRKT.

Numerator = 80 LRKT men Denominator = Total number of men = 80 + 55 = 135

Proportion = (80 / 135) x 100 = 59% B. Calculate the proportion of graft failure among men. Numerator = graft failure in men

= 3 graft failure in LRKT men + 4 graft failure in DDKT men = 7 graft failure in men

Notice that the numerator (7 graft failure in men) is a subset of the denominator. Denominator = all graft failures

= 7 graft failures in men + 5 graft failures women = 12 graft failures

Proportion = 7 / 12 = 58.3%

13 | P a g e

Rate A rate is a measure of the frequency with which an event occurs in a defined population

over a specified period of time. Because rates put disease frequency in the perspective

of the size of the population, rates are particularly useful for comparing disease

frequency in different locations, at different times, or among different groups of persons

with potentially different sized

populations; that is, a rate is a measure of risk.

Table 1: Epidemiologic measures categorized as ratio, proportion and rate

Incidence

Incidence refers to the occurrence of new cases of disease or injury in a population over

a specified period of time. Two types of incidence are commonly used — incidence

proportion and incidence rate.

14 | P a g e

Incidence proportion or cumulative incidence

Incidence proportion is the proportion of an initially disease-free population that develops

disease, becomes injured, or dies during a specified ( usually limited) period of time.

Synonyms include attack rate, risk, probability of getting disease, and cumulative

incidence. Incidence proportion is a proportion because the persons in the numerator,

those who develop disease, are all included in the denominator (the entire population).

Method for calculating incidence proportion (risk)

Number of new cases of disease or injury during specified period Size of population at start of period

Examples III A. From study in example I, 3 of the 80 LRKT men died during the 10-year

follow-up period. Calculate the risk of graft failure for these men.

Numerator = 3 graft failure among the LRKT men Denominator = 80 LRKT men 10n = 102 = 100

Risk = (3 / 80) x 100 = 3.7%

B. In an outbreak of gastroenteritis among attendees of a meeting, 99 persons ate Som tum (Green papaya salad), 30 of whom developed gastroenteritis. Calculate the risk of illness among persons who ate Som tum. Numerator = 30 persons who ate Som tum and developed gastroenteritis Denominator = 99 persons who ate Som tum 10n = 102 = 100 Risk = “Food-specific attack rate” = (30 / 99) x 100 = 0.303 x 100 = 30.3%

15 | P a g e

Incidence rate or incidence density

Incidence rate is a measure of incidence that incorporates time directly into the

denominator. Synonyms include incidence density or person-time rate. A person-time rate

is generally calculated from a long- term cohort follow-up study, wherein enrollees are

followed over time and the occurrence of new cases of disease is documented. Similar to

the incidence proportion, the numerator of the incidence rate is the number of new cases

identified during the period of observation. However, the denominator differs. The

denominator is the sum of the times each person was observed for all persons. This

denominator represents the total time the population was at risk of and being observed

for disease. Therefore, the incidence rate is the ratio of the number of cases to the total

time the population is at risk of disease. An incidence rate describes how quickly disease

occurs in a population.

Method for calculating incidence rate

Number of new cases of disease or injury during specified period

Time each person was observed, totaled for all persons

In a long- term follow-up study of morbidity, each study participant may be followed for

several years. One person followed for 5 years without developing disease is claimed to

contribute 5 person-years of follow-up. However, long- term cohort studies of the type

described here are not very common so epidemiologists more commonly calculate

incidence rates based on a numerator of cases observed or reported, and a denominator

based on the mid-year population. This type of incident rate turns out to be comparable

to a person-time rate.

16 | P a g e

Examples IV A. Hypothetical cohort of 12 initially disease-free subjects followed over a 5-

year period from 1990 to 1995.

B. In 2003, 44,232 new cases of acquired immunodeficiency syndrome (AIDS) were reported in the United States. The estimated mid-year population of the U.S. in 2003 was approximately 290,809,777. Calculate the incidence rate of AIDS in 2003.

Numerator = 44,232 new cases of AIDS Denominator = 290,809,777 estimated mid-year population 10n = 100,000

Incidence rate = (44,232 / 290,809,777) x 100,000 = 15.21 new cases of AIDS per 100,000 population

17 | P a g e

Prevalence

Prevalence, which is sometimes referred to as prevalence rate, is the proportion of

persons in a population who have a particular disease or attribute at a specified point in

time or over a specified period of time. Prevalence differs from incidence in that

prevalence includes all cases, both new and preexisting, in the population at the specified

time, whereas incidence is limited to new cases only.

Point prevalence refers to prevalence measured at a particular point in time. It is the

proportion of persons with a particular disease or attribute on a particular date.

Period prevalence refers to prevalence measured over an interval of time. It is the

proportion of persons with a particular disease or attribute at any time during the interval.

Method for calculating prevalence of disease

All new and pre-existing cases during a given time period x 10n

Population during the same time period

Prevalence and incidence are frequently confused. Prevalence refers to proportion of

persons who have a condition at or during a particular time period, whereas incidence

refers to the proportion or rate of persons who develop a condition during a particular time

period. Therefore prevalence and incidence are similar, but prevalence includes new and

pre-existing cases whereas incidence includes new cases only. The key difference is in

their numerators.

Numerator of incidence = new cases that occurred during a given time period

Numerator of prevalence = all cases present during a given time period

18 | P a g e

Prevalence is based on both incidence and duration of illness as shown in Figure 4. High

prevalence of a disease within a population might reflect

high incidence or prolonged survival without cure or both. Conversely, low prevalence

might indicate low incidence, a rapidly fatal process, or rapid recovery.

• Prevalence rather than incidence is often measured for chronic diseases such as

hypertension or chronic kidney disease which have long duration and dates of onset that

are difficult to pinpoint.

Figure 4: Relationship between prevalence and incidence

Example V In a survey of 1,150 women who gave birth in Ramathibodi hospital in 2000, a total of 468 reported taking a multivitamin at least 4 times a week during the month before becoming pregnant. Calculate the prevalence of frequent multivitamin use in this group.

Numerator = 468 multivitamin users Denominator = 1,150 women Prevalence = (468 / 1,150) x 100 = 0.407 x 100 = 40.7%

19 | P a g e

Mortality Frequency Measures A mortality rate is a measure of the frequency of occurrence of death in a defined

population during a specified interval. Morbidity and mortality measures are often the

same mathematically; it’s just a matter of what you choose to measure, illness or death.

When mortality rates are based on vital statistics (e.g. , counts of death certificates) , the

denominator most commonly used is the size of the population at the middle of the time

period. Table 2 summarizes the formulas of frequently used mortality measures.

Table 2: Frequently Used Measures of Mortality

20 | P a g e

Age-adjusted mortality rates

Mortality rates can be used to compare the rates in one area with the rates in another area,

or to compare rates over time. However, because mortality rates obviously increase with

age, a higher mortality rate among one population than among another might simply

reflect the fact that the first population is older than the second. To eliminate the distortion

caused by different underlying age distributions in different populations, statistical

techniques are used to adjust or standardize the rates among the populations to be

compared. These techniques take a weighted average of the age specific

mortality rates, and eliminate the effect of different age distributions among the different

populations. Mortality rates computed with these techniques are age- adjusted or age-

standardized mortality rates.

Death-to-case ratio

The death- to-case ratio is the number of deaths attributed to a particular disease during

a specified time period divided by the number of new cases of that disease identified

during the same time period. The death- to-case ratio is a ratio but not necessarily a

proportion, because some of the deaths that are counted in the numerator might have

occurred among persons who developed disease in an earlier period, and are therefore

not counted in the denominator.

Method for calculating death -to-case ratio

Number of deaths attributed to a particular disease during specified period

Number of new cases of the disease identified during the specified period

21 | P a g e

Case-fatality rate

The case- fatality rate is the proportion (not the true rate) of persons with a particular

condition ( cases) who die from that condition. It is a measure of the severity of the

condition. The case- fatality rate is a proportion, so the numerator is restricted to deaths

among people included in the denominator. The time periods for the numerator and the

denominator do not need to be the same; the denominator could be cases of HIV/AIDS

diagnosed during the calendar year 1990, and the numerator, deaths among those

diagnosed with HIV in 1990, could be from 1990 to the present.

Method for calculating case-fatality rate

The concept behind the case- fatality rate and the death- to-case ratio is similar, but the

formulations are different. The death-to case ratio is simply the number of cause-specific

deaths that occurred during a specified time divided by the number of new cases of that

disease that occurred during the same time. The deaths included in the numerator of the

death- to-case ratio are not restricted to the new cases in the denominator. In fact, for

many diseases, the deaths are among persons whose onset of disease was years earlier.

In contrast, in the case-fatality rate, the deaths included in the numerator are restricted to

the cases in the denominator.

22 | P a g e

Measures of association

The key to epidemiologic analysis is comparison. The term association was invented by

statisticians when we observed that two variables are related in some way. A measure of

association quantifies the relationship between exposure and disease among the two

groups. Exposure is used to mean not only exposure to foods, mosquitoes, or a toxic

substance, but also includes characteristics of persons ( for example, age, race, sex) ,

biologic characteristics ( immune status) , acquired characteristics ( marital status) ,

activities ( occupation, physical activities) , or conditions under which they live

(socioeconomic status or access to medical care). Examples of measures of association

include risk ratio (relative risk), rate ratio, and odds ratio.

Risk ratio

Definition of risk ratio

A risk ratio (RR) , also called relative risk, compares the risk of a health event ( disease,

injury, risk factor, or death) among one group with the risk among another group. It does

so by dividing the risk (incidence proportion, attack rate) in group 1 by the risk (incidence

proportion, attack rate) in group 2 . The two groups are typically differentiated by such

demographic factors as sex (e.g., males versus females) or by exposure to a suspected

risk factor (e.g., DM or no DM).

Method for Calculating risk ratio The formula for risk ratio (RR) is: Risk of disease (incidence proportion) in group of primary interest Risk of disease (incidence proportion) in comparison group

23 | P a g e

A risk ratio of 1.0 indicates identical risk among the two groups, which means no effect. A

risk ratio greater than 1.0 indicates an increased risk for the group in the exposed group.

A risk ratio less than 1.0 indicates a decreased risk for the exposed group, indicating that

perhaps exposure actually protects against disease occurrence.

Rate ratio

A rate ratio compares the incidence rates, person- time rates, or mortality rates of two

groups. As with the risk ratio, the two groups are typically differentiated by demographic

factors or by exposure to a suspected causative agent.

Rate ratio = Rate for group of primary interest

Rate for comparison group

The interpretation of the value of a rate ratio is similar to that of the risk ratio.

Odds ratio

An odds ratio ( OR) is another measure of association that quantifies the relationship

between an exposure with two categories and health outcome. Referring to the four cells

in Table 3.15, the odds ratio is calculated as

Odds ratio = (a /b )( c/d ) = ad / bc

where

a = number of persons exposed and with disease

b = number of persons exposed but without disease

c = number of persons unexposed but with disease

d = number of persons unexposed and without disease

a+c = total number of persons with disease (case-patients)

b+d = total number of persons without disease (controls)

24 | P a g e

The odds ratio is sometimes called the cross- product ratio because the numerator is

based on multiplying the value in cell “ a” times the value in cell “ d,” whereas the

denominator is the product of cell “b” and cell “c.”

Example VI

A. Exposure and Disease in a Hypothetical Population of 10,000 Persons

Calculate the risk and odds ratios. 1. Risk ratio: 5.0 / 1.0 = 5.0 2. Odds ratio: (100 x 7,920) / (1,900 x 80) = 5.2 Notice that the odds ratio of 5.2 is close to the risk ratio of 5.0. That is one of the attractive features of the odds ratio — when the health outcome is uncommon, the odds ratio provides a reasonable approximation of the risk ratio. Another attractive feature is that the odds ratio can be calculated with data from a case-control study, whereas neither a risk ratio nor a rate ratio can be calculated.

25 | P a g e

Measures of Public Health Impact

A measure of public health impact is used to place the association between an exposure

and an outcome into a meaningful public health context. Whereas a measure of

association quantifies the relationship between exposure and disease, and thus begins to

provide insight into causal relationships, measures of public health impact reflect the

burden that an exposure contributes to the frequency of disease in the population.

Measures of public health impact, which are often used, are the attributable risk and the

attributable proportion.

The attributable risk

The attributable risk reflects the quantifiable disease burden in an exposed group

attributable to exposure. It provides an answer to what is the risk which can be attributed

to the exposure or what is the excess risk due to the exposure. It can be calculated as

risk difference (RD).

26 | P a g e

The attributable proportion The attributable proportion, also known as the attributable risk percent, is a measure of

the public health impact of a causative factor. The calculation of this measure assumes

that the occurrence of disease in the unexposed group represents the baseline or

expected risk for that disease. It further assumes that if the risk of disease in the exposed

group is higher than the risk in the unexposed group, the difference can be attributed to

the exposure. Thus, the attributable proportion is the amount of disease in the exposed

group attributable to the exposure. It represents the expected reduction in disease if the

exposure could be removed (or never existed).

Appropriate use of attributable proportion depends on a single risk factor being

responsible for a condition. When multiple risk factors may interact (e.g., physical activity

and age or health status), this measure may not be appropriate.

Method for calculating attributable proportion

Attributable proportion is calculated as follows:

Risk for exposed group – risk for unexposed group x 100%

Risk for exposed group

Summary

Frequency measures include ratios, proportions, and rates. Ratios and proportions are

useful for describing the characteristics of populations. Proportions and rates are used

for quantifying morbidity and mortality. These measures allow epidemiologists to infer risk

among different groups, detect groups at high risk, and develop hypotheses about causes

— that is, why these groups might be at increased risk. The two primary measures of

morbidity are incidence and prevalence.

27 | P a g e

A variety of mortality rates describe deaths among specific groups, particularly by age or

sex or by cause. The hallmark of epidemiologic analysis is comparison, such as

comparison of observed amount of disease in a population with the expected amount of

disease. The comparisons can be quantified by using such measures of association as

risk ratios, rate ratios, and odds ratios. These measures provide evidence regarding

causal relationships between exposures and disease. Measures of public health impact

place the association between an exposure and a disease in a public health context. Two

such measures are the attributable risk and attributable proportion.

Documents

RACE 62 Study Designs & Measurements in Clinical Epidemiology€¦ · 1986; 100:398–407. 4. Rothman K. Modern epidemiology. Section 1, Chapter 3: Measures of Occurrence. 3rd ed