Scales of Measurement

Scales of Measurement Data comes in various sizes and shapes and it is important to know about these so that the proper analysis can be used on the data. There are usually 4 scales of measurement that must be considered: 1. Nominal Data classification data, e.g. m/f no ordering, e.g. it makes no sense to state that M > F arbitrary labels, e.g., m/f, 0/1, etc 2. Ordinal Data ordered but differences between values are not important e.g., political parties on left to right spectrum given labels 0, 1, 2 e.g., Likert scales, rank on a scale of 1..5 your degree of satisfaction e.g., restaurant ratings 3. Interval Data ordered, constant scale, but no natural zero differences make sense, but ratios do not (e.g., 30-20=20-10, but 20/10 is not twice as hot! e.g., temperature (C,F), dates 4. Ratio Data ordered, constant scale, natural zero e.g., height, weight, age, length Some computer packages (e.g. JMP) use these scales of measurement to make decisions about the type of analyses that should be performed. Also, some packages make no distinction between Interval or Ratio data calling them both continuous. However, this is, technically, not quite correct. Only certain operations can be performed on certain scales of measurement. The following list summarizes which operations are legitimate for each scale. Note that you can always apply operations from a 'lesser scale' to any particular data, e.g. you may apply nominal, ordinal, or interval operations to an interval scaled datum. Nominal Scale. You are only allowed to examine if a nominal scale datum is equal to some particular value or to count the number of occurrences of each value. For example, gender is a nominal scale variable. You can examine if the gender of a person is F or to count the number of males in a sample. A researcher might wish to compare essay grades between male and female students. Tabulations would be compiled using the categories "male" and "female." Sex would be a nominal variable. Note that the categories themselves are not quantified. Maleness or femaleness are not numerical in nature, rather the frequencies of each category results in data that is quantified -- 11 males and 9 females. Gender, handedness, favorite color, and religionNominalNominal scales are the lowest scales of measurement. Numbers are assigned to categories as "names". Which number is assigned to which category is completely arbitrary. Therefore, the only number property of the nominal scale of measurement is identity. The number gives us the identity of the category assigned. The only mathematical operation we can perform with nominal data is to count.

Classifying people according to gender is a common application of a nominal scale.In the example below, the number "1" is assigned to "male" and the number "2" is assigned to "female". We can just as easily assign the number "1" to "female" and "2" to male. The purpose of the number is merely to name the characteristic or give it "identity".

As we can see from the graphs, changing the number assigned to "male" and "female" does not have any impact on the data -- we still have the same number of men and women in the data set.

Additional examples for everyday nominal scales are zip codes, area of country.A typical example of a nominal variable in psychology is diagnosis. The diagnostic system used by clinical psychologists and psychiatrists assigns numbers to different diagnostic categories (Diagnostic and Statistical Manual of Mental Disorders - IV (DSM-IV), American Psychiatric Association, 1994). These numbers are merely arbitrary "codes" for easy record keeping for hospitals and insurance companies. Let's look at three diagnoses -- Schizophrenia - Disorganized Type (295.1); Major Depressive Disorder - Recurrent (296.3); and Obsessive-Compulsive Disorder (300.3). The graph below shows the prevalence of these disorders in a hospital population.

Suppose the hospital administrator decided to change the numbers assigned to each diagnosis. Let's see what happens to the distribution.

The categories appear at different places on the x-axis but the prevalence data are the same. Only the label changes.

Other examples for other psychological research nominal scales: Behavior codes in naturalistic observations, drug type, brain regions.

Ordinal Scale. You are also allowed to examine if an ordinal scale datum is less than or greater than another value. Hence, you can 'rank' ordinal data, but you cannot 'quantify' differences between two ordinal values. For example, political party is an ordinal datum with the NDP to left of Conservative Party, but you can't quantify the difference. Another example, are preference scores, e.g. ratings of eating establishments where 10=good, 1=poor, but the difference between an establishment with a 10 ranking and an 8 ranking can't be quantified. They indicate only that one data point is ranked higher or lower than another (Runyon, 1991). For instance, a researcher might want to analyze the letter grades given on student essays. An A would be ranked higher than a B, and a B higher than a C. However, the difference between these data points, the precise distance between an A and a B, is not defined. Letter grades are an example of an ordinal variable.A researcher wishing to measure consumers' satisfaction with their microwave ovens might ask them to specify their feelings as either "very dissatisfied", "somewhat dissatisfied", "somewhat satisfied", or "very satisfied". The items in this scale are ordered, ranging from least to most satisfied. This is what distinguishes ordinal from nominal scales. Whereas nominal scales don't allow comparisons in the degree to which two subjects possess the dependent variable, just this kind of comparison is possible with ordinal scales. For example, our satisfaction ordering makes it meaningful to assert that one person is more satisfied than another with their microwave ovens. Such an assertion reflects the first person's use of a verbal label that comes later in the list than the label chosen by the second person. On the other hand, ordinal scales fail to capture important information that will be present in the other scales we examine. In particular, the difference between two levels of an ordinal scale cannot be assumed to be the same as the difference between two other levels. In our satisfaction scale, for example, the difference between the responses "very dissatisfied" and "somewhat dissatisfied" cannot be compared to the difference between "somewhat dissatisfied" and "somewhat satisfied". Nothing in our measurement procedure allows us to determine whether the two differences reflect the same difference in psychological satisfaction. Statisticians express this point by saying that the differences between adjacent scale values do not necessarily represent equal intervals on the underlying scale giving rise to the measurements. (In our case, the underlying scale is the true feeling of satisfaction, which we are trying to measure.) What if the researcher had measured satisfaction by asking consumers to indicate their level of satisfaction by choosing a number from one to four? Would the difference between the responses of one and two necessarily reflect the same difference in satisfaction as the difference between the responses two and three? The answer is No. Changing the response format to numbers does not change the meaning of the scale. We still are in no position to assert that the mental step from 1 to 2 (for example) is the same as the mental step from 3 to 4. Ordinal scales have the property of magnitude as well as identity. The numbers represent a quality being measured (identity) and can tell us whether a case has more of the quality measured or less of the quality measured than another case (magnitude). The distance between scale points is not equal. Ranked preferences are presented as an example of ordinal scales encountered in everyday life. We also address the concept of unequal distance between scale points.Ranked PreferencesWe are often interested in preferences for different tastes, especially if we are planning a party. Let's say that we asked the three students pictured below to rank their preferences for four different sodas. We usually rank our strongest preference as "1". With four sodas, our lowest preference would be "4". For each soda, we assign a rank that tells us the order (magnitude) of the preference for that particular soda (identity). The number simply tells us that we prefer one soda over another, not "how much" more we prefer the soda.

Because of the property of magnitude (or order), the numbers are no longer considered arbitrary as they are in nominal scales. If you asked students their preferences because you wanted to serve what they like best at a party, you would serve our first student Pepsi, our second student Sprite, and our third student Surge.

Let's change the numbers assigned to "Pepsi" and "Coke" for our first student.

Changing the numbers changes the meaning of the preferences. You would now serve our first student Coke and not Pepsi.

Distance between Scale PointsWe assume that the intervals between scale points on ordinal scales are unequal. Thus, the "distance" between a rank of "1" and "2" is not necessarily the same as the "distance" between ranks of "3" and "4".

Let's say our first student likes Pepsi the best but also has a strong liking for Coke, which she rated as "2". She thinks Sprite is OK but prefers cola drinks. She really does not like Surge at all. In this case the preference "distance" between "3" and "4" is much greater than the preference "distance" between ranks "1" and "2" even though the numerical distance between them is the same. This concept of unequal psychological distance is pictured below.

Other examples for everyday ordinal scales: Socioeconomic status, class rank, letter gradeWe will look at two different forms of ordinal scales ranked preferences assigned ranks We will end with a discussion of the interval or distance between scale points.Ranked PreferencesPhysiological psychologists are often interested in preferences for different tastes. Let's say that you were asked to taste five different foods and rank your preference in order. The foods are sweet, salty, bitter, sour, fatty. We usually rank our strongest preference as "1" . With five foods, our lowest preference would be "5". These ranks have the property of identity because they tell us which food and magnitude because they place the preference in order. They do not tell us "how much" more, just more or less.

Because of the property of magnitude (or order), the numbers are no longer considered arbitrary. The numbers reflect a characteristic of the person -- taste preference. Let's change the numbers assigned to "Sweet" and "Fatty" and "Sour" for our first student.

Changing the numbers changes the meaning -- the numbers now indicate a very different type of person -- rather than having a "sweet tooth" we would think of this student as having an unusual taste preference.Assigned RanksPsychologists are often asked to rank the performance of group members. Typically, we assign ranks in order to select a smaller subset or to show an individuals relative placement in a larger group. For example, a graduate program has 10 applicants who qualify for a scholarship but can only fund 3 applicants. Faculty members are asked to rank the students using a set of criteria in order to select the scholarship winners from the applicant pool.

Many colleges use class rank as an admissions criterion. Here, the students academic performance is given a relative place in a larger group. Assigned ranks are classified as ordinal scales because they have the properties of identity and magnitude. We know that one person has been identified as having more skill than another but we do not know exactly how much more skill.

Listed below are class ranks assigned on the basis of high school grade point average (GPA).

The student with the 3.86 GPA is assigned a rank of 6 which means that 5 students have a higher GPA and the remaining students have a lower GPA.

Let's see what happens if we switch the numbers around.

With this change, the number "6" no longer has the same meaning; we no longer know the relative class standing of any student in this group.

The order of numbers must be constant in either ascending or descending order.Distance Between Scale PointsWe assume that the intervals between scale points on ordinal scales are unequal. Thus, the "distance" between a rank of "1" and "2" is not necessarily the same as the "distance" between ranks of "3" and "4".

Let's say that for the ranked preferences, our second student liked fatty tastes the best but also has a strong liking for salty foods which he rated as "2". He likes sweet foods (rated "3") but has a stronger preference for foods such as French fries and potato chips that are both fatty and salty. He does not like sour foods (rated "4") and really hates bitter tastes (rated "5"). In this case the preference "distance" between "3" and "4" is much greater than the preference "distance" between ranks "1" and "2". The psychological distance represented by the interval between numbers is not equal.Distance Between Scale PointsLets look at the same issue for assigned ranks. Even though there is a one-point distance between ranks 2 and 3 and ranks 5 and 6, the differences in GPAs are not equal. GPAs that are .02 points different separate ranks 2 and 3 and GPAs that are .04 points different separate ranks 5 and 6. Thus, we know that student 2 is one rank higher than student 3 and student 5 is one rank higher than student 6 but the rank does not tell us how much higher they are in their GPA.

Interval Scale. You are also allowed to quantify the difference between two interval scale values but there is no natural zero. For example, temperature scales are interval data with 25C warmer than 20C and a 5C difference has some physical meaning. Note that 0C is arbitrary, so that it does not make sense to say that 20C is twice as hot as 10C. A researcher might analyze the actual percentage scores of the essays, assuming that percentage scores are given by the instructor. A score of 98 (A) ranks higher than a score of 87 (B), which ranks higher than a score of 72 (C). Not only is the order of these three data points known, but so is the exact distance between them -- 11 percentage points between the first two, 15 percentage points between the second two and 26 percentage points between the first and last data points.IntervalInterval scales have the properties of: identity magnitude equal distance The equal distance between scale points allows us to know how many units greater than, or less than, one case is from another on the measured characteristic. So, we can always be confident that the meaning of the distance between 25 and 35 is the same as the distance between 65 and 75. Interval scales DO NOT have a true zero point; the number "0" is arbitrary.A good example of an interval scale is the measurement of temperature on Fahrenheit or Celsius scales. The units on a thermometer represent equal volumes of mercury between each interval on the scale. The thermometer identifies for us how many units of mercury correspond to the temperature measured.

We know that 60 is hotter than 30 and that there is the same 10-degree difference in temperature between 20 and 30 as between 50 and 60. Zero degrees on either scale is an arbitrary number and not a "true" zero. The zero point does not indicate an absence of temperature; it is an arbitrary point on the scale.

Other examples for everyday interval scales: Age (0 is culturally determined), SAT scores.Many of our standardized tests in psychology use interval scales. An IQ (Intelligence Quotient) score from a standardized test of intelligence is a good example of an interval scale score.IQ scores are derived from a lengthy testing process that requires the participant to complete a number of cognitive tasks. Each task is scored and the set of scores is converted into an overall standardized IQ score. IQ scores are created so that a score of 100 represents the average IQ of the population and the standard deviation (or average variability) of scores is 15. A distribution of IQ scores is presented below.

If one student receives an IQ score of 84 and another student receives and IQ score of 116 we can count on the units having the same meaning in order to make our interpretation. The first student would be 16 points below the mean, which would indicate a below-average potential for educational pursuits. The second student would be 16 points above the mean which would indicate an above-average potential for educational activities.There is no zero point for IQ. We do not think of a person as having no intelligence (although we may be tempted to make that evaluation upon occasion). Similarly for standardized scales of personality or other psychological attributes -- a zero point is an arbitrary point on a scale and does not indicate the absence of a quality or characteristic.

You may need to read a test manual or detailed description of scoring procedures to determine whether a standardized test is measured on an interval scale.

The interval scale of measurement only permits mathematical operations of addition and subtraction. We can combine amounts or remove amounts. We can discuss an amount that is more than, less than, or equal to another amount. But, we cannot make statements that involve multiplication or division. When measuring IQ, we can say that a person with an IQ score of 110 is 40 points higher than someone with an IQ score of 70, but we would never say that an IQ of 120 means that someone is twice as intelligent as someone with an IQ of 60. A true zero point is required to make valid statements about mathematical operations of multiplication or division of numbers on a scale.

Other examples of interval scales in psychological research: Most standardized tests of achievement such as SAT, ACT, MCAT.

Ratio Scale. You are also allowed to take ratios among ratio scaled variables. Physical measurements of height, weight, length are typically ratio variables. It is now meaningful to say that 10 m is twice as long as 5 m. This ratio hold true regardless of which scale the object is being measured in (e.g. meters or yards). This is because there is a natural zero. Let's count how many times children whisper to one another on the bus.

Observe Pair #1 -- they whisper 1 time. Observe Pair #2 -- they whisper 6 times. Observe Pair #3 -- they whisper 11 times

If in a selected interval, we never observed two children whisper, we have confidence that the "0" point represents an absence of that particular behavior.

The equal intervals and true zero point allow us to know that Pair #2 whispered 6 times as often as Pair #1.Now, let's compare this count of the number of behaviors to our interval scale. If we were measuring IQ, we would never say that an IQ of 120 means that someone is twice as intelligent as someone with an IQ of 60. We would never say that someone had no IQ. Yet, we can confidently discuss how many more times a particular behavior occurred. This is the advantage of ratio scales. Without a true zero point (such as for IQ or personality tests), we cannot do multiplication or division and thus cannot talk about twice as many or half as much of a characteristic. It is the true zero point in ratio scales that allows us to multiply and divide. Other examples of ratio scales in psychological research: Height, weight, volume, latency.Likert-type RatingsLikert-type ratings are used in surveys where we are asked to rate how much we agree or disagree with a statement. Some psychologists classify Likert-type ratings as ordinal scales; others consider them to be roughly interval or approximately equal interval. Likert-type ratings have the properties of identity and order. They have the property of identity because they let us know whether we agree or disagree. They also have the property of order because each number represents a rating that is more or less than the others. Psychologists disagree as to whether the interval between scale points is equal or not equal.

Listed below is an item from a survey about relationships. We are asked to rate how much the item describes how we feel in our important relationships.It is easy for me to become emotionally close to others. I am comfortable depending on them and having them depend on me. I don't worry about being alone or having others not accept me.

This person chose a number that is almost the highest possible. We would consider this a high ranking or "strong" agreement with the statement.

Let's see what happens if we switch the numbers around.

On this scale, the number "6" no longer has the meaning of strongly agreeing with the statement. In fact, it does not have a clear meaning at all.

The order of numbers must be constant in either ascending or descending order.Psychologists who classify Likert-type ratings as ordinal consider the distance between scale points to be unequal. In this view, the psychological distance between "1" (Not at All Like Me) and "4" (Somewhat Like Me) is not the same as between "4" (Somewhat Like Me) and "7" (Very Much Like Me) even though there is a difference of 3 points between each rating. Research suggests that our willingness to endorse positive statements is different from our willingness to endorse negative statements. We are also less likely to endorse statements at the extreme ends (e.g., "1" and "7") than at a more "moderate" level (e.g., "2" and "6"). Thus, the psychological distance represented by the interval between numbers is not equal.

Psychologists who classify Likert-type ratings as roughly interval consider the distance between scale points to be approximately equal interval. They argue that when scales are carefully constructed to have a sufficient number of scale points and appropriate labels, we can assume that the psychological distance between scale points is equal. But, because we can never be absolutely certain that the scale points represent equal psychological distance, we call the scales roughly or approximately interval to identify that they are treated as such.

So, do Likert-type ratings have an ordinal or approximately interval scale of measurement? There is no one correct answer. This is up to you, as a researcher and data analyst, to decide.Now, we will add another twist to the discussion. Many psychologists create measures by adding up the individual Likert-type ratings or calculating an average rating across items. As some psychologists consider Likert ratings to be ordinal and others classify them as approximately interval, what is the scale of measurement of scores created from a set of individual Likert-type ratings? Most psychologists consider such scores to be roughly or approximately equal interval. The term "approximately interval" denotes that the scales are not interval but are treated as such in data analysis.Suppose that a self-esteem scale asks people to rate how much they agree or disagree with a series of items that ask about positive or negative personal qualities. Scoring instructions state that responses should be averaged across all items.1. The Likert-type ratings for each item can have a score of 1, 2, 3, or 4. A distribution of scores is presented below.2. Once the items are averaged, the overall mean self-esteem score can have values such as 1.00, 1.50, 1.75, 2.00, 2.50, 2.75, 3.25, 3.50, 3.75, 4.00. The distribution of the average across items is presented below.

Creating an average across multiple items that measure the same underlying construct (self-esteem) is thought to increase the reliability and validity of the scale. The numeric average of these items gives us a total score with a much wider range of possible values than just 1 to 4.Similarly, a sum or total of item responses will give a broader range of scores than the individual Likert-type ratings. For this reason, many psychologists and statisticians treat scales created in this fashion as interval scales even if they view the individual Likert-type ratings as ordinal scales. We call these scales "approximately interval" or "roughly interval" to indicate that they strictly speaking are not interval but are being treated as such in data analysis. Some scientists group together approximately interval, interval and ratio scales into a larger category of "scale" or "score" data. Why does the scale of measurement matter?The scale of measurement of our variables determines the mathematical operations that are permitted for those variables. In turn, these mathematical operations determine which statistics can be applied to the data.

The chart below lists the scales of measurement that we have reviewed in this exercise and the types of statistics that can be applied to variables created using these scales of measurement.

Keep in mind that many psychologists make no distinction between approximately interval, interval, and ratio data. This is a matter of preference rather than being right or wrong.

Frequently Asked Questions (FAQ) 1. What is a natural zero Some scales of measurement have a natural zero and some do not. For example, height, weight etc have a natural 0 at no height or no weight. Consequently, it makes sense to say that 2m is twice as large as 1m. Both of these variables are ratio scale. On the other hand, year and temperature (C) do not have a natural zero. The year 0 is arbitrary and it is not sensible to say that the year 2000 is twice as old as the year 1000. Similarly, 0C is arbitrary (why pick the freezing point of water?) and it again does not make sense to say that 20C is twice as hot as 10C. Both of these variables are interval scale.

Documents

Scales of Measurement