Upload
jbnx
View
656
Download
0
Embed Size (px)
DESCRIPTION
Citation preview
Chapter 5
Measuring Variables and Sampling
Today: Begin Exam 2 material (Chapters 5, 6, 4)
Scales of measurement Psychometric properties
Reliability Validity
Tuesday: Finish chapter 5 Discuss Exam 1
Roadmap
We have:
A research question An idea for a research design A hypothesis
But how do we measure what we’re interested in?
Zoom out: where are we?
We study variables and need to measure them
accurately 4 scales of measurement
Nominal Ordinal Interval Ratio
Scales of Measurement
symbols classify or categorize into GROUPS or
TYPES Name, Categorize, Classify Caution: use of numbers to indicate group
Examples- gender, marital status, experimental condition
Nominal Scale
A rank order scale of measurement Examples- order of finish, Letter grade in class,
social class (low, med., high) Allows you to determine which person is higher
or lower but not how much higher or lower. Can’t make direct comparisons
Ordinal Scale
Rank ordering PLUS equal intervals of distance
between adjacent numbers Example- Celsius and Fahrenheit temperature, IQ
scores, year Now you can make comparisons Equal distances but no absolute zero point
Interval Scale
rank ordering, equal intervals PLUS an absolute
zero point Absolute zero = absence of variable Examples- Kelvin temperature, income, weight,
height, response time.
Ratio Scale
Reliability: Consistency/stability of scores Validity: Are you measuring what you are trying
to measure? Ideally, we want:
Measures that are reliable Inferences that are valid
Reliability is necessary but not sufficient in order to have validity
Psychometric properties
Think about a Target
4 Primary types
Test-Retest Reliability Equivalent- Forms Reliability Internal Consistency Reliability Interrater Reliability
Indicate level of reliability with a reliability coefficient Correlation; should be positive and strong (> .70)
Measuring Reliability
Refers to consistency over time Same measure administered twice (with a time
interval between)
Test- Retest
Equivalent forms- two versions of the same
measure Administer to the same group of people
Problem- hard to develop equivalent measures
Example: SAT, GRE
Equivalent-Forms Reliability
Consistency with which test items measure a
single construct. More items increases reliability, but we use as
few items as possible Why?
Internal Consistency
I feel sad I feel down I feel depressed I feel miserable I feel awful
Example: Internal Consistency
I feel hungry I feel happy I have green eyes Big Bird is scary I like turtles
http://www.youtube.com/watch?v=CMNry4PE93Y
Example: Internal Consistency
Measured using coefficient alpha (α)
a.k.a. Cronbach’s alpha Should be .7 or higher
High values mean the items are measuring the same construct
If your scale measures more than 1 thing, each construct gets its own coefficient α
Internal Consistency
Interrater reliability- consistency of ratings made
by different judges GRE writing section Expressive writing studies Correlation between ratings should be strong/positive
Interrater Reliability
percentage of times different observers agree
% of times raters agree- easy to calculate and understand
Interobserver Agreement
Accuracy of inferences or interpretations made
on the basis of scores Measuring schizophrenia, or love
We can’t directly observe it! It’s the accuracy of the interpretation from the test
Validity
Construct Operationalization Important to consider:
Does your operationalization truly reflect what you’re measuring?
Validation Never-ending process
Validity
Content validity: judgment of the degree to
which items adequately represent a construct’s domain. Do items appear to represent the thing you’re trying
to measure? (face validity) Does your measure exclude any important parts of
what you’re trying to measure? Does your test measure something besides what you
wanted? (i.e., include irrelevant items)
Obtaining Validity: Based on Content
Some constructs are multidimensional and need
measures that address all dimensions Homogeneity—degree to which a set of items
measure a single construct Item-to-total correlation Coefficient alpha
Obtaining Validity: Based on Internal Structure
Criterion-related validity: degree to which scores
predict or relate to an already established test Two types of criterion validity:
Predictive: using your measure to predict future performance
Concurrent: using your measure to predict current performance on the same construct, or a related one.
Obtaining Validity: Based onRelations to Other Variables
Convergent validity: relationship between your
measure and other measures of that same construct
Discriminant validity: evidence that scores from your measure are NOT similar to scores of tests on different constructs.
Obtaining Validity: Based onRelations to Other Variables
Reliability and validity info apply to the measure
of interest in the reported sample Situation-specific, not broad
Standardized tests: norming group If you want to use a test with a group not represented
in the norming group, be cautious Report R & V for your own sample, and be wary of
articles that make blanket statements about a measure’s R & V
Appropriate Use of Reliability and Validity Info