Upload
tjcarter
View
359
Download
0
Embed Size (px)
Citation preview
Session 5. Designing SoTL Studies: Validity and Practicality in Quantitative Research Studies
Virginia Commonwealth UniversityADLT 673 – Teaching as Scholarship in Medical EducationKelly Lockeman, PhD
“Everything that can be counted does not necessarily count; everything that counts cannot necessarily be counted.”
- Albert Einstein
“Measure what is measurable, and make measurable what is not so.”
- Galileo Galilei
Quantitative Research Experimental
Single subject True
experimental Quasi-
experimental
Non-Experimental Descriptive Comparative Correlational Ex post facto Causal-
comparative
Do you have a good hypothesis? Is it stated in declarative form? Is it consistent with known facts,
previous research, and theory? Does it state the expected
relationship between two or more variables?
Is it testable? Is it clear? Is it concise?
Variables: Building Blocks for Research
Variable: A concept or characteristic that can take on different values or be divided into categories.
A B C
Defining Variables
Conceptual Definitions: use words and concepts to describe the variable.
Operational Definitions: indicate how the concept is measured or manipulated.• How would you define each of these variables conceptually?
• How would you operationalize them in a research study?
GenderRace Social ClassLeadership StyleAchievementEfficacyDepression
Part I: Validity
Testing Your HypothesisIndependent Variable (IV): The predictor
or cause. In experimental studies, the researcher controls or manipulates the independent variable (the treatment or intervention).
Dependent Variable (DV): The outcome or effect that the researcher measures (e.g., knowledge, skills or attitudes).
A B CIndepende
ntIndepende
ntDependent
Questions of Validity
Possible problems related to causality:
The assessment was not measured well.
The intervention was not manipulated well.
Something other than the intervention
caused change in the assessment.
Sample Hypothesis: The intervention (IV) improves students on the assessment (DV).
Construct vs. Internal ValidityConstruct Validity:
Am I measuring what I think I am measuring?
Am I implementing what I think I am implementing?
Internal Validity: Did the treatment cause the outcome?
Characteristics of Validity It is the inference that is valid or
invalid, not the measure. An instrument can be valid for one
use but not another. Validity is a matter of degree. Validity involves an overall
evaluative judgment based on evidence.
A study does not have absolute validity or
absolutely no validityThe level of validity relates to the confidence in the conclusions
Construct and internal validity are measured on a continuum
Construct validity does not imply internal validity (and vice versa)
When a hypothesis is supported, it does not necessarily mean that the
study has either construct or internal validity
Evaluating Validity
What is meant by “construct”? A concept, model, or schematic idea A construct is the global notion of the
measure, such as: ▪ Student motivation▪ Intelligence▪ Student learning▪ Student anxiety
The specific method of measuring a construct is called the operational definition.
For any construct, researchers can choose many possible operational definitions.
What is a proxy measure? Example: What is “productivity”? (Operational
definition) How do we measure productivity? (Proxy measures) Common measures of productivity:
▪ Work output▪ Time and face time at work▪ Absences
Common data collection methods:▪ Observation▪ Record review▪ Self report
Proxy: approximates the real thing
Optimize Construct Validity
Measure learning directly (clear operational definitions; learning is not the same as enjoyment or perceived learning).
Measure student learning through student learning objectives (ensure these are aligned with assessments).
Use established scales to measure student attitudes and personality (don’t reinvent the wheel; tests in Print).
Optimize Construct Validity, cont.
Know how to score the measure (make sure you’ve established this before data collection; know what is reasonable; rubrics; training; IRR).
Determine whether to use graded or ungraded measures (pros and cons of both).
Minimize participant and researcher expectancies.
Optimize Construct Validity, cont.
Determine whether to use multiple operational definitions (can use multiple measures).
Use a retention measure to investigate long-term effects (but treat long term results with caution about other influences).
Good Differences between Conditions Improve Construct Validity
The treatment (intervention) needs to be manipulated well to ensure
construct validityThe only difference between
conditions should be the treatment
Other variables that are different between conditions
are confounds
To determine construct validity, treatments need specific operational definitions
Anything that can affect the results and cause a difference between students in treatment and control conditions needs to
be documented
Potential problems in using different sections of a class
Construct validity of the treatment is questionable in
any design that compares one section of a class with
anotherClasses are a social space, and the
students and instructors are interdependent
Students can ask different questions
The class may have a different “tone”
Splitting a class into two groups can minimize this concern; if students in
a split class can be randomly assigned to a condition, internal
validity will increase
Different Types of Comparison in Research Design Between Participants Within Participants:
Multiple TreatmentsWithin Participants: Multiple Measures
How it works
Students in one condition compared to students in another condition (control – Treatment; multiple T’s)
All students in both control and treatment conditions
Students receive both pre-test (control) and post-test (treatment)
Strengths No carryover effects from multiple treatments; no instrumentation or testing effects from multiple assessments
No selection bias; greater statistical power
No selection bias; greater statistical power
Weaknesses
Selection bias without random assignment; many differences if groups are separate (e.g., two separate classes); lower statistical power
Instrumentation and testing effects; carryover effects
Instrumentation and testing effects; other confounds that occur between assessments
Improve Internal Validity by:
Random assignment; adding covariates
Counterbalancing Increase number of assessments; add no treatment separate control condition; use alternative measures for assessment
External Validity
Can the sample used in the study generalize to other groups
or populations?
Generally, it is impossible in
classroom studies to get a sample that will generalize
to all students.
The researcher
should report demographic
characteristics
How realistic is the
situation? In a classroom, if the treatment
works, external
validity is higher
Part II: Practicality
Common Problems in SoTL Research
Trying to measure everything Small number of students = low statistical
power Only a single class; limits type of design Difficulties in random assignment Difficulties in determining whether the
treatment is potent enough to have an effect (relates to power)
Conducting an ethical study in a classroom or training situation
Don’t Use Want to make
statement about causality
Have low number of students
Use Have single group of
students that cannot be divided
Have only one session in which to collect data
Additional Options:Correlate many variables at the same time
Simple Correlation
One-Group, Post Test Only
Don’t Use Want to make
statement about causality
Want to make comparison to another group
Use Desired focus is on
describing treatment and not assessment
Cannot have pre-test or control group
Want single group of students that cannot be divided
Two-Group, Post-Test Only
Don’t Use Have low number of
students Groups are very
different Have different
assessments for each condition
Use Concerned about
carryover effects Concerned about testing
and instrumentation effects
Have multiple groups Have only one session to
collect data Additional Options:
• Use random assignment to improve internal validity• Add post-test to assess long-term change• Add additional conditions• Use covariates to improve internal validity and power
One Group, Pre-test, Post-test
Don’t Use Items other than
treatment occur between assessments
First assessment affects second
Students likely to change between assessments with no treatment
Use Have low number of
students Have single group
that cannot e divided Cannot have control
condition
Additional Options:• Add post-test to assess long-term change• Use alternative measures to minimize testing and
instrumentation effects
Two-Group, Pre-test/Post-test
Don’t Use Have single group
of students that cannot be divided
Use Have multiple
groups
Additional Options:• Use random assignment to improve internal
validity• Add post-test to assess long-term change• Use alternative measures to minimize testing
and instrumentation effects• Add additional conditions• Use covariates to improve internal validity and
power
Within Participants Design
Don’t Use Early treatments
affect later treatments Early assessments
affect later assessments
Use Have low number of
students Have single group
that cannot be divided
Additional Options:• Add additional treatments• Counterbalance conditions to improve internal validity• Include pre-test to assess students before any treatment
Crossover Design
Don’t Use First assessment, by
itself, affects second Have single group of
students that cannot be divided
Use Have low number of
students Have multiple groups
Additional Options:• Include pre-test to assess before treatment• Add post-test to examine long-term change• Use random assignment to improve internal validity• Use alternative measures to minimize testing and
instrumentation effects
Interrupted Time-Series Design
Don’t Use Have only one
session to collect data
Early assessments affect later assessments
Use Have low number of
students Have single group
that cannot be divided Want to determine
long-term effects
Additional Options:• Add control condition to improve internal
validity• Add additional treatment condition, with
treatment at different time to improve internal validity
More Complex Designs Use multiple treatments to
investigate interactions (Interactions) Use moderators to determine when
treatment has effect (Concept of ATI) Use mediators to investigate how
treatment has effect (Mixed Method?)
Remember! Each design has advantages and
disadvantages. Often, there is no clear right way,
although some designs will be better than others.
There is no single ideal study that eliminates all potential problems and all alternative hypotheses.
One study cannot answer all of your questions!