Upload
urielle-jenkins
View
24
Download
8
Tags:
Embed Size (px)
DESCRIPTION
Assessment: The “Hidden Variable” of Achievement. A Field-tested Learning Assessment Guide (FLAG) for STEM Instructors Michael Zeilik University of New Mexico (www.flaguide.org). Pop Quiz! In terms of learning gains , which instructor attribute enhances achievement the most?. - PowerPoint PPT Presentation
Citation preview
Assessment: The “Hidden Variable” of Achievement
A Field-tested Learning Assessment Guide (FLAG)
for STEM Instructors
Michael Zeilik University of New Mexico
(www.flaguide.org)
Pop Quiz!In terms of learning gains, which instructor attribute enhances achievement the most?
A. Teaching experienceB. Clarity of presentationsC. Energy and enthusiasmD. Deep knowledge of subjectE. None of the above
What Works? 20th Century Gain Results
Active Learning, Mastery Learning (SD ≥ 0.5 ≥ 0.5, PSI, discussion, debates, games, role playing, controversy)
Cooperative Learning (SD ≥ 0.5SD ≥ 0.5, cognitive and affective; century of research, all disciplines)
One-on-One Tutoring (SD = 2SD = 2,, with trained tutors)
EVERYTHING ELSE!Lectures (“standard model”), reinforce
memorizationmemorization!! Unstructured discussion, supervised
independent study, autonomous small groups, self study
Audio-tutorials, programmed instruction, computer-based instruction, instructional television, Web-based instruction
What Does Not Work? (Small gains SD < 0.3)
What Works in Context: Physics & Astronomy
Disciplinary education research, uses “tools of the trade” to conduct experiments
Create a mostly empirical robust knowledge base about learning in physics & astronomy
Probe initial state (prior knowledge), final state (learning outcomes), and student thinking
Measurement: Assessment!
Galileo Galilei:
“Measure what is measurable, and make measurable what is not so.”But: “Measure what you value, and value what you measure.” (M. Zeilik)
FLAG FeaturesAssessment Basics
“What is this assessment business all about?”
Making Goals “What do you want to measure?”
Classroom Assessment Techniques (CATs) “How do you measure it?”
Searchable Database—Toolbox “What are good tools for measurement?”
All peer reviewed and evidence based
FLAG CATs Attitudinal Surveys: E.
Seymour, E. Lewis Concept Tests:
A. Ellis Concept Maps:
M. Zeilik Conceptual Diagnostic
Tests: M. Zeilik
Interviews: M. Smith, S. A. Southerland
Performance Assessments: T. Slater
Portfolios: T. Slater
Scoring Rubrics: D. Ebert-May
Student Assessment of Learning Gains: E. Seymour
Weekly Reports: E. Etkina
Mathematical Thinking: M. Swan,
J. Rideway Multiple-Choice Tests:
J. Parkes Minute Papers:
M. Zeilik
Minute PaperTake a few minutes at the end of class and
ask for a written response to: “What was the most important
concept you learned in class?” “What important question remains
unanswered?” “What was the muddiest point of this class?”
Few Minute Paper: Teams reach consensus, submit written report
Analysis: Sort into themes (cards)Weekly Report: Extended minute paper
Quick feedback on conceptual (not factual) understanding
Instructor gives conceptual question with choices (common “misconceptions”)
After a minute, whole class responds (hands, flash cards, class polling system)
Instructor assesses responses: If most incorrect, pair up do discuss (peer teaching)
Class response again to gauge mastery; instructor adapts in real time
Concept Tests
Earth
asteroid
The magnitude of the force exerted by the asteroid on the Earth is
a) larger than the magnitude of the force exerted by the Earth on the asteroidb) the same as the magnitude of the force exerted by the Earth on the asteroidc) smaller than the magnitude of the force exerted by the Earth on the asteroidd) zero. (the asteroid exerts no force on the Earth).
P111 P221 P112Largerthan:
2% 2% 5%
Sameas:
13% 9% 24%
Smallerthan:
62% 84% 67%
Zero: 23% 5% 5%N 101 91 101
Results: Midwestern Universities (Dostal)(P111, conceptual; P221 calc; P112 algebra)
Attitude Surveys: Perceptions about course, discipline; seem easy-not!, best based on a robust, field-tested model
Minute Paper: “Given limited resources, what one change would you make to improve this course?”; sort by themes
Student Assessment of Learning Gains (SALG): Probes learning gains that students perceive; avoids performance critiques; easily customized (15 min); available on-line
Student Assessments
UNM P102 SALG Spring 2003(Mean = 3.38)
Error Bars are Standard Deviations
3.68
3.52
3.64
3.71
4.15
3.75
3.99
1.00 2.00 3.00 4.00 5.00
Instructor talk
Discussions
Team work
Activities
Coop quizzes
CT/CRS
Videos
Rating (scale 1 - 5)
P102 UNM Spring 2003 Gender Matters—Sometimes!
1.00 2.00 3.00 4.00 5.00
Math Self-Aptitude
Science Self-Aptitude
Last math
Pretestconfidence
PosttestConfidence
Score (1 = very poor, 5 = very good)
Female Male All
NS!
NS!
p < 0.002
p < 0.0001
p < 0.002
Cooperative Quiz Gainsby Gender
P102 UNM Spring 2003 Quizzes
0.00
0.20
0.40
0.60
0.80
1.00
Practice Quiz #1 Quiz #2 Quiz #3 Mean
<g> Female Effect size (F) <g> Male Effect size (M)
Does it stick? (Coop quizzes/Test)
A101 UNM Fall 2000Identical Items (p = 0.96)
0
20
40
60
80
100
120
Q1-12 Q2-4 Q2-9 Q2-8 Q2-7 Q3-6 Q3-7
Item
Percent correct
Pre %Post %Test#1
Attitude Results: Intro Astro & Physics
Measure(50% =neutral)
Pretest(%)
Posttest(%)
Gain andEffect Size
UNM A101(n = 311)
62 ± 10 63 Notsignificant
CMU A111( n = 224)
57 55 Notsignificant
UNM A271(n = 28)
78 82 Notsignificant
UNM P102(n = 31)
60 60 Notsignificant
Conceptual Diagnostic TestsIdeally research-based on
“misconceptions”, revealed by student “think aloud” interviews
Measures pre/post conceptual gains as a summative assessment
Force Concept Inventory (FCI; 1985); Astronomy Diagnostic Test (ADT; 1999) version 2; national baselines; large data sample (about 5,000)
Follow protocol!
ADT 2 UNM Fall 2000 vs. National
20
25
22
30 26 28
0
5
10
15
20
25
30
Gain (National) Gain (UNM)
Females
Males
All
ES (F) = 0.84 => 80% of postscores above mean of prescores
ES (M) = 0.53 => 70% of postscores above mean of prescores
(Pre = 5346; Post = 3842)
Pre/post: ADT National Project/UNM<g> = (post% - pre%)/(100% - pre%)
34
56
32.4
47.3
41.5
27
53.7
38
0
10
20
30
40
50
60
Pre-course (N=5346) Post-course (N=3842)
Score (%)
Overall: <g> = 0.22 Women: <g> = 0.20Men: <g> = 0.25 UNM: <g> = 0.31
UNM
UNM
Standard errors plotted
P102 UNM Spring 2003 Pre/Post
0%
10%
20%
30%
40%
50%
60%
70%
Pre% FCIPost% FCI<g> FCI
Pre(FCI + GRC)Post(FCI + GRC)<g> FCI+GRC
<g> GRC
Percentage
Female Male
Classroom Assessment:Classroom Assessment:Good News!Good News!
Well-done formative assessment results in a pre/post gain of about 0.5 standard deviation
(70% rather than 50% on a “standardized” test)