How good is a robot tutor? The effectiveness of excel as a teaching
resource multiplier in teaching statistics
Dave NunezColin Tredoux
Susan Malcolm-SmithACSENT Lab
University of Cape Town
Jacob JafthaDept. of Mathematics and Applied
MathematicsUniversity of Cape Town
2
Context
UCT Psychology has an extensive statistics teaching programme (1st year to honours)
Research focus makes this an imperativeBy honours, are expected to apply stats to a significant individual research project
A mixed group of students enteringAll have high-school maths, or have completed/concurrently completing a year-long numeracy courseStats is largely disliked, and provokes significant anxiety
3
Context
Large classes, few tutorsTypically 40:1 student:tutor ratioExcel based tutorials developed to counter this (lab facilities can cope with numbers) – “tutor in a can”; “tutorbot”; “tutortron-2000”
Positive student feedback from excel tutsLiked that they could take them homeSeemed to compensate for poor lecture attendanceBUT – very little interaction between teachers & students (how were explanations/queries handled? Was it necessary?)
4
The excel based tutorials used
In development since 2003Almost all technical glitches resolved
Contain text, exercises and evaluationText supplements textbook (text and images); also includes animations & simulationsTeaches concepts and toolsProvides exercises which are immediately scored (feedback given for each question)Each tut ends with a mini-test which must be submitted [electronically]Each tut takes 120-150 minutes to complete
5
The excel based tutorials used
The tutorials aim to be more than simple exercises
Embed some teaching by interaction & feedback
Raises the issue: Can interactive, discovery based learning surpass student-tutor interaction for learning statistics
Some topics are well suited for discovery (sampling distribution of the mean)Some topics are poorly suited for it (probability)Do the excel tutorials lead to skill transfer?
7
Methods used in the past
Pre-test/post-testWithout a control, cannot show the tutorial is the cause (even a bad tut teaches something)
Voluntary assignmentNo control for motivation variablesNo control for repetition
Performance often measured by means of psychological variables
Confidence, mastery, conceptual learningNo absolute task-based criteria
8
Deficits in past methods
Poor controlsNo proper control within subjects (natural learning)No proper control across conditions (subjects self-assign to conditions)These are often related to ethical concerns
Measures are generally poorSingle measure of complex, time-dependent phenomenonNo criterion based assessment (i.e. low ecological validity of findings)
9
Research questions
Do Excel based tutorials (EBTs) compare in performance (marks scored) to pen-and-paper tutorials (PnPs)?
Is there a difference in terms of psychological variables (mastery, confidence) between EBTs and PnPs?
10
Strengths of the current study
Two-group quasi-experimentPseudo random assignment of students to excel/pen-and-paper tutorialsStrong control/similarity of tutorials (we think)
Semester long, continuous assessmentStandard test after each tutorial (criterion and psychological measures)Final exam at the end of the semester
11
Sample
The 2007 PSY2006F classStatistics lecture each Friday; One stats tut a week172 students (only Humanities students)Almost all have been through 3 tutorials in PSY1001W on using excel for stats2007 cohort not significantly different from other yearsNot told about the study; simply told strange tutorial structure was due to logistical reasons
12
Materials
PnP tuts are ‘traditional’ as done in the dept. before advent of excel tuts
Published in a textbook (we partly wrote) – in 2001Choose tutors who excel (!) at statisticsThey lead students (groups of 30-40) through worksheets and explain problems and theory as they go alongStudents are given 2 hour classroom sessions to complete tuts (mostly don’t finish)Students are required to submit the completed worksheet a week after the classroom session
13
Materials
Excel tuts (latest versions)Developed by us (2003-2007) 1 senior tutor in the lab for stats queries, junior tutors for technical problemsStudents are given 2 hour lab sessions (groups of 30-40) to complete tuts (mostly don’t finish)Students are required to submit the completed excel worksheet a week after the lab session
14
Design
Control for individual variation and cross-group effects
Each student does 4 EBTs, 4 PnPs (8 topics in the course)Two ‘streams’ – EPEPEPEP, PEPEPEPE Within subjects design, and cross-group comparisonThe non-statistics marks in the course (research methods, psychometrics & qualitative methods) can be used to validate (traditionally high R2 between them)
15
Measures
Exam at the end2 hour practical exam (given data, problem solving – no concepts)Do each exam section in the same technology form as the tuts were done in
16
Measures
Monday assessmentsEach tut has a set of MCQ items 6 MCQ items, 3 concepts, 3 calculations; one each easy, moderate, hard5 Likert items about confidence with the material, usefulness of tut, degree of understanding, how much extra help is needed
17
Measures (3)
Distribution X is normally distributed; distribution Y has a standard normal distribution. Which of the following statements MUST BE FALSE?
a) The mean of distribution X is 2
b) The standard deviation of distribution Y is 1
c) Distribution Y must always give the same proportion of high scores as low scores when sampled randomly
d) Distribution X never gives scores lower than distribution Y when sampled randomly.
Two students, Able and Baker, want to get into the honours class, but they have taken different third year subjects. Able did the PSY300X course (which had a mean mark of 53% and a standard deviation of 11%) and he got a mark of 80%. Baker on the other hand did the PSY300Y course (mean mark of 57% and a standard deviation of 7.5%), and got a mark of 77%. If honours places are awarded to students who stand out the most in their courses, which one of the students should get into honours and why?
a) Able should get in, because he scored 27% above the course average
b) Baker should get in, because he scored 20% above the course average
c) Able should get in, because he scored proportionately higher above the course average
d) Baker should get in, because he scored proportionately higher above the course average
18
N=170
Comp. Paper
Quant. methods
0.15 0.33
Psychometrics
0.35 0.40
Qual. methods
0.25 0.36
Validation
19
GROUP ; LS M eans
W ilks lam bda= .99259, F (3, 165)= .41061, p= .74559
E ffec tive hypothes is dec om pos it ion
V ertic al bars denote 0.95 c onfidence intervals
ex am _quant ex am _ps yc hom ex am _qual
A B
GROUP
16
18
20
22
24
26
28
30
32
Validation
20 R1*G ROUP ; LS M eans
Current effec t: F (5, 370)= 4.7192, p= .00034
E ffec tive hypothes is dec om pos it ion
V ertic al bars denote 0.95 c onfidence intervals
P aper firs tCom p. firs t
att1 att3 att4 att6 att7 att81.6
1.8
2.0
2.2
2.4
2.6
2.8
3.0
3.2
3.4
3.6
3.8
Positive attitude (0-5)
C
C
C
CC
C
Attituderesults
21
eval_1; LS M eans
Current effec t: F (2, 150)= 1.1240, p= .32769
E ffec tive hypothes is dec om pos it ion
V ertic al bars denote 0.95 c onfidence intervals
Class room tuts B oth were helpful Lab tuts21
22
23
24
25
26
27
28
29
30
Score for com
puter questions
Preferenceeffects
22
eval_1; LS M eans
Current effec t: F (2, 151)= 1.5940, p= .20651
E ffec tive hypothes is dec om pos it ion
V ertic al bars denote 0.95 c onfidence intervals
Class room tuts B oth were helpful Lab tuts17
18
19
20
21
22
23
24
25
26
27
28
29
Paper based questions
Preferenceeffects
23 R 1*GR O UP; LS MeansC urrent effec t: F(5, 455)=1.5736, p= .16607
Effec tive hypothes is decom pos itionVertical bars denote 0.95 confidence intervals
C om puter firs t Paper firs t
Topic 1 Topic 3 Topic 4 Topic 6 Topic 7 Topic 81.5
2.0
2.5
3.0
3.5
4.0
4.5
5.0
5.5
6.0
Test score (out of 6)
C
C
C
C
C
C
Mondayassessments
24 R1*G ROUP ; LS M eans
Current effec t: F (7, 1169)= 7.3499, p= .00000
E ffec tive hypothes is dec om pos it ion
V ertic al bars denote 0.95 c onfidence intervals
GRO UP A GRO UP B
Q1s td Q2s td Q3s td Q4s td Q5s td Q6s td Q7s td Q8s td
R1
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
Exam
mark (0-1)
C
C
C
C
C
C
C C
Examresults
25 R1*G ROUP ; LS M eans
Current effec t: F (5, 840)= 1.0067, p= .41255
E ffec tive hypothes is dec om pos it ion
V ertic al bars denote 0.95 c onfidence intervals
P aper firs t Com p firs t
q1-t1 q3-t3 q4-t4 q6-t6 q7-t7 q8-t8-1.2
-1.0
-0.8
-0.6
-0.4
-0.2
0.0
0.2
0.4
Improvem
ent from tut to exam
C
C
C
C
C
C
Testingeffects
26
What the data shows
The EBTs can function as a robot tutorWith small tutor team, marks at least as good as traditional tutorials, better in a few topics for some students
Student preference/attitude is not associated with performance
Lack of significant findingsNo patterned differences
27
What the data shows
EBTs can show an advantageAt exam time rather than test timeMay indicate poor test or that EBTs need repetition to take effectIt is a weak effect - does not generalize to the entire class easily (group B only)
28
What the data DOES NOT show
Excel based statistics teaching is betterContent is confounded with formTutor ability is confounded with form
Students enjoy/get confidence from the EBTsOnly differences show the opposite
Students can leverage existing computer skills for learning statistics
Skills were pre-existing and not manipulated
Recommended