36
Learning Factors Analysis – A General Method for Cognitive Model Evaluation and Improvement Hao Cen, Kenneth Koedinger, Brian Junker Human-Computer Interaction Institute Carnegie Mellon University

Learning Factors Analysis – A General Method for Cognitive Model Evaluation and Improvement Hao Cen, Kenneth Koedinger, Brian Junker Human-Computer Interaction

Embed Size (px)

Citation preview

Page 1: Learning Factors Analysis – A General Method for Cognitive Model Evaluation and Improvement Hao Cen, Kenneth Koedinger, Brian Junker Human-Computer Interaction

Learning Factors Analysis – A General Method for Cognitive Model Evaluation and

Improvement

Hao Cen, Kenneth Koedinger, Brian Junker

Human-Computer Interaction Institute

Carnegie Mellon University

Page 2: Learning Factors Analysis – A General Method for Cognitive Model Evaluation and Improvement Hao Cen, Kenneth Koedinger, Brian Junker Human-Computer Interaction

Learning curve analysis by hand & eye … Steps in programming problems where the function

(“method”) has two parameters (Corbett, Anderson, O’Brien, 1995)

Page 3: Learning Factors Analysis – A General Method for Cognitive Model Evaluation and Improvement Hao Cen, Kenneth Koedinger, Brian Junker Human-Computer Interaction

Can learning curve analysis be automated? Learning curve analysis

Identify blips by hand & eye Manually create a new model Qualitative judgment

Need to automatically: Identify blips by system Propose alternative cognitive models Evaluate each model quantitatively

Page 4: Learning Factors Analysis – A General Method for Cognitive Model Evaluation and Improvement Hao Cen, Kenneth Koedinger, Brian Junker Human-Computer Interaction

Overview

A Geometry Cognitive Model and Log Data Learning Factors Analysis algorithm Experiments and Results

Page 5: Learning Factors Analysis – A General Method for Cognitive Model Evaluation and Improvement Hao Cen, Kenneth Koedinger, Brian Junker Human-Computer Interaction

Domain of current study

15 skills

1. Circle-area

2. Circle-circumference

3. Circle-diameter

4. Circle-radius

5. Compose-by-addition

6. Compose-by-multiplication

7. Parallelogram-area

8. Parallelogram-side

9. Pentagon-area

10. Pentagon-side

11. Trapezoid-area

12. Trapezoid-base

13. Trapezoid-height

14. Triangle-area

15. Triangle-side

Domain of study: the area unit of the geometry tutorCognitive model:

Page 6: Learning Factors Analysis – A General Method for Cognitive Model Evaluation and Improvement Hao Cen, Kenneth Koedinger, Brian Junker Human-Computer Interaction

Log Data -- Skills in the Base Model

Student Step Skill Opportunity

A p1s1 Circle-area 1

A p2s1 Circle-area 2

A p2s2 Rectangle-area 1

A p2s3 Compose-by-addition 1

A p3s1 Circle-area 3

Page 7: Learning Factors Analysis – A General Method for Cognitive Model Evaluation and Improvement Hao Cen, Kenneth Koedinger, Brian Junker Human-Computer Interaction

Overview

Cognitive Models & Cognitive Tutors Literature Reviews on Model Improvement A Geometry Cognitive Model and Log Data Learning Factors Analysis algorithm Experiments and Results

Page 8: Learning Factors Analysis – A General Method for Cognitive Model Evaluation and Improvement Hao Cen, Kenneth Koedinger, Brian Junker Human-Computer Interaction

Learning Factors Analysis

Statistics

Combinatorial SearchDifficulty Factors

a set of factors that make a problem-solving step more difficult for a student

Logistic regression, model scoring to fit statistical models to student log data

A* search algorithm with “smart” operators for proposing new cognitive models based on the factors

Page 9: Learning Factors Analysis – A General Method for Cognitive Model Evaluation and Improvement Hao Cen, Kenneth Koedinger, Brian Junker Human-Computer Interaction

The Statistical Model

Probability of getting a step correct (p) is proportional to:- if student i performed this step = Xi,

add overall “smarts” of that student = i

- if skill j is needed for this step = Yj, add easiness of that skill = j

add product of number of opportunities to learn = Tj & amount gained for each opportunity = j

( ) jjjjjiipp TYYX ∑ ∑∑ ++=− γβα1ln p

Use logistic regression because response is discrete (correct or not) Probability (p) is transformed by “log odds” “stretched out” with “s curve” to not bump up against 0 or 1

(Related to “Item Response Theory”, behind standardized tests …)

Page 10: Learning Factors Analysis – A General Method for Cognitive Model Evaluation and Improvement Hao Cen, Kenneth Koedinger, Brian Junker Human-Computer Interaction

Difficulty Factors

Difficulty Factors -- a property of the problem that causes student difficulties Like first vs. second parameter in LISP example above

Four factors in this study Embed: alone, embed Backward: forward, backward Repeat: initial, repeat FigurePart: area, area-difference, area-combination, diameter, circumference,

radius, side, segment, base, height, apothem

Embed factor: Whether figure is embedded in another figure or by itself (alone)Example for skill Circle Area:

Q: Given AB = 2, find circle area in the context of the problem goal to calculate the shaded area

A B

A B

Page 11: Learning Factors Analysis – A General Method for Cognitive Model Evaluation and Improvement Hao Cen, Kenneth Koedinger, Brian Junker Human-Computer Interaction

Combinatorial Search

Goal: Do model selection within the logistic regression model space Steps:1. Start from an initial “node” in search graph2. Iteratively create new child nodes by splitting a model using covariates or

“factors”3. Employ a heuristic (e.g. fit to learning curve) to rank each node 4. Expand from a new node in the heuristic order by going back to step 2

Page 12: Learning Factors Analysis – A General Method for Cognitive Model Evaluation and Improvement Hao Cen, Kenneth Koedinger, Brian Junker Human-Computer Interaction

System: Best-first Search an informed graph search algorithm

guided by a heuristic Heurisitcs – AIC, BIC Start from an existing model

Page 13: Learning Factors Analysis – A General Method for Cognitive Model Evaluation and Improvement Hao Cen, Kenneth Koedinger, Brian Junker Human-Computer Interaction

System: Best-first Search an informed graph search algorithm

guided by a heuristic Heurisitcs – AIC, BIC Start from an existing model

OriginalModel

AIC = 5328

5301 5312 53205322

Split by Embed Split by Backward Add Formula

50+

Page 14: Learning Factors Analysis – A General Method for Cognitive Model Evaluation and Improvement Hao Cen, Kenneth Koedinger, Brian Junker Human-Computer Interaction

System: Best-first Search an informed graph search algorithm

guided by a heuristic Heurisitcs – AIC, BIC Start from an existing model

OriginalModel

AIC = 5328

5301 5312

5320

53205322

Split by Embed Split by BackwardAdd Formula

53135322

50+

Page 15: Learning Factors Analysis – A General Method for Cognitive Model Evaluation and Improvement Hao Cen, Kenneth Koedinger, Brian Junker Human-Computer Interaction

System: Best-first Search an informed graph search algorithm

guided by a heuristic Heurisitcs – AIC, BIC Start from an existing model

OriginalModel

AIC = 5328

5301 5312

5320

53205322

Split by Embed Split by Backward Add Formula

53135322

50+

5322 53245325

Page 16: Learning Factors Analysis – A General Method for Cognitive Model Evaluation and Improvement Hao Cen, Kenneth Koedinger, Brian Junker Human-Computer Interaction

System: Best-first Search an informed graph search algorithm

guided by a heuristic Heurisitcs – AIC, BIC Start from an existing model

OriginalModel

AIC = 5328

5301 5312

5320

53205322

Split by Embed Split by Backward Add Formula

53135322

50+

5322 53245325

Page 17: Learning Factors Analysis – A General Method for Cognitive Model Evaluation and Improvement Hao Cen, Kenneth Koedinger, Brian Junker Human-Computer Interaction

System: Best-first Search an informed graph search algorithm

guided by a heuristic Heurisitcs – AIC, BIC Start from an existing model

OriginalModel

AIC = 5328

5301 5312

5320

53205322

Split by Embed Split by Backward Add Formula

53135322

5248

50+

5322 53245325

15 expansions later

Page 18: Learning Factors Analysis – A General Method for Cognitive Model Evaluation and Improvement Hao Cen, Kenneth Koedinger, Brian Junker Human-Computer Interaction

The Split

Binary Split -- splits a skill a skill with a factor value, & a skill without the factor value.

Student Step Skill Opportunity

A p1s1 Circle-area-alone 1

A p2s1 Circlearea-embed 1

A p2s2 Rectangle-area 1

A p2s3Compose-by-addition 1

A p3s1 Circle-area-alone 2

Student Step Skill Opportunity Factor- Embed

A p1s1 Circle-area 1 alone

A p2s1 Circle-area 2 embed

A p2s2 Rectangle-area 1

A p2s3Compose-by-

addition 1

A p3s1 Circle-area 3 alone

After Splitting Circle-area by Embed

Page 19: Learning Factors Analysis – A General Method for Cognitive Model Evaluation and Improvement Hao Cen, Kenneth Koedinger, Brian Junker Human-Computer Interaction

The Heuristics

Good model captures sufficient variation in data but is not overly complicated balance between model fit & complexity minimizing

prediction risk (Wasserman 2005) AIC and BIC used as heuristics in the search

two estimators for prediction risk balance between fit & parisimony select models that fit well without being too complex AIC = -2*log-likelihood + 2*number of parameters BIC = -2*log-likelihood + number of parameters *

number of observations

Page 20: Learning Factors Analysis – A General Method for Cognitive Model Evaluation and Improvement Hao Cen, Kenneth Koedinger, Brian Junker Human-Computer Interaction

Overview

Cognitive Models & Cognitive Tutors Literature Reviews on Model Improvement A Geometry Cognitive Model and Log Data Learning Factors Analysis algorithm Experiments and Results

Page 21: Learning Factors Analysis – A General Method for Cognitive Model Evaluation and Improvement Hao Cen, Kenneth Koedinger, Brian Junker Human-Computer Interaction

Experiment 1

Q: How can we describe learning behavior in terms of an existing cognitive model?

A: Fit logistic regression model in equation above (slide 27) & get coefficients

Page 22: Learning Factors Analysis – A General Method for Cognitive Model Evaluation and Improvement Hao Cen, Kenneth Koedinger, Brian Junker Human-Computer Interaction

Experiment 1

Results:

Skill Intercept SlopeAvg Opportunties Initial Probability Avg

ProbabilityFinal Probability

Parallelogram-area 2.14 -0.01 14.9 0.95 0.94 0.93

Pentagon-area -2.16 0.45 4.3 0.2 0.63 0.84

Student Intercept

student0 1.18

student1 0.82

student2 0.21

Model Statistics

AIC 3,950

BIC 4,285

MAD 0.083

Higher intercept of skill -> easier skill

Higher slope of skill -> faster students learn it

Higher intercept of student -> student initially knew more

The AIC, BIC & MAD statistics provide alternative ways to evaluate models

MAD = Mean Absolute Deviation

Page 23: Learning Factors Analysis – A General Method for Cognitive Model Evaluation and Improvement Hao Cen, Kenneth Koedinger, Brian Junker Human-Computer Interaction

Experiment 2

Q: How can we improve a cognitive model? A: Run LFA on data including factors &

search through model space

Page 24: Learning Factors Analysis – A General Method for Cognitive Model Evaluation and Improvement Hao Cen, Kenneth Koedinger, Brian Junker Human-Computer Interaction

Experiment 2 – Results with BIC

Splitting Compose-by-multiplication into two skills – CMarea and CMsegment, making a distinction of the geometric quantity being multiplied

Model 1 Model 2 Model 3

Number of Splits:3 Number of Splits:3 Number of Splits:2

1. Binary split compose-by-multiplication by figurepart segment

2. Binary split circle-radius by repeat repeat

3. Binary split compose-by-addition by backward backward

1. Binary split compose-by-multiplication by figurepart segment

2. Binary split circle-radius by repeat repeat

3. Binary split compose-by-addition by figurepart area-difference

1. Binary split compose-by-multiplication by figurepart segment

2. Binary split circle-radius by repeat repeat

Number of Skills: 18 Number of Skills: 18 Number of Skills: 17

AIC: 3,888.67BIC: 4,248.86MAD: 0.071

AIC: 3,888.67BIC: 4,248.86MAD: 0.071

AIC: 3,897.20BIC: 4,251.07MAD: 0.075

Page 25: Learning Factors Analysis – A General Method for Cognitive Model Evaluation and Improvement Hao Cen, Kenneth Koedinger, Brian Junker Human-Computer Interaction

Experiment 3

Q: Will some skills be better merged than if they are separate skills? Can LFA recover some elements of original model if we search from a merged model, given difficulty factors?

A: Run LFA on the data of a merged model, and search through the model space

Page 26: Learning Factors Analysis – A General Method for Cognitive Model Evaluation and Improvement Hao Cen, Kenneth Koedinger, Brian Junker Human-Computer Interaction

Experiment 3 – Merged Model

Merge some skills in the original model to remove some distinctions, add as a difficulty factors to consider

The merged model has 8 skills: Circle-area, Circle-radius => Circle Circle-circumference, Circle-diameter => Circle-CD Parallelogram-area and Parallelogram-side => Parallelogram Pentagon-area, Pentagon-side => Pentagon Trapezoid-area, Trapezoid-base, Trapezoid-height => Trapezoid Triangle -area, Triangle -side => Triangle Compose-by-addition Compose-by-multiplication

Add difficulty factor “direction”: forward vs. backward

Page 27: Learning Factors Analysis – A General Method for Cognitive Model Evaluation and Improvement Hao Cen, Kenneth Koedinger, Brian Junker Human-Computer Interaction

Experiment 3 – ResultsModel 1 Model 2 Model 3

Number of Splits: 4 Number of Splits: 3 Number of Splits: 4

Number of skills: 12 Number of skills: 11 Number of skills: 12

Circle *areaCircle *radius*initialCircle *radius*repeatCompose-by-additionCompose-by-addition*area-differenceCompose-by-multiplication*area-combinationCompose-by-multiplication*segment

All skills are the same as those in model 1 except that 1. Circle is split into Circle *backward*initial, Circle *backward*repeat, Circle*forward,2. Compose-by-addition is not split

All skills are the same as those in model 1 except that 1. Circle is split into Circle *backward*initial, Circle *backward*repeat, Circle *forward,2. Compose-by-addition is split into Compose-by-addition and Compose-by-addition*segment

AIC: 3,884.95 AIC: 3,893.477 AIC: 3,887.42

BIC: 4,169.315 BIC: 4,171.523 BIC: 4,171.786

MAD: 0.075 MAD: 0.079 MAD: 0.077

Page 28: Learning Factors Analysis – A General Method for Cognitive Model Evaluation and Improvement Hao Cen, Kenneth Koedinger, Brian Junker Human-Computer Interaction

Experiment 3 – Results

Recovered three skills (Circle, Parallelogram, Triangle) => distinctions made in the original model are necessary

Partially recovered two skills (Triangle, Trapezoid)=> some original distinctions necessary, some are not

Did not recover one skill (Circle-CD)=> original distinction may not be necessary

Recovered one skill (Pentagon) in a different way=> Original distinction may not be as significant as distinction caused by another factor

Page 29: Learning Factors Analysis – A General Method for Cognitive Model Evaluation and Improvement Hao Cen, Kenneth Koedinger, Brian Junker Human-Computer Interaction

Beyond Experiments 1-3

Q: Can we use LFA to improve tutor curriculum by identifying over-taught or under-taught rules? Thus adjust their contribution to curriculum length

without compromising student performance A: Combine results from experiments 1-3

Page 30: Learning Factors Analysis – A General Method for Cognitive Model Evaluation and Improvement Hao Cen, Kenneth Koedinger, Brian Junker Human-Computer Interaction

Beyond Experiments 1-3 -- Results Parallelogram-side is over taught.

high intercept (2.06), low slope (-.01). initial success probability .94, average number of practices per student is

15 Trapezoid-height is under taught.

low intercept (-1.55), positive slope (.27). final success probability is .69, far away from the level of mastery, the

average number of practices per student is 4. Suggestions for curriculum improvement

Reducing the amount of practice for Parallelogram-side should save student time without compromising their performance.

More practice on Trapezoid-height is needed for students to reach mastery.

Page 31: Learning Factors Analysis – A General Method for Cognitive Model Evaluation and Improvement Hao Cen, Kenneth Koedinger, Brian Junker Human-Computer Interaction

Beyond Experiments 1-3 -- Results How about Compose-by-multiplication?

Intercept slope Avg Practice Opportunties

Initial Probability Avg Probability

Final Probability

CM -.15 .1 10.2 .65 .84 .92

With final probability .92 students seem to have mastered Compose-by-multiplication.

Page 32: Learning Factors Analysis – A General Method for Cognitive Model Evaluation and Improvement Hao Cen, Kenneth Koedinger, Brian Junker Human-Computer Interaction

Beyond Experiments 1-3 -- Results

However, after split

CMarea does well with final probability .96

But CMsegment has final probability only .60 and an average amount of practice less than 2

Suggestions for curriculum improvement: increase the amount of practice for CMsegment

Intercept slope Avg Practice Opportunties

Initial Probability

Avg Probability

Final Probability

CM -.15 .1 10.2 .65 .84 .92

CMarea -.009 .17 9 .64 .86 .96

CMsegment -1.42 .48 1.9 .32 .54 .60

Page 33: Learning Factors Analysis – A General Method for Cognitive Model Evaluation and Improvement Hao Cen, Kenneth Koedinger, Brian Junker Human-Computer Interaction

Conclusions and Future Work

Learning Factors Analysis combines statistics, human expertise, & combinatorial search to evaluate & improve a cognitive model

System able to evaluate a model in seconds & search 100s of models in 4-5 hours Model statistics are meaningful Improved models are interpretable & suggest tutor

improvement Planning to use LFA for datasets from other tutors to

test potential for model & tutor improvement

Page 34: Learning Factors Analysis – A General Method for Cognitive Model Evaluation and Improvement Hao Cen, Kenneth Koedinger, Brian Junker Human-Computer Interaction

Acknowledgements

This research is sponsored by a National Science Foundation grant to the Pittsburgh Science of Learning Center. We thank Joseph Beck, Albert Colbert, and Ruth Wylie for their comments.

Page 35: Learning Factors Analysis – A General Method for Cognitive Model Evaluation and Improvement Hao Cen, Kenneth Koedinger, Brian Junker Human-Computer Interaction

END

Page 36: Learning Factors Analysis – A General Method for Cognitive Model Evaluation and Improvement Hao Cen, Kenneth Koedinger, Brian Junker Human-Computer Interaction

To do

Reduce DFA-LFA.ppt, get from ERM lecture Go over 2nd exercise on creating learning curves (from

web site) in this talk & finish in 2nd session? Print paper ….

Other Mail LOI feedback to Bett, add Kurt’s refs