Upload
kathleen-sherman
View
217
Download
1
Tags:
Embed Size (px)
Citation preview
Using Decision Trees to Predict Course Success in the Multiple Measures Assessment Pilot
RP CONFERENCEAPRIL 9, 2015
Your MMAP to Today's Presentation
• Overview of Multiple Measures Assessment Project (MMAP)
• Data sources for MMAP• Why multiple measures? Why decision
trees?• Code for creating decision trees in R• Impact of disjunctive application of MM rule
sets• Equity implications• Implementation
Multiple Measures Assessment Project
• MMAP research team & MMAP Work Group• Component of the Common Assessment Initiative• Analyzing high school performance data to create
models that predict success in CC English & math• Also working on assessing predictive validity of
noncognitive variables• Focused on predictive validity and reducing
inappropriate underplacement• Webinars, summits, professional development
• Tests alone have proven to have poor predictive validity
• MMS can reduce underplacement• Provide a more holistic picture• Multiple measures assessment is mandated• Multiple measures have traditionally involved a few
survey-type questions• More of a nudge than a true determinant
Why MMs?
Correlations with College Grade:11th grade Cumulative GPA vs. Accuplacer
English 11th Grade GPA Accuplacer
Transfer 0.27 0.101 level below 0.24 0.122 levels below 0.25 0.123 levels below 0.18 0.124 levels below 0.21 0.07
Math 11th Grade GPA Accuplacer
Transfer - STEM 0.24 0.19Transfer – Stats 0.31 0.16Transfer – LAM 0.26 0.091 level below 0.28 0.212 levels below 0.26 0.113 levels below 0.23 0.114 levels below 0.19 0.05
Lessons From Other StatesA number of assessment reform challenges have been identified (RFA, 2013)
Using data to inform the work: Developmental education student performance data helps make the case for system-wide reform. Using the data to educate about the need for developmental education reform is an essential first step in developing the political will to move forward.
Including faculty in the process: Engaging faculty in every part of the work has been central to the successes experienced to date.
Communicating across system colleges: Important to inform the field about the work that is taking place and also improve the results by incorporating feedback.
National legislative & policy context
State-level◦ North Carolina – 2.6 GPA & 4 years of math transfer-level◦ Florida – developmental educational optional ◦ Virginia – re-design of assessment in, modularization of math◦ Connecticut – dev. education cut to 1 semester, coreq only◦ Hawaii CCs- pilots using HSGPA in assessment◦ Colorado, Montana, Mississippi* - move to corequisite models◦ Massachusetts – 2.7 GPA and coreq. for non-STEM math
National level◦ 2013: US Governmental Accountability Office report on Developmental
Education in Community Colleges◦ 2014: Center for the Analysis of Postsecondary Readiness established by CCRC,
MDRC◦ 2015: Foundation policy advocacy and funding priority
Data for MMAP
Sources
COMIS
K-12 transcript data
CST, EAP, ACCUPLACER, CAHSEE
Structure
High School (grade)
Community College (CB21)
8
Pros and Cons of Decision Trees Strengths Weaknesses
• Visualization• Easy to understand
output• Easy to code rules• Model complex
relationships easily• Linearity, normality, not
assumed • Handles large data sets• Can use categorical and
numeric inputs
• Results dependent on training data set – can be unstable esp. with small N
• Can easily overfit data• Out of sample
predictions can be problematic
• Greedy method selects only ‘best’ predictor
• Must re-grow trees when adding new observations
A B ZZ A BB Z A
A B AB A B
AAA BBB
ZZZ
Increasing Homogeneity with each split
Branch
Internal Node
Leaf Node
Root Node
How is homogeneity measured?
Gini-Simpson Index
If selecting two individual items randomly from a collection, what is the probability they are in different categories.
The Gini coefficient is a measure of the inequality of a distribution, a value of 0 expressing total equality and a value of 1 maximal inequality.
Shannon Information Index
Measures diversity of a collection of items. Higher values indicate greater diversity.
Libraries and Code for R: Your Basic Classification Decision Tree
Data <- read.csv(“C:/Folder/Document.csv", header=T)Data.df <- data.frame(Data)DataTL <- (subset (Data.df,EnglishLevel==“Transfer Level”))library(rpart)library(rpart.plot)ctrl <- rpart.control(minsplit = 100, minbucket = 1, cp = 0.001)DataTransferLevel <- (subset (Data.df,CourseLevel==1)) fitTL <- rpart(formula = success ~ Delay + CBEDS_rank + course_gp + A2G + cst_ss + grade_level + GPA_sans, data=DataTL, method = "class", control = ctrl)printcp(fitTL)prp(fitTL)rsq.rpart(fitTL)print(fitTL) print(fitTL, minlength=0, spaces=2, digits= getOption("digits"))summary(fitTL)More code available at: http://bit.ly/MMAP_resources_2015
Considerations
• Splitting criterion: how small should the leaves be? What are the minimum # of splits?
• Stopping criterion: when should one stop growing the branch of the tree?
• Pruning: avoiding overfitting of the tree and improving
• Understanding classification performance
Two Approaches to Avoid Overfitting
Forward pruning: Stop growing the tree earlier.• Stop splitting the nodes if the number of
samples is too small to make reliable decisions.• Stop if the proportion of samples from a single
class (node purity) is larger than a given threshold
Post-pruning: Allow overfit and then post-prune the tree.• Estimation of errors and tree size to decide
which subtree should be pruned.
Fit Statistics: Evaluating your tree
• Misclassification rate - the number of incorrect predictions divided by the total number of classifications.
• Sensitivity - the percentage of cases that actually experienced the outcome (e.g., "success") that were correctly predicted by the model (i.e., true positives).
• Specificity - the percentage of cases that did not experience the outcome (e.g., "unsuccessful") that were correctly predicted by the model (i.e., true negatives).
• Positive predictive value - the percentage of correctly predicted successful cases relative to the total number of cases predicted as being successful.
Variables that predict CC success• English
– Cumulative HS GPA– C+ or better in AP
English class– Score on English CST
• Math– Cumulative HS GPA – Grades in Algebra II,
Calculus, Statistics, Trigonometry, Geometry, Pre-calculus
– Taking a challenging CST– Score on math CST– Delay
English transfer-level rule setsDirect matriculants11th grade HS Cume GPA >= 2.7
OR
C+ or better in AP English
Non-direct matriculants12th grade HS Cume GPA >= 2.7
OR
C or better in AP EnglishOR
Cumulative GPA >= 2.3&12th Course GP >= B-
GE Math transfer-level rule setsDirect matriculants11th grade Cume GPA >= 3.2
ORCumulative GPA >= 2.5&CST >= 288&Algebra II C or better
ORCumulative GPA >= 2.9 &C or better in Algebra II
Non-direct matriculants12th grade Cume GPA >= 2.8
ORCST >= 284&Took Trigonometry in high school
ORTrigonometry B or better
Disproportionate Impact
• Renewed interest in equity across gender, ethnicity, age, disability, foster youth and veteran status
• Does a student’s demographics predict placement level?
• If so, what is the degree of impact and what can be done to mitigate?
Timeline and feedback
• Pilot running now through fall 2015• Testing rule sets via retroactive “what ifs”• Collecting Social-Psychological and other
MMs• Active implementations (some venues)• Need to collect & analyze fall 2015 in
winter/early spring of 2016• Fall 2016 system go-live deadline• All feedback appreciated and encouraged
Contacts:
Terrence WillettThe RP Group [email protected]
Craig HaywardThe RP [email protected]
Alyssa NguyenThe RP [email protected]
John HettsEducational Results Partnership/[email protected]
Ken SoreyEducational Results Partnership/[email protected]
Danielle DuranEducational Results Partnership/[email protected] Daniel LamoreeEducational Results Partnership/[email protected]