Upload
david-onder
View
89
Download
0
Embed Size (px)
Citation preview
David OnderDirector of Assessment
Casey Iannone Consultant
Background
2
Background
3Freshmen Only
The Problem
• We are asked for program-level data
• Lack of program-level retention and graduation rates
• Complicated (particularly for undergrads)
• Different programs serve different purposes
• High stakes – program prioritization (AA driven)
• Reports lumped all non-retained together (whether they graduated or stopped out)
4
Reflecting on the past
What We Wanted
• Solid & simple approach (easy to explain and defend)• Fair
• Useful for all types of programs
• Meaningful for decision-making (high- and low-level)• Not overly complicated display• Illuminates
• Overall performance
• Historic trends
• When are students lost
• Something that can be generated yearly w/o too much effort
7
5 Outcomes
• Five possible outcomes for each student that declares a given major
• Retained in program
• Graduated in program
• Retained in different program
• Graduated in different program
• Not retained (stop-out/drop-out)
• Exclusive and exhaustive
8
General Approach
• Based on cohorts:
• A student is placed in a program cohort the 1st time they declare a given program
• Each student in the cohort is flagged as one of the 5 possible outcomes for each ½ year interval (each regular semester)
• At each interval we report where the members of the cohort fall
• Each student will only appear in one cohort for a program (usually)
9
Technical Approach
10
Multiple Iterations
Started with term-level dataThis is helpful for programs, in the
context of program history, but NOT administrators
11
Academic Year
Next combined data into academic years
12
Summary
All students lumped into 3 groups
This is the most summarized data we can (read: are willing to) provide.The bottom line = Bold 3-group number
13
But how does that compare?
This is average program data to use as a comparison
14
Horizontal stacked bar
• Normally we would never do this, but …..
What are people asking:1. How is the program performing?2. How are students performing overall at institution?3. How many are dropping out?
15
1
2
3
We graph 5 Flags too
@ 1 year @ 4 year @ 6 year
Transitions to completers and drop-outs
16
Compare performance over time
Main question
17
But what about next year?
If we study learning as a data science, we can reverse engineer the human brain and tailor learning techniques to maximize the chances of student success. This is the biggest revolution that could happen in education, turning it into a data-driven science, and not such a medieval set of rumors professors tend to carry on.
-Sebastian Thrun
Most data is only
viewed in one
direction.
A Small Window of
Time Exist to Reach
Students.
The Rising Tide
Of Data.
What is Machine Learning?
Machine learning is the science of getting computers to act without being explicitly programmed.
-Stanford University
Supervised Learning
Unsupervised Learning
Co
nti
nu
ou
sSupervised Unsupervised
Dis
cre
te Classification or
Categorization
Regression
Clustering
Dimensionality reduction
Co
nti
nu
ou
sSupervise
d
Unsupervise
d
Dis
cre
te
Classificationo KNNo Treeso Logistic Regressiono Naïve-Bayeso SVM
Regressiono Linearo PolynomialDecision TreesRandom Forests
Association Analysiso Apriorio FP-GrowthHidden Markov Model
Clustering & Dimensionality reductiono SVDo PCAo K-means
Ready to enter the
Lab
First-year Retention Example
Student RetentionBinary response: Retained yes/no
### CREATING MODEL ###train_new <- DT[1:3032,]test_new <- DT[3033:3790,]test_new$fake_retention <- NULL
# Create a new model `my_tree`my_tree <- rpart( fake_retention ~ SAT + HS.GPA + Gender_Factor + Race_Eth,
data = train_new, method = "class",control=rpart.control(cp=0.0001) )
# Visualize your new decision treeprp( my_tree, type = 4, extra = 100, faclen=0, under = TRUE,
compress=FALSE, ycompress=FALSE )
# Make your prediction using `my_tree` and `test_new`my_prediction <- predict( my_tree, test_new, type = "class“ )
# Create a data frame with two columns: ID & Retained Survived contains your predictionsvector_studentID <- test_new$ID
my_solution <- data.frame( ID=vector_studentID, Retained=my_prediction )
Brainstorm
Brainstorm
Campus Assessment/Effectiveness Framework
Data Dissemination
Key Stakeholders
Education