24
Big Data Meets Learning Science Apache Spark Summit East 2017 Alfred Essa VP, Research and Data Science McGraw-Hill Education @malpaso

Big Data Meets Learning Science: Keynote by Al Essa

Embed Size (px)

Citation preview

Page 1: Big Data Meets Learning Science: Keynote by Al Essa

Big Data Meets Learning Science

Apache Spark Summit East 2017

Alfred EssaVP, Research and Data ScienceMcGraw-Hill Education@malpaso

Page 2: Big Data Meets Learning Science: Keynote by Al Essa

Our Journey from Print to Digital

2 McGraw-Hill Learning Science

3 Spark, DataBricks

1 Innovation Pipeline

Page 3: Big Data Meets Learning Science: Keynote by Al Essa

Speed of innovation, notdata, is the differentiator.

Page 4: Big Data Meets Learning Science: Keynote by Al Essa

Technology Time to Market

Spark Factor

People Process

Apache Spark

DataBricks

Page 5: Big Data Meets Learning Science: Keynote by Al Essa

Innovation Pipeline

Research

ProductValidation

ProductDevelopment

Page 6: Big Data Meets Learning Science: Keynote by Al Essa

Databricks underpins our innovation pipeline and workflow.

Page 7: Big Data Meets Learning Science: Keynote by Al Essa

2 McGraw-Hill Learning Science

Page 8: Big Data Meets Learning Science: Keynote by Al Essa

From Print to Digital: 128-year Journey

K-12, Higher Ed & Professionalbusinesses

~4,800employees

Page 9: Big Data Meets Learning Science: Keynote by Al Essa

Parking Lot

Adaptive Platform Leverages MHE Reach and Scale

How do we learn and how can we learn better?.

Introduction of SmartBook

May 2013 1,500+ adaptive products available

Now

Learners who have used MHE Adaptive

~5,500,000 ~10,000,000,000Student interactions

Authors trained to use MHE Adaptive~4,000

Page 10: Big Data Meets Learning Science: Keynote by Al Essa

Parking Lot

Research Phase

Research Step: Build Models and Algorithms

Model Iteration: Ship “Data Instrumented” product to improve and tune models

Product Focus: Iterate prototypes with business and customers

.Research Question: How do we learn and how can we learn better?

Page 11: Big Data Meets Learning Science: Keynote by Al Essa

Stacked Algorithm

Learning Tool for Optimizing Acquisition and Recall

2 31

Learning Science Principles

Effortful Recall

Spaced Practice

Interleaving

CognitiveScienceModel

Mobile App

Page 12: Big Data Meets Learning Science: Keynote by Al Essa
Page 13: Big Data Meets Learning Science: Keynote by Al Essa
Page 14: Big Data Meets Learning Science: Keynote by Al Essa
Page 15: Big Data Meets Learning Science: Keynote by Al Essa

3 Spark, DataBricks

Page 16: Big Data Meets Learning Science: Keynote by Al Essa

The Problem

Students drop out or fail their course

1

2 At-risk students can be difficult to identify by instructors

Identify at-risk studentspre-emptively

Page 17: Big Data Meets Learning Science: Keynote by Al Essa

The Solution

A classifier to predict abandonment

Jacqueline FeildDataScientist

Nicholas LewkowDataScientist

Page 18: Big Data Meets Learning Science: Keynote by Al Essa

Solution: A Classifier to Predict Abandonment

F1 F2 F3 F4 F5 F61F1 F2 F3 F4 F5 F61

F1 F2 F3 F4 F5 F60F1 F2 F3 F4 F5 F61

F1F2F3F4F5F6

Classificationalgorithm

0

• Logistic Regression usedforinitialclassification algorithm

• Simplealgorithm tointerpret• Providesprobability estimates

instead ofhardclassification label• Allowsforsimple interpretation of

featureimportance

• Oneclassifierworksforalldisciplines

Page 19: Big Data Meets Learning Science: Keynote by Al Essa

Parking LotCompanies that we have met with and completed Stage One, but there is no immediate partnership opportunity

Parallel Pipeline for Creating Classifier

How do we learn and how can we learn better?.

Models and Algorithms

Ship “Data Instrumented” Product

Iterate Prototypes with Customers

Page 20: Big Data Meets Learning Science: Keynote by Al Essa

Parking LotCompanies that we have met with and completed Stage One, but there is no immediate partnership opportunity

Spark Transformation

How do we learn and how can we learn better?.

Models and Algorithms

Ship “Data Instrumented” Product

Iterate Prototypes with Customers

Page 21: Big Data Meets Learning Science: Keynote by Al Essa

Parking LotCompanies that we have met with and completed Stage One, but there is no immediate partnership opportunity

Speedup with Spark

How do we learn and how can we learn better?.

Models and Algorithms

Ship “Data Instrumented” Product

Iterate Prototypes with Customers

Sn:Speedupfromn cores

t1:Timetorunon1core

tn:Timetorunonncores

Page 22: Big Data Meets Learning Science: Keynote by Al Essa

Parking LotCompanies that we have met with and completed Stage One, but there is no immediate partnership opportunity

Evaluate Model Accuracy

How do we learn and how can we learn better?.

Models and Algorithms

Iterate Prototypes with Customers

• Use area underthereceiver operatingcharacteristiccurve(AUC-ROC) asanothermeasure ofmodelaccuracy

• 0.9- 1.0=excellent

• 0.8- 0.9=good

• 0.7- 0.8=fair

• 0.6- 0.7=poor

• 0.5- 0.6=fail

• LookathowtheAUC-ROC foramodelchangesthroughoutthesemester

22

Page 23: Big Data Meets Learning Science: Keynote by Al Essa

Evaluate Intervention Window

InterventionWindow:

Howmuch timeinadvance canweprovideforanintervention tooccur prior toabandonment?

Page 24: Big Data Meets Learning Science: Keynote by Al Essa

Conclusions

Technologyisimportant,butbuildanagileinnovationworkflowwithDatabricks.