50
Kerry Rice, Ed.D. Associate Professor and Chair Andy Hung, Ed. D Assistant Professor Yu-Chang Hsu, Ph. D. Assistant Professor Towards the Development of a Real-Time Decision Support System for Online Learning, Teaching and Administration

VSS 2011 Data Mining (Thursday, 10:45)

Embed Size (px)

DESCRIPTION

Towards the Development of a Real-Time Decision Support System for Online Learning, Teaching and Administration

Citation preview

Page 1: VSS 2011 Data Mining (Thursday, 10:45)

Kerry Rice, Ed.D.Associate Professor and ChairAndy Hung, Ed. DAssistant ProfessorYu-Chang Hsu, Ph. D.Assistant Professor

Towards the Development of a Real-Time Decision Support System for Online Learning, Teaching and Administration

Page 2: VSS 2011 Data Mining (Thursday, 10:45)

M.S. in Educational Technology

Masters in Educational Technology

Ed. D in Educational Technology

K-12 Online Teaching Endorsement

Graduate Certificates:Online Teaching - K12 & Adult Learner

Technology Integration Specialist

School Technology Coordinator

Online Teacher PD Portal

Game Studio: Mobile Game Design

Learning Technology Design Lab

Page 3: VSS 2011 Data Mining (Thursday, 10:45)

EDTECH Fast Facts

• Largest graduate program at BSU• Fully online, self-support program• Served over 1,200 unique students last year• Interdisciplinary partnerships with Math, Engineering,

Geoscience, Nursing, Psychology, Literacy, Athletics. • Partnerships with iNACOL, AECT, ISTE, Google,

Stanford, IDLA, Connections Academy, K12, Inc., ID State Department of Education, Discovery Education, Nicolaus Copernicus University, Poland

• First dual degree program – National University of Tainan.

• Save 200+ tons of CO2 emissions annually

Page 4: VSS 2011 Data Mining (Thursday, 10:45)

Image created using wordle: http://www.wordle.net/

Page 5: VSS 2011 Data Mining (Thursday, 10:45)

Going Virtual! Research Series

Page 6: VSS 2011 Data Mining (Thursday, 10:45)

Going Virtual! Research Series

• Who delivered/received PD? • When and how PD was delivered? • Content and sequence of PD?

2007: The Status of Professional Development

• Amount of PD?• Preferred delivery format?• Most important topics for PD?

2008: Unique Needs and Challenges

• Program evaluations• Complexities of measuring “effectiveness”

2009: Effective Professional Development of K-12 Online Teachers

• Revisit questions from 2007 & 2008 • What PD have you had? What do you need?

2010: The Status of PD and Unique Needs of K-12 Online Teachers

• Pass Rate Predictive Model• Engagement• Association Rules

2011: Development of an Educational Data Mining model

Page 7: VSS 2011 Data Mining (Thursday, 10:45)

Going Virtual! Research SeriesG

oing

Virt

ual!

2007 258 Respondents

•167 K-12 online teachers•61 Administrators•14 TrainersOver 40 virtual schools and online programsOver 30 states G

oing

Virt

ual!

2008 884 K-12 Online

Teachers•727 virtual schools•99 supplemental programs

•54 brick and mortar online programs

Over 60 virtual schools and online programsOver 30 states

Goi

ng V

irtua

l 201

1 Traditional• Virtual Charter• Supplemental Program

With DATA MINING• Online Teacher PD Workshops

• Online Graduate Courses

• End of Year Program Evaluation

Goin

g V

irtu

al!

2

01

0830 K-12 Online Teachers•417 Virtual School•318 Supplemental•81 Blended•12 Brick N MortarOver 50 virtual schools and online programsOver 40 states & 24 countries

Des

crip

tive

Eval

uativ

eGoals: • Program evaluation• Develop cloud-based, real-time

Decision Support System (DSS)• Link PD effectiveness to

student outcomes

Page 8: VSS 2011 Data Mining (Thursday, 10:45)

Traditional Evaluation Systems

Teacher Effectiveness

Highly qualified?

Parent Satisfaction

Annual Performance

Range of implementatio

n

Student Satisfaction

Knowledge of STS

Program

AYP?

Improved Test Scores

Parent Satisfaction

Student Outcomes

Performance

Participation

Attendance

ISAT/DWA

Self-Efficacy

Satisfaction

Page 9: VSS 2011 Data Mining (Thursday, 10:45)

Leveraging Data Systems

PD Effectiveness

Quality

Usefulness

Engagement

Teacher Effectivenes

s

Change in teaching practice

Quantity AND Quality

of Interaction

Course Design

Student Outcomes

Satisfaction

Engagement

Dropout Rate

Performance

Learning Patterns

Self report

Self report Self report

Low-level data

Low-level data

Page 10: VSS 2011 Data Mining (Thursday, 10:45)

Data Mining

Data mining techniques can be applied in online environments to understand hidden relationships between logged activities, learner experiences, and

performance. It can be used in education to track learner behaviors, identify struggling students, depict learning

preferences, improve course design, personalize instruction, and predict student performance.

Page 11: VSS 2011 Data Mining (Thursday, 10:45)

Educational Data Mining

Special Challenges• Learning behaviors are complex• Target variables (learning outcomes/performance)

require wide range of assessments and indicators• Goal of improving online teaching and learning is hard

to quantify• Limited number of DM techniques suitable to meet

educational goals• Only interactions that occur in the LMS can be tracked

through data mining. What if learning occurs outside the LMS?

• Still a very intensive process to identify rules and patterns

Page 12: VSS 2011 Data Mining (Thursday, 10:45)

DM Applications in Education

• Pattern discovery (data visualization, clustering, sequential path analysis)– Track students’ learning progress– Identify outliers (outstanding or at-risk students)– Depict students’ learning preferences (learner profiling)– Identify relationships of course components (web

mining)• Predictive Modeling (decision tree analysis)

– Suggest personalized activities (classification prediction)– Foresee student performance (numeric prediction)– Adaptive evaluation system development

• Algorithm generation: analysis methods can be integrated into platforms.

Page 13: VSS 2011 Data Mining (Thursday, 10:45)

Data Preprocessing

• Data Collection• Data Cleaning• Session Identification• Behavior Identification

Page 14: VSS 2011 Data Mining (Thursday, 10:45)

Data Transformation

Page 15: VSS 2011 Data Mining (Thursday, 10:45)

3 Data Mining Studies

• Study #1: Teacher Training Workshops 2010– Survey Data + Data Mining + Student Outcomes

• Study #2: Graduate Courses 2010– Data Mining + Student Outcomes (no demographic data)

• Study #3: End of Year K-12 Program Evaluation (2009 – 2010)– Data Mining + Student Outcomes + Demographic Data

+ Survey Data

Page 16: VSS 2011 Data Mining (Thursday, 10:45)

Study #1: Teacher Training Workshops 2010

• Survey Data + Data Mining + Student Outcomes• Research Goal: To demonstrate the potential

applications of data mining with a case study – Program evaluation of workshop quality for continuous

improvement of design and delivery.– Evaluation of PD impact on both teachers (and

students).

Page 17: VSS 2011 Data Mining (Thursday, 10:45)

• Blackboard• 103 participants• 31,417 learning logs• clustering analysis, sequential association

analysis, and decision tree analysis • Engagement variables

– Frequency of logins– Length of time online (survey and dm)– Frequency of content access– Number of discussion posts

Study #1: Teacher Training Workshops 2010

Page 18: VSS 2011 Data Mining (Thursday, 10:45)

Learning Paths

• Association Rule Analysis– Participants tended to switch between content and

discussion within one session. – Different types of interactions (content-participant,

participant-instructor, and participant-participant) were well facilitated in the workshops overall.

Page 19: VSS 2011 Data Mining (Thursday, 10:45)

Performance

Pass Rate Predictive Model• Decision Tree Analysis

– Improved grades and pass rate (from 88% to 92% and 89% to 94% respectively) when participants’ logged into LMS more than 10 times over six weeks. The average for both is further improved to 98% when frequency of logins increased to 17 times.

Increased logins = Increased performance

Page 20: VSS 2011 Data Mining (Thursday, 10:45)

Quality of Experience

Engagement• Clustering + Survey Questions

– More time spent online = more time spent offline.– Previous online teaching experience = more hours

spent both online and offline.

Page 21: VSS 2011 Data Mining (Thursday, 10:45)

DM Conclusions

• Interaction and engagement were important factors in learning outcomes.

• The results indicate that the workshops were well facilitated, in terms of interaction.

• Participants who had online teaching experience could be expected to have a higher engagement level but prior online learning experience did NOT show a similar relationship.

• There is a direct relationship between the amount of time learners spent online and their average course logins to engagement and performance. Specifically, more time spent online and a higher frequency of logins equates to increased engagement and improved performance.

Page 22: VSS 2011 Data Mining (Thursday, 10:45)

• Two factors influenced expectation ratings:– Practical new knowledge– Ease of locating information

• Three factors influenced satisfaction ratings:– Usefulness of subject-matter– Well-structured website– Sufficient technical supports

• Instructor quality was related to:– Stimulated interest– Preparation for class– Respectful treatment of students– Peer collaboration– Assessments aligned to course objectives– Support services for technical problems

Overall Conclusions

Page 23: VSS 2011 Data Mining (Thursday, 10:45)

Study #2: Graduate Courses 2010

• Data Mining + Student Outcomes (no demographic data)

• Research Goal: To demonstrate the potential applications of data mining with a case study – Generate personalized advice– Identify struggling students– Adjust teaching strategies– Improve course design– Data Visualization

• Study Design– Comparative (between and within courses)– Random course selection

Page 24: VSS 2011 Data Mining (Thursday, 10:45)

Study #2: Graduate Course 2010

• Moodle• Two graduate courses (X and Y)• Each with two sections

– X1 (18 students)– X2 (19 students)– Y1 (18 students)– Y2 (22 students)

• 2,744,433 server logs

Page 25: VSS 2011 Data Mining (Thursday, 10:45)

Study #2: Graduate Course 2010

• Variables– ID’s (user and session)– Learning Behaviors (reading materials, posting disc.)– Time/duration– Grades or pass/fail (independent variables)

Page 26: VSS 2011 Data Mining (Thursday, 10:45)

Weekday Course PatternsWeekday Student Patterns

Learner Behaviors

Page 27: VSS 2011 Data Mining (Thursday, 10:45)

Weekday and Time Patterns of Learning Behaviors

• Reading is the major activity; Similar patterns• Sunday => reply discussions • Monday & Tuesday, between 1pm and midnight

Page 28: VSS 2011 Data Mining (Thursday, 10:45)

Shared Student Characteristics Course X

CLUSTER ANALYSIS

1) LOW ENGAGED – LOW PERFORMING

2) HIGH ENGAGED-HIGH PERFORMING

3) HIGH ENGAGED – LOW PERFORMING

4) LOW ENGAGED – HIGH PERFORMING

Page 29: VSS 2011 Data Mining (Thursday, 10:45)

Shared Student Characteristics Course Y

CLUSTER ANALYSIS

1) LOW ENGAGED – LOW PERFORMING

2) HIGH ENGAGED-HIGH PERFORMING

3) HIGH ENGAGED – LOW PERFORMING

4) LOW ENGAGED – HIGH PERFORMING

Page 30: VSS 2011 Data Mining (Thursday, 10:45)

ASSOCIATION RULEPATH ANALYSISCOURSE DESIGN

Learner Behaviors

Page 31: VSS 2011 Data Mining (Thursday, 10:45)

Predictive Analysis – Course X

Discussion board posts and replies were the most important variable for predicting performance (27+ replies = better performance)

Some lower performers had high reply numbers (> 43)

Cluster analysis revealed that students tended to only read discussions.

Page 32: VSS 2011 Data Mining (Thursday, 10:45)

Predictive Analysis – Course Y

Number of discussion board posts read was the most important predictor of performance (378+ = better performance)

Fewer discussions read + more replies (54+ = better performance)

The design of course Y improved the quality of discussions and influenced student behaviors.

Page 33: VSS 2011 Data Mining (Thursday, 10:45)

• Demographics + Survey Data + Data Mining + Student Outcomes

• Research Goal: Large scale program evaluation– How can the proposed program evaluation framework

support decision making at the course and institutional level?

– Identify key variables and examine potential relationships between teacher and course satisfaction, student behaviors, and student performance outcomes

Study #3: End of Year K-12 Program Evaluation

Page 34: VSS 2011 Data Mining (Thursday, 10:45)

Study #3: End of Year K-12 Program Evaluation (2009 – 2010)

• Blackboard LMS• 7500 students • 883 courses• 23,854,527

learning logs (over 1 billion records)

Page 35: VSS 2011 Data Mining (Thursday, 10:45)

Total Variables = 22

stuIDAgeCityDistrictGrade_AvgClick_AvgContent_Access_AvgCourse_Access_AvgPage_Access_AvgDB_Entry_AvgTab_Access_Avg

Login_AvgModule_AvgGenderHSGradYearSchoolNo_CourseNo_FailNo_PassPass ratecSatisfaction_AvgiSatisfaction_Avg

Page 36: VSS 2011 Data Mining (Thursday, 10:45)

• Average frequency of logins per course.• Average frequency of tab accessed per course• Average frequency of module accessed per course• Average frequency of clicks per course• Average frequency of courses accessed (from the

Blackboard portal)• Average frequency of page accessed per course (page tool)• Average frequency of course content accessed per course

(content tool)• Average number of discussion board entries per course.

Engagement

Page 37: VSS 2011 Data Mining (Thursday, 10:45)

Cluster Analysis - by StudentSpring 2010

Page 38: VSS 2011 Data Mining (Thursday, 10:45)

Cluster Analysis - by Student

• High engagement = high performance • The optimal number of courses = 1 to 2 per semester • Older students (age > 16.91) tended to take more than two

courses with pass rates ranging from 54.09-56.11%• High-engaged students demonstrated engagement levels

twice that of low-engaged students • Female students were more active than male students in

online discussions (with higher DB_Entry avg frequency)• Female students had higher pass rates than male students

Page 39: VSS 2011 Data Mining (Thursday, 10:45)

Identified lowest performing courses (Math, Science and English) were analyzed with cluster analysis. • High-engaged + high performance = good design and good

implementation?• High engaged + low performance = bad design and good

implementation?• Low engaged + low performance = bad design and bad

implementation?

Cluster Analysis – by Course

Page 40: VSS 2011 Data Mining (Thursday, 10:45)

Subject areas in which the level of activity was consistent with student outcomes: – High Performance and High Engagement = Driver

Education, Electives, Foreign Language, Health, and Social Studies

– Low Engagement and Low Performance = English

Subject areas in which the level of activity was inconsistent with student outcomes:– High Engagement and Low Performance = Math and

Science. Why?

Cluster Analysis – by Course

Page 41: VSS 2011 Data Mining (Thursday, 10:45)

• Regardless of the content area or level of engagement, low performance courses were entry-level

• Most high-engaged, high performance courses were advanced level courses.

• Regardless of Math, Science, or English subject-matter, entry level courses tended to have lower performance whether students were categorized as low-engaged or high-engaged.

• The reasons students enrolled in a course may influence their engagement level and performance. Student survey responses indicated that students who retook courses they have previously failed, tended to demonstrate lower engagement and lower performance.

Cluster Analysis – by Course

Page 42: VSS 2011 Data Mining (Thursday, 10:45)

• Positive correlation between engagement level and performance (higher engaged => higher performance)

• Engagement level and gender have stronger effects on student final grades than age, school district, school, and city. For most students, high engaged => high performance

• Overall, female students performed better than male students

• Students who were around 16 years old or younger performed better than those who were 18 years or older.

• Compared with other Blackboard components such as discussion board entries and content access, tab access had negative effects on student performance (higher tab access => lower performance)

Predictive Analysis – Pass Rate

Page 43: VSS 2011 Data Mining (Thursday, 10:45)

• Students with higher average final grades (> 73.25) had higher course satisfaction.

• Students who passed all courses or passed some of their courses had higher course satisfaction than all-failed students.

• Students who took two or more courses in Spring 2010, whether they passed those courses or not, had higher course satisfaction.

• Female students had higher course satisfaction than male students.

• Online behaviors (i.e., frequency of page accessed and number of discussion board entries) had minor effects on course satisfaction (higher frequency/number => higher course satisfaction).

Predictive Analysis – Course Satisfaction

Page 44: VSS 2011 Data Mining (Thursday, 10:45)

• Students with higher average final grades (> 73.25%) indicated higher instructor satisfaction.

• Students who took two or more courses in Spring 2010, whether they passed those courses or not, showed higher instructor satisfaction.

• Female students indicated higher instructor satisfaction than male students.

• Online behaviors (frequency of module accessed) had minor effects on instructor satisfaction (higher frequency => higher course satisfaction).

• Older students (> 17.5 years old) had higher instructor satisfaction.

Predictive Analysis – Instructor Satisfaction

Page 45: VSS 2011 Data Mining (Thursday, 10:45)

Regression Analysis

• Spring 2010 – Survey data + Data Mining• Purpose: To identify which variables contributed

significantly toward students’ average final grade.  • Positive (higher values, higher average final grade)

– Self-reported GPA (Likert-scale type of response)– Satisfaction toward positive experience (Likert-scale type of

response)– Satisfaction toward course content (Likert-scale type of

response)– Time on coursework (Likert-scale type of response)– Course access (based on LMS server log data)

• Negative (higher values, lower average final grade)– Effort and challenge (based on Likert-scale type of response on

the survey)– Tab access (based on LMS server log data)

Page 46: VSS 2011 Data Mining (Thursday, 10:45)

Conclusions

• Higher-engaged students usually had higher performance – limited to courses which were well-designed and

implemented. In this study, entry-level courses tended to have lower performance whether students were categorized as low engaged or high engaged high

• Satisfaction and engagement levels could not guarantee high performance

Page 47: VSS 2011 Data Mining (Thursday, 10:45)

Characteristics of successful students

• Female• 16.5 years or younger• Took one or two courses per semester• Took Foreign Language or Health course• Lived in larger cities

Page 48: VSS 2011 Data Mining (Thursday, 10:45)

Characteristics of at-risk students

• Male• 18 years or older• Took more than two courses per semester• Took entry-level courses in Math, Science, or

English• Lived in smaller cities

Page 49: VSS 2011 Data Mining (Thursday, 10:45)

AUTOMATED INSTRUCTIONAL RESPONSE SYSTEM (AIRS)

**We are looking for partners

Page 50: VSS 2011 Data Mining (Thursday, 10:45)