28
Using metrics and cluster analysis for analyzing learner video viewing behaviors in educational videos Alexandros Kleftodimos TEI of Western Macedonia & University of Macedonia Georgios Evangelidis University of Macedonia

Using Metrics and Cluster Analysis in Video Based Learning

Embed Size (px)

Citation preview

Using metrics and cluster analysis for analyzing learner video viewing

behaviors in educational videos

Alexandros KleftodimosTEI of Western Macedonia &

University of Macedonia

Georgios EvangelidisUniversity of Macedonia

On line video is increasingly used in educationand this is evident from a number of reports, research papers anduniversity initiatives

Reports & research papers• Video Use in Higher Education, designed and funded by

Copyright Clearance Center and conducted by Intelligent Television with the cooperation of New York University, 2009.

• PBS Public Broadcasting Service, “Digitally Inclined & Deepening connections”, 8th annual survey conducted by Grunwald Associates, 2010

• R.H. Kay, “Exploring the use of video podcasts in education:

A comprehensive review of the literature”,

Initiatives

Research on viewing behaviours

• There is an ongoing expansion of online video use in education and as a consequence research that focuses on how students view educational videos (viewing patterns) becomes of particular interest

• Most of the papers focusing in this research area use surveys and focus groups and only a few adopt a different approach and use streaming server log file analysis and datamining to analyze viewing behaviors.

A framework for recording, monitoring and analyzing learner behavior while watching and interacting with online

educational videos

Login

Web Page with Videos

Videos can be Linear or Interactive and they can contain

•Quiz questions

•Captions,

•Screen capture

•animations, etc

Descriptive information

•Users,

•videos,

•video sections,

•Interactive elements

•Quiz questions that exist

Session Information• Sessions started by users• Videos watched by users• Sections from a video watched

(sections are defined on topic basis)

• Actions, stop-resume, backward and forward jumps

• Interactive elements attempted• Quiz Questions answered

The Database contains

Two modules have been developed in order to aid the educator in monitoring individual learner

activity

– A module that provides a way to navigate through the database entries

– A module that provides graphs picturing the sequence of viewed videos, sections and interactive items (or quiz questions) attempted

Sections viewed by a learner

The Educational SettingsVideo lessons were created for supporting the courses

“Introduction to Computers” (1st semester),

“Communication Technologies” (6th semester)

Department of Digital Media and Communication at the Technological Institute of Western Macedonia

Video lessons were created and used as support material

The lectures that were supported by the video lessons had obligatory attendance and students were taught the same topics in class as well

The Educational Settings

• Linear videos cover all topics that the learner needs to know

• Interactive videos are videos where the learner is asked to perform a series of actions and is guided through the completion of certain tasks (e.g., embedding a video in a website, creating frames and links, etc).

• In interactive videos, the learner is gaining and consolidating his/her knowledge through interaction, as well as trial and error

• The duration of the videos ranges from 1 to 10 minutes, approximately. Most of the videos are below 3.5 minutes

Two webpages were created for accommodating the videos. Users were required to login to access these

webpages

e.g. Webpage for “Communication Technologies”

Educational settingsIntroduction to computers (for learning Microsoft Office

Word and Excel concepts)• 25 linear demonstration videos • 20 interactive videos • 2 videos consisting mostly of quizzes• fall semester of 2012-13 and 2013-2014

Communication Technologies (learning web page design with Dreamweaver)

• 28 linear demonstration videos, • 14 interactive videos • 28 document files. • spring semester of 2012-13.

Educational settings

• 350 learners accessed the video platform and viewed at least one video. – 147 of these learners were attending the course

“Introduction to Computers” (fall semesters 2012-2013 & 2013-2014) and

– 203 learners attended the course “Communication Technologies” (spring semester 2012-2013).

• These learners performed over 11,000 viewings for both linear and interactive videos

(9403 for linear videos)

Metrics & Cluster Analysis

• Provide Metrics to measure learner engagement and video popularity

• Derive these metrics from the dataset and use cluster analysis to obtain groupings of similar learner behaviors

Learner engagement• Number of videos started by

the learner• Percent of distinct videos

started by a learner • Percent of videos watched by

the learner = overall frames watched in videos / number of overall frames in videos

• Number of days that the viewings took place.

• Percent of interactive elements attempted by the learner = distinct elements attempted in all interactive videos / number of elements in all videos.

• Number of actions performed

Video popularity• Number of times the video is

started

• Number of learners that accessed the video

• Percentage of learners that accessed the video = number of distinct learners that visited the video at least once / total number of learners

• Number of times the video was abandoned (threshold number of 60%)

• Percent abandoned = number of times the video was abandoned / number of times the video was started.

• Number of actions

Cluster analysis with K means

Optimum number of clusters was obtained by using the “within sum of squared error SSE”

To obtain the optimum k, we start from one cluster and continue adding clusters until diminishing returns are achieved, meaning no significant reduction in within SSE.

Another important parameter in accepting the resulting clusters is that we should be able to interpret them

WEKA a known open source data mining software was used for the cluster analysis task

Clustering learners using learner engagement metrics

Attributes Full Data (350)

Cluster 0 (26~7%)

Cluster 1

(71 ~20%)

Cluster 2(84 ~24%)

Cluster 3(29 ~8%)

Cluster 4(140 ~

40%)

Percent of linearVideos watched by thelearner

37.4 22.7 73.9 44.7 79.4 8.5

Percent of Interactiveelements performedby the learner

16.5 73.5 7.1 6.3 78.3 3.8

Number of days thatthe viewings tookplace

3.6 3.8 6.2 3.1 5.9 2

Number of actionsPerformed

49.2 16.7 139.2 26.8 115.6 9.3

Learner engagement clusters

• Projecting these clusters to the 0-10 scale of marks obtained by the learners (there seems to be no visual

association).

0-10 scale of Marks

Clustering videos using video popularity metrics

Videos for the course “Communication Technologies”Number of iterations: 2Within cluster sum of squared errors: 4.00

Attributes Full Data (42) Cluster 0 (20 ~ 48% )

Cluster 1(22 ~ 52%)

Times started 144.8333 227.35 69.8182

Times abandoned 23.7857 37.45 11.3636

Number of learners 75.1429 107 46.1818

Actions per video 112.9286 199.5 34.2273

Clustering videos using video popularity metrics

• The clustering scheme revealed that interactive videos all fall in the second cluster (Cluster 1) meaning that interactive videos failed to attract learners.

• The second cluster also contained a few linear videos that covered topics of low difficulty or topics excluded from the examination syllabus

Research on viewing and listening styles

De Boer, Kommers & deBrock [9] viewingbehaviours:

• linear (watching a complete video once)

• elaborative (watching a complete video twice)

• maintenance rehearsal (watching part of a video repeatedly)

• zapping (skipping through video and watching brief segments).

Moran et al. [12] listeningProfiles

• straight-through (listening to an audio file sequentially),

• stop-start (performing a number of pauses and resumes, from the same point, at a number of points in the audio track),

• re-listen (stopping and going back at specific points to re-listen portions of the audio),

• skip ahead (pausing and moving forward, skipping portions of the audio),

• non sequential (going back and forth and listening to portions of the audio).

Traces of these styles are also found in our data (e.g. by observing the graphs)

Clustering viewing patterns (8 clusters formed)

Attribute Full Data (9403)

Cluster 13998~ 43%

Cluster 71702 ~18%

Cluster 31176 ~ 13%

Cluster 61018 ~11%

sequential_view 0.6062 1 1 0 0

forward_jump 0.3615 0 0 0.108 0.4902

backward_jump 0.4298 0 0 1.8206 0.2102

stop_resume 1.1362 0 0 2.4039 2.1552

drop_out 0.2737 0 1 0 0

Projecting the clusters to learners showed learners follow different styles from viewing to viewing

43% of the viewings were linear. Possible explanations

• Videos were used for revision purposes since almost half of the viewings took place only a week before the exams. Learners did not have the time to engage in more interactive ways of learning such as pausing the video at various points in the timeline to perform the taught actions in the software

• Videos had short duration and this can cause less media actions and dropouts

• The interface provided limited ways of accessing different video segments

• Videos were easy for the learners to understand and this caused less media actions

Key findings

Clustering Learnersa) three clusters of learners with

preference in linear videos but with different levels of engagement,

b) a smaller cluster of learners which gave equal emphasis to linear and interactive videos and

c) an even smaller cluster of learners who preferred a more interactive way of learning, by attempting mainly interactive videos.

d) The analysis showed no association between the clusters and the learner performance

Clustering Videostwo clusters of videos differentiating videos of high Interest from videos of low interest.

Clustering viewing behaviorsa) Sequential viewings formed the

largest cluster followed by sequential viewings that were dropped out.

b) A cluster containing non sequential viewings with mostly stop resume actions and backward jumps was the third bigger cluster

Future Steps

• Contact interviews with a sample of students from each learner cluster to get more insights on viewing behaviors

• Deploy open source tools to– use the framework principles with any video that

resides in open platforms (you tube, vimeo)– provide more meaningful visualizations related to

video viewing

– aggregate video content with content coming from webpages and open webservices (e.g googlemaps, google forms, slideshare, wikipedia)

Thank you for listening