21
Informatiseringscentrum Patterns in Usage Data Victor Maijer University of Amsterdam 2 June 2006, Vancouver

Informatiseringscentrum Patterns in Usage Data Victor Maijer University of Amsterdam 2 June 2006, Vancouver

Embed Size (px)

Citation preview

Page 1: Informatiseringscentrum Patterns in Usage Data Victor Maijer University of Amsterdam 2 June 2006, Vancouver

Info

rma

tiser

ings

cent

rum Patterns in Usage Data

Victor Maijer University of Amsterdam

2 June 2006, Vancouver

Page 2: Informatiseringscentrum Patterns in Usage Data Victor Maijer University of Amsterdam 2 June 2006, Vancouver

Info

rma

tiser

ings

cent

rum Overview

- Introduction

- Data Mining

- Results

- Sakai & DM

- Conclusion

Page 3: Informatiseringscentrum Patterns in Usage Data Victor Maijer University of Amsterdam 2 June 2006, Vancouver

Info

rma

tiser

ings

cent

rum

Introduction• UvA founded in 1632 (Atheneum Illustre)

• 7 schools (faculty), 1518 study programmes

• 25.000 students, 3500 employees (2000 academic staff)

• Blackboard is our VLE since 1999, 13.000 users per day

• We run OSP and regard Sakai as a potential succesor of Blackbaord

Page 4: Informatiseringscentrum Patterns in Usage Data Victor Maijer University of Amsterdam 2 June 2006, Vancouver

Info

rma

tiser

ings

cent

rum

Strategic Information

Stakeholders need strategic information in order to make decisions

Stakeholders are:

Instructors

Administrators

Management

Support

Etc.

Page 5: Informatiseringscentrum Patterns in Usage Data Victor Maijer University of Amsterdam 2 June 2006, Vancouver

Info

rma

tiser

ings

cent

rum

Data Warehouse

Provides an integrated and total view of learning/collaboration systems

Makes the systems current and historical information easily available for decision making

Makes decision-support transactions possible without hindering operational systems

Presents a flexible and interactive source of strategic information

Page 6: Informatiseringscentrum Patterns in Usage Data Victor Maijer University of Amsterdam 2 June 2006, Vancouver

Info

rma

tiser

ings

cent

rum

Architecture

Page 7: Informatiseringscentrum Patterns in Usage Data Victor Maijer University of Amsterdam 2 June 2006, Vancouver

Info

rma

tiser

ings

cent

rum

Info for Administrators & Management

Page 8: Informatiseringscentrum Patterns in Usage Data Victor Maijer University of Amsterdam 2 June 2006, Vancouver

Info

rma

tiser

ings

cent

rum

Why I went mining

• I had data, a lot

• I did it before

• I wanted to do some fun stuff

Official reason (the one I tell my boss):

• We needed strategic information about how our VLE evolved

Page 9: Informatiseringscentrum Patterns in Usage Data Victor Maijer University of Amsterdam 2 June 2006, Vancouver

Info

rma

tiser

ings

cent

rum

What is Data Mining?

• Data mining is the extraction of implicit, previously unknown, and potentially useful information from data.

• Clustering is a data mining technique that applies when

instances are to be divided into natural groups.

Page 10: Informatiseringscentrum Patterns in Usage Data Victor Maijer University of Amsterdam 2 June 2006, Vancouver

Info

rma

tiser

ings

cent

rum

Example

Course Documents

ABBA 36

BEATLES 4

COLDPLAY 30

DARKHORSES 2

ELASTICA 24

Group Members Average Docs

A ABBA,

COLDPLAY,

ELASTICA

30

B BEATLES,

DARKHORSES

3

Page 11: Informatiseringscentrum Patterns in Usage Data Victor Maijer University of Amsterdam 2 June 2006, Vancouver

Info

rma

tiser

ings

cent

rum

Procedure

• Determine mining questions

• Determine source (tables)

• Verify by changing items via GUI

• Identify needed output formats for analysis

• Define SQL-queries

• Program scripts (Perl)

• Determine which clustering techniques you want to apply• Analyze (cluster).

‘Weka’ is an excellent JAVA OS tool for Data Mininghttp://www.cs.waikato.ac.nz/ml/weka/

Page 12: Informatiseringscentrum Patterns in Usage Data Victor Maijer University of Amsterdam 2 June 2006, Vancouver

Info

rma

tiser

ings

cent

rum

Domains clustered

• CourseSites and its content

• Users (instructors)

• Sessions (student)

Page 13: Informatiseringscentrum Patterns in Usage Data Victor Maijer University of Amsterdam 2 June 2006, Vancouver

Info

rma

tiser

ings

cent

rum

Site clusters

0

5

10

15

20

Cluster A Cluster B Cluster C Cluster D

DiscussionFora

Gradebook

Tests

Groups

0

20

40

60

80

100

120

140

Cluster A Cluster B Cluster C Cluster D

Content

Announcement

Basic usage (content + announcements)

Extended usage

Cluster Size(%) N

A 87 1547

B 7 122

C 4 66

D 2 43

0

20

40

60

80

100

120

140

Cluster A Cluster B Cluster C Cluster D

Content

Announcement

Basic usage (content + announcements)

Extended usage

Page 14: Informatiseringscentrum Patterns in Usage Data Victor Maijer University of Amsterdam 2 June 2006, Vancouver

Info

rma

tiser

ings

cent

rum

Content clusters

0

20

40

60

80

100

Cluster A Cluster B Cluster C Cluster D

Test

Asignment

Document

External Link

Folder

Cluster Size(%) N

A 91 1636

B 3 62

C 3 57

D 3 45

Page 15: Informatiseringscentrum Patterns in Usage Data Victor Maijer University of Amsterdam 2 June 2006, Vancouver

Info

rma

tiser

ings

cent

rum

Instructor activity clusters

0

100

200

300

400

500

600

Cluster A Cluster B Cluster C Cluster D

Announcements

Content

Dropbox

DiscussionBoard

Gradebook

Test

Cluster Size(%) N

A 88 1443

B 7 115

C 4 61

D 1 15

Page 16: Informatiseringscentrum Patterns in Usage Data Victor Maijer University of Amsterdam 2 June 2006, Vancouver

Info

rma

tiser

ings

cent

rum

Student session clusters

8,73,8

25,5

63,4

29,7

171,45

0

2040

60

80

100120

140

160180

Cluster A Cluster B Cluster C

Clicks

Dur(min)

Cluster Size(%) N

A 91 1294K

B 6 90K

C 2 32K

Page 17: Informatiseringscentrum Patterns in Usage Data Victor Maijer University of Amsterdam 2 June 2006, Vancouver

Info

rma

tiser

ings

cent

rum

Extra

• Female students click significant more than male students and have significant longer sessions

• Any ideas?

Page 18: Informatiseringscentrum Patterns in Usage Data Victor Maijer University of Amsterdam 2 June 2006, Vancouver

Info

rma

tiser

ings

cent

rum

Sakai & Data mining

• Our UvA Pilots were too small to analyze

• Content can be clustered

• Events are difficult to cluster (not enough logging compared to Blackboard

Page 19: Informatiseringscentrum Patterns in Usage Data Victor Maijer University of Amsterdam 2 June 2006, Vancouver

Info

rma

tiser

ings

cent

rum

Implications

• Put rumours into perspective

• Differentiate to user groups– Support– Functionality

Page 20: Informatiseringscentrum Patterns in Usage Data Victor Maijer University of Amsterdam 2 June 2006, Vancouver

Info

rma

tiser

ings

cent

rum

Conclusion

• Methods– Clustering can be used to discover usage patterns– You need appropiate hardware for preprocessing and

clustering

• Results– Basic Usage (Documents & Announcements)– Duration of a session is a couple of minutes– Extended Usage grows but is limited

• Sakai needs more logging if it wants to compete with Blackboard

• A Sakai warehouse would be nice

Page 21: Informatiseringscentrum Patterns in Usage Data Victor Maijer University of Amsterdam 2 June 2006, Vancouver

Info

rma

tiser

ings

cent

rum

Evolvement

Users

Usage

0