Upload
hendrik-drachsler
View
2.025
Download
1
Embed Size (px)
DESCRIPTION
Presentation give at the dataTEl Marketplace at the RecSysTEL workshop a
Citation preview
1
2http://www.flickr.com/photos/traftery/4773457853/sizes/lby Tom Raftery
Free the data
2http://www.flickr.com/photos/traftery/4773457853/sizes/lby Tom Raftery
Why ?
2http://www.flickr.com/photos/traftery/4773457853/sizes/lby Tom Raftery
Because we will get new
insights
Hendrik DrachslerCentre for Learning Sciences and Technology@ Open University of the Netherlands
Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL
3
29.09.2010 RecSysTEL workshop at RecSys and ECTEL conference, Barcelona, Spain
#dataTEL
What is dataTEL
•dataTEL is a Theme Team funded by the STELLAR network of excellence.
•It address 2 STELLAR Grand Challenges 1. Connecting Learner2. Contextualisation
4
Recommender in TEL
5
The TEL recommender are a bit like this...
6
The TEL recommender are a bit like this...
6
We need to select for each application an appropriate recsys that fits its needs.
But...
7
Kaptain Koboldhttp://www.flickr.com/photos/kaptainkobold/3203311346/
“The performance results of different research efforts in recommender systems are hardly comparable.”
(Manouselis et al., 2010)
But...
7
Kaptain Koboldhttp://www.flickr.com/photos/kaptainkobold/3203311346/
“The performance results of different research efforts in recommender systems are hardly comparable.”
(Manouselis et al., 2010)
The TEL recommender experiments lack transparency. They need to be repeatable to test:
• Validity• Verification• Compare results
How others compare their recommenders
8
What kind of data set we can expect in TEL
9
Find an appropriate recommendation system for a set of goals, tasks, limitations, and constraints.More accurately – choose the most appropriate
system from a set of candidatesMajor claim: there is no silver bullet – a
system that is both the most accurate, the fastest, the cheapest, …
We need to select for each application an appropriate recsys that fits its needs.
Graphic by J. Cross, Informal learning, Pfeifer (2006)
The Long Tail
10Graphic Wilkins, D., (2009); Long tail concept Anderson, C. (2004)
The Long Tail
10Graphic Wilkins, D., (2009); Long tail concept Anderson, C. (2004)
The Long Tail of Learning
The Long Tail
10Graphic Wilkins, D., (2009); Long tail concept Anderson, C. (2004)
The Long Tail of LearningFormal Data
Informal Data
11
Guidelines for Data Set Aggregation
Justin Marshall, Coded Ornament by rootoftwo http://www.flickr.com/photos/rootoftwo/267285816
11
Guidelines for Data Set Aggregation
1. Create a data set that realistically reflects the variables of the learning setting.
Justin Marshall, Coded Ornament by rootoftwo http://www.flickr.com/photos/rootoftwo/267285816
11
Guidelines for Data Set Aggregation
1. Create a data set that realistically reflects the variables of the learning setting.
2. Use a sufficiently large set of user profiles
Justin Marshall, Coded Ornament by rootoftwo http://www.flickr.com/photos/rootoftwo/267285816
11
Guidelines for Data Set Aggregation
1. Create a data set that realistically reflects the variables of the learning setting.
2. Use a sufficiently large set of user profiles
3. Create data sets that are comparable to others
Justin Marshall, Coded Ornament by rootoftwo http://www.flickr.com/photos/rootoftwo/267285816
11
Guidelines for Data Set Aggregation
1. Create a data set that realistically reflects the variables of the learning setting.
2. Use a sufficiently large set of user profiles
3. Create data sets that are comparable to others
Justin Marshall, Coded Ornament by rootoftwo http://www.flickr.com/photos/rootoftwo/267285816
Standardize research on recommender systems in TEL
Five core questions:
1.How can data sets be shared according to privacy and legal protection rights?
2.How to development a policy to use and share data sets?3.How to pre-process data sets to make them suitable for
other researchers?4.How to define common evaluation criteria for TEL
recommender systems?5.How to develop overview methods to monitor the
performance of TEL recommender systems on data sets?
12
dataTEL::Objectives
13
1. Protection Rights
13
1. Protection Rights
13
1. Protection Rights
O V E R S H A R I N G
13
1. Protection Rights
O V E R S H A R I N GWere the founders of PleaseRobMe.com actually allowed to grab the data from the web and present it in that way?
13
1. Protection Rights
O V E R S H A R I N GWere the founders of PleaseRobMe.com actually allowed to grab the data from the web and present it in that way?
Are we allowed to use data from social handles and reuse it for research purposes?
13
1. Protection Rights
O V E R S H A R I N GWere the founders of PleaseRobMe.com actually allowed to grab the data from the web and present it in that way?
Are we allowed to use data from social handles and reuse it for research purposes?
and there is more company protection rights!
14
2. Policies to use Data
14
2. Policies to use Data
14
2. Policies to use Data
14
2. Policies to use Data
14
2. Policies to use Data
14
2. Policies to use Data
14
2. Policies to use Data
15
3. Pre-Process Data Sets
For formal data sets from LMS:
1. Data storing scripts2. Anonymisation scripts
For informal data sets:
1. Collect data2. Process data3. Share data
16
4. Evaluation Criteria
Kirkpatrick model by Manouselis et al. 2010
Combine approach by Drachsler et al. 2008
16
4. Evaluation Criteria
Kirkpatrick model by Manouselis et al. 2010
Combine approach by Drachsler et al. 2008
1. Accuracy2. Coverage3. Precision
16
4. Evaluation Criteria
Kirkpatrick model by Manouselis et al. 2010
Combine approach by Drachsler et al. 2008
1. Accuracy2. Coverage3. Precision
1. Effectiveness of learning2. Efficiency of learning 3. Drop out rate4. Satisfaction
16
4. Evaluation Criteria
Kirkpatrick model by Manouselis et al. 2010
Combine approach by Drachsler et al. 2008
1. Accuracy2. Coverage3. Precision
1. Effectiveness of learning2. Efficiency of learning 3. Drop out rate4. Satisfaction
1. Reaction of learner2. Learning improved 3. Behaviour 4. Results
16
4. Evaluation Criteria
Kirkpatrick model by Manouselis et al. 2010
Combine approach by Drachsler et al. 2008
1. Accuracy2. Coverage3. Precision
1. Effectiveness of learning2. Efficiency of learning 3. Drop out rate4. Satisfaction
1. Reaction of learner2. Learning improved 3. Behaviour 4. Results
16
4. Evaluation Criteria
Kirkpatrick model by Manouselis et al. 2010
Combine approach by Drachsler et al. 2008
1. Accuracy2. Coverage3. Precision
1. Effectiveness of learning2. Efficiency of learning 3. Drop out rate4. Satisfaction
1. Reaction of learner2. Learning improved 3. Behaviour 4. Results
17
5. Data Set Framework to Monitor Performance
Formal Datasets Informal
Data A Data B Data C
Algorithms: Algoritmen A Algoritmen B Algoritmen C
Models: Learner Model A Learner Model B
Measured attributes: Attribute A Attribute B Attribute C
Algorithms: Algoritmen D Algoritmen E
Models: Learner Model C Learner Model E
Measured attributes: Attribute A Attribute B Attribute C
Algorithms: Algoritmen B Algoritmen D
Models: Learner Model A Learner Model C
Measured attributes: Attribute A Attribute B Attribute C
18
Join us for a Coffee ...http://www.teleurope.eu/pg/groups/9405/datatel/
19
Upcoming event ...Workshop: dataTEL- Data Sets for Technology Enhanced Learning
Date: March 30th to March 31st, 2011
Location: Ski resort La Clusaz in the French Alps, Massif des Aravis
Funding: Food and lodging for 3 nights for 10 selected participants
Submissions: For being funded, please send extended abstracts to http://www.easychair.org/conferences/?conf=datatel2011
Deadline for submissions: October 25th, 2010
CfP: http://www.teleurope.eu/pg/pages/view/46082/
This silde is available at:http://www.slideshare.com/Drachsler
Email: [email protected]
Skype: celstec-hendrik.drachsler
Blogging at: http://www.drachsler.de
Twittering at: http://twitter.com/HDrachsler
20
Many thanks for your interests