Upload
cordelia-weeks
View
48
Download
0
Embed Size (px)
DESCRIPTION
Model-based Context-Aware Recommendation. Intelligent Database Systems Lab School of Computer Science & Engineering Seoul National University, Seoul, Korea Dongjoo Lee. Center for E -Business Technology Seoul National University Seoul, Korea. Introduction. - PowerPoint PPT Presentation
Citation preview
Model-based Context-Aware Recommendation
Intelligent Database Systems LabSchool of Computer Science & Engineering
Seoul National University, Seoul, Korea
Dongjoo Lee
Center for E-Business TechnologySeoul National UniversitySeoul, Korea
Copyright 2008 by CEBT
Introduction
Traditional recommendation methods
Content-based recommendation
– What’s the features that can describe the item?
Collaborative filtering
– Item based CF
– User based CF
– Hybrid CF
Issues in using context information in recommendation
1) What is context information?
2) How to use context information?
3) Is it really useful to use context information in recommen-dation?
2
0.9 0.3 0.20.8 0.50.7 0.6 0.7
0.6 0.70.4 0.3 0.40.1 0.8 0.7 0.5
item
user
Copyright 2008 by CEBT
Recommendation
Recommendation
Context-Aware Recommendation
Model-based Context-Aware Recommendation
1) Context abstraction
2) Item abstraction
3) Model construction
1) Context-item association rule mining
2) User profiling
3
)|()( umPmscore like
),|()( ucmPmscore like
Copyright 2008 by CEBT
Model-based Context-Aware Recommenda-tion
4
mg1
User
Item
cg2
cg1
cgm
User and ContextCluster
mg2
mgn
… … …
ItemCluster
MGmg CGcg
MGmg
like
uccgPcgmgPmgmP
ucmgPmgmP
ucmPmscore
),|()|()|(
),|()|(
),|()(
Active context
1
32
132
Model based recommendation
groupcontext :
group item :
context :
user :
item target :
cg
mg
c
u
m
Copyright 2008 by CEBT
r1: (Male, 40대 , Winter, Home) -> (Jazz, 16Bit); <0.07, 0.7>r2: (20대 , Night) -> (Rock); <0.03, 0.3>r3: (20대 , Night) -> (Dance); <0.02, 0.4>
Learning from Logs
5
Users listen logs
User Context MusicAge Gen-
derJob … Time Loca-
tion… Title Genr
eBit Artis
tAl-
bum…
… … M1
Materialize
User Context MusicMale
Fe-male
10대
… Win-ter
Home
… Rock Jazz Bal-lad
16bit
4bit …
1.0 0.0 0.9 0.8 1.0 1.0 0.2 0.0 0.0 1.0
…
FuzzyAbstraction
(Quantization)
Learning (association rule mining)
Title Genre
Bit Artist
Al-bum
…
Users MusicUser Music Time Location
u1 … M1
u1 …
u2
u1
u3
u3
u2
Age Gen-der
Job …
… …
…
mg1cg1
cg2
cgm
mg2
mgn
Scoring
uc, m
Copyright 2008 by CEBT
Definition
Database Definition
Fuzzy Set
6
j
j
aj
mji
aj
mj
ni
Dv
vvvvt
Da
aaaaA
ttttT
),...,,...,,(
: attribute ofDomain
},...,,...,,{
},...,,...,,{
21
21
21
m
ja
lakaaaa
jf
jkaka
j
jjjjj
kja
jj
FF
ffffF
mvm
matfmTtmtf
1
,,2,1,
,,
},...,,...,,{
]1,0[:
}]),[(,|)),{(
,
Copyright 2008 by CEBT
Definition (cont’d)
Fuzzy Predicate
Fuzzy Association Rule
r indicate that if pA (t) satisfied, we can imply pB (t). This means that if a
tuple t is a member of fuzzy set A, we can say it is also a member of fuzzy set B.
Association Rule Mining
The goal of association rule mining is to find all association rules that has confidence and support bigger than or equal to minimum confidence and minimum support.
7
kaF jkjaFttp ,:)(
,
})(,)(|{
)()(:
rconfidencersupportrR
tptpr BA
Copyright 2008 by CEBT
Use Fuzzy Association Rule in Recommenda-tion
8
Copyright 2008 by CEBT
Learning: Association Rule Mining
9
User Context MusicMale
Fe-male
10대
20대
Win-ter
… Fall Morn-ing
Night
Home
Of-fice
Warm Cold … Rock Jazz Bal-lad
16bit
4bit …
1.0 0.0 0.9 0.1 0.8 1.0 1.0 0.0 0.8 0.3 1.0 0.2 0.0 0.0 1.0
1.0 0.0
Example
Copyright 2008 by CEBT
Formal Expression
10
Copyright 2008 by CEBT 11
1) Context Abstraction – Context Representation
Filter
sensed data
filtered data
concept
context
context
context
context
Coolcontext
Filter
Sensor
Sensor
Context Concepts
fuzzy membership functionfiltered data
20080910 14:34:00, 7℃= {(cool, 0.5), (cold, 0.5),(fall,0.9),(afternoon,1.0),…}
…
}10 ,|),{( wandRwCCccwccc
Calculate weight by using fuzzy membership functions
Copyright 2008 by CEBT
1) Context Abstraction – Fuzzy Join
12
Concept
Hot
Cool
Cold
…
Temp.
39
28
17
7
-1
-20
f
Temp.Con-cept
Fuzzi-ness
39 Hot 0.98
28 Hot 0.84
17 Hot 0.20
7 Hot 0
-1 Hot 0
-20 Hot 0
39 Cool 0
28 Cool 0.1
17 Cool 0.87
7 Cool 0.5
-1 Cool 0.05
-20 Cool 0
39 Cold 0
28 Cold 0
17 Cold 0
7 Cold 0.5
-1 Cold 0.87
-20 Cold 0.99
Temp.Con-cept
Fuzzi-ness
39 Hot 0.98
28 Hot 0.84
17 Hot 0.20
17 Cool 0.87
7 Cool 0.5
7 Cold 0.5
-1 Cold 0.87
-20 Cold 0.99
Temperature
Fuzzy Join Functions
HotCoolCold
Fuzz
iness
Context Data Abstract ContextConcepts
Product of two relation Fuzzy Join Result
α-cut may improve query performance
Copyright 2008 by CEBT
1) Context Abstraction – Fuzzy Equi-Join
Normal Equi-Join
Fuzzy Equi-Join
The most important thing is fuzzy function (≈) that com-pares two values
Obtain fuzzy membership degree
Performance Improvement
– Sort-Merge Join using partial order of fuzzy similarity
13
SELECT T1.*, T2.*FROM table1 T1 JOIN table2 T2 ON T1.a = T2.b
SELECT T1.*, T2.*, FuzzyValueFROM table t1 JOIN table t2 ON t1.a ≈ t2.bWHERE FuzzyValue > THETA
Copyright 2008 by CEBT 14
1) Context Abstraction – Periodic Membership Function
Modified Cosine Function
Because temporal value is periodic , periodic function is appropriate for calculating membership degree to the temporal concepts.
time
f(x) = max(min(10.0 * cos( 2pi * (x - (150) ) / 1440 - 8.5), 1, 0)
dawn
midnight
f(x) = max(min(7.0 * cos( 2pi * (x - (60) ) / 1440 - 5.5), 1, 0)
f(x) = max(min(4.0 * cos( 2pi * (x - (172800) ) / 525600 - 2.4), 1, 0)
Spring
)0),1,)/2cos(max(min()( dcbxaxf
Dawn, Morning, Noon, Afternoon, Evening, Night, Mid-night
Monday, Tuesday, Wednesday, Thursday, Friday, Satur-day, Sunday
Spring, Summer, Autumn, Winter
New Year’s Day, Valentine’s Day, White Day, Children’s Day, Parents’ Day, Christmas
Copyright 2008 by CEBT
Context Concept and Membership Function
type idtemporal con-
ceptsbase index a b
c ( 주기 )
d alpha-minalpha-max
f(x) = max(min(a * cos( 2pi * (x - b ) / c) - d, 1, 0)
season
1 spring 2008-01-01 00:00 0 3 151200 525600 1.7 88663 213737 f(x) = max(min( 3 * cos( 2pi * ( x - 151200 ) / 525600) - 1.7, 1,) 0)2 summer 2008-01-01 00:00 0 3 280800 525600 1.7 218263 343337 f(x) = max(min( 3 * cos( 2pi * ( x - 280800 ) / 525600) - 1.7, 1,) 0)3 autumn 2008-01-01 00:00 0 3 410400 525600 1.7 347863 472937 f(x) = max(min( 3 * cos( 2pi * ( x - 410400 ) / 525600) - 1.7, 1,) 0)4 winter 2008-01-01 00:00 0 3 21600 525600 1.7 -40937 84137 f(x) = max(min( 3 * cos( 2pi * ( x - 21600 ) / 525600) - 1.7, 1,) 0)
month
5 feburary 2008-01-01 00:00 0 20 21600 525600 18.5 -4965 48165 f(x) = max(min( 20 * cos( 2pi * ( x - 21600 ) / 525600) - 18.5, 1,) 0)
6 january 2008-01-01 00:00 0 20 66240 525600 18.5 39675 92805 f(x) = max(min( 20 * cos( 2pi * ( x - 66240 ) / 525600) - 18.5, 1,) 0)
7 march 2008-01-01 00:00 0 20 106560 525600 18.5 79995 133125 f(x) = max(min( 20 * cos( 2pi * ( x - 106560 ) / 525600) - 18.5, 1,) 0)
8 april 2008-01-01 00:00 0 20 149760 525600 18.5 123195 176325 f(x) = max(min( 20 * cos( 2pi * ( x - 149760 ) / 525600) - 18.5, 1,) 0)
9 may 2008-01-01 00:00 0 20 194400 525600 18.5 167835 220965 f(x) = max(min( 20 * cos( 2pi * ( x - 194400 ) / 525600) - 18.5, 1,) 0)
10 june 2008-01-01 00:00 0 20 239040 525600 18.5 212475 265605 f(x) = max(min( 20 * cos( 2pi * ( x - 239040 ) / 525600) - 18.5, 1,) 0)
11 july 2008-01-01 00:00 0 20 282240 525600 18.5 255675 308805 f(x) = max(min( 20 * cos( 2pi * ( x - 282240 ) / 525600) - 18.5, 1,) 0)
12 august 2008-01-01 00:00 0 20 326880 525600 18.5 300315 353445 f(x) = max(min( 20 * cos( 2pi * ( x - 326880 ) / 525600) - 18.5, 1,) 0)
13 september 2008-01-01 00:00 0 20 371520 525600 18.5 344955 398085 f(x) = max(min( 20 * cos( 2pi * ( x - 371520 ) / 525600) - 18.5, 1,) 0)
14 october 2008-01-01 00:00 0 20 416160 525600 18.5 389595 442725 f(x) = max(min( 20 * cos( 2pi * ( x - 416160 ) / 525600) - 18.5, 1,) 0)
15 november 2008-01-01 00:00 0 20 459360 525600 18.5 432795 485925 f(x) = max(min( 20 * cos( 2pi * ( x - 459360 ) / 525600) - 18.5, 1,) 0)
16 december 2008-01-01 00:00 0 20 504000 525600 18.5 477435 530565 f(x) = max(min( 20 * cos( 2pi * ( x - 504000 ) / 525600) - 18.5, 1,) 0)
specialday
17 new year's day 2008-01-01 00:00 0 365 720 525600 364.2 -2672 4112 f(x) = max(min( 365 * cos( 2pi * ( x - 720 ) / 525600) - 364.2, 1,) 0)
18 valentine's day 2008-01-01 00:00 0 365 64080 525600 364.2 60688 67472 f(x) = max(min( 365 * cos( 2pi * ( x - 64080 ) / 525600) - 364.2, 1,) 0)
19 christmas 2008-01-01 00:00 0 365 516420 525600 363.8 511443 521037 f(x) = max(min( 365 * cos( 2pi * ( x - 516420 ) / 525600) - 363.8, 1,) 0)
day of week
20 Monday 2008-01-01 00:00 1440 7 720 10080 5.6 -102 1542 f(x) = max(min( 7 * cos( 2pi * ( x - 720 ) / 10080) - 5.6, 1,) 0)21 Tuesday 2008-01-01 00:00 1440 7 2160 10080 5.6 1338 2982 f(x) = max(min( 7 * cos( 2pi * ( x - 2160 ) / 10080) - 5.6, 1,) 0)22 Wednesday 2008-01-01 00:00 1440 7 3600 10080 5.6 2778 4422 f(x) = max(min( 7 * cos( 2pi * ( x - 3600 ) / 10080) - 5.6, 1,) 0)23 Thursday 2008-01-01 00:00 1440 7 5040 10080 5.6 4218 5862 f(x) = max(min( 7 * cos( 2pi * ( x - 5040 ) / 10080) - 5.6, 1,) 0)24 Friday 2008-01-01 00:00 1440 7 6480 10080 5.6 5658 7302 f(x) = max(min( 7 * cos( 2pi * ( x - 6480 ) / 10080) - 5.6, 1,) 0)25 Saturday 2008-01-01 00:00 1440 7 7920 10080 5.6 7098 8742 f(x) = max(min( 7 * cos( 2pi * ( x - 7920 ) / 10080) - 5.6, 1,) 0)26 Sunday 2008-01-01 00:00 1440 7 9360 10080 5.6 8538 10182 f(x) = max(min( 7 * cos( 2pi * ( x - 9360 ) / 10080) - 5.6, 1,) 0)27 holiday 2008-01-01 00:00 1440 7 8496 10080 4 7059 9933 f(x) = max(min( 7 * cos( 2pi * ( x - 8496 ) / 10080) - 4, 1,) 0)
time of day
28 dawn 2008-01-01 00:00 0 24 240 1440 22.5 174 306 f(x) = max(min( 24 * cos( 2pi * ( x - 240 ) / 1440) - 22.5, 1,) 0)29 sunrise 2008-01-01 00:00 0 24 360 1440 22.5 294 426 f(x) = max(min( 24 * cos( 2pi * ( x - 360 ) / 1440) - 22.5, 1,) 0)30 morning 2008-01-01 00:00 0 12 480 1440 10.5 386 574 f(x) = max(min( 12 * cos( 2pi * ( x - 480 ) / 1440) - 10.5, 1,) 0)31 forenoon 2008-01-01 00:00 0 6 720 1440 22.5 466 734 f(x) = max(min( 6 * cos( 2pi * ( x - 720 ) / 1440) - 22.5, 1,) 0)32 noon 2008-01-01 00:00 0 24 720 1440 22.5 654 786 f(x) = max(min( 24 * cos( 2pi * ( x - 720 ) / 1440) - 22.5, 1,) 0)33 afternoon 2008-01-01 00:00 0 6 900 1440 4.5 766 1034 f(x) = max(min( 6 * cos( 2pi * ( x - 900 ) / 1440) - 4.5, 1,) 0)34 sunset 2008-01-01 00:00 0 24 180 1440 22.5 1014 1146 f(x) = max(min( 24 * cos( 2pi * ( x - 180 ) / 1440) - 22.5, 1,) 0)35 evening 2008-01-01 00:00 0 8.5 1140 1440 5.5 960 1320 f(x) = max(min( 8.5 * cos( 2pi * ( x - 1140 ) / 1440) - 5.5, 1,) 0)36 night 2008-01-01 00:00 0 4 0 1440 2.3 -182 182 f(x) = max(min( 4 * cos( 2pi * ( x - 0 ) / 1440) - 2.3, 1,) 0)37 midnight 2008-01-01 00:00 0 6 60 1440 4.7 -60 180 f(x) = max(min( 6 * cos( 2pi * ( x - 60 ) / 1440) - 4.7, 1,) 0)
15
f(x) = max(min(a * cos( 2pi * (x - b ) / c) - d, 1, 0)
Copyright 2008 by CEBT
Fuzzy join query
16
select log_id, concept_id, fuzziness from ( select t2.id log_id, t2.user_id, t2.track_id, t1.id concept_id, t1.name, t2.m_date, t2.m_val, case when t1.a*cos(2* 3.141592 *(t2.M_VAL + t1.IDX-t1.b)/t1.c) - t1.d > 1 then 1 when t1.a*cos(2* 3.141592 *(t2.M_VAL + t1.IDX-t1.b)/t1.c) - t1.d < 0 then 0 else t1.a*cos(2* 3.141592 *(t2.M_VAL + t1.IDX-t1.b)/t1.c) - t1.d end fuzziness from test_lfm_temp_concept t1, test_lfm_user_track t2 order by log_id asc ) where fuzziness > 0.5
Copyright 2008 by CEBT 17
3) Music Abstraction – Music Representation
My Fist Your Face
Rose
I've Got to See You Again
rock
alternative
seen live
indie
90s
electro
romance
Thinking of You
Sleeping Beauty
jazz
Sleeping Beauty = {(rock, 1.0), (90s, 0.9)}Rose = {(rock, 1.0), (indie, 0.8)}
…
Membership degree calculation
song annotations
Representation
}10 ,|),{( wandRwDCdcwdcm
DC: Domain concepts
m: Target items
0005.0log3
0005.0log3)) min ((max
)min (log
2
22
,
idc
ij
dcm
dccount
dccount
wji
Copyright 2008 by CEBT
Fuzzy join query
18
select a.track_id, c.id tag_id, a.tag, a.count/b.tag_count_max maxratio, (log(2, a.count/b.tag_count_max) + 5 - log(2, 0.0005))/(5-log(2,0.0005)) fuziness from test_lfm_track_tag a join test_lfm_track b on a.track_id = b.id join test_lfm_tag c on a.tag = c.tag where a.count/b.tag_count_max > 0.0001
Copyright 2008 by CEBT
5) User Profiling – Fuzzy Join and Aggrega-tion
19
User Time Music
urisj27 2008.02.25 8:30 Beautiful Day
Music Tag Fuzziness
Beautiful Day Dreamy
User listen logs with context
Music with annotations
Concept Fuzzy Function
Morning
Afternoon
… ...
Context conceptsand fuzzy function
User Time Music Context Fuzziness
urisj27 2008.02.25
8:30Beautiful Day Morning 1.0
Fuzzy-equivalent Join (Time)
Equivalent Join (Music)
User Context DomainFuzzi-ness
Fuzziness
urisj27 Morning Dreamy 1.0
User Context Domainp(mg|cg,u)
urisj27 Morning Dreamy 0.7
Aggregation
i
ji
ji
cgcgluulLl
mgmglcgcgluulLl
ijmgcgu fuzzinessl
fuzzinessl
ucgmgpw
..
...
,, .
.
),|(
},,|),,,({ MGmgCGcgUufuzzinessmgcgulL
Context groupingItem grouping
Copyright 2008 by CEBT 20
Contribution
Model based context aware recommendation
Do not depends on ambiguous relationships among concepts, users, and items
Not from the name or description
But from the semantic annotations, tags
Abstract context concepts by using fuzzy membership func-tions
Distinguish context concepts from target domain concepts
There is no reason to put them together
Even though they have the same name, we have to consider them as different.
– Domain concepts are only meaningful when they are used in that domain. They may have different meaning when they are used in different domains.
Copyright 2008 by CEBT 21
How to Evaluate?
How to evaluate effect of the context?
Divide logs into training set and test set
Give the same information and see the results of no context using path and context using path
– If recommended song list contains the song, it’s ok.
– Top k recommendation results.
Copyright 2008 by CEBT 22
Publication Schedule
Target conference
The 2009 IEEE/WIC/ACM International Conference on Web Intelligence (WI ’09)
– Info: 15-18 September 2009, Milan Italy
– Due date: April 10, 2009
– Notification: June 3, 2009
– Format: IEEE 2 column format, max 8 pages
Copyright 2008 by CEBT 23
Additional Issues
Crawling
Data sampling
Relationship extraction
Approximate string matching – didn’t apply
Copyright 2008 by CEBT 24
Crawling last.fm
735,000 users
South Korea, North Korea, Japan, United Kingdom, USA
5,855,000 tracks
duplicated multiple tracks
913,720/3,322,000 …… still crawling
69,725,000 user listen recent tracks
69,000,000 listen tracks of thousands of users
6,659,000 user loved tracks
2,311,000 user tags
Copyright 2008 by CEBT
Data Sampling
미국 국적에 음악을 많이 들은 상위 100 여명 정도에 대해서만 테스트 select * from lfm_user where country = 'United States' and
track_count2 > 1000 order by track_count2 desc
상위 100 여명 정도가 많이 들은 노래 선정 select * from lfm_rel_user_track_2 where user_id = 'thetasteofink‘
Tag 읽어오는 쿼리 select a.id, a.artist, a.name, b.tag, b.count from
test_lfm_track_match a join lfm_rel_track_tag b on a.id = b.track_id
상위 100 여명 정도가 많이 들은 노래에 있는 tag 로 음악 추상화 Artist, album 을 어떻게 활용할지는 일단 보류
앨범 이름 , 곡명이 일치하지 않는 것 어떻게 처리할지 고려하자 .
Approximate string matching 을 적용하는 것은 또 다른 문제
25
Copyright 2008 by CEBT
Experiments
Two domains
Music domain
– Last.fm
Movie domain
– iMDB
They have different characteristics
26
Copyright 2008 by CEBT 27
Data Sets
Data Table Name Description Size
User TEST_MST_USER 사용자 정보 92
Track TEST_MST_TRACK 곡 정보 ( 사용자가 듣지 않은 곡은 배제 ) 74058
Tag TEST_MST_TAG 태그 245092
Track-tag TEST_REL_TRACK_TAG_FUZZINESS 곡에 부착된 태그 2125879
User listen logs TEST_REL_USER_TRACK 사용자가 음악을 들은 로그 587037
Temporal concepts TEST_MST_TEMP_CONCEPT시간에 관련된 개념 및 시간 값으로부터 각 개념에 대한 소속 값을 구하기 위한 membership degree func-tion
37
Log-temporal con-cepts
TEST_REL_LOG_CONCEPT_FUZZINESS
각 로그 시점에서 관련된 시간 개념과 이에 소속 값 2929362
Sampling from last.fm