Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
ON SOCIAL-TEMPORAL GROUP
QUERY WITH ACQUAINTANCE
CONSTRAINT D.N Yang
Y.L Chen
W.C Lee
M.S Chen
Sep 2011
Presented by
Roi Fridburg
1
AGENDA
Activity Planning
Social Graphs
Proposed Algorithms
SGSelect
SGTSelect
Experimental Results
Efficiency and Reliability
Conclusions
2
ACTIVITY PLANNING
How do we plan an activity?
Who do we invite?
3
Familiarity with the initiator
Social relations with the
members
Available time
ACTIVITY PLANNING – CONT.
PROPOSAL: • SOCIAL-TEMPORAL GROUP QUERY (STGQ)
TO FIND ACTIVITY TIME AND ATTENDEES WITH THE MINIMUM TOTAL SOCIAL DISTANCE TO THE INITIATOR INCORPORATING AN ACQUAINTANCE CONSTRAINT
4
Definitions SGQ STGQ Results Conclusions
ILLUSTRATING EXAMPLE
SOCIAL GRAPH OF CASEY AFFLECK
q = Casey Affleck
5
Definitions SGQ STGQ Results Conclusions
ILLUSTRATING EXAMPLE
SOCIAL GRAPH OF MATT DAMON
q = Matt Damon
k=0 each attendee knows everybody else F: {M.Damon, K.Costner, R.De Niro}
F: {M.Damon, R.De Niro, A.Jolie}
F: {M.Damon, N.Cage, J.Roberts}
F: {M.Damon, R.De Niro, J.Roberts}
Which one should we choose?
Answer: The one that has the minimum total distance
between q and every vertex v. → min dv,qv∈F
6
d𝑣𝑖,𝑣𝑗 (𝑖≠𝑗) is not counted
ILLUSTRATING EXAMPLE – CONT.
SOCIAL GRAPH OF MATT DAMON
q = Matt Damon
k=1 each attendee has at most 1
unfamiliar attendee
For example:
F: {M.Damon, K.Costner, R.De Niro, A.Jolie}
There is no solution for k=0 in this case! 7
BASIC TERMS AND PARAMETERS
q – activity initiator
G – social graph of q
p – activity size number of expected attendees
s – social radius Scope of candidate attendees
All acquainted attendees are located no more than s edges away from the initiator on his social graph
k – acquaintance constraint Social relationship between attendees
Each attendee can have at most k other unacquainted attendees
m – activity length of time (slots)
d – social distance Distance between vertices in the group
How “close” the vertices are to each other 8
ACTIVITY EXAMPLES
A person has a given number of tickets for a movie
and would like to invite some friends
Set of mutually close friends and the time that all of them
are available
small k all attendees know each other very well
s=1 all attendees are acquainted with him
9
ACTIVITY EXAMPLES – CONT.
A person would like to arrange a party and invite
some friends
large s friends of friends can be also invited
large k more diverse event
10
WHERE IS ALL THE DATA TAKEN FROM?
Social network websites – social relations and
contact information
Web collaboration tools – shared available time
Google calendar
Enterprise tools
MS Outlook
11
ILLUSTRATING EXAMPLE
WITH TIME CONSTRAINT
{v7}
{v7,v2}
{v7,v2,v3
}
{v7,v2,v3,v4}
{v7,v2,v3,v6}
{v7,v2,v3,v8}
{v7,v2,v4
}
{v7,v2,v4,v6}
{v7,v2,v4,v8}
{v7,v2,v6
} {v7,v2,v6,v8}
{v7,v3}
{v7,v3,v4
}
{v7,v3,v4,v6}
{v7,v3,v4,v8}
{v7,v3,v6
} {v7,v3,v6,v8}
{v7,v4} {v7,v4,v6
} {v7,v3,v6,v8}
C53= 10
d = 64
d = 65
t1
t2
t3
t4
t5
t6
v1
o o o o
v2
o o o o
v3
o o o o o
v4
o o o o o o
v5
o o o o
v6
o o o o
v7
o o o o o
v8
o o o o o 13
COMPLEXITY AND CANDIDATE GROUPS
Simple approach: consider every possible p
attendees and find the total social distance
Needs to evaluate Cf − 1p − 1
candidate groups
f is the number of candidate attendees
If an initiator would like to invite 10 attendees out of
his 100 friends, the number of candidate groups is
in the order of 𝟏𝟎𝟏𝟑
Scenario considers only the friends of the user
This problem is NP-hard
14
SOCIAL GROUP QUERY (SGQ)
PROBLEM DEFINITION
Given an activity initiator q and his social graph G=(V,E)
v is a candidate attendee
eu,v is the distance between u and v representing their
social closeness
Need to find SGQ(p,s,k) which satisfies all the conditions
15
Definitions SGQ STGQ Results Conclusions
SOCIAL GROUP QUERY (SGQ)
ALGORITHM DESIGN
Proposed algorithm: SGSelect
Step 1: derive feasible graph GF=(VF, EF) from G which
satisfies:
Social radius constraint
dv,q is the minimal social distance between v and q
q = Ryan Giggs
dv3,q =? dv3,q = 8
dv6,q =? dv3,q = 16
16
RADIUS GRAPH EXTRACTION
How to create the feasible graph?
Define di v, q
the i-edge minimum distance between the
vertex v and the vertex q
1 ≤ i ≤ s
17
RADIUS GRAPH EXTRACTION – CONT.
Target
G V, E → GF VF, EF
dv,q satisfying s constraint
1. 𝑑𝑞,𝑞0 = 0, 𝑑𝑢,𝑞
0 = ∞ 𝑓𝑜𝑟 𝑢 ≠ 𝑞;
2. 𝑓𝑜𝑟 𝑖 = 1 𝑡𝑜 𝑠 𝑑𝑜
3. 𝑑𝑞,𝑞𝑖 = 0;
4. 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑣𝑒𝑟𝑡𝑒𝑥 𝑢 ≠ 𝑞 𝑖𝑛 𝑉 𝑑𝑜
5. 𝑑𝑢,𝑞𝑖 = 𝑑𝑢,𝑞
𝑖−1;
6. 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑣𝑒𝑟𝑡𝑒𝑥 𝑣 𝑖𝑛 𝑁𝑢 𝑑𝑜
7. 𝑖𝑓 𝑑𝑣,𝑞𝑖−1 + 𝑒𝑢,𝑣 < 𝑑𝑢,𝑞
𝑖 𝑡ℎ𝑒𝑛
8. 𝑑𝑢,𝑞𝑖 = 𝑑𝑣,𝑞
𝑖−1 + 𝑒𝑢,𝑣;
9. 𝐸𝑥𝑡𝑟𝑎𝑐𝑡 𝑎𝑙𝑙 𝑣𝑒𝑟𝑡𝑖𝑐𝑒𝑠 𝑤 𝑖𝑛 𝑉 𝑤𝑖𝑡ℎ 𝑑𝑤,𝑞
𝑠 < ∞ 𝑎𝑛𝑑 𝑓𝑜𝑟𝑚 𝑡ℎ𝑒 𝑠𝑒𝑡 𝑉𝑓
q = Ryan Giggs
𝑑𝑞,𝑞0 = 0
𝑑𝑢2,𝑞0 = ∞
𝑑𝑢3,𝑞0 = ∞
𝑑𝑢4,𝑞0 = ∞
𝑑𝑢5,𝑞0 = ∞
𝑑𝑢6,𝑞0 = ∞
𝑑𝑢7,𝑞0 = ∞
𝑑𝑞,𝑞1 = 0
𝑑𝑢2,𝑞1 = 3
𝑑𝑢3,𝑞1 = 10
𝑑𝑢4,𝑞1 = 4
𝑑𝑢5,𝑞1 = ∞
𝑑𝑢6,𝑞1 = ∞
𝑑𝑢7,𝑞0 = ∞
𝑑𝑞,𝑞2 = 0
𝑑𝑢2,𝑞2 = 3
𝑑𝑢3,𝑞2 = 8
𝑑𝑢4,𝑞2 = 4
𝑑𝑢5,𝑞2 = 10
𝑑𝑢6,𝑞2 = 16
𝑑𝑢7,𝑞0 = ∞
18
ACCESS ORDERING
The selection of a vertex at each iteration is critical
Avoid selecting a vertex v that
Increases the total social distance
Leads to violation of the acquaintance constraint
It is preferable to first include
Well-connected vertex
Early pruning of unqualified solutions is a key factor to
the overall performance
Terminology
GF VF, EF : feasible social graph of q
VS : intermediate solution obtained so far
V𝐴 : remaining vertices in V𝐹 (V𝐴 = VF − VS)
19
ACCESS ORDERING – CONT.
Step1: construct the feasible graph GF
Step2: explore GF to find the optimal solution
Initially VS includes only q
The remaining vertex set VA is VF − {q}
At each iteration moving a vertex from VA to VS
VS represents a feasible solution when VS =p
20 𝐕𝐒 𝐕𝐀 q
v1 v3
v2
v4
vi
ACCESS ORDERING – CONT.
Two conditions should be satisfied to exploit the
acquaintance constraint:
Interior unfamiliarity
The maximum number of unacquainted vertices that a vertex v in VS has
Exterior expansibility
The maximum number of vertices that VS can be extended
from
21 𝐕𝐒 𝐕𝐀 q
v6 v3
v2
v4
v8
INTERIOR UNFAMILIARITY
Interior unfamiliarity (U) – definition:
U(VS) = maxv∈VS VS − v − Nv
Nv is a set of neighboring vertices of v in GF
VS − v − Nv is a set of non-neighboring vertices of v in VS
In other words:
The maximum number of unacquainted vertices that a vertex v in VS has
Small value of U indicates that every vertex v ∈ VS has plenty of neighboring vertices in VS
It is preferable first to include a well-connected
vertex (low U) since it will make selections of other
vertices in the later iterations easier 22
INTERIOR UNFAMILIARITY – CONT.
Interior unfamiliarity (U) – condition:
U(VS ∪ {v}) ≤ kVS∪{v}
p
θ
θ > 0
VS∪{v}
p is the portion attendees that have been considered
With θ = 0 the inequality reaches its maximum (k)
Large θ allows choosing a vertex from VA that connects
to more vertices in VS
At early stages (VS∪{v}
p≪ 1) it is preferable to add well-
connected vertices
Algorithm SGSelect reduces θ if there exists no vertex
in VA that can satisfy the above condition 23
EXTERIOR EXPANSIBILITY
Exterior expansibility (A) – definition:
A(VS) = minv∈VS VA ∩ Nv + k − VS − v − Nv
VA ∩ Nv contains the neighboring vertices of v in V𝐴
VS − v − Nv is a set of non-neighboring vertices of v in VS
In other words:
The maximum number of vertices that VS can be extended from
We can select at most k − VS − v − Nv extra non-
neighboring vertices v in V𝐴
24
EXTERIOR EXPANSIBILITY – CONT.
Exterior expansibility (A) – condition:
A(VS ∪ {v}) ≥ p − VS ∪ {v} p − VS ∪ {v} is the number of attendees that have not been
considered
If the inequality doesn’t hold, the new intermediate
solution set obtained by adding v is not expansible
There is no degree of freedom in this case
25
EXAMPLE
SOCIAL GRAPH OF CASEY AFFLECK
VS = v7 , VA = {v2, v3, v4, v6, v8} Select v2 (since it has smallest social distance)
Calculate exterior expansibility A(VS ∪ {v2})
VA ∩ NV2 + k − VS − v2 − NV2 = 2 + 1 − 0 = 3
VA ∩ NV7 + k − VS − v7 − NV7 = 4 + 1 − 0 = 5
Choose the smallest one -> A(VS ∪ {v2}) = 𝟑
Validate exterior expansibility condition
p − VS ∪ v2 = 4 − 2 = 𝟐
Exterior expansibility holds for v2
vi di,7
v2 17
V3 18
V4 27
v6 23
v8 25
• GF
• q=v7 • p=4 • s=1 • k=1
26
EXAMPLE – CONT.
SOCIAL GRAPH OF CASEY AFFLECK
VS = v7 , VA = {v2, v3, v4, v6, v8} Calculate interior unfamiliarity U(VS ∪ {v2})
VS − v2 − NV2 = ∅ = 0
VS − v7 − NV7 = ∅ = 0
Choose the largest one -> U(VS ∪ {v2}) = 𝟎
Validate interior unfamiliarity condition
kVS∪{v2}
p
θ= 1 ×
2
4
2=𝟏
𝟒 (assuming that θ = 2)
Interior unfamiliarity holds for v2
vi di,7
v2 17
V3 18
V4 27
v6 23
v8 25
• GF
• q=v7 • p=4 • s=1 • k=1
27
EXAMPLE – CONT.
SOCIAL GRAPH OF CASEY AFFLECK
VS = v7, v2 , VA = {v3, v4, v6, v8} Select v3
Calculate exterior expansibility A(VS ∪ {v3}) = 1
Validate exterior expansibility condition p − VS ∪ v3 = 1
Exterior expansibility holds for v3
Calculate interior unfamiliarity U(VS ∪ {v3}) = 1
Validate interior unfamiliarity condition kVS∪{v3}
p
θ=
9
16
Interior unfamiliarity is violated for v3
In this stage, no reduction of θ is done, since there are still more
vertices in VA
Put v3 in parentheses and temporarily skip it
vi di,7
v2 17
V3 18
V4 27
v6 23
v8 25
28
EXAMPLE – CONT.
SOCIAL GRAPH OF CASEY AFFLECK
VS = v7, v2 , VA = { v3 , v4, v6, v8} Select v6
Both conditions hold
VS = v7, v2, v6 , VA = {v3, v4, v8} Select v3
Interior unfamiliarity is violated for v3
Put v3 in parentheses and temporarily skip it
VS = v7, v2, v6 , VA = { v3 , v4, v8} Select v8
Exterior expansibility is violated for v8
Remove v8 from VA
VS = v7, v2, v6 , VA = { v3 , v4} Select v4
Both conditions hold – p=4
Obtain first feasible solution 𝐕𝐒 = 𝐯𝟕, 𝐯𝟐, 𝐯𝟔, 𝐯𝟒
Total social distance = 67
vi di,7
v2 17
V3 18
V4 27
v6 23
v8 25
29
EXAMPLE – CONT.
SOCIAL GRAPH OF CASEY AFFLECK
𝐕𝐒 = 𝐯𝟕, 𝐯𝟐, 𝐯𝟔, 𝐯𝟒 (Total social distance = 67)
Is this the optimal solution?
Does it have the lowest social distance?
By backtracking to steps where interior unfamiliarity constraint was violated and reducing θ to zero, it is possible to get a better solution
𝐕𝐒 = 𝐯𝟕, 𝐯𝟐, 𝐯𝟑, 𝐯𝟒 (Total social distance = 62)
vi di,7
v2 17
V3 18
V4 27
v6 23
v8 25
30
SOCIAL-TEMPORAL GROUP QUERY (STGQ)
Extension of SGQ by exploring the temporal
dimension (m)
STGQ is more complex than SGQ
Intuitive approach
Find SGQ solution for each individual activity period and
then select the one with the minimum total social
distance
Computationally expensive
31
Definitions SGQ STGQ Results Conclusions
SOCIAL-TEMPORAL GROUP QUERY (STGQ)
ALGORITHM DESIGN
Proposed algorithm: STGSelect
STGselect constructs the VS with vertices which
have more available times lots in common
Pivot time slot is chosen
It is better to choose time slot which has plenty vertices
It is mandatory to choose a time slot which has more
than p vertices
32
TEMPORAL EXTENSIBILITY
Temporal extensibility (X) – definition:
X(VS) = Ts −m Ts set of the consecutive time slots available for all vertices in VS
In other words:
How many extra mutual time slots are available
Temporal extensibility(X) – condition:
X(VS ∪ {v}) ≥ m − 1p− VS∪{v}
p
φ
φ ≥ 1
p− VS∪{v}
p is the portion of attendees that have not been
considered
33
EXAMPLE (ONCE AGAIN, BUT WITH TEMPORAL CONSTRAIN)
SOCIAL GRAPH OF CASEY AFFLECK
Choose t3 as pivot time slot
VS = v7 , VA = {v2, v3, v4, v6, v8} Select v2 (since it has smallest social distance)
Both exterior expansibility and interior unfamiliarity conditions hold
Calculate temporal extensibility X(VS ∪ {v2})
Ts = t1, t2, t3, t4, t5 → Ts = 5 → X(VS ∪ {v2}) = 5 − 3 = 𝟐
Validate temporal extensibility condition
m− 1p− VS∪{v}
p
φ= 2 ×
2
4
2=𝟏
𝟐 -> condition holds
vi di,7
v2 17
V3 18
V4 27
v6 23
v8 25
• GF
• q=v7 • p=4 • s=1 • k=1 • m=3
t1 t2 t3 t4 t5 t6 t7
v2 o o o o o o o
v3 o o o o
v4 o o o o o o
v6 o o o o o o
v7 o o o o o o
v8 o o o o
34
EXPERIMENTAL RESULTS
Experiment setup – need to find G
Scheduling 194 people from various communities (schools, government,
business, industry)
Google calendar is utilized for scheduling
Social graph Synthetic dataset with 12800 people is generated from a social
network website
The schedule of each person in each day is randomly assigned from the above 194-people real dataset
Existing algorithm - PCArrange Imitating the behavior of manual coordination via phone calls,
where the initiator q sequentially invites close friends first then finds out the common available time slots
Doesn’t include acquaintance constraint
They obtain k from each activity manually
35
Definitions SGQ STGQ Results Conclusions
RESULT ANALYSIS
1.00E+04
1.00E+05
1.00E+06
1.00E+07
1.00E+08
1.00E+09
1.00E+10
2 3 4 5 6 7 8 9 10 11 12
Exec
uti
on
tim
e [n
s]
p (k=2, s=1)
Running time with different p
SGSelect
Baseline
1.00E+04
1.00E+05
1.00E+06
1.00E+07
1.00E+08
1.00E+09
1.00E+10
1.00E+11
0 2 4 6
Exec
uti
on
tim
e [n
s]
s (k=2, p=4)
Running time with different s
SGSelect
Baseline
1.00E+05
1.00E+06
1.00E+07
1.00E+08
0 1 2 3 4 5 6 7
Exec
uti
on
tim
e [n
s]
k (p=5, s=2)
Running time with different k
SGSelect
Baseline
5.00E+04
5.00E+05
5.00E+06
5.00E+07
5.00E+08
150 1500 15000
Exec
uti
on
tim
e [n
s]
Network Size (p=5, k=3, s=1)
Running time with different
network sizes
SGSelect
Baseline
36
RESULT ANALYSIS – CONT.
0
2
4
6
8
2 3 4 5 6 7 8 9 10 11 12
k
p
Relation of p and k
SGSelect
Baseline
0
50
100
150
2 3 4 5 6 7 8 9 10 11 12Tota
l so
cial
dis
tan
ce
p
Relation of p and the total social distance
SGSelect
Baseline
37
ARE WE ALWAYS GETTING WHAT WE WANT?
Time constraint can sometimes lead to undesirable
solution. E.g. Arrange a party at 09:00 – 14:00
Possible solution:
Define also time interval, e.g. m=3 & 18:00<T<02:00
38
ARE WE ALWAYS GETTING WHAT WE WANT?
Event may be too diverse for the initiator
p=8
s=2
k=6
Possible solution
Give more “weight” to s=1 – include more direct friends of the initiator
q
10
11
8
3
14
7
39
ARE WE ALWAYS GETTING WHAT WE WANT?
Data is not complete, currently there are no social
graphs which includes also timetable
Reliability – “inactive” users wrong social distance
calculation
Social graph may not contain actual friends
40
CONCLUSIONS
Novel system for automatic activity planning based
on social and temporal relationship
Two algorithms proposed:
SGSelect
STGSelect
Algorithm proposed, significantly outperforms the
existing algorithms
Research result can be adopted in social network
websites
41
Definitions SGQ STGQ Results Conclusions