Presenter : Jian-Ren Chen Authors : Cihan Kaleli , Huseyin Polat 2012 , KBS

Preview:

DESCRIPTION

Privacy-preserving SOM-based recommendations on horizontally distributed data. Presenter : Jian-Ren Chen Authors : Cihan Kaleli , Huseyin Polat 2012 , KBS. Outlines. Motivation Objectives Methodology Privacy analysis Experiments Conclusions Comments. Motivation. - PowerPoint PPT Presentation

Citation preview

Intelligent Database Systems Lab

Presenter : JIAN-REN CHEN

Authors : Cihan Kaleli, Huseyin Polat

2012 , KBS

Privacy-preserving SOM-based recommendations on horizontally

distributed data

1

Intelligent Database Systems Lab

OutlinesMotivationObjectivesMethodologyPrivacy analysisExperimentsConclusionsComments

2

Intelligent Database Systems Lab

Motivation• Collaborative Filtering (CF) systems are used to

suggest web pages. limited number of users’ data -> lack of accuracy-> Cold Start Problem

• Horizontally partitioned among multiple vendors

3

Intelligent Database Systems Lab

Objectives• Those companies holding inadequate number of users’

data might decide to combine their data. accurate predictions Performance

• Privacy-preserving scheme

4

Intelligent Database Systems Lab

Methodology

Privacy-preserving SOM clustering on horizontally

distributed data

Privacy-preserving k-nn-based predictions on horizontally

distributed data

a. Off-linei. Cluster users’ data distributed among multiple parties using SOM while preserving data owners’ privacy.ii. Compute aggregate data values required for recommendation estimations.

b. Onlinei. Determine a’s cluster.ii. Estimate prediction after receiving required aggregate data from other parties. Return the referral to a.

5

Intelligent Database Systems Lab

SOM clustering

k-nn-based collaborative filtering

MethodologyDetermine values of initial constants:

Find the winning Kohonen layer neuron:

Update the weight vectors of all neurons:

6

Intelligent Database Systems Lab

MethodologyPearson correlation coefficient:

The prediction for a on q:

SOM clustering

k-nn-based collaborative filtering

7

Intelligent Database Systems Lab

Privacy-preserving SOM clustering on

horizontally distributed data

Privacy-preserving k-nn-based predictions

on horizontallydistributed data

Methodology

8

1. number of clusters2. sequence of active party

Determine values of initial constants

SOM

1. all users it holds are assigned to a cluster2. updated Wj vectors to the second party

1. the next party repeats step 22. sends new updated Wj vectors to the next party

The last party sends the updated Wj vectors tothe IP

Intelligent Database Systems Lab

MethodologyPrivacy-preserving SOM clustering on

horizontally distributed data

Privacy-preserving k-nn-based predictions

on horizontallydistributed data

among C parties, P can be written

paq = va + P, where P is:

choose j percent of the users who did not rate q, where j in (0,)

choose j percent of their zuj values, remove their values, and replace with zero, wherej in(0,].

9

Intelligent Database Systems Lab

• Attacks and Vulnerabilities:1) A1 : Parties can coalesce for capturing a target

party’s data2) A2 : Paying-off3) V1 : Not able to return any result4) V2 : Missing values in aggregate values vector

Privacy analysis

10

Intelligent Database Systems Lab

Experiments

• Data sets

11

Intelligent Database Systems Lab

Experiments

12

Intelligent Database Systems Lab

Experiments

13

Intelligent Database Systems Lab

Experiments

14

Intelligent Database Systems Lab

Conclusions• Integrating split data significantly improves

preciseness.

• Although privacy concerns make accuracy worse,

accuracy losses are smaller than the accuracy gains

due to collaboration.

15

Intelligent Database Systems Lab

Comments• Advantages– accuracy, performance, and privacy

• Disadvantage– cost, accuracy

• Applications– Collaborative Filtering– Privacy-preserving scheme

16