39
Privacy in Today’s World: Solutions and Challenges Rebecca Wright Stevens Institute of Technology 26 June, 2003

Privacy in Today’s World: Solutions and Challenges

  • Upload
    kory

  • View
    50

  • Download
    0

Embed Size (px)

DESCRIPTION

Privacy in Today’s World: Solutions and Challenges. Rebecca Wright Stevens Institute of Technology 26 June, 2003. Talk Outline. Overview of privacy landscape Privacy-preserving data mining Privacy-protecting statistical analysis of large databases Selective private function evaluation - PowerPoint PPT Presentation

Citation preview

Page 1: Privacy in Today’s World: Solutions and Challenges

Privacy in Today’s World: Solutions and Challenges

Rebecca WrightStevens Institute of Technology

26 June, 2003

Page 2: Privacy in Today’s World: Solutions and Challenges

Talk Outline

• Overview of privacy landscape

• Privacy-preserving data mining

• Privacy-protecting statistical analysis of large databases

• Selective private function evaluation

• Conclusions

Page 3: Privacy in Today’s World: Solutions and Challenges

“You have zero privacy. Get over it.”

- Scott McNealy, 1999

• Changes in technology are making privacy harder.

– reduced cost for data storage

– increased ability to process lots of data

• Increased need for security may make privacy seem less critical.

Erosion of Privacy

Page 4: Privacy in Today’s World: Solutions and Challenges

Historical Changes• Small towns, little movement:

– very little privacy, social mechanisms helped prevent abuse

• Large cities, increased movement:

– lost social mechanisms, but gained privacy through anonymity

• Now:

– advancing technology is reducing privacy, social mechanisms not replaced.

Page 5: Privacy in Today’s World: Solutions and Challenges

What Can We Do?

• Use technology, policy, and education to

– maintain/increase privacy

– provide new social mechanisms

– create new mathematical models for better understanding

Problem: Using old models and old modes of thought in dealing with situations arising from new technology.

Page 6: Privacy in Today’s World: Solutions and Challenges

What is Privacy?

• May mean different things to different people

– seclusion: the desire to be left alone

– property: the desire to be paid for one’s data

– autonomy: the ability to act freely

• Generally: the ability to control the dissemination and use of one’s personal information.

Page 7: Privacy in Today’s World: Solutions and Challenges

Privacy of Data• Stored data

– encryption, computer security, intrusion detection, etc.

• Data in transit

– encryption, network security, etc.

• Release of data

– current privacy-oriented work: P3P, privacy bird, EPA, Internet Explorer 6.0, etc.

Page 8: Privacy in Today’s World: Solutions and Challenges

Internet Explorer V.6

Block All Cookies Not usable

High Reasonable range ofbehavior, blocking some

… cookies based on theirprivacy policies

Low

Accept All Cookies No privacy

Fairly simple, deals only with cookies, limited info

Page 9: Privacy in Today’s World: Solutions and Challenges

Different Types of Data• Transaction data

– created by interaction between stakeholder and enterprise

– current privacy-oriented solutions useful

• Authored data– created by stakeholder– digital rights management (DRM) useful

• Sensor data

– stakeholders not clear at time of creation– presents a real and growing privacy threat

Page 10: Privacy in Today’s World: Solutions and Challenges

Product Design as Policy Decision

• product decisions by large companies or public organizations become de facto policy decisions

• often such decisions are made without conscious thought to privacy impacts, and without public discussion

• this is particularly true in the United States, where there is not much relevant legislation

Page 11: Privacy in Today’s World: Solutions and Challenges

Example: Metro Cards

Washington, DC

- no record kept of per card transactions

- damaged card can be replaced if printed value still visible

New York City

- transactions recorded by card ID

- damaged card can be replaced if card ID still readable

- have helped find suspects, corroborate alibis

Page 12: Privacy in Today’s World: Solutions and Challenges

Privacy Tradeoffs?

• Privacy vs. security: maybe, but doesn’t mean giving up one gets the other (who is this person? is this a dangerous person?)

• Privacy vs. usability: reasonable defaults, easy and extensive customizations, visualization tools

Tradeoffs are to cost or power, rather than inherent conflict with privacy.

Page 13: Privacy in Today’s World: Solutions and Challenges

Surveillance and Data Mining

• Analyze large amounts of data from diverse sources.

• Law enforcement and homeland security:

– detect and thwart possible incidents before they occur

– identify and prosecute criminals after incidents occur

• Companies like to do this, too.

– Marketing, personalized customer service

Page 14: Privacy in Today’s World: Solutions and Challenges

Privacy-Preserving Data MiningAllow multiple data holders to collaborate to compute important (e.g. security-related) information while protecting the privacy of other information.

Particularly relevant now, with increasing focus on security even at the expense of privacy (e.g. TIA).

Page 15: Privacy in Today’s World: Solutions and Challenges

Advantages of privacy protection

• protection of personal information

• protection of proprietary or sensitive information

• fosters collaboration between different agencies (since they may be more willing to collaborate if they need not reveal their information)

Page 16: Privacy in Today’s World: Solutions and Challenges

Cryptographic Approach

• Using cryptography, provably does not reveal anything except output of computation.

– Privacy-preserving computation of decision trees [LP00]

– Secure computation of approximate Hamming distance of two large data sets [FIMNSW01]

– Privacy-protecting statistical analysis [CIKRRW01]

– Privacy-preserving association rule mining [KC02]

Page 17: Privacy in Today’s World: Solutions and Challenges

Randomization Approach

• Randomizes data before computation (which can then either be distributed or centralized).

• Induces a tradeoff between privacy and computation error.

– Distribution reconstruction algorithm from randomized data [AS00]

– Association rule mining [ESAG02]

Page 18: Privacy in Today’s World: Solutions and Challenges

Comparison of Approaches

inefficiency

privacy loss

inaccuracy

randomization approach

cryptographic approach

Page 19: Privacy in Today’s World: Solutions and Challenges

Comparison of Approaches

inefficiency

privacy loss

inaccuracy

randomization approach

cryptographic approach

Page 20: Privacy in Today’s World: Solutions and Challenges

Privacy-Protecting Statistics [CIKRRW01]

• Parties communicate using cryptographic protocols designed so that:

– Client learns desired statistics, but learns nothing else about data (including individual values or partial computations for each database)

– Servers do not learn which fields are queried, or any information about other servers’ data

– Computation and communication are very efficient

CLIENT

Wishes to compute statistics

of servers’ data

SERVERS

Each holds large database

Page 21: Privacy in Today’s World: Solutions and Challenges

Non-Private and Inefficient Solutions

• Database sends client entire database (violates database privacy)

• For sample size m, use SPIR to learn m values (violates database privacy)

• Client sends selections to database, database does computation (violates client privacy)

• General secure multiparty computation (not efficient for large databases)

Page 22: Privacy in Today’s World: Solutions and Challenges

Secure Multiparty Computation

• Allows k players to privately compute a function f of their inputs.

• Overhead is polynomial in size of inputs and complexity of f [Yao, GMW, BGW, CCD, ...]

P1

P2Pk

Page 23: Privacy in Today’s World: Solutions and Challenges

Symmetric Private Information Retrieval

• Allows client with input i to interact with database server with input x to learn (only)

• Overhead is polylogarithmic in size of database x [CMS,GIKM]

ixClient Server

i nxxx ,...,1Learns ix

Page 24: Privacy in Today’s World: Solutions and Challenges

Homomorphic Encryption

• Certain computations on encrypted messages correspond to other computations on the cleartext messages.

• For additive homomorphic encryption,

– E(m1) • E(m2) = E (m1+ m2)

– also implies E(m)x = E(mx)

• Paillier encryption is an example.

Page 25: Privacy in Today’s World: Solutions and Challenges

Privacy-Protecting Statistics Protocol

• To learn mean and variance: enough to learn sum and sum of squares.

• Server stores:

and responds to queries from both

• efficient protocol for sum efficient protocol formean and variance

1x 2x nx...

1z 2z nz...)( 2ixz

i

Page 26: Privacy in Today’s World: Solutions and Challenges

Weighted Sum

ji

m

j j xw 1

Client wants to compute selected linear combination of m items:

Client ServerHomomorphic encryption E, D

0

ji

w if jii

o/w

decrypts to obtain

computes

v

)(),...,( 1 nEE

)(

))((

1

1

i

n

i i

xn

i i

xE

Ev i

ji

m

j ji

n

i i xwx

11

Page 27: Privacy in Today’s World: Solutions and Challenges

Efficiency

• Linear communication and computation (feasible in many cases)

• If n is large and m is small, would like to do better

Page 28: Privacy in Today’s World: Solutions and Challenges

Selective Private Function Evaluation

• Allows client to privately compute a function f over m inputs

• client learns only

• server does not learn

Unlike general secure multiparty computation, we want communication complexity to depend on m, not n. (More accurately, polynomial in m, polylogarithmic in n).

mii xx ,,1

),,(1 mii xxf

mii ,...,1

Page 29: Privacy in Today’s World: Solutions and Challenges

Security Properties• Correctness: If client and server follow the

protocol, client’s output is correct.

• Client privacy: malicious server does not learn client’s input selection.

• Database privacy:

– weak: malicious client learns no more than output of some m-input function g

– strong: malicious client learns no more than output of specified function f

Page 30: Privacy in Today’s World: Solutions and Challenges

Solutions based on MPC

• Input selection phase:

– server obtains blinded version of each

• Function evaluation phase

– client and server use MPC to compute f on the m blinded items

jix

Page 31: Privacy in Today’s World: Solutions and Challenges

Input selection phase

Client Server

Homomorphic encryption D,EComputes encrypted database

)( 1xE )( nxE...Retrieves

using SPIR

) ( ),..., (1mi ix E x E

SPIR(m,n), E

Picks random

computes

) (j ic x Ej

mcc ,...,1 ) (j ic x Ej

Decrypts received values:

jij cxsj

Page 32: Privacy in Today’s World: Solutions and Challenges

Function Evaluation Phase

• Client has

• Server has

Use MPC to compute:

mccc ,...,1

msss ,...,1 jij cxsj

) ,..., ( ) ( ) , (1mx x f c s f s c g

• Total communication cost polylogarithmic in n, polynomial in m, | f |

Page 33: Privacy in Today’s World: Solutions and Challenges

Distributed Databases

• Same approach works to compute function over a distributed database.

– Input selection phase done in parallel with each database server

– Function evaluation phase done as single MPC

– Database privacy means only final outcome is revealed to client.

Page 34: Privacy in Today’s World: Solutions and Challenges

Performance

Complexity Security

1 mSPIR(n,1,k) + O(k|f|) Strong

2 mSPIR(n,1,1) + MPC(m,|f|) Weak

3 SPIR(n,m,log n) + MPC(m,|f|) + km2 Weak

4 SPIR(n,m,k) + MPC(m,|f|) Honest clientonly

Current experimentation to understand whether these methods are efficient in real-world settings.

Page 35: Privacy in Today’s World: Solutions and Challenges

Initial Experimental Results

• Initial implementation of linear computation and communication solution [H. Subramaniam & Z. Yang]

– implementation in Java and C++

– uses Paillier encryption

– uses synthetic data, with client and server as separate processes on the same machine (2-year old Toshiba laptop).

Page 36: Privacy in Today’s World: Solutions and Challenges

Initial Experimental Results

0

5

10

15

20

25

30

0 20,000 40,000 60,000 80,000 100,000

Database size

Tim

e (m

inu

tes)

Total time

Page 37: Privacy in Today’s World: Solutions and Challenges

Initial Experimental Results

0

5

10

15

20

25

30

0 20,000 40,000 60,000 80,000 100,000

Database size

Tim

e (m

inu

tes)

Total timeEncryption

Page 38: Privacy in Today’s World: Solutions and Challenges

Conclusions• Privacy is in danger, but some important progress has

been made.

• Important challenges ahead:

– Usable privacy solutions (efficiency and user interface)– Sensor data

– Better use of hybrid approach: decide what can safely be disclosed, what needs moderate protection, and use cryptographic protocols to protect most critical information.

– Mathematical/formal models to understand and compare different solutions.

Page 39: Privacy in Today’s World: Solutions and Challenges

Research Directions• Investigate integration of cryptographic approach and

randomization approach:

– seek to maintain strong privacy and accuracy of cryptographic approach, ...

– while benefitting from improved efficiency of randomization approach

• Understand mathematically what the resulting privacy and/or accuracy compromises are.

• Technology, policy, and education must work together.