Upload
ashley-pitts
View
235
Download
0
Tags:
Embed Size (px)
Citation preview
Claudia Diaz (K.U.Leuven) 1
Privacy and anonymity
Claudia Diaz
Katholieke Universiteit Leuven
Dept. Electrical Engineering – ESAT/COSIC
UPC Seminar, Barcelona, November 24, 2008
Introducing myself...
Claudia Diaz (K.U.Leuven) 2
Introducing COSIC (1/2)
Group at the Dept. of Electrical Engineering of the Katholieke Universiteit Leuven
COSIC: COmputer Security and Industrial Cryptography Professors: Bart Preneel, Ingrid Verbauwhede and Vincent
Rijmen 5 post-docs (more arriving soon) 40 PhD students (more arriving soon) 5-15 visitors and external associated researchers at any given
time Very international: 15-25 nationalities, ~60% non-Belgians
Claudia Diaz (K.U.Leuven) 3
Introducing COSIC (2/2)
Claudia Diaz (K.U.Leuven) 4
… a few words on the scope of the talk New field, in development
Give an idea on the problems, concepts and solutions for anonymity
Claudia Diaz (K.U.Leuven) 5
Claudia Diaz (K.U.Leuven) 6
Outline
Privacy and anonymity Anonymous communications Anonymity metrics Social networks
Solove on “I have nothing to hide”
"the problem with the ‘nothing to hide’ argument is its underlying assumption that privacy is about hiding bad things.“
"Society involves a great deal of friction, and we are constantly clashing with each other. Part of what makes a society a good place in which to live is the extent to which it allows people freedom from the intrusiveness of others. A society without privacy protection would be suffocation, and it might not be a place in which most would want to live."
Claudia Diaz (K.U.Leuven) 7
Diffie and Landau on Internet eavesdropping Governments are expanding their surveillance powers to protect
against crime and terrorism. BUT:
Secrecy, lack of transparency, and diminished safeguards may easily lead to abuses
“Will the government’s monitoring tools be any more secure than the network they are trying to protect? If not, we run the risk that the surveillance facilities will be subverted or actually used against the U.S.A.”
They conclude: “Communication is fundamental to our species; private communication is fundamental to both our national security and our democracy. Our challenge is to maintain this privacy in the midst of new communications technologies and serious national security threats. But it is critical to make choices that preserve privacy, communications security and the ability to innovate. Otherwise, all hope of having a free society will vanish.”
Claudia Diaz (K.U.Leuven) 8
Claudia Diaz (K.U.Leuven) 9
Privacy properties from a technical point of view Anonymity
Hiding link between identity and action Unlinkability
Hiding link between two or more actions / identities Unobservability
Hiding user activity Pseudonymity
One-time pseudonyms / persistent pseudonyms Plausible deniability
Not possible to prove user knows / has done something
Claudia Diaz (K.U.Leuven) 10
The concept of Privacy [Solove] Privacy threats we are trying to protect against
(out of 16 identified by Solove) Surveillance: monitoring of electronic transactions
Preventive properties: anonymity, unobservability Interrogation: forcing people to disclose information
Preventive property: plausible deniability Aggregation: combining several sources of information
Preventive property: unlinkability Identification: connecting data to individuals.
Preventive properties: anonymity and unlinkability
Claudia Diaz (K.U.Leuven) 11
Claudia Diaz (K.U.Leuven) 12
Anonymity – Data and Communication Layers
App App
Com Com
IP
Alice Bob
Claudia Diaz (K.U.Leuven) 13
Classical Security Model
Alice Bob
Eve
Passive / Active
• Confidentiality• Integrity• Authentication• Non repudiation• Availability
Claudia Diaz (K.U.Leuven) 14
Anonymity – Concept and Model
Set of Alices Set of Bobs
Claudia Diaz (K.U.Leuven) 15
Anonymity Adversary
• Passive/Active
• Partial/Global
• Internal/External
Recipient?
Third Parties?
Anonymity Adversary
The adversary will: Try to find who is sending messages to whom. Observe
All links (Global Passive Adversary) Some links
Modify, delay, delete or inject messages. Control some nodes in the network.
The adversary's limitations Cannot break cryptographic primitives. Cannot see inside nodes he does not control.
Claudia Diaz (K.U.Leuven) 16
Soft privacy enhancing technologies Hard privacy
Focus on data minimization Adversarial data holder / service provider
Soft privacy Policies, access control, liability, right to correct
information Adversary: 3rd parties, corrupt insider in honest
SP, errors BUT user has already lost control of her data
Claudia Diaz (K.U.Leuven) (Slide taken from G. Danezis) 17
Other privacy-enhancing technologies Anonymous credentials / e-cash / ZK
protocols Steganography / covert communication Censorship resistance techniques Anonymous publication Private information retrieval (PIR) Private search K-anonymity Location privacyClaudia Diaz (K.U.Leuven) 18
Claudia Diaz (K.U.Leuven) 19
Outline
Privacy and anonymity Anonymous communications Anonymity metrics Social networks
Claudia Diaz (K.U.Leuven) 20
Concept of Mix: collect inputs
Router that hides correspondence between
inputs and outputs
Claudia Diaz (K.U.Leuven) 21
Concept of Mix: mix and flush
Router that hides correspondence between
inputs and outputs
Claudia Diaz (K.U.Leuven) 22
Functionality of Mixes
Mixes modify The appearance of messages
Encryption / Decryption Sender → Mix1 : {Mix2, {Rec, msg}KMix2
}KMix1
Padding / Compression Substitution of information (e.g., IP)
The flow of messages Reordering Delaying - Real-time requirements! Dummy traffic - Cost of traffic!
Claudia Diaz (K.U.Leuven) 23
Based on the mix proposed by Chaum in 1981:1. Collect N inputs2. Shuffle3. Flush (Forward)
Pool selection algorithm No pool / Static pool / Dynamic pool Influences the performance and anonymity provided
by the mix Flushing condition
Time / Threshold Deterministic / Random
Pool Mixes
Round
Claudia Diaz (K.U.Leuven) 24
Example of pool mixDeterministic threshold
static pool Mix
Pool = 2Threshold = 4
Claudia Diaz (K.U.Leuven) 25
Stop-and-Go Mix
Proposed by Kesdogan in 1998 Reordering strategy based on delaying M/M/∞ Delays generated by the user from an
Exponential distribution Timestamping to prevent active attacks
Trusted Time Service Anonymity estimates based on the
assumption of Poisson incoming traffic
Claudia Diaz (K.U.Leuven) 26
Network topology Mixes are combined in networks in order to
Distribute trust Improve availability
Cascade
Fully connected network
Restricted route network
Cascades vs Free Route topologies Flexibility of routing
Surface of attack Advantage free routes
Availability Advantage free routes
Intersection attacks Advantage cascades (anonymity set smaller but no
partitioning possible) Trust
Advantage free routes (more choices available to user)
Claudia Diaz (K.U.Leuven) 27
Peer-to-peer vs client-server architectures Symmetric vs asymmetric systems
Surface of attack Advantage peer-to-peer
Liability issues Advantage client-server
Resources / incentives / quality of service Advantage client-server
Availability Advantage peer-to-peer
Sybil attacks Advantage? Depending on admission controls (for
peers/servers)
Claudia Diaz (K.U.Leuven) 28
Deployed systems I
Anon.penet.fi (Helsingius 1993) Simple proxy, substituted email headers Kept table of correspondences nym-email Brought down by legal attack in 1996
Type I Cypherpunk remailers (Hughes, Finney1996) No tables (routing info in msgs themselves), PGP encryption (no attacks based on content) –
attacks based on size are possible Chains of mixes (distribution of trust) Reusable reply blocks (source of insecurity)
Claudia Diaz (K.U.Leuven) 29
Deployed systems II
Mixmaster (Cottrell, evolving since 1995) Fixed size (padding / dividing large messages) Integrity protection measures Multiple paths for better reliability No replies
Mixminion (Danezis, 2003) SURBs (Single-Use Reply Blocks) Packet format: detection of tagging attacks (all-or-
nothing) Forward security: trail of keys, updated with one-way
functions
Claudia Diaz (K.U.Leuven) 30
Low-latency applications
Stream-based instead of message-based communication Web browsing, interactive applications, voice over IP, etc.
Bi-directional circuits Ephemeral session keys, onion-encrypted (forward secrecy)
Real-time requirements Delaying not an option
Proposed systems C-S: Onion Routing, ISDN, Web Mixes P2P: Crowds, P5 (broadcast), Herbivore (DC-nets)
Implemented systems: TOR, JAP
Claudia Diaz (K.U.Leuven) 31
Onion encryption
Claudia Diaz (K.U.Leuven) 32
R1
R3
R2
D
DR3R2R1
Onion Routing
Claudia Diaz (K.U.Leuven) 33
R1
R3
R2
D
DR3R2R1
TOR – adversary model
Claudia Diaz (K.U.Leuven) 34
Anonymizing http traffic not trivial Difficult to conceal traffic pattern Difficult to pad
Lots of padding: scalability / cost problem Little padding: not enough to conceal pattern
Vulnerable to strong adversaries (entry+exit) Fingerprinting attacks
Adversary observes only user side Internet exchanges
Claudia Diaz (K.U.Leuven) 35
Claudia Diaz (K.U.Leuven) 36
Dummy Traffic
Fake messages/traffic introduced to confuse the attacker
Undistinguishable from real traffic Dummies improve the anonymity by making more
difficult the traffic analysis Neccessary for unobservability Expensive
(Some) Attacks on mix-based systems Passive attacks
Long-term intersection attacks (statistical disclosure) Persistent communication patterns Extract social network information
Traffic correlation / confirmation Firgerprinting Source separation Epistemic attacks (route selection)
Active attacks N-1 attacks Sybil Tagging Replay DoS
Claudia Diaz (K.U.Leuven) 37
Claudia Diaz (K.U.Leuven) 38
Long-Term Intersection Attacks Family of attacks with many variants:
Disclosure attack (Agrawal, Kesdogan) Hitting set attack (Kesdogan) Statistical disclosure attack (Danezis, Serjantov) Extensions to SDA (Dingledine and Mathewson) Two-Sided SDA (Danezis, Diaz, Troncoso) Receiver-bound cover traffic (Mallesh, Wright)
Assumption: Alice has persistent communication relationships (she communicates
repeatedly with her friends) Method:
Combine many observations (looking at who receives when Alice sends) Intuition:
If we observe rounds in which Alice sends, her likely recipients will appear frequently
Result: We can create a vector that expresses Alice’s sending probabilities (a
sending profile)
Claudia Diaz (K.U.Leuven) 39
Outline
Privacy and anonymity Anonymous communications Anonymity metrics Social networks
Claudia Diaz (K.U.Leuven) 40
Definition [PfiHan2000]
First clear definition of anonymity (2000) Anonymity is the state of being not
identifiable within a set of subjects, the anonymity set.
The anonymity set is the set of all possible subjects who might cause an action or be addressed.
Anonymity depends on: The number of subjects in the anonymity set The probability distribution of each subject in the
anonymity set being the target
Claudia Diaz (K.U.Leuven) 41
Example: computing anonymity metrics for a pool mix
p1=2-1
p2=2-2
p3=2-3
p4=2-4
Probability of
recipient Ri : pi=2-i
Recipient R1
Recipient R2
Recipient R3
Recipient R4
Potentially infinite subjects in the recipient anonymity set
Claudia Diaz (K.U.Leuven) 42
Entropy: measure of the amount of information required on average to describe the random variable
Measure of the uncertainty of a random variable Increases with N and with uniformity of distribution
Distribution with entropy H equivalent to uniform distribution with 2H subjects
Other information theoretic metrics: min-entropy, max-entropy, Rényi entropy, relative entropy, mutual information, ....
Entropy: information-theoretic anonymity metrics [DSCP02, SD02]
iN
ii ppH 2
1
log
Combining traffic analysis with social network knowledge “On the Impact of Social Network Profiling on
Anonymity” [DTS-PETS08] Use of Bayesian inference to combine different
sources of information (SN knowledge, text mining) Results:
“Combined” anonymity decreases with network growth (if only one source of information is considered, then growing the network does not decrease anonymity)
Anonymity degradation as more profiles become known ERRORS: Bad quality SN profile information
Small profile errors lead to large errors in the results If adversary not completely confident of SN info → cannot have
any confidence on results
Claudia Diaz (K.U.Leuven) 43
Combinatorial approaches
Edman et al. Consider deanonymization for a system as a whole
(instead of individual users) Find perfect matching inputs/outputs Perfect anonymity for t messages: t! equiprobable
combinations GTDPV-WPES08
Edman et al.’s metric overestimates anonymity if users send/receive more than one message
Generalization to users sending/receiving more than one message
Divide and conquer algorithm to compute the metric Upper and lower bounds on anonymity (easy to compute)
Claudia Diaz (K.U.Leuven) 44
Claudia Diaz (K.U.Leuven) 45
Outline
Privacy and anonymity Anonymous communications Anonymity metrics Social networks
A social network approach
Most anonymity techniques focus on channels and one-to-one relationships
Mailing lists, twikis, blogs, online social networks: Many-to-many communication (communities) Shared spaces and collective content (group photo
albums, forums) Moreover, information about third parties (not in
the community) is also leaked (pictures, stories, references in posts)
This scenario presents new privacy challenges
Claudia Diaz (K.U.Leuven) 46
“The Economics of Mass Surveillance” [DanWit06] Model:
Claudia Diaz (K.U.Leuven) 47
Clubs People
Assumptions If one of the club participants
is under surveillance all information shared in this club, and the membership list of the club becomes known to Eve
Eve has a limited budget (number of people it can put under surveillance)
Target selection
How best to choose those to be put under surveillance to maximize returns in terms of observed clubs, people and their relationships?
How does the lack of information, due to the use of an anonymizing networks, affect the effectiveness of such target selection?
Data: mail archives of political network 2003-2006 Clubs = mailing lists (373) People = email addresses posting more than 5 times to
those mailing lists (2338) Links = number of posts of person over 3 years (3879)
Claudia Diaz (K.U.Leuven) 48
With and without anonymizing network Without Eve can observe all traffic flow (not
the content) and thus construct the social graph (links people-clubs)
Eve chooses to put under surveillance the nodes with the highest degrees, excluding all spaces already under surveillance
Result: surveilling 8% of the people is enough to control 100% of the clubs
Claudia Diaz (K.U.Leuven) 49
With Eve can only observe aggregated
amount of traffic generated by people Eve puts under surveillance the
nodes that generate the most volume Results: 50% links known when
surveying 5% of people. To get 80% of links an adversary needs to observe 30% of people (vs. 3%)
The Economics of Covert Community Detection and Hiding [Nag08] Adversary tries to uncover communities and learn
community membership Counter-surveillance techniques that can be
deployed by covert groups (e.g., topological rewiring)
“Our study confirms results from previous work showing that invasion of privacy of most people is cheap and easy”
With certain counter-surveillance strategies: “70% of the covert community going undetected even when 99% of the network is under direct surveillance.”
Claudia Diaz (K.U.Leuven) 50
Claudia Diaz (K.U.Leuven) 51
Conclusions Privacy / anonymity is an area of research getting
increasing attention Technical, legal, sociological, psychological, economic, political,
philosophical aspects Economics of privacy:
Crypto: little overhead → lots of security Anonymity: lots of overhead → a little bit of security Privacy invasion is:
Profitable for businesses Increases the power of Government
High-latency applications (email): Problems with persistent user behavior
Low-latency applications Insecure towards strong adversaries
Anonymous communications are fragile: if you want to propose a new system: Check the literature Check known attacks
Claudia Diaz (K.U.Leuven) 52
(Some) Open Questions
Do individuals care enough to pay the price of privacy? Will they in the future? What is privacy anyway?
Will privacy technology be implemented to re-establish the tradeoffs and power balances of the pre-information era? Or will society and individuals redefine privacy and power tradeoffs and adapt their behaviour to accomodate a reality of privacy invasion and surveillance?
Anonymity metrics: Do we really understand what is anonymity? Which is the best way to measure anonymity? Which are the “adequate
levels”? Current metrics very limited in application (e.g., how to measure privacy
losses due to personal data becoming public? Real user behavioral models Models for adversary knowledge Solutions for Low-Latency Anonymous Communication? Solutions for Social Networks?
Thank you!
Recommended bibliography on the subject:
http://www.freehaven.net/anonbib/Claudia Diaz (K.U.Leuven) 53