50
LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 (Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil Jajodia Witold Litwin Thomas Schwarz George Mason U. U. Paris Dauphine Santa Clara U.

LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

Embed Size (px)

Citation preview

Page 1: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

1

LH*RE : A Scalable Distributed Data Structure with

Recoverable Encryption Keys

(Work in Progress, Jan 09)( Provisional Patent Appl.)

Sushil Jajodia Witold Litwin Thomas Schwarz

George Mason U. U. Paris Dauphine Santa Clara U.

Page 2: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

2

Overview• A new data structure• A Scalable Distributed Data Structure– LH* Family

• Client-side Encryption–Using one or many symmetric encryption keys–Protects the privacy of client data stored on

unknown servers• Hence moderately trusted by the client

Page 3: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

3

Overview• Recoverable Encryption Keys– Safely backed up in the file–Recoverable on behalf of the client –Recoverable without the client on behalf of

some Authority• Revocable Keys– Idem

• Scalable file parameters – Preserving the assurance

Page 4: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

4

Overview

• Applications on:– SDDS– P2P– Clouds– Grids

• Enterprise Data• Medical Data• Social Networks

Page 5: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

5

Overview• Basic Threat Model– Client site is safe– LH* Coordinator site is safe– Data hosting organization as the whole is safe

(trusted)– Network is safe while a key is backed up or recovered– No malicious intruder

• To decrypt some records an intruder then needs:– Break an encryption key– Break into at least k servers

• k is client defined parameter

Page 6: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

6

Overview• The servers to break-in for a specific record can

be anywhere in the file– At locations unknown to the intruder– Changing with splits – The intruder may need to break to all the servers

• The effort of breaking some specific k servers may – Still not suffice to break any record

• Most often

– Suffice only for a few records• When the client uses many encryption keys

Page 7: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

7

Overview

• LH*RE data record manipulation costs no more messaging than in an LH* file

• Key recovery cost is about that of LH* scan– Possibly 2M messages for M servers in one or

several rounds

• Storage overhead due to encryption is negligible

• In practice, LH*RE file should be safe

Page 8: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

Overview• LH*RE could be useful for:–Organizations with multiple clients & servers

• Typical case today–Clients of remote storage services• P2P, Grid, Cloud … computing• Amazon, Google, MS, IBM…

• Distributed Systems need client-side encryption and key recoverability –Both not yet well handled in practice

8

Page 9: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

9

Generic LH*

• Scalable Distributed Hash Data Structure• Data are stored in buckets on Server Sites

numbered 0,1,2…• Applications are at Client Sites– Peer Site may be client & server

• Data are in records with primary keys• Record can be inserted, updated, deleted,

searched or scanned• Record C address m is LH (C )

Page 10: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

10

Generic LH*

• Overflowing inserts generate splits moving data into new buckets (on new sites)– Splits are ordered : 0, 0, 1, 0, 1,2,3,0,1,…,2j -1,0…

• LH (C) dynamically changes• Client may not know the actual file state• It uses only its private file state image for addressing• Addressing errors may result

Page 11: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

11

Generic LH*• Any addressing error is resolved by the

servers in at most two forwarding messages– Only one for LH*RS

P2P

• Every forwarding adjusts the client image• Addressing errors do not repeat• All together LH* is the fastest SDDS (P2P,

Grid, Cloud...) addressing scheme.

Page 12: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

12

LH*RE

• Coordinator may have additional capabilities– Certifying the address of every client– Maintaining PKI over the file • If the network is not safe• For client identity checking

– …• Records are LH* records with additional client

identity field I• Key-based addressing is as for LH*

Page 13: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

13

LH*RE

• File starts with at least K buckets– K is file parameter– Basically, K is a power of 2

• Data in every record are encrypted by the client– Through some good symmetric encryption key

method• Much faster than known public key schemes

• Primary keys and I are not encrypted

Page 14: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

14

Encryption/Decryption

• Client uses a cached table T (t) with N encryption keys Ei

• Some hash h (C) chooses t for R (C)– E.g., t = h (C) = C mod N

• Client encrypts/decrypts the non-key data field D in R (C) using Ei into D’ field – Using strong encryption• AES• PGP• …

Page 15: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

15

Encryption/Decryption

• Client forms encrypted record R ’ (C) = (C, I, t, D’) – I is provable client identity –Or any info to provide by the future

requestor to access R ’

Page 16: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

16

Encryption/Decryption

• The client manipulates the encrypted record R’ (C) basically as for LH*– Key-based search, insert, delete and update

• However, the scan operation over the non-key field does not operate anymore–Cannot search for the content – That is the basic purpose of LH*RE

Page 17: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

17

Encryption Key Encoding• Client encodes each encryption key E – Using secret sharing with k ≤ K shares

• k - 1 shares are different white noises N 1 .. Nk-1

– There is a new set of shares for every encryption key • Higher assurance than if all keys used the same

set of noises• Such approach remains a possibility nevertheless–Not addressed in what follows, unless stated

otherwise

Page 18: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

18

Encryption Key Encoding

• The k - th share value is

E' = N1 … Nk-1 E

– denotes X OR• Each share becomes share record

Sj = (Cj , t, I, Ni ) for j = 1, k - 1

S k = (Cj , t , I, E‘ )

Page 19: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

19

Encryption Key Encoding

• Client chooses each key Cj by some hash LHK

defined as follows: – LHK hashes Nj or E’ on initial buckets 0,1…K -1

– For any j > 1 and any l < j : LHK (Cj ) ≠ LHK (Cl )

• Here Cl is previously generated key for E being encoded

– Every Cj is unique in the file• General constraint on LH* file–Could be relaxed

Page 20: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

20

Encryption Key Encoding

• Client sends each Sj for storage– As usual if the network is safe– Using any reasonable protocol for safe

transmission otherwise• SSL…

• Otherwise, the snooper could guess all the shares and decode an encryption key

• Forwarding does not need this procedure• Neither the data record manipulation

Page 21: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

21

Encryption Key Encoding• Main Property– All share records of E that client sends out for storage

end up at different servers• Even if they are forwarded

• Regardless of future splits and merges they always remain at different servers– Despite the migrations during the splits

• Proof : details avoided here• Basis : in LH*, no splits may migrate records in

different buckets into the same bucket

Page 22: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

22

Encryption Key Encoding• Example– File extends over servers (buckets) 0,1,2,…12,13– Shares of some key end up in servers 0,3,6,11– Coming splits may only move these shares

respectively to servers distant by 23, 24, 25… 6 14,22… 0 16,32… 3 19,35… 11 27…

Page 23: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

23

Encryption Key Recovery

• Concerns all the encryption keys of some client I’• Requestor can be the client itself– Having lost T for any reason

• Requestor can be a trusted authority A – In case of disappearance of I’

• Dismissal of an employee• Death or incapacity of a patient• ….

• A requests then the recovery on behalf of new client I”

Page 24: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

24

Encryption Key Recovery

• Requestor basically does not know k and N • It requests then the LH-like scan with the

deterministic termination– Searching for any share record where for some N’

I := I’ and t ≤ N ’• Choice of N’ is arbitrary–Basically, should be large enough to be > N –Alternatively, the client may use it to prevent

the flooding by the incoming replies

Page 25: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

25

Encryption Key Recovery

• If the requestor knows N and k the probabilistic termination suffices– Recovery may be cheaper

• In practice, with high probability, probabilistic termination should usually suffice– Why ?

Page 26: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

26

Encryption Key Recovery

• The requestor could be fake– E.g., Monkey in the middle

• Each server receiving S verifies therefore the identity of the requestor– E.g., the IP address of the client with the

coordinator• Unless it caches the legal addresses• Or they are integral part of the I-fields

– Or it verifies the signature through PKI – …

Page 27: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

27

Encryption Key Recovery

• Direct requests from servers to the coordinator generate 2N messages– Heavy load for the coordinator

• Alternative way is to aggregate the requests at the servers

• Sending fewer of those to the coordinator• Even a single one only• As below

Page 28: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

28

Encryption Key Recovery

• Every server having a child waits for the request from it

• Every child requests the confirmation from its father

• Except for server 0, every server requests the confirmations from its father – By structure of LH* all these requests end up at

server 0– Server 0 forwards the request to the coordinator

Page 29: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

29

Encryption Key Recovery

• The coordinator gets a single message – Regardless of N

• Its reply propagates downward similarly• Notice that the scheme works assuming no

malicious action at server– As we do unless we state otherwise

• Otherwise, e.g., server 0 could send fake OK• Big trouble could follow

Page 30: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

30

Encryption Key Recovery

• Once the server gets OK, it starts the actual bucket scan

• Sends all the records found to I’ or I’’– If the network is not safe, it uses SSL or alike• Snooper could collect the shares otherwise

• Sends an Ack of having received S otherwise

Page 31: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

31

Encryption Key Recovery

• The client – Matches the records with same t – Recovers the t-th key• By of all the shares sharing t• Deterministic termination guarantees that there are k

such shares

– Sets N = tmax where tmax is the maximal t received

Page 32: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

32

Encryption Key Revocation

• Revocation consists of change of the encryption key for every data record of a client

• May happen when–Client’s T went to wrong hands–Client right to use data abruptly expired• Termination of employment• …

Page 33: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

33

Encryption Scalability

• More encryption keys for a larger file– To offset assurance deterioration• Here: the number of keys that remain

undisclosed if a key gets disclosed • Suffices to append new keys to T and extend

the hash function• Existing encryption is not affected

Page 34: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

34

Encoding Scalability

• More shares per key for a larger file– To offset assurance deterioration

• To set k = k + 1, it suffices:– Create for every i a new noise share Nk

– Read any but one share record Sj of the t – th key

– N j := Nj Nk

– Store updated Sj

– Create and store new share record Sk = (Ck , t, I, Nk)

Page 35: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

35

Encoding Scalability

• The process may be carried out by scanning successive buckets 0,1…–Requesting from new buckets only share

records whose t was not dealt with yet.–Until we re-encode the entire T

Page 36: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

36

Performance: Messaging Cost

• Same as for LH* for data records manipulation• Plus kN + messages to backup T • Basically, about 4N messages for key recovery

scan– In about log N rounds

• Can be (much) less messages for probabilistic termination or client address caching at the servers

Page 37: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

37

Processing Cost

• Processing overhead concerns –Mainly, the (symmetric) encryption/decryption• Depends on encryption scheme used

– From time to time, especially initially • Key generation & encoding

– Sporadically• Key Recovery• Key Revocation

• This analysis is an open issue at present

Page 38: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

38

Storage Overhead

• Should be O (kN) on the servers• Encryption keys & thus share records should be usually

small compared to data records• Same for other LH*RE specific fields within each data

record • Storage overhead on the servers should be usually

negligible• Client storage for T should be O (N)– Easily OK for even millions of encryption keys in a typical RAM

Page 39: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

39

Encryption Strength

• Attack 1: Any Single Server Intrusion – By an Intruder or the Administrator

• Accidentally or willingly

• Impossible to decode any encryption key• One has to break the encryption keys of the data

records of interest– About impossible in practice for good encryption– Difficulty compounds when the client uses multiple

encryption keys

• LH*RE data on a server are safe in this sense

Page 40: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

40

Encryption Strength • Attack 2 : Multiple Server Intrusion to decrypt a

specific data record• To decode E of any data record of interest intruder

has to break into at least k servers – With the shares of E

• Otherwise, the brute force is the only issue• If M > k, to break into k or more servers does

not guarantee the success with a specific record– See the example later on in this talk

Page 41: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

41

Encryption Strength • The shares searched for may be anywhere in the

file• N o share has any info about the location of the

other shares• The intruder may need to break into every server• If M = k, to break into k servers suffices for the

success• Hence it is safer to start the file with K > k

Page 42: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

42

Encryption Strength • Attack 3 : At least any k-server intrusion to decrypt

any data records• The decoding of some encryption keys hence

disclosure of some data is possible– But not sure

• The likelihood and consequences depend on file state and parameters

• Assurance analysis may be the tool to find out more

Page 43: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

43

Encryption Assurance• Assuming impossible to break the encryption keys

by brute force,• What if an intruders breaks to l servers ?• Assurance Analysis Measures– Confidence that no disclosure happens – Extend of disclosure otherwise

Page 44: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

44

Encryption Assurance• Basic measures–Probability a that no record gets disclosed– Expected fraction d of the file that gets

disclosed – Expected fraction that remains undisclosed–Number of records that are disclosed • or remain undisclosed

Page 45: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

45

Encryption Assurance• If l < k, then a = 1• If l ≥ k, then a depends on number of servers M,

on N and on bucket size b at each server– Basically, larger are N or M and smaller is b,

higher is the assurance• In-depth analysis remains to be done

Page 46: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

46

Example• k = 4, 1 encryption key, 16 servers• Assurance a against intrusion into k servers ?• Usual randomness – Servers are equally likely to be intruded

a = 1 – ( 4 /16 * 3 /15 * 2/14 * 1/13 ) = 1 – 1/1820 ≈ 0.9995

• Expected disclosure : d = ¼ of the file• Remains undisclosed : 1 – d = ¾ of the

file

Page 47: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

47

Example• Use of 2 encryption keys

a (1) ≈ 1 – 2/1820 ≈ 0.999

a ( 2) = 1 – (2/1820) 2 > 0.999999

a = 1 – 2/1820 – (2/1820) 2 ≈ 0.999• Expected disclosure d ≈ 1/8 of the file• Now what about using 10 keys ?

a ≈ 0.99 d ≈ 1/ 4 0• And what about 100 keys ?• And what if the file becomes bigger ?– e.g. M 128

Page 48: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

48

Conclusion• New data structure• Let the file to be scalable and distributed • Let data records to be client-side encrypted• Let encryption keys to be recoverable and

revocable• Negligible messaging, processing and storage

overhead• Future work should focus on experiments &

assurance analysis

Page 49: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

49

Future work• Experiments • Assurance analysis• Applications• Variants– Server caches client addresses– Probabilistic termination for key recovery– …

• Larger threat model– Malicious intruder

• Destroying or corrupting the shares

Page 50: LH* RE : A Scalable Distributed Data Structure with Recoverable Encryption Keys 1 ( Work in Progress, Jan 09) ( Provisional Patent Appl.) Sushil JajodiaWitold

50

Thank you for

Your Attention