53
Secure and efficient data sharing on encrypted cloud relational databases

Secure and efficient data sharing on encrypted cloud relational databases

Embed Size (px)

Citation preview

Page 1: Secure and efficient data sharing on encrypted cloud relational databases

Secure and efficient data sharing on encrypted cloud relational databases

Page 2: Secure and efficient data sharing on encrypted cloud relational databases

Introduction

• (Relational)-cloud databases are welcomed

Service provider (SP)User

Item_ID Cost Wholesale_price

1076 10 20

3308 15 50

Store data on cloud

Get back a data item

Item_ID Cost Wholesale_price

1076 10 20

Page 3: Secure and efficient data sharing on encrypted cloud relational databases

Encryption for security

• Due to security concern, data is encrypted before storing on cloud

Service provider (SP)User

Item_ID Cost Wholesale_price

Egask5 A42fgs 2S46Dg

asD3j64 139ASs Dd3fj2

Store data on cloud

Get back a data item

Item_ID Cost Wholesale_price

1076 10 20

Key is kept by user, but not SP!

Page 4: Secure and efficient data sharing on encrypted cloud relational databases

The problem of data sharing

Item_ID Cost Wholesale_price

Egask5 A42fgs 2S46Dg

asD3j64 139ASs Dd3fj2

Alice SP Bob

Bob is my business partner, I want to let him know the wholesale price

of some of my selected products.

Requirements:1. Shared data should be revealed to only

Bob (but not SP). 2. Other unshared data should remain

unknown to both Bob and SP3. Cost to Alice should be low (while cost to

Bob and SP should be affordable)

Alice’s data

Page 5: Secure and efficient data sharing on encrypted cloud relational databases

Application of data sharing

1. Alice is a company user of SP. Now, Alice hires Bob, who is a data analytics expert to perform analysis. Alice has to share some of her data with Bob

2. Alice and Bob are two business partners. They share some data for gaining advantages, e.g., more market information.

Page 6: Secure and efficient data sharing on encrypted cloud relational databases

Naïve solution of data sharing(E.g., CryptDB, TrustedDB)

• Encryption: Use an existing general encryption function, e.g., RSA with padding, to encrypt all datac = E(p, k)– Ciphertext: c– Plaintext: p– (Public) Key: k– Encryption function: E

Item_ID Cost Wholesale_price

Egask5 A42fgs 2S46Dg

asD3j64 139ASs Dd3fj2

Page 7: Secure and efficient data sharing on encrypted cloud relational databases

Naïve solution of data sharing - cont

Wholesale_price

2S46Dg

Item_ID Cost Wholesale_price

Egask5 A42fgs 2S46Dg

asD3j64 139ASs Dd3fj2

Alice SP Bob

Share Wholesale_price of Item “Egask5”

Alice sends Bob a copy of the key

On request, SP sends Bob the shared itemAccess control is enforced to prevent Bob from seeing unauthorized items

Item_ID Cost Wholesale_price

1076 10 20

3308 15 50

This solution is not secure!

Page 8: Secure and efficient data sharing on encrypted cloud relational databases

Another naïve solution

Item_ID Cost Wholesale_price

Egask5 A42fgs 2S46Dg

asD3j64 139ASs Dd3fj2

Alice SP Bob

Wholesale_price

2S46Dg

Wholesale_price

20

Wholesale_price

20

Alice downloads the items to be shared and decrypts them

Send Bob the plain data

Bob either stores the data on his own or inserts them to cloud like new tuples

High processing cost to Alice

Page 9: Secure and efficient data sharing on encrypted cloud relational databases

Problem definition

• Data: relational data– A table R contains

• T: a set of tuples• C is a set of columns (attributes)

– Each tuple t has exactly m values • (NULL is also a value)

• Format of data for sharing:– CS: a subset of C

– TS: a subset of T– Just like the result of a query

A B C

a1 b1 c1

a2 b2 c2

B C

b2 c2

T = {t1, t2}C = {“A”, “B”, “C”}t1 = {a1, b1, c1}t2 = {a2, b2, c2}

TS = {t2}CS = { “B”, “C”}t2 = {b2}

Page 10: Secure and efficient data sharing on encrypted cloud relational databases

Models

• 3 parties: Alice, Bob, SP– Relationship: refer to introduction

• Attack model– Bob and SP are semi-honest and colluding• Bob and SP are functioning as normal• An attacker observes everything seen by Bob and SP• Requirement:

– The attacker cannot any plain data of Alice except for those are shared with Bob

Page 11: Secure and efficient data sharing on encrypted cloud relational databases

Solution framework

• The solution includes:– An encryption method (KeyGen, Enc, Dec)– Sharing method (Share, SDec)

Alice Bob SP

1. k = KeyGen()

2. c = Enc(p, k) A B C

ca1 cb1 cc1

ca2 cb2 cc2

2. p = Dec(c, k)

3. H = Share(CS, TS, k)

4. p = SDec(c, H)

Page 12: Secure and efficient data sharing on encrypted cloud relational databases

Our solution: Relational-based encryption (RBE)

• Problem of using general encryption, e.g., RSA– The same key is required to decrypt all encrypted

values– In order to let Bob decrypt one particular data

item, the decryption key must be sent to Bob– Overpowered Bob can now decrypt any data

encrypted by Alice

Page 13: Secure and efficient data sharing on encrypted cloud relational databases

Relational-based encryption (RBE)

• Idea: How about having each individual data item encrypted by a unique value key?

A B C

a1 b1 c1

a2 b2 c2

A B C

ka1 kb1 kc1

ka2 kb2 kc2

A B C

ca1 cb1 cc1

ca2 cb2 cc2

+

Plain values Value key table Encrypted values

To share b1

Give kb1 to BobBob can only decrypt cb1, other values are safe since Bob does not have other value keys

However, Alice has to remember all value keys, it will be a high storage cost

Page 14: Secure and efficient data sharing on encrypted cloud relational databases

Key abstraction

• Each cell can be located by column identifier and row identifier

• Each tuple has a tuple secret rid; each column has a column secret cid

• Use one-way hash function– k = h(rid, cid)

• Storage cost at Alice: O(mn) => O(m+n)

A B C

t1 ka1 kb1 kc1

t2 ka2 kb2 kc2

t1, A ka1

t2, C kc2

m: number of columnsn: number of tuples

Page 15: Secure and efficient data sharing on encrypted cloud relational databases

Towards O(1) storage cost at Alice

• Use an existing encryption function– E: encryption function– D: Decryption function

• Tuple secrets and column secrets are encrypted and are stored at SP

A B C

E(cidA) E(cidB) E(cidC)

E(rid1) ca1 cb1 cc1

E(rid2) ca2 cb2 cc2

Page 16: Secure and efficient data sharing on encrypted cloud relational databases

Encryption/decryption process

• Alice first gets back E(cid) and E(rid) of the value to be encrypted/decrypted– Decrypt and get cid and rid– Get the value key of the cell and encrypt/decrypt

the cell• Although it may seem to have a higher

encryption/decryption cost now, RBE is more efficient for relational data – more details after the math details

Page 17: Secure and efficient data sharing on encrypted cloud relational databases

Details in math

• KeyGen– Just the same key generation as the underlying

encryption scheme• Enc– Tuple ti = <p1, p2, …, pm>– Obtain cid and rid– ci = pi XOR h(rid XOR cid)• h: one-way hash

Page 18: Secure and efficient data sharing on encrypted cloud relational databases

Details in math

• Dec– Encrypted tuple t’i = <c1, c2, …, cm>– Obtain cid and rid– pi = ci XOR h(rid XOR cid)

Page 19: Secure and efficient data sharing on encrypted cloud relational databases

Correctness of encryption

• pi = ci XOR h(rid XOR cid) --- (1)

• ci = pi XOR h(rid XOR cid) --- (2)

• Sub. (2) into RHS of (1)• ci XOR h(rid XOR cid)

= pi XOR h(rid XOR cid) XOR h(rid XOR cid)= pi

Page 20: Secure and efficient data sharing on encrypted cloud relational databases

Security

• Encrypted data is stored at cloud, is it safe?

• ci = pi XOR h(rid XOR cid)

One time pad: p XOR kNote: the same key cannot be used to encrypt two or more data items!

One time pad is perfectly secureNot breakable unless the key is leaked

One-way hash function: not reversibleKnowing the hash value cannot derive the input to hash (rid XOR cid) – an important feature to guard against CPA-style attackOverall: As secure as

the hash function

There are tons of highly secure one-way hash function, including those encryption functions of different encryption schemes

Page 21: Secure and efficient data sharing on encrypted cloud relational databases

Security - cont

• On the other hand, cid and rid can be derived from CN (column name) and E(rid, k)

• Imagine they are encrypted values of the underlying encryption function (E, D), the security is the same as underlying scheme

Page 22: Secure and efficient data sharing on encrypted cloud relational databases

Efficiency

• Decrypting a query result with n tuples and m columns– Traditional method, e.g., RSA,• mn decryptions

• In our scheme– m+n decryptions, mn hashes, 2mn XOR operations

• Cost of decryption >> hash >> XOR

Page 23: Secure and efficient data sharing on encrypted cloud relational databases

Data sharing

• Input: TS, CS

– Alice sends the rid of each tuple in TS to Bob

– Alice sends the cid of each column in CS to Bob

• H = <HT, HC> = Share(TS, CS, k)– HT = {rid | rid of t and t in TS}

– HC = {cid | cid of c and c in CS}

• Decryption: SDec(c, H)– Find corresponding rid and cid of c• pi = ci XOR h(rid XOR cid)

Page 24: Secure and efficient data sharing on encrypted cloud relational databases

Security

• Revealing some values of cid and rid

Cells that are not related – of course secureCells knowing its cid but

not rid, secure?

Page 25: Secure and efficient data sharing on encrypted cloud relational databases

Secure

• ci = pi XOR h(rid XOR cid)

• Note: the above already assumed Bob and SP are colluding– Otherwise, Bob has no access to encrypted values

of other data

Unknown hash input due to unknown rid or cid

The hash value is unknown then

Page 26: Secure and efficient data sharing on encrypted cloud relational databases

Problem of multiple sharing

• Users collusion• User retrieves different shared versions at

different time 1st sharing

2nd sharing

Additional information that can be observed combining both sharing instances

Page 27: Secure and efficient data sharing on encrypted cloud relational databases

• Introduction– ECC: Operations are defined on 2D but finite

points

Advanced solutionEcliptic curve cryptography (ECC)

y2 mod p = x3 – x mod p y2 mod p = x3 – x + 1 mod p

p: system parameter

Page 28: Secure and efficient data sharing on encrypted cloud relational databases

Operations on ECC

• “Addition”

• Scalar multiplication– kP = P + P + … + P

P-2P

2P

Page 29: Secure and efficient data sharing on encrypted cloud relational databases

Operations on ECC

• Order of curve– Number of points on the curve– Let n be the order of curve

• (n+1)P = P for all P

• Curve with prime order, i.e., n is prime– There is integer k s.t., kP = Q for any point P, Q (P != 0)

• Elliptic curve discrete logarithm problem (ECDLP)– Given P, Q, it is hard to find k s.t. kP = Q

• Pairing function e: – e(aP, bQ) = e(P, Q)ab

– Security: Bilinear Diffie-Hellman (BDH) assumption• Given P, aP, bP, cP, it is hard to find e(P, P)abc

Page 30: Secure and efficient data sharing on encrypted cloud relational databases

Improvement over our sharing scheme

• Recall:• Encryption: ci = pi XOR h(rid, cid)

• Decryption: pi = ci XOR h(rid, cid)• Share: Return all concerned rid and cid

– Define h(rid, cid) = e(rid P, cid Q)• P, Q are private (even if they are public, it is fine.)

Page 31: Secure and efficient data sharing on encrypted cloud relational databases

Sharing

• Protocol Share– Alice generates a random r– Return• {(r-1*rid)*P}• {(r*cid) Q}

Page 32: Secure and efficient data sharing on encrypted cloud relational databases

Bob’s decryption

• Protocol SDec– Bob has • X =(r-1*rid)* P• Y = (r*cid) Q

– Computing g(X, Y) = h2(e(X, Y))• = h2(e((r-1*rid) P), (r*cid) Q))

• = h2(e(rid P, cid Q))

Recall:h(rid, cid) = h2(e(rid P, cid Q))

Page 33: Secure and efficient data sharing on encrypted cloud relational databases

Security in multiple sharing

• Focus on columns, the case for rows is similar

1st sharing

2nd sharing

r1 cidA Q r1 cidB Q r1 cidC Q

r2 cidB Q r2 cidC Q

The values of rid and cid are contained in different sharing instances, is it a concern?

Page 34: Secure and efficient data sharing on encrypted cloud relational databases

Question: is it secure?

• If we can find e(rid2 P, cidA P)…, we can solve BDH problem (let Q = P for now)– Given P, aP, bP, cP, find e(P, P)abc

• In our case– a = cidA

– b = r2-1 * rid2

– c = r2

– Generate random unrelated parameters rid1, cidB, r1

r1 cidA P r1 cidB P

r2 cidB P

r1-1*rid1 P

r2-1*rid2 P

A B C

Any values combination of a, b, c can be expressed in this way

r1 A r1 cidB P

r1-1 * rid1 P

B

cidB C

Page 35: Secure and efficient data sharing on encrypted cloud relational databases

Security in multiple rows, columns?

r1 cidA P r1 cidB P

r2 cidB P

r1-1*rid1 P

r2-1*rid2 P

r1 A r1 cidB P

r1-1 * rid1 P

B

cidB C

r1-1*ridi Pr1

-1 * ridi P

r2 cidC P

cidC C

Our security proof is for general case

Page 36: Secure and efficient data sharing on encrypted cloud relational databases

Selecting tuples for sharing

• It is a fundamental problem that how the user defines what data to share with a particular party

• Select tuple with user’s free choice– Requires at least linear cost (to number of tuples)

• Another option– Define by query

Page 37: Secure and efficient data sharing on encrypted cloud relational databases

Pre-computation for sharing by query

Alice SP Bob

Q Q

RR

Alice issues a query to define the data to be shared with Bob

Alice prepares an index-like pre-computed information and gives it to SP

H H

Shared

DB

R is related to the query answer and index

A hint H is generated based on R

Bob can observe the shared data with the hint and the index at SP

Page 38: Secure and efficient data sharing on encrypted cloud relational databases

Solution framework

• The solution includes:– An encryption method (KeyGen, Enc, Dec,

BuildTree)– Sharing method (SQuery, Share, SDec)

Alice Bob SP

1. k = KeyGen()2. c = Enc(p, k)

A B C

ca1 cb1 cc1

ca2 cb2 cc2

2. p = Dec(c, k)

5. H = Share(CS,Φ, k)

6. p = SDec(c, Δ, H)3. Δ = BuildTree()

4. Φ = SQuery(q)

Page 39: Secure and efficient data sharing on encrypted cloud relational databases

Extending basic scheme

• Encrypted tuple secrets in a tree

Ei(rid1, k12) Ei(rid2, k12) Es(k12)

t3 t4 t5 t6 t7 t8

Leaf level

Ei(k12, k14) Ei(k34, k14) Es(k14)

Ei(k14, k18) Ei(k58, k18) Es(k18)

t1 t2

Keys for Es are kept at Alice only

Page 40: Secure and efficient data sharing on encrypted cloud relational databases

Computing the answer of a query

• SQuery(q)

Ei(rid1, k12) Ei(rid2, k12) Es(k12)

t3 t4 t5 t6 t7 t8

Leaf level

Ei(k12, k14) Ei(k34, k14) Es(k14)

Ei(k14, k18) Ei(k58, k18) Es(k18)

t1 t2

Answers

Returned to Alice

Page 41: Secure and efficient data sharing on encrypted cloud relational databases

Share (CS, Φ, k)

• Φ = {Es(k14)}

• H = <HT, HC> = Share(CS, Φ, k)– HT = {k14}

– HC = {cid | cid of c and c in CS}

Page 42: Secure and efficient data sharing on encrypted cloud relational databases

Computing the answer of a query

• Bob’s knowledge: k14

Ei(rid1, k12) Ei(rid2, k12) Es(k12)

t3 t4 t5 t6 t7 t8

Leaf level

Ei(k12, k14) Ei(k34, k14) Es(k14)

Ei(k14, k18) Ei(k58, k18) Es(k18)

t1 t2

k12 k34 Es(k14)

rid1 rid2 Es(k12)

Tuple secrets of t1 to t4

Remain unknown

Page 43: Secure and efficient data sharing on encrypted cloud relational databases

Advantage of using index

• Without index, cost to Alice must be at least linear to number of tuples in the sharing domain– Now, it is linear to number of nodes returned in

the tree, which is usually much smaller

Page 44: Secure and efficient data sharing on encrypted cloud relational databases

Indexing scheme for multi-sharing scenario

• Use a different function to generate the value key

t1

h1

t2

h2

t3

h3

t4

h4

t5

h5

t6

h6

t7

h7

t8

h8

h12 h34 h56 h78

h14 h58

h18

^ ^ ^ ^ ^ ^ ^ ^

^ ^ ^ ^

^^

^

Leaf level

For t1:ci = pi XOR h1(h12(h14(h18( cid ))))

Page 45: Secure and efficient data sharing on encrypted cloud relational databases

Computing the answer of a query

• SQuery(q)• Φ = {h14 ο h18}

t1

h1

t2

h2

t3

h3

t4

h4

t5

h5

t6

h6

t7

h7

t8

h8

h12 h34 h56 h78

h14 h58

h18

^ ^ ^ ^ ^ ^ ^ ^

^ ^ ^ ^

^^

^

Leaf level

Answers

Page 46: Secure and efficient data sharing on encrypted cloud relational databases

Share (CS, Φ, k)

• Φ = {h14 ο h18}

• H = Share(CS, Φ, k)– H = {h14(h18(cid)) | cid of c and c in CS}

Page 47: Secure and efficient data sharing on encrypted cloud relational databases

Computing the answer of a query

• Bob’s knowledge– x = h14 (h18 (cid ))

t1

h1

t2

h2

t3

h3

t4

h4

t5

h5

t6

h6

t7

h7

t8

h8

h12 h34 h56 h78

h14 h58

h18

^ ^ ^ ^ ^ ^ ^ ^

^ ^ ^ ^

^^

^

Leaf level

value key of t1 = h1( h12(x))

One-way hash, can’t go upCan’t see other tuples

Specific to this column, not another column

Page 48: Secure and efficient data sharing on encrypted cloud relational databases

Developed schemesScheme Secure against

user-SP Collusion?

Secure in multiple sharing?

Cost

Basic Yes Partial O(m+n) - Very lowMulti Yes Yes O(m+n) – Low

Scheme Alice’s cost Secure in multiple sharing?Basic O(m + u) PartialMulti O(mu) Yes

u: number of nodesm: number of columnsn: number of tuples

With Pre-computation

Page 49: Secure and efficient data sharing on encrypted cloud relational databases

Related work

• Privacy preserving data integration, e.g., DMKD 04– User issues query that is to be answered by an

untrusted platform across multiple data sources– Different model

• Access control by ABE (attribute –based encryption), e.g., ASIACCS 10– Each data is associated with an access structure. Each

user is associated with certain access attributes. Only the user with the access attributes satisfying the access structure of the data can decrypt the data.

Page 50: Secure and efficient data sharing on encrypted cloud relational databases

Access control• Example:

• A file requires “IT staff” OR (“Marketing” AND “Manager”)• Alan is <“IT Staff”, “Junior”, “Full Time”> - OK• Betty is <“Part time”, “Marketing”, “Manager”> - OK• Cathy is <“Full time”, “Sales”, “Manager”> - No

• Features• Attribute revocation and ciphertext revocation: SP takes almost all workload

– Attribute revocation: User permission changes, e.g., Betty becomes <“Full time”,…>– Ciphertext revocation: file permission changes

• Drawback in our case: require a pre-defined set of access attributes• Ad hoc sharing instances?

– Need to add a new attribute, say “ABC company”, which requires re-encryption of the entire database, by the data owner

– Side note: this method is attracting a good amount of attention in crypto area.

Page 51: Secure and efficient data sharing on encrypted cloud relational databases

Backup

Page 52: Secure and efficient data sharing on encrypted cloud relational databases

Ei(ϒ1, k12) Ei(ϒ2, k12) Es(k12)

t3 t4 t5 t6 t7 t8

Leaf level

Ei(k12, k14) Ei(k34, k14) Es(k14)

Ei(k14, k18) Ei(k58, k18) Es(k18)

t1 t2

Page 53: Secure and efficient data sharing on encrypted cloud relational databases

t1

h1

t2

h2

t3

h3

t4

h4

t5

h5

t6

h6

t7

h7

t8

h8

h12 h34 h56 h78

h14 h58

h18

^ ^ ^ ^ ^ ^ ^ ^

^ ^ ^ ^

^^

^

Leaf level