38
Secure Query Processing in an Untrusted (Cloud) Environment

Secure Query Processing in an Untrusted (Cloud) Environment

Embed Size (px)

Citation preview

Page 1: Secure Query Processing in an Untrusted (Cloud) Environment

Secure Query Processing in an Untrusted (Cloud) Environment

Page 2: Secure Query Processing in an Untrusted (Cloud) Environment

Agenda

• Introduction to the model• Overview of different approaches

Page 3: Secure Query Processing in an Untrusted (Cloud) Environment

Model

Data owner / Users Database Service

Provider (SP)

Data

Database

Provides professional database service• Backup• Performance• …

EmpID HourlyRate WorkingHour

2 40 36

Get back data from SP for own use

Find Alice’s record

Page 4: Secure Query Processing in an Untrusted (Cloud) Environment

Introduction: security concern

Data owner / Users Database Service

Provider (SP)

SensitiveData

Trusted Party Untrusted Party

Objectives of our work:(1) Protect sensitive data from being seen by

untrusted party (including SP)(2) Users still enjoy the database service from SP

March 2009, Google Docs allowed unintended access to some private documentsJune 2013, Facebook bug leaks contact info of 6 million users

Page 5: Secure Query Processing in an Untrusted (Cloud) Environment

Secure database system - overview

• Encrypt data before sending to SP

57309 23749300 489226453

EmpID HourlyRate WorkingHour

1 30 23

2 40 36

Data owner (DO) / UserService provider (SP)

EmpID HourlyRate WorkingHour

79826 334164104 547322019

57309 23749300 489226453

Q: SELECT * WHERE HourlyRate * WorkingHour > 900

Q’

2 40 36

Transformed queries, with some ‘trapdoors’ to help SP to compute the answer

Page 6: Secure Query Processing in an Untrusted (Cloud) Environment

Approaches to solve the problem

• Hardware-based solutions– Trusted DB [SIGMOD 2011], Cipherbase [SIGMOD 2013]

• Homomorphic-encryption-based solutions– CryptDB [CACM 2012], MONOMI [PVLDB 2013]

• Secure Multiparty Computation (SMC) approach– ShareMind [PAISI 2012]

• Our solution• Secure indexing approaches

– Orthogonal to above solutions, can be integrated to any of them

– Domain partitioning [SIGMOD 2002]

Page 7: Secure Query Processing in an Untrusted (Cloud) Environment

Before we discuss different approaches…

• SP is assumed to be more powerful.• Users are trusted. They can see plain data.

– A baseline solution exists• Users retrieve the entire encrypted database• Decrypt it, then do whatever they want

– Problems of the baseline solution• High communication cost and high processing cost at users

• What different approaches are trying to do– Delegate the query processing job to SP

• Utilize the power of SP

– Users obtain the (encrypted) query answers only• Low communication cost and low processing cost at users

– We can always revert back to baseline method!

Page 8: Secure Query Processing in an Untrusted (Cloud) Environment

Hardware-based solutions

• Use of secure co-processor• Can store a key on it– No party can observe the

key stored on it• Provides API for

cryptographic actions using the stored key

• Tamper-resistance– Cannot hack the device through physical intrusion

Page 9: Secure Query Processing in an Untrusted (Cloud) Environment

Use of secure coprocessor

Users SP

Data

Database

Secure co-processor(s) is/are installed at SP side

Data

Encryption using a secret key

The key is sent to the secure coprocessor through secure channel

Note: the key is known to users and the secure co-processor only

Find Alice’s record

Decrypt the records one by one and process the query

Result

Can be encrypted or plain. In this example, just return yes/no

Answer

Answer

Decrypt the answer

Page 10: Secure Query Processing in an Untrusted (Cloud) Environment

Optimization strategies

• Add more secure co-processors for parallel processing

• Compute the part of query that does not involve encrypted data on DBMS first– Example: SELECT * FROM T WHERE Price > 10 and

Order_Date < “22 Feb 2014”• If Price is encrypted while Order_Date is not, the DMBS

first processes the predicate Order_Date < “22 Feb 2014”

Page 11: Secure Query Processing in an Untrusted (Cloud) Environment

Pros and cons

• Pros– Strong security protection as long as the secure

coprocessor is not compromised– Can process any query

• Cons– Require special hardware– Expensive

In USD(Data obtained on 7 Feb 2014)

Page 12: Secure Query Processing in an Untrusted (Cloud) Environment

Homomorphic-encryption-based solutions

• Homomorphic encryption– A special type of encryption which allows certain type

of operations (on plain values) to be executed on encrypted values• Let E be an encryption function

– Homomorphic propertyE(f(x, y)) = g(E(x), E(y))

– Examples• RSA

– E(a)*E(b) = E(a*b)

• OPES [SIGMOD 04]– E(a) > E(b) if and only if a > b

Page 13: Secure Query Processing in an Untrusted (Cloud) Environment

Using homomorphic encryptions

E(35) by OPES

ejAAS

Users SP

EmpID HourlyRate WorkingHour

1 50 23

2 30 36

EID HR WH

1 ka6fj h3a45

2 d2s2a Anm24

Sensitive

By OPES

By RSAEID HR WH

1 Hj%3 45877

2 Ks12# AA244OPES

RSA

SELECT HR*WH WHERE HR > 35

HR > 35

>

<

HR*WH

z%^#5

HR*WH

1150

Page 14: Secure Query Processing in an Untrusted (Cloud) Environment

Pros and cons

• Pros– Low overheads in query processing at SP

• Example: just need multiplication on RSA-encrypted data without encryption or decryption

• Cons– Multiple encrypted versions of the same data may

be needed– Does not support composition of operations

• Without data interoperability• Example: cannot compute HR*WH > 6000

Page 15: Secure Query Processing in an Untrusted (Cloud) Environment

Secure Multiparty Computation (SMC) approach

EID WH

2 18

2 13

2 5

Users

SP #1

SP #2

SP #3EID HR WH

1 50 23

2 30 36

EID HR WH

1 60 28

2 31 18

EID HR WH

1 40 56

2 0 5

EID HR WH

1 50 39

2 99 13

Secret sharing

v = v1+ v2 + v3 mod 100

Each SP can’t derive the plain value v by having one share vi only

SELECT EID, WH WHERE HR > 35

By exchanging some information (may involve multiple rounds), the result can be computed securely

EID WH

2 36

Page 16: Secure Query Processing in an Untrusted (Cloud) Environment

Pros and cons

• Pros– Theoretically support any computations– Usually low processing cost at SPs• Most protocols do not need cryptographic operations

• Cons– High communication costs between SPs• Multiple rounds of communication

– The SPs must not be colluding– 3 times the cost due to 3 SPs

Page 17: Secure Query Processing in an Untrusted (Cloud) Environment

Our solution

Users SP

KeysEncrypted data

2-party secret sharing

Storage cost at user is linear to schema size (number of tables and number of columns)

SELECT EID, WH WHERE HR > 35

Some hints for SP to process(derived from keys)Message size depends on keys (small)

Encrypted Results

Page 18: Secure Query Processing in an Untrusted (Cloud) Environment

Key features of our design

• Low processing cost at users– Operate on keys only– Make use of SP’s processing power for query

processing• Allows composition of operations– Example: evaluate Quantity * Price + Fixed_cost• First compute A = Quantity * Price• Then compute Ans = A + Fixed_cost

– Data interoperability

Page 19: Secure Query Processing in an Untrusted (Cloud) Environment

Key features of our design

• Allow operations between plain and encrypted data– Encrypting everything is not suggested• Overheads in processing on encrypted data

– Queries may compose of both plain and encrypted data

– Example: SELECT * WHERE Num_Stock * Stock_Price > 5000• Num_Stock is encrypted, Stock_Price and the constant

are not.

Page 20: Secure Query Processing in an Untrusted (Cloud) Environment

What can our system do?

• SQL structure

SELECT T1.Price*T2.QuantityFROM Inventory as T1 INNER JOIN SaleOrder as T2 ON T1.itemID = T2.itemIDWHERE T1.Stock*T1.Price < 10,000

• On integer type data

Projection with numeric operations; can be expressions composed with addition, multiplication

Equi-join

Predicate(s) to filter result tuples; support AND/OR/NOT; support expressions

Page 21: Secure Query Processing in an Untrusted (Cloud) Environment

More operations

• INSERT/UPDATE– Example:

UPDATE T1 SET Salary = Salary * 1.05 WHERE PeerScore + ManagerScore > 30

• Basic aggregate function: COUNT/SUM/AVERAGE– Example:

SELECT SUM(HR*WH) FROM T WHERE Age < 30

Can be an expression

Just like selection

Page 22: Secure Query Processing in an Untrusted (Cloud) Environment

Limitations

• Incur high processing cost to SP, due to massive cryptographic operations

• Still under development– Currently focus on integer type data– Query plan optimization

Page 23: Secure Query Processing in an Untrusted (Cloud) Environment

END.

Page 24: Secure Query Processing in an Untrusted (Cloud) Environment

ADDITIONAL MATERIALS

Page 25: Secure Query Processing in an Untrusted (Cloud) Environment

SMC Example: addition protocol

z

s3 + r1

SP #1

SP #2

SP #3

x y

x1 y1

x y

x3 y3

x y

x2 y2

Operation: z = x + y

s1 = x1+y1-r1

v = v1+ v2 + v3 mod n

s2 = x2+y2-r2

z

s1 + r2

z

s2 + r3s3 = x3+y3-r3

z1 + z2 + z3 = x + y

Page 26: Secure Query Processing in an Untrusted (Cloud) Environment

Our solution

X Y

x1a y1a

x2a y2a

Users SP

Row-id X Y

r1 x1a y1a

r2 x2a y2a

… … …

Row-id X Y

E(r1) x1b y1b

E(r2) x2b y2b

… … …

2-party secret sharingRow-id X

ckX

Ycky

r1

r2

Column key for each column

X Y

x1b y1b

x2b y2b

It incurs a high storage overhead to users Row-ids are encrypted by some

existing encryption method

Without knowing the shares at users, SP can’t recover the plain dataA table of pseudo-

random numbers

Page 27: Secure Query Processing in an Untrusted (Cloud) Environment

The actual storage at both sides

A<2, 2>

B<1, 3>

Users

Row-id A B

1 8 8

2 32 29

Row-id A B

E(1) 9 31

E(2) 22 29

SPUsers only remember the column keys (each contains two values)

A B

2 3

4 1

Plain data

v = v1v2 mod nn = 35

Page 28: Secure Query Processing in an Untrusted (Cloud) Environment

Operation on our encrypted data

A<2, 2>

B<1, 3>

Users

Row-id A B

E(1) 9 31

E(2) 22 29

SP

Similar to SMC, there will be some communications between user and SPBut the communication is uni-directional (only user -> SP)

Operation: C = A+B

C<4, 5>

Ce = A’ + B’

E(1) 20

E(2) 5Some ‘hints’ are sent to SP to help SP compute the operation

Page 29: Secure Query Processing in an Untrusted (Cloud) Environment

Retrieving the data

• SELECT C WHERE A * B + D > 20

A<…>

B<…>

C<…>

D<…>

Table schema, and column keys at user

Row-id Match?

E(1) No

E(2) Yes

E(6) No

E(4) No

… …

Find the answers

Projection on C only

Row-id C

E(2) 3

E(16) 12

… …

Encrypted answer sent back to userRow-ids must be there

Row-id A B C D

E(1) … … … …

E(2) … … … …

Encrypted values at SP

Page 30: Secure Query Processing in an Untrusted (Cloud) Environment

Decrypting the result

• SELECT C WHERE A * B + D > 20

A<…>

B<…>

C<…>

D<…>

Table schema, and column keys at user

v = v1v2 mod nn=35

Row-id C

E(2) 3

E(16) 12

… …

Row-id C

2 31

16 17

… …

User computes own item keys

Encrypted answers

C

23

29

Decrypt

Page 31: Secure Query Processing in an Untrusted (Cloud) Environment

Without data interoperability

RSA:E1(x) * E1(y) = E1(x*y)

*E1(x) E1(y) E1(a) =

E1(x*y)

OPES:E2(a) > E2(b) if a > b

Supports multiplication over encrypted data

Supports comparison over encrypted data

>E2(a) E2(b)

How to computex+y > b over encrypted data?

User

Operate on different space

decrypt E1(a)then encryptE2(a)

Page 32: Secure Query Processing in an Untrusted (Cloud) Environment

With data interoperability

+E(x) E(y)

>

E(a) = E(x+y)

E(b)

How to computex*y > b over encrypted data?

Other examples: (x1 – x2)2 + (y1 – y2)2 can be computed using addition and multiplication only

Page 33: Secure Query Processing in an Untrusted (Cloud) Environment

Secure item key generator

• INPUT: row key r, column key <m, x>– All are kept private

• System parameter: n, g– Selected by DO, n is public, g is not

• Generation function: vk = mgxr mod n• Security:– Extension of RSA function– Even if an attacker observes several item keys, it is

computationally hard to derive the secret parameters and hence other item keys

Page 34: Secure Query Processing in an Untrusted (Cloud) Environment

Illustration 1:Multiplication of 2 columns

A B

1 2 3

2 4 1

Plain data

A<2, 2>

B<1, 3>

Ae Be

1 9 31

2 22 29

Table schema, and column keys at DO

Encrypted values at SP

n=35g=2

C<2, 5>

Ce

1 34

2 8

Result: C

1 29

2 18

C=AB

6

4

DO SP

<mamb, xa + xb> Ce = AeBe

Page 35: Secure Query Processing in an Untrusted (Cloud) Environment

Proof of correctness

• We have a = magrxa a’

b = mbgrxb b’

• Decryption on Cmamb gr(xa+xb) (a’b’)

= (magrxa a’)(mbgrxb b’)

= ab

A<ma, xa>

B<mb, xb>

Ae Be

E(r) a’ b’

C<mamb, xa+xb>

Ce

E(r) a’b’

DOSP

Page 36: Secure Query Processing in an Untrusted (Cloud) Environment

Illustration 2Addition

• C=A+B– Example: SELECT * WHERE salary + bonus >

40,000• Preparation stage– We add a constant column S to the plain database– S is encrypted, i.e., DO keeps a column key of S, SP

keeps a column of encrypted valuesA B

2 3

4 1

A B S

2 3 1

4 1 1

Page 37: Secure Query Processing in an Untrusted (Cloud) Environment

A B S

2 3 1

4 1 1

DO SP

Plain data

C = A + B

5

5

C<4, 5>

A<2, 2>

B<1,3>

S<11,13>

Ae Be Se

E(1) 9 31 8

E(2) 22 29 4pA = 15pB = 2

A’ = qAAeSepA B’ = qBBeSe

pB

E(1) 29 26

E(2) 4 1

Row key C

1 23

2 1

Item keys

qA = 18qB = 4

Ce = A’ + B’

E(1) 20

E(2) 5

Storage at both sidesDO gives hints to SPSP computes the encrypted answers

pA = 13-1 * (5-2) mod 24pB = 13-1 * (5-3) mod 24qA = 2 * 1115 * 4-1 mod 35qB = 1 * 112 * 4-1 mod 35pA = xs

-1 * (xc-xa) mod Φ(n)pB = xs

-1 * (xc-xb) mod Φ(n)qA = ma * mS

pa * mC-1 mod n

qB = mb * mSpb * mC

-1 mod n

Page 38: Secure Query Processing in an Untrusted (Cloud) Environment

Proof of correctness

• We have a = magrxa a’

b = mbgrxb b’

1 = msgrxs s’ s’ = ms-1g-rxs

• Following the procedure, we have

A<ma, xa>

B<mb, xb>

S<mS, xS>

Ae Be Se

E(r) a’ b’ s’

C<mc, xc>

Ce

E(r) c’

DOSP

c’ = (qAa’s’pA)+(qBb’s’pB)c’ = (ma mspA mc

-1) a’ s’ pA + (mb mspB mc

-1) b’ s’ pBA’ = qAAeSe

pA B’ = qBBeSepB

E(1) 29 26

E(2) 4 1

Ce = A’ + B’

E(1) 20

E(2) 5

qA = ma * mSpa * mC

-1 mod nqB = mb * mS

pb * mC-1 mod n

c’ = (ma mc-1) a’ g-rxspA + (mb mc

-1) b’ g-rxspB

(ms-1g-rxs)pA

= ms-pAg-rxspA

c’ = (ma mc-1) a’ g-r(xc-xa) + (mb mc

-1) b’ g-r(xc-xb)

pA = xs-1 * (xc-xa) mod Φ(n)

pB = xs-1 * (xc-xb) mod Φ(n)

c’ = mc-1 g-rxc (ma grxa a’ + mb grxb b’)

c’ = mc-1 g-rxc (a + b)

Decryption on c’mc

grxc c’ = a + b