Secure Cloud Database with Sense of Security
Introduction
• Cloud computing– IT as a service from third party service provider
• Security in cloud environment– Adversary corrupts the service provider?– Goal: protect sensitive data
Related Work
• Encryption Approach– NetDB2, IBM (Outsourced database)– Relational Cloud, CryptDB (MIT, CIDR 2011)– TrustedDB using secure hardware (VLDB 2011
demo, Radu Sion)• Secure Multi-Party Computation Approach– ShareMind
NetDB2
Tuple 1 xxx yyyTuple 2 aaa bbb
Tuple 1 !a4 a3gTuple 2 L%j m*KValue-level encryption
SELECT * WHERE value = `xxx’ SELECT * WHERE value = `!a4’
DB
Encrypted DB
Tuple 1 P2 P2
Tuple 2 P1 P1+Partition information
Partition:P1: < `m’; otherwise P2
SELECT * WHERE value < `xxx’ SELECT * WHERE value in [P1, P2]
Simple deterministic encryption
CryptDB
• Onion-encryption: multiple encryption done on 1 data
10
Original data
encryptE1(10) =A*65h
OPES: numeric comparisons
E2(A*65h) = BB647
Deterministic encryptionEquality can be done
Non-deterministic encryptionNo computation is feasible
E3(BB647) = %j@9G
If the user wants more computation power, decrypt to the desired level (one way!)
ShareMind
• Key: Secret sharing + recursive processing
A
B
C
Service Provider 1
Service Provider 2
Service Provider 3
QueryResult
D
E
F
D + E + F = Result
DB
DB = A + B + C
Comparisons of the two approaches
• Encryption-based methods– Provide limited computation capabilities– Security strength depends on the encryption function
• For example, deterministic encryption may allow a frequency analysis attack– `Male’ , `Female’ => `%k9)2’, `Ah475’– `Ah475’ x 21; `%k9)2’ x 5 in DB group
• MPC-based methods– More generic operators– Requires multiple trusted parties
• ShareMind cannot guard against collusions
MPC-based is the solution?
DB
A B C
SP2SP1 SP3
Owner
DB A
B C
SP1 SP2
Owner
MPC-based: What if all service providers collude?
Updated Model: Owner has to join in MPC operations, (storage and computation) cost not less than hosting DB on its own; 2 SPs? Not cost-effective
Research problem
• Owner keeps a small share A (small storage)
• Without A, SP cannot recover DB (similar security strength as MPC)
• Owner has minimal involvement in MPC (low cost)
DB A
B
SP
Owner
Desired Model
Secure multiparty computationBackground
Secret sharing (around 1980)
10
Secret
46 shares
Alice Bob
6+4 = 10
What is the secret value?
Alice’s share would be 5? 20? -3?
The secret is recovered only when the two parties exchange their shares
Secret sharing
• General case
s
Secret
s1 s2 … sn
The secret can be divided into n parties, for any n
s = g(s1, s2, …, sn)
Example:Sum of all shares (modular)Bitwise XOR of all sharesProduct, string concatenation, etc…
Security requirement:Given k < n shares, it is hard to recover s
Secure multiparty computation
Party 1
x1
Party 2
x2
Party n
xn
…
Objective:Every party obtains f(x1, x2, …, xn) but cannot observe any other information apart from its own data
r = f(x1, x2, …, xn)
r
r
r
To design a generic secure database
Before we proceed….Clarifying the security
• Negative result– Ideal security:• Querying workflow: user issues query => service
providers compute result and return to user• Knowledge gained by service providers: NONE. Not
even anything about query and result!
– A solution achieving ideal security is not more efficient than a non-outsourcing solution (not using cloud)
Knowledge gained by service provider
• Output space of a simple selection query: varies from no tuple to the entire database– Even larger space if we consider joins
• Example knowledge gain– If the output size is small, the service provider knows
it is not the case that the query selects entire table• To hide the above information, each returned
query result should be at least of size = entire table
Security in secure database
• The service provider can observe– Query content• The tables that are related to the query• Number of conditions, types of conditions, attributes
that are related• But not other info about query
– Query answer• the set of shares of tuples in some query answer• But not other content
Example query
• SELECT NameFROM EmployerWHERE Salary > 6000
• Transformed query may look like to one service providerSELECT ATTRIBUTE_7FROM TABLE_AWHERE ATTRIBUTE_3 - X > 0WITH PARAM_X_1 = 1234WITH PARAM_CMP = 335
Some basic design
• To hide the database, we use secret sharingDB = A + B
• In our case, we use multiplicative secret sharing– To store value v, we have
ab = v (mod D)• D: domain size• The shares are a, b
DB A
B
SP
Owner
2 types of operators
Owner
Service provider
Type 1a
b
Secure operation: the result is also in the share format
Majority of the operations should be of this type
Owner
Service provider
Type 2
r
Disclosing operation: the result is directly given to SP
Operation: Whether the tuple is in the query result
Type 2 can be done by Type 1, then send a to b
Share Compression
• The shares of the DB is generated randomly• Who decides the random shares? Lets use a
pseudo random function– Similar to RSA encryption
ID X
1 18
2 20
ID Share
1 1
2 4
f(ID) = mIDk mod n
ID Share
1 18
2 5
Share AKept by owner
Share BBy SP
k,m: secret key; n public key
k=2m=1
Storage cost
• Linear to number of columns– Assuming the IDs are from 1-t• Just need to remember t• Note on the random function:
– To make the input look like random, we have» f(ID) = mh(ID)k mod n
• h: any one-way hash
• Storage part is easy, how about computation?
ID Share
1 1
2 4
… …
f(ID) = mIDk mod n
How to do multiplication?
• Column-column multiplication– The two values are both in share format
A B
10 20
ID A (k = 1, m=5)
B (k =2,m=1)
2 10 4
A B
1 5
Real value
Owner
SP
C = A X B
200
5
40 (k = 3, m=5)
m1m2xk1xk2 = m1m2xk1+k2 k = 2
m=1
resharing
4
50
k=1m=5
A = a1a2B = b1b2C = (a1b1)(a2b2)
mIDk = 10
Recap: operations at the parties
A (k = 1, m=5)
B (k =2,m=1)Owner
SP
A B
1 5
2 8
10 9
… …
C (k=2,m=1)
C
50
…
…
…
Column-constant multiplication
A
10
ID A (k = 1, m=1)
2 2
A
5
Real value
Owner
SP
Constant B = 20
C = A X B
200
5
40 (k = 1, m=20)
k = 2m=5
resharing
20
10
k=-1m=4
mIDk = 2
Column-column addition
• A = a1a2
• B = b1b2
– C = A + B => a1a2 + b1b2
– Goal: C = c1c2 = a1a2 + b1b2
c2 = a1c1-1a2 + b1c1
-1b2
Owner: a1, b1SP: a2, b2
Kept by owner
Column-column addition
• c2 = a1c1-1a2 + b1c1
-1b2
A B
10 20
ID A (k = 1, m=5)
B (k =2,m=1)
2 10 4
A B
1 5
Real value
Owner
SP
C = A + B
30
f(ID) = mIDk
3.75
A:k=-1m=2.5
C (k = 2, m = 2)
8
B:k=0m=0.5
1.25 * 1 + 0.5 * 5
Column-constant addition
• Add a constant to each tuple– Becomes column-column addition
A
10
20
30
45
A Z
10 1
20 1
30 1
45 1
Managing negative values
• A sign bit is used– In two shares– Again the owner keeps a function• Additive function
• 0 represents positive, 1 represents negativeValue 1 Value 2 Sign bit of v1 x v2
0 0 0
0 1 1
1 0 1
1 1 0
XOR gate. Addition.
Multiplication with sign bitA B
-10 20
ID A (k = 1, m=5)
B (k =2,m=1)
2 10 4
A B
1 5
Real value
Owner
SP
C = A X B
-200
5
40 (k = 3, m=5)
m1m2xk1xk2 = m1m2xk1+k2 k = 2
m=1
resharing
4
50
k=1m=5
mIDk = 10
Magnitude part: the same!
Multiplication with sign bitA B
-10 20
ID A’s sign (k = 2, m=1)
B’s sign (k =1,m=1)
2 4 (1, +) 4 (1, +)
A B
0, - 1, +
Real value
Owner
SP
C = A X B
-200
0, -
1, +
m1m2xk1xk2 = m1m2xk1+k2 k = 2
m=2
resharing
8 (0, -)
1, +
k=1m=2-1 = 2
Magnitude part: the same!
n = 3
k = 3m=1
mod 3 mod 2
0: No change1: Change sign
Ans: 4 => 1Change!
Addition with sign bit
• The math is the same• A = a1a2
• B = b1b2
– C = A + B => a1a2 + b1b2
– Goal: C = c1c2 = a1a2 + b1b2
c2 = a1c1-1a2 + b1c1
-1b2
Addition with sign bit
• c2 = a1c1-1a2 + b1c1
-1b2A B
-10 20
ID A (k = 1, m=5)
B (k =2,m=1)
2 10 4
Sign (k=1, m = 5)20 (0, -)
(k=2, m = 2)8 (0, -)
A B
Value 1 5
Sign 1, + 0, -
Real value
Owner
SP
C = A + B
10
f(ID) = mIDk
1.25
A:k=-1m=2.5
C (k = 2, m = 2)
Value 8
Sign (k=1, m=2)4 (1, +)
B:k=0m=0.5
-1.25 * 1 + (-0.5) * (-5)
(k=1, m = 5) (k=2, m = 2)
No change
Comparison
• One type of comparison: A > 0– A = a1a2
• Secret sharing with the sign bit
a1>0 a2>0 A > 0
T (1) T (1) T(1)T (1) F (0) F(0)F (0) T (1) F(0)F (0) F (0) T(1)
Others
• Security– Given a share, the attacker cannot get the private
keys (k, m), i.e., other shares• Reducible to RSA
• A reduced security strength can be achieved– f(ID) = kh(ID)• No modular exponential => more efficient at service
provider side