Upload
brisa-ledyard
View
217
Download
0
Embed Size (px)
Citation preview
Don’t Reveal My Intension: Protecting User Privacy using Declarative Preferences during Distributed Query Processing
Nicholas L Farnan, Adam J Lee, Panos K Chrysanthis
University of Pittsburgh
Ting Yu
North Carolina State University
ESORICS, 14 Sept. 2011 2
Alice is Concerned her Employer Pollutes
SELECT * FROM Plants, Supplies, Polluted_WatersWHERE Supplies.type = "solvent", AND Supplies.name = Polluted_Waters.pollutant, AND Polluted_Waters.location = Plants.location, AND Plant.id = Supplies.plant_id;
ESORICS, 14 Sept. 2011 3
Our Goals for this Work
To empower users querying distributed database system with declarative controls over their privacy that are flexible enough to allow for a balance between privacy and performance
ESORICS, 14 Sept. 2011 4
Roadmap
● Overview of Distributed Query Processing● Privacy Definitions● Overview of Our Methodology● Proposed SQL Extensions● Overview of Related Work● Conclusion and Ongoing Work
ESORICS, 14 Sept. 2011 5
Distributed Query Processing
SELECT * FROM Plants, Supplies, Polluted_WatersWHERE Supplies.type = "solvent", AND Supplies.name = Polluted_Waters.pollutant, AND Polluted_Waters.location = Plants.location, AND Plant.id = Supplies.plant_id;
Alice
Querier
Inventory
Facilities
Pollution WatchUntrustedTrusted
ESORICS, 14 Sept. 2011 6
How Does Optimization Affect Querier Privacy?SELECT * FROM Plants, Supplies, Polluted_Waters
WHERE Supplies.type = "solvent", AND Supplies.name = Polluted_Waters.pollutant, AND Polluted_Waters.location = Plants.location, AND Plant.id = Supplies.plant_id;
Reveals sensitive information to ManuCoReveals sensitive information to Pollution WatchResults in a large amount of network trafficStrikes a balance between privacy and performance
ESORICS, 14 Sept. 2011 7
Formalizing this Intensional Knowledge
Given a globally-expanded query plan Q = <N, E>
We denote by κp (Q) N E the intensional knowledge that principal p P ⊆ ∪ ∈has of the query encoded by the plan Q.
At a minimum, κp (Q) contains the set of all locally-expanded query plans for each node n N annotated for execution by the principal p, and further all ∈edges leaving or entering such nodes.
κInventoryκFacilitiesκPollution_Watch
SELECT * FROM Plants, Supplies, Polluted_WatersWHERE Supplies.type = "solvent", AND Supplies.name = Polluted_Waters.pollutant, AND Polluted_Waters.location = Plants.location, AND Plant.id = Supplies.plant_id;
ESORICS, 14 Sept. 2011 8
Our Approach
● Have users to define intensional regions
● Specify constraints on those regions
● Construct a query plan that respects those constraints
Make sure all operations involving these conditions are evaluated by a trusted server!
ESORICS, 14 Sept. 2011 9
A Formal Definition of Querier Privacy
Given an intensional region I,
And a set of colluding adversaries A P,⊆
A globally-expanded query plan Q is said to be (I, A)-privateiff κA (Q) ⊭ I
Where ⊨ denotes an inference procedure for extracting intensional knowledge from a collection of query plans.
ESORICS, 14 Sept. 2011 11
Representing Query Plan Nodes
<op, params, p>
● op - Relational algebra operation● params - Parameters to that operation● p - Principle where operation will be executed
ESORICS, 14 Sept. 2011 12
Matching Against Query Tree Nodes
<scan, *, *><*, {(pollutant, =, name), (location, =, location)}, *><*, {('solvent')}, *>
ESORICS, 14 Sept. 2011 13
Constraining Dissemination of Intensional Regions
Node descriptors can contain free variables
Users author constraints on these free variables
<*, {(pollutant)}, $l>
$l = Querier
SELECT * FROM Plants, Supplies, Polluted_WatersWHERE Supplies.type = "solvent", AND Supplies.name = Polluted_Waters.pollutant, AND Polluted_Waters.location = Plants.location, AND Plant.id = Supplies.plant_idREQUIRING $l = Querier HOLDS OVER <*,{(pollutant)},$l>;
ESORICS, 14 Sept. 2011 14
Extending SQL to Support Constraints
ESORICS, 14 Sept. 2011 15
Balancing Privacy and Performance
W. Kießling. Foundations of preferences in database systems. VLDB, 2002.
All nodes operating on the pollutant attribute are evaluated by Querier &
( Query is estimated to take less than 2 minutes to run ⊗
All join operations are evaluated by Querier )
SELECT * FROM Plants, Supplies, Polluted_WatersWHERE Supplies.type = "solvent", AND Supplies.name = Polluted_Waters.pollutant, AND Polluted_Waters.location = Plants.location, AND Plant.id = Supplies.plant_idPREFERRING $l = Querier HOLDS OVER <*,{(pollutant)},$l> CASCADE LESSTHAN(runtime, 2) AND $l = Querier HOLDS OVER <join,*,$l>;
ESORICS, 14 Sept. 2011 16
Expressing Preferences in SQL
W. Kießling and G. Köstler. Preference SQL: Design, Implementation, Experiences. VLDB, 2002.
ESORICS, 14 Sept. 2011 17
Related Work
● k-anonymity, l-diversity, t-closeness, differential privacy...
● All look at database privacy, though a compliment to our work● Protect the privacy of those whose data is stored in the database
● Private Information Retrieval (PIR)
● Server support required for privacy to be achieved● Our approach can utilize PIR techniques when they are available,
applicable, and efficient● Werner Kießling's work on partially ordered preferences
● Express preferences over query results● We adapt his work to operate over query optimization
ESORICS, 14 Sept. 2011 18
Conclusions and Ongoing Work
● How a query is evaluated in a distributed environment can drastically affect querier privacy
● We present a formalization of querier privacy, (I, A)-privacy, and further mechanisms for users to express their particular privacy preferences
● We have adapted Kießling's work on partially ordered preferences to query optimization as opposed to data retrieval
● We are currently modifying the PostgreSQL query optimizer to support (I, A)-privacy constraints.
ESORICS, 14 Sept. 2011 19
Thank you.Questions?
This research was supported in part by the National Science Foundation under awards CCF–0916015, CNS–0964295, CNS–1017229, CNS–0914946, CNS–0747247, and CDI OIA–1028162; and by the K. C. Wong Education Foundation.