18
Inference Control in Statistical databases Introduction to: 1

Introduction to: 1. Goal[DEN83]: Provide frequency, average, other statistics of persons Challenge: Preserving privacy[DEN83] Interaction between

Embed Size (px)

Citation preview

Page 1: Introduction to: 1.  Goal[DEN83]:  Provide frequency, average, other statistics of persons  Challenge:  Preserving privacy[DEN83]  Interaction between

1

Inference Control in Statistical databases

Introduction to:

Page 2: Introduction to: 1.  Goal[DEN83]:  Provide frequency, average, other statistics of persons  Challenge:  Preserving privacy[DEN83]  Interaction between

2

Statistical database:

Goal[DEN83]: Provide frequency, average, other statistics

of persons Challenge:

Preserving privacy[DEN83] Interaction between client and database

server[JAG07]▪ Client may not want server to know what

information querying.▪ Server would like to ensure that client does not

learn information about database.

Page 3: Introduction to: 1.  Goal[DEN83]:  Provide frequency, average, other statistics of persons  Challenge:  Preserving privacy[DEN83]  Interaction between

3

Indirect access:[DEN83]:

Indirect disclosure of sensitive data So, what is sensitive data?

Determined by the policies of the system.

Example: There is only one female professor in

department Salary of female professor can be achieved

by subtractions of total salary with total salary of male professors.

Page 4: Introduction to: 1.  Goal[DEN83]:  Provide frequency, average, other statistics of persons  Challenge:  Preserving privacy[DEN83]  Interaction between

4

What causes indirect access? Indirect access takes place via

inference. Partial vs. Full inference

Inference channels: Database dependencies and integrity Constraints. Domain knowledge. Query correlation.

Page 5: Introduction to: 1.  Goal[DEN83]:  Provide frequency, average, other statistics of persons  Challenge:  Preserving privacy[DEN83]  Interaction between

5

What we can do?

Example: There is only one female professor in

department Salary of female professor can be

achieved by subtractions of total salary with total salary of male professors.

Page 6: Introduction to: 1.  Goal[DEN83]:  Provide frequency, average, other statistics of persons  Challenge:  Preserving privacy[DEN83]  Interaction between

6

Approaches[ADA89]:

Query Restriction[DEN83]: Restricting query size. Control query overlap[JAG07]. Auditing.

Perturbation[DEN83]: Means data changing▪ Output perturbation▪ Data perturbation

Conceptual Frameworks like conceptual model, lattice, etc

Page 7: Introduction to: 1.  Goal[DEN83]:  Provide frequency, average, other statistics of persons  Challenge:  Preserving privacy[DEN83]  Interaction between

7

Query-set-size-control[Den83]: Statistics release only if the size of

query satisfies special condition.

Based on sensitive statistic Depends on policy of system

C should satisfy the condition: K< C< L - K ( L is size of the database)

Page 8: Introduction to: 1.  Goal[DEN83]:  Provide frequency, average, other statistics of persons  Challenge:  Preserving privacy[DEN83]  Interaction between

8

Query-set-size-control[Den83]: Query-set-size-control

It is memory-less Trackers can subvert it:▪ Pad small query sets with enough extra

records to put them in the allowable range▪ Subtract the effect of the paddings

Page 9: Introduction to: 1.  Goal[DEN83]:  Provide frequency, average, other statistics of persons  Challenge:  Preserving privacy[DEN83]  Interaction between

9

Query-set-overlap-control: Restricting the number of overlapping

entities among successive queries of a given user.

Drawbacks: Ineffective for preventing the cooperation

of several users to compromise database. Statistics for both a set and its subset can

not be released. User profile should be kept for each user

Page 10: Introduction to: 1.  Goal[DEN83]:  Provide frequency, average, other statistics of persons  Challenge:  Preserving privacy[DEN83]  Interaction between

10

Auditing:

Keeping up-to-date logs of all queries made by each user.

Constantly checking for possible compromise when ever a new query is issued.

Drawback: CPU usage. Storage requirement.

Page 11: Introduction to: 1.  Goal[DEN83]:  Provide frequency, average, other statistics of persons  Challenge:  Preserving privacy[DEN83]  Interaction between

11

Approaches[ADA89]:

Query Restriction[DEN83]: Restricting query size. Control query overlap[JAG07]. Auditing.

Perturbation[DEN83]: Means data changing▪ Output perturbation▪ Data perturbation

Conceptual Frameworks like conceptual model, lattice, etc

Page 12: Introduction to: 1.  Goal[DEN83]:  Provide frequency, average, other statistics of persons  Challenge:  Preserving privacy[DEN83]  Interaction between

12

Perturbation:

Probability distribution: Replaces database by another sample

from the same distribution. Fixed data perturbation:

Values of attributes in the database are perturbed once and for all.

Bias Problem: Bias to quantities Conditional means, frequencies.

Page 13: Introduction to: 1.  Goal[DEN83]:  Provide frequency, average, other statistics of persons  Challenge:  Preserving privacy[DEN83]  Interaction between

13

Bias Problem:

X: Original value of an attribute Y = X + a ( Perturbed value) Consider the set of entities that has

perturbed value w: Matloff shows that E(X|Y=w) is not

necessarily equal to w.

Page 14: Introduction to: 1.  Goal[DEN83]:  Provide frequency, average, other statistics of persons  Challenge:  Preserving privacy[DEN83]  Interaction between

14

Conceptual Model:

Main idea: Statistical database contains information

more than just one population Security control component should be

aware of the relationship between populations and their security issues.

Users knowledge should be taken into account.

Page 15: Introduction to: 1.  Goal[DEN83]:  Provide frequency, average, other statistics of persons  Challenge:  Preserving privacy[DEN83]  Interaction between

15

Conceptual Model:

Population definition: Allowed statistical query for each

attribute of query History of changes Relationships

User knowledge construct: Process that keeps track of properties of

user Describe users knowledge from earlier

queries as well as any supplementary knowledge.

Page 16: Introduction to: 1.  Goal[DEN83]:  Provide frequency, average, other statistics of persons  Challenge:  Preserving privacy[DEN83]  Interaction between

16

Conceptual Model:

Constraint enforcer and checker: Process enforces security constraints

Page 17: Introduction to: 1.  Goal[DEN83]:  Provide frequency, average, other statistics of persons  Challenge:  Preserving privacy[DEN83]  Interaction between

17

Conclusion:

No single security-control method satisfies all objectives.

Choosing security-control depends on application

Page 18: Introduction to: 1.  Goal[DEN83]:  Provide frequency, average, other statistics of persons  Challenge:  Preserving privacy[DEN83]  Interaction between

18

Reference:

[DEN83] D.E.Denning, “Inference Controls for Statistical Databases”, SRI International, vol.16, no.7, pp.69-82, 1983.

[JAG07] G.Jagannathan, R.N.Wright, “Private Inference Control for Aggregate Database Queries”, Proceedings of 7th IEEE International Conference on Data Mining Workshops, pp.711-716, 2007.

[FAR02] C.Farkas, S.Jajodia, “The Inference Problem: A Survey”, SIGKDD Explor. Newsl, vol.4, no.2, pp.6-11, 2002.

[ADA89] N.R.Adam, J.C.Worthmann, “Security-control Methods for Statistical Databases: A Comparative Study”, ACM Computing Survey, vol.21, no.4, pp.515-556, 1989.