Upload
gyles-cole
View
219
Download
0
Embed Size (px)
Citation preview
1
Inference Control in Statistical databases
Introduction to:
2
Statistical database:
Goal[DEN83]: Provide frequency, average, other statistics
of persons Challenge:
Preserving privacy[DEN83] Interaction between client and database
server[JAG07]▪ Client may not want server to know what
information querying.▪ Server would like to ensure that client does not
learn information about database.
3
Indirect access:[DEN83]:
Indirect disclosure of sensitive data So, what is sensitive data?
Determined by the policies of the system.
Example: There is only one female professor in
department Salary of female professor can be achieved
by subtractions of total salary with total salary of male professors.
4
What causes indirect access? Indirect access takes place via
inference. Partial vs. Full inference
Inference channels: Database dependencies and integrity Constraints. Domain knowledge. Query correlation.
5
What we can do?
Example: There is only one female professor in
department Salary of female professor can be
achieved by subtractions of total salary with total salary of male professors.
6
Approaches[ADA89]:
Query Restriction[DEN83]: Restricting query size. Control query overlap[JAG07]. Auditing.
Perturbation[DEN83]: Means data changing▪ Output perturbation▪ Data perturbation
Conceptual Frameworks like conceptual model, lattice, etc
7
Query-set-size-control[Den83]: Statistics release only if the size of
query satisfies special condition.
Based on sensitive statistic Depends on policy of system
C should satisfy the condition: K< C< L - K ( L is size of the database)
8
Query-set-size-control[Den83]: Query-set-size-control
It is memory-less Trackers can subvert it:▪ Pad small query sets with enough extra
records to put them in the allowable range▪ Subtract the effect of the paddings
9
Query-set-overlap-control: Restricting the number of overlapping
entities among successive queries of a given user.
Drawbacks: Ineffective for preventing the cooperation
of several users to compromise database. Statistics for both a set and its subset can
not be released. User profile should be kept for each user
10
Auditing:
Keeping up-to-date logs of all queries made by each user.
Constantly checking for possible compromise when ever a new query is issued.
Drawback: CPU usage. Storage requirement.
11
Approaches[ADA89]:
Query Restriction[DEN83]: Restricting query size. Control query overlap[JAG07]. Auditing.
Perturbation[DEN83]: Means data changing▪ Output perturbation▪ Data perturbation
Conceptual Frameworks like conceptual model, lattice, etc
12
Perturbation:
Probability distribution: Replaces database by another sample
from the same distribution. Fixed data perturbation:
Values of attributes in the database are perturbed once and for all.
Bias Problem: Bias to quantities Conditional means, frequencies.
13
Bias Problem:
X: Original value of an attribute Y = X + a ( Perturbed value) Consider the set of entities that has
perturbed value w: Matloff shows that E(X|Y=w) is not
necessarily equal to w.
14
Conceptual Model:
Main idea: Statistical database contains information
more than just one population Security control component should be
aware of the relationship between populations and their security issues.
Users knowledge should be taken into account.
15
Conceptual Model:
Population definition: Allowed statistical query for each
attribute of query History of changes Relationships
User knowledge construct: Process that keeps track of properties of
user Describe users knowledge from earlier
queries as well as any supplementary knowledge.
16
Conceptual Model:
Constraint enforcer and checker: Process enforces security constraints
17
Conclusion:
No single security-control method satisfies all objectives.
Choosing security-control depends on application
18
Reference:
[DEN83] D.E.Denning, “Inference Controls for Statistical Databases”, SRI International, vol.16, no.7, pp.69-82, 1983.
[JAG07] G.Jagannathan, R.N.Wright, “Private Inference Control for Aggregate Database Queries”, Proceedings of 7th IEEE International Conference on Data Mining Workshops, pp.711-716, 2007.
[FAR02] C.Farkas, S.Jajodia, “The Inference Problem: A Survey”, SIGKDD Explor. Newsl, vol.4, no.2, pp.6-11, 2002.
[ADA89] N.R.Adam, J.C.Worthmann, “Security-control Methods for Statistical Databases: A Comparative Study”, ACM Computing Survey, vol.21, no.4, pp.515-556, 1989.