Upload
alexander-goodman
View
219
Download
0
Tags:
Embed Size (px)
Citation preview
On the Origin of Data
Daniel Deutch
Blavatnik School of Computer Science, Raymond and Beverly Sackler Faculty of
Exact Sciences
Data Evolvement
• This is the era of Data.– Databases, text, blogs, social data,…– Huge volumes
• Evolving Through Automatic Tools
• Sent Between Applications and Users
Provenance
Data Provenance
• Understanding how and why data has evolved is of fundamental importance – For authentication
• Both origin and propagators of data should be trustworthy
– For access control• Confidentiality constraints interplay with the
transformation
– For hypothetical reasoning • What if we change a piece of data?• How can we optimally affect data evolvement
Example• Alice posted photos with David• David is worried about Eve seeing his photos
OR OR AND NOT( )( )
Tracking Provenance• The logic is already implemented (e.g. to decide what photos to
show)
• We develop tools to “instrument” applications with provenance tracking.
• Simply maintaining an “activity log” is not good enough.
– We want also the possible “reasons” for activities
– E.g. “not blacklisted” is not an activity
• Instead we create formulas in generic algebraic constructions based on semirings
• We also develop tools that use the provenance information for analysis.
Generic Expression
Trust:
OR OR AND NOT( )( )
False OR ( ( True OR True) AND NOT False ) = True
Number of paths (if Alice and Eve are not friends) :
0 + ( ( 1 + 1 ) x 1 ) = 2
min ( (0:05 min 0:08 ) + 0:00 ) = 0:05Latency:
Provenance for SQL Queries
• Amsterdamer, D., Tannen, Provenance for Aggregate Queries [PODS ‘11]• Amsterdamer, D., Tannen, On the limitations of Provenance for Queries with Difference [Tapp ‘11]• D. , Milo, Roy, Tannen, Circuits for Datalog Provenance [ICDT ‘14]• Amsterdamer, D. ,Green, Karvounarakis, Tannen, Semiring-based Provenance for SQL Queries (In preparation)• D. , Moskovitch, Provenance for Relational Updates [In preparation]
Dep. Emp Prov.
Eng. Alice S
Eng. Bob T
Sales Carol S
Emps GoodEmps
Emp Prov.
Alice C
Bob S
Carol T
Dep. Prov.
Eng. S·C+T·S = S + T = S
Sales S·T = T
πDep(Emps GoodEmps)⨝
Provenance for Social and Web Data
• Bienvenu, D., Suchaneck, Provenance for Web 2.0 Data [Secure Data Management ‘12]• Abiteboul, Bienvenu, D., Deduction in the Presence of Distribution and Contradictions [WebDB ‘12] • Abiteboul, D., Vianu, Deduction with Contradictions in Datalog [ICDT ‘14]• Amarilli, D., Senellart, Provenance for Order-Aware Transformations (In preparation)
PROPOLIS:Provenance for Process Analysis
• D., Moskovich, Tannen, PROPOLIS: Provisioned Analysis of Data-Centric Processes [VLDB ’13]• D., Moskovich, Tannen, A Provenance Framework for Data-Dependent Process Analysis (Submitted)• D., Moskovich, Provenance for Distributed Processes (In preparation)
Thank you!