34
•For more information visit http://wiki.larkc.eu/UrbanComputing Incremental Reasoning on Incremental Reasoning on Streams and Streams and Rich Background Knowledge Rich Background Knowledge http://streamreasoning.org http://streamreasoning.org Emanuele Della Valle DEI - Politecnico di Milano [email protected] http://emanueledellavalle.org Joint work with: Davide Francesco Barbieri, Daniele Braga, Stefano Ceri, and Michael Grossniklaus

Incremental Reasoning on Streams andRich Background Knowledge

  • View
    2.556

  • Download
    1

Embed Size (px)

DESCRIPTION

The presentation I gave at ESWC 2010 in Heraklion, Greece, June 1st, 2010

Citation preview

Page 1: Incremental Reasoning on Streams andRich Background Knowledge

•For more information visit http://wiki.larkc.eu/UrbanComputing

Incremental Reasoning on Incremental Reasoning on Streams andStreams and

Rich Background Knowledge Rich Background Knowledge http://streamreasoning.org http://streamreasoning.org

Emanuele Della Valle DEI - Politecnico di Milano

[email protected]://emanueledellavalle.org

Joint work with:Davide Francesco Barbieri, Daniele Braga, Stefano Ceri, and Michael Grossniklaus

Page 2: Incremental Reasoning on Streams andRich Background Knowledge

Emanuele Della Valle - visit http://streamreasoning.org

Agenda

• Motivation

• Background

• Stream Reasoning Concept

• Past Achievements

• Main Contribution

• Retrospective and Conclusions

2ESWC 2010, Heraklion, Greece, June 1st, 2010

Page 3: Incremental Reasoning on Streams andRich Background Knowledge

Emanuele Della Valle - visit http://streamreasoning.org

Motivation

It‘s a streaming World! [IEEE-IS2009]

3ESWC 2010, Heraklion, Greece, June 1st, 2010

Sensor networks, …

traffic engineering, …

social networking, …

… and many others …

generate streams!

Page 4: Incremental Reasoning on Streams andRich Background Knowledge

Emanuele Della Valle - visit http://streamreasoning.org

Motivation

Questions People are Asking

4ESWC 2010, Heraklion, Greece, June 1st, 2010

• Given this brand of turbine, what is the expected time to failure when the barring starts to vibrate as now detected?

• Is a traffic jam going to happen in this highway? And is then convenient to reallocate travelers based upon the forecast?

• Who are the opinion makers? i.e., the users who are likely to influence the behavior of other users who follow them

Page 5: Incremental Reasoning on Streams andRich Background Knowledge

Emanuele Della Valle - visit http://streamreasoning.org

Motivation

Problem Statement

• Making sense – in real time – of gigantic and inevitably noisy data streams – in order to support the decision process of extremely

large numbers of concurrent users

5ESWC 2010, Heraklion, Greece, June 1st, 2010

Page 6: Incremental Reasoning on Streams andRich Background Knowledge

Emanuele Della Valle - visit http://streamreasoning.org

Background

What are data streams anyway?

• Formally: – Data streams are unbounded sequences of time-

varying data elements

• Less formally: – an (almost) “continuous” flow of information – with the recent information being more relevant as it

describes the current state of a dynamic system

time

6ESWC 2010, Heraklion, Greece, June 1st, 2010

Page 7: Incremental Reasoning on Streams andRich Background Knowledge

Emanuele Della Valle - visit http://streamreasoning.org

Background

Can the Semantic Web Process Data Stream?

• The Semantic Web, the Web of Data is doing fine– RDF, RDF Schema, SPARQL, OWL, DL– well understood theory, – rapid increase in scalability

• BUT it pretends that the world is staticor at best a low change rateboth in change-volume and change-frequency

– ontology versioning– belief revision– time stamps on named graphs

• It sticks to the traditional one-time semantics

7ESWC 2010, Heraklion, Greece, June 1st, 2010

Page 8: Incremental Reasoning on Streams andRich Background Knowledge

Emanuele Della Valle - visit http://streamreasoning.org

Background

Continuous Semantics

• Processing data streams in the space of one-time semantics is difficult because of the very nature of the underlying data

• Innovative* assumption: continuous semantics! – streams can be consumed on the fly rather than being

stored forever and– queries are registered and continuously produce

answers

* This innovation arose in DB community in ’90s

8ESWC 2010, Heraklion, Greece, June 1st, 2010

Page 9: Incremental Reasoning on Streams andRich Background Knowledge

Emanuele Della Valle - visit http://streamreasoning.org

Background

Stream Processing

• Continuous queries registered over streams that are observed trough windows

ESWC 2010, Heraklion, Greece, June 1st, 2010

window

input stream stream of answerRegistered Continuous

Query

9

Page 10: Incremental Reasoning on Streams andRich Background Knowledge

Emanuele Della Valle - visit http://streamreasoning.org

Background

Key Optimization in Stream Processing

• When a continuous query is registered, generate a query execution plan

– New plan merged with existing plans

– Global scheduler for plan execution maximizing experience gathered with previous queries.

10ESWC 2010, Heraklion, Greece, June 1st, 2010

Q1 Q2

State4⋈State3

Stream1 Stream2

Stream3

State1 State2⋈

SchedulerScheduler

Page 11: Incremental Reasoning on Streams andRich Background Knowledge

Emanuele Della Valle - visit http://streamreasoning.org

Background

Data Stream Management Systems (DSMS)• Research Prototypes

– Amazon/Cougar (Cornell) – sensors– Aurora (Brown/MIT) – sensor monitoring, dataflow– Gigascope: AT&T Labs – Network Monitoring– Hancock (AT&T) – Telecom streams– Niagara (OGI/Wisconsin) – Internet DBs & XML– OpenCQ (Georgia) – triggers, view maintenance– Stream (Stanford) – general-purpose DSMS– Stream Mill (UCLA) - power & extensibility– Tapestry (Xerox) – publish/subscribe filtering– Telegraph (Berkeley) – adaptive engine for sensors– Tribeca (Bellcore) – network monitoring

• High-tech startups– Streambase, Coral8, Apama, Truviso

• Major DBMS vendors are all adding stream extensions as well– Oracle http://www.oracle.com/technology/products/dataint/htdocs/streams_fo.html

– DB2 http://www.eweek.com/c/a/Database/IBM-DB2-Turns-25-and-Prepares-for-New-Life/

11ESWC 2010, Heraklion, Greece, June 1st, 2010

Page 12: Incremental Reasoning on Streams andRich Background Knowledge

Emanuele Della Valle - visit http://streamreasoning.org

Concept

Stream Reasoning [IEEE-IS2009,Dagstuhl2010]

• Idea origination– Can continuous semantics be ported to reasoning?– This is an unexplored yet high impact research

area!

• Stream Reasoning– Logical reasoning in real time on gigantic and

inevitably noisy data streams in order to support the decision process of extremely large numbers of concurrent users.

-- S. Ceri, E. Della Valle, F. van Harmelen and H. Stuckenschmidt, 2010

• Note: making sense of streams necessarily requires processing them against rich background knowledge

12ESWC 2010, Heraklion, Greece, June 1st, 2010

Page 13: Incremental Reasoning on Streams andRich Background Knowledge

Emanuele Della Valle - visit http://streamreasoning.org

Concept

Research Challenges (selection) [IEEE-IS2009]

• Relation with data-stream systems– Just as RDF relates to data-base systems?

• Query languages for semantic streams– Just as SPARQL for RDF but with continuous

semantics?• Reasoning on Streams

– Efficient incremental updates of deductive closures? – How to combine streams and background knowledge?

• Distributed and parallel processing– Streams are parallel in nature

• Real time constrains– A reasoning task must be completed before the

answer become useless

13ESWC 2010, Heraklion, Greece, June 1st, 2010

Page 14: Incremental Reasoning on Streams andRich Background Knowledge

Emanuele Della Valle - visit http://streamreasoning.org

Past Achievements

Explored Continuous Semantics for SeWeb

• We gave up one-time semantics in Semantic Web and explored the benefits provided by continuous semantics when dealing with streams

• We investigated– RDF streams [WWW2009]

• the natural extension of the RDF data model to the new continuous scenario and

– Continuous SPARQL (or simply C-SPARQL) [WWW2009, EDBT2010]

• A syntactic and semantic extension of SPARQL for querying RDF streams

14ESWC 2010, Heraklion, Greece, June 1st, 2010

Page 15: Incremental Reasoning on Streams andRich Background Knowledge

Emanuele Della Valle - visit http://streamreasoning.org

Past Achievements

RDF Stream

• RDF Stream Data Type– Ordered sequence of pairs, where each pair is made

of an RDF triple and its timestamp t(< triple >, t)

• E.g.,

(<:ourmaninsa :isIn :Munich>, 2010-05-31T18:34:41)

(<:MadamMichelle :isIn :SouthAfrica >, 2010-05-31T18:24:28)

(<:Ayngelina :isIn :Nicaragua >, 2010-05-31T18:19:21)

15ESWC 2010, Heraklion, Greece, June 1st, 2010

“just arrived in”

Page 16: Incremental Reasoning on Streams andRich Background Knowledge

Emanuele Della Valle - visit http://streamreasoning.org

Past Achievements

An Example of C-SPARQL Query

Who has landed in USA in the last hour?

REGISTER QUERY WhoHasLandedInUSAinTheLastHour AS

PREFIX gno: <http://www.geonames.org/ontology#>

PREFIX c: <http://www.geonames.org/countries/#>

PREFIX : <http://example>

SELECT ?traveller ?place ?type

FROM <http://sws.geonames.org/nonExistingUSfeatureGraph>

FROM STREAM <http://someStreamGeneratedFromTwitter>

[ RANGE 60m STEP 5m ]

WHERE {

?traveller :isIn ?place .

?place gno:inCountry c:US .

?place gno:featureCode ?type .

}

16ESWC 2010, Heraklion, Greece, June 1st, 2010

Page 17: Incremental Reasoning on Streams andRich Background Knowledge

Emanuele Della Valle - visit http://streamreasoning.org

Past Achievements

An Example of C-SPARQL Query Explained

Who has landed in USA in the last hour?

REGISTER QUERY WhoHasLandedInUSAinTheLastHour AS

PREFIX gno: <http://www.geonames.org/ontology#>

PREFIX c: <http://www.geonames.org/countries/#>

PREFIX : <http://example>

SELECT ?traveller ?place ?type

FROM <http://sws.geonames.org/nonExistingUSfeatureGraph>

FROM STREAM <http://someStreamGeneratedFromTwitter>

[ RANGE 60m STEP 5m ]

WHERE {

?traveller :isIn ?place .

?place gno:inCountry c:US .

?place gno:featureCode ?type .

}

17ESWC 2010, Heraklion, Greece, June 1st, 2010

Combined with triples a RDF graph

triples from a stream

Query registration(for continuous

execution)

FROM STREAM clause

WINDOW

Page 18: Incremental Reasoning on Streams andRich Background Knowledge

Emanuele Della Valle - visit http://streamreasoning.org

Past Achievements

C-SPARQL Engine Architecture

• We implemented a C-SPARQL engine based on LarKC conceptual framework

18ESWC 2010, Heraklion, Greece, June 1st, 2010

Select Select AbstractAbstract ReasonReason

Streamed Input Window Content RDF Streams

Answ

ers

Stre

ams

Window

RDF GraphsPerformed by a DSMS

Page 19: Incremental Reasoning on Streams andRich Background Knowledge

Emanuele Della Valle - visit http://streamreasoning.org

Main Contribution

Achievements vs. Research Challenges

• Relation with data-stream systems– Notion of RDF stream [WWW2009]

• Query languages for semantic streams– C-SPARQL [WWW2009,EDBT2010]

• Reasoning on Streams– Efficient incremental updates of

deductive closures– How to combine streams and

background knowledge• Distributed and parallel processing

– Streams are parallel in nature• Real time constrains

– A reasoning task must be completed before the answer become useless

Contribution of this work

19ESWC 2010, Heraklion, Greece, June 1st, 2010

Page 20: Incremental Reasoning on Streams andRich Background Knowledge

Emanuele Della Valle - visit http://streamreasoning.org

Main Contribution

State-of-the-Art Approach [Ceri1994,Volz2005]

1. Overestimation of deletion: Overestimates deletions by computing all direct consequences of a deletion.

2. Rederivation: Prunes those estimated deletions for which alternative derivations (via some other facts in the program) exist.

3. Insertion: Adds the new derivations that are consequences of insertions to extensional predicates.

20ESWC 2010, Heraklion, Greece, June 1st, 2010

Page 21: Incremental Reasoning on Streams andRich Background Knowledge

Emanuele Della Valle - visit http://streamreasoning.org

Main Contribution

Our approach 1/2

• Assumption– Insertions and deletions are triples respectively

entering and exiting the window – The window size is known

• Therefore– The time when each triple will expire is known and

determined by the window size• E.g. if the window is 10s long

a triple entering at time t then it will exit at time t+10s

– Note: all knowledge can be annotated with an expiration time

• i.e., background knowledge is annotated with +

21ESWC 2010, Heraklion, Greece, June 1st, 2010

Page 22: Incremental Reasoning on Streams andRich Background Knowledge

Emanuele Della Valle - visit http://streamreasoning.org

Main Contribution

Our approach 2/2

• The algorithm1. computes the entailments derived by the inserts,

2. annotates each entailed triple with a expiration time, and

3. eliminates from the current state all copies of derived triples except the one with the highest timestamp.

22ESWC 2010, Heraklion, Greece, June 1st, 2010

Page 23: Incremental Reasoning on Streams andRich Background Knowledge

Emanuele Della Valle - visit http://streamreasoning.org

Main Contribution

Our Approach at Work

23ESWC 2010, Heraklion, Greece, June 1st, 2010

12 Jan 2009

A B

A B C

1

2

TS Triples in the Window Entailments in the Window

A C

[11]

[11] [11][12]

A B C3

A C[11] [11][12]

D[13]

DB[12]

[11]

A B C

4A C

[11] [11][12]D

[13]

DB[12]

[11]E[14] [14][14]x

A B C

12A C

[12]D

[13]

DB[12]

E[14] [14] [14]

A C

13AD

[13]

DE[14] [14] [14]

[11] [11]11

Page 24: Incremental Reasoning on Streams andRich Background Knowledge

Emanuele Della Valle - visit http://streamreasoning.org

Main Contribution

Comparative Evaluation• Hypothesis

– Background knowledge do not change and it is materialized– Changes only take place in the window

• An experiment comparing the time required to compute a new materialization using

– Re-computing from scratch (i.e.,1250 ms in our setting)– State of the art incremental approach [Volz, 2005]– Our approach

• Results at increasing % of the triples updated

• .

24ESWC 2010, Heraklion, Greece, June 1st, 2010

10

100

1000

10000

0,0% 2,0% 4,0% 6,0% 8,0% 10,0% 12,0% 14,0% 16,0% 18,0% 20,0%

ms.

% of the materialization changed when the window slides

incremental-volz incremental-stream

Page 25: Incremental Reasoning on Streams andRich Background Knowledge

Emanuele Della Valle - visit http://streamreasoning.org

Retrospective and Conclusions

Achievements vs. Research Challenges

• Relation with data-stream systems– Notion of RDF stream :-|

• Query languages for semantic streams– C-SPARQL :-D

• Reasoning on Streams– Efficient incremental updates of deductive closures

• This paper :-) ... but much more work is needed!

– How to combine streams and background knowledge• This paper :-| ... but a lot needs to be studied ...

• Distributed and parallel processing– Future work :-P

• Real time constrains– Future work :-P

25ESWC 2010, Heraklion, Greece, June 1st, 2010

Page 26: Incremental Reasoning on Streams andRich Background Knowledge

For more information visit http://www.larkc.eu/

References (selection)

• Vision [IEEE-IS2009] Emanuele Della Valle, Stefano Ceri, Frank van

Harmelen, Dieter Fensel It's a Streaming World! Reasoning upon Rapidly Changing Information. IEEE Intelligent Systems 24(6): 83-89 (2009) bibtex

• Continuous SPARQL (C-SPARQL) [EDBT2010] Davide Francesco Barbieri, Daniele Braga, Stefano

Ceri and Michael Grossniklaus. An Execution Environment for C-SPARQL Queries. EDBT 2010

[WWW2009] Davide Francesco Barbieri, Daniele Braga, Stefano Ceri, Emanuele Della Valle, Michael Grossniklaus: C-SPARQL: SPARQL for continuous querying. WWW 2009: 1061-1062 bibtex

• Stream Reasoning[Dagstuhl2010] Heiner Stuckenschmidt, Stefano Ceri, Emanuele

Della Valle and Frank van Harmelen. Towards Expressive Stream Reasoning. Proceedings of the Dagstuhl Seminar on Semantic Aspects of Sensor Networks, 2010.

26ESWC 2010, Heraklion, Greece, June 1st, 2010

Page 27: Incremental Reasoning on Streams andRich Background Knowledge

For more information visit http://www.larkc.eu/

Thank You! Questions?

27

Much More to Come!Keep an eye on

http://www.streamreasoning.org

27ESWC 2010, Heraklion, Greece, June 1st, 2010

Page 28: Incremental Reasoning on Streams andRich Background Knowledge

Emanuele Della Valle - visit http://streamreasoning.org

Back-up Slides

The Entailment Regime That We Used

• In the current implementation we support RDF-S++ – rdf:type – rdfs:subClassOf – rdfs:domain and rdfs:range – rdfs:subPropertyOf – owl:sameAs – owl:inverseOf – owl:TransitiveProperty

28ESWC 2010, Heraklion, Greece, June 1st, 2010

Page 29: Incremental Reasoning on Streams andRich Background Knowledge

Emanuele Della Valle - visit http://streamreasoning.org

Back-up Slides

Volz 2005 rewriting rules

29ESWC 2010, Heraklion, Greece, June 1st, 2010

Page 30: Incremental Reasoning on Streams andRich Background Knowledge

Emanuele Della Valle - visit http://streamreasoning.org

Back-up Slides

Example of maintenance program

Original Rule

Maintenance Program

30ESWC 2010, Heraklion, Greece, June 1st, 2010

Page 31: Incremental Reasoning on Streams andRich Background Knowledge

Emanuele Della Valle - visit http://streamreasoning.org

Back-up Slides

Our rewriting rules

31ESWC 2010, Heraklion, Greece, June 1st, 2010

Page 32: Incremental Reasoning on Streams andRich Background Knowledge

Emanuele Della Valle - visit http://streamreasoning.org

Back-up Slides

Example of maintenance program for streams

Original Rule

Maintenance Program

32ESWC 2010, Heraklion, Greece, June 1st, 2010

Page 33: Incremental Reasoning on Streams andRich Background Knowledge

Emanuele Della Valle - visit http://streamreasoning.org

Back-up Slides

Simple Stream Reasoner Architecture

33ESWC 2010, Heraklion, Greece, June 1st, 2010

RDF StoreAsserted Triples

Entailed Triples

Hash TableT of Asserted Triples

Pe

rma

nen

t Sp

aceT of Entailed Triples

RDF StorePrevious

Materialization

Hash TableT of Previous

Materialization

Wo

rkin

g S

pa

ce

P[T], P [T]Ins P [T], P [T]-+

Incremental Maintainer

Rule Engine Built-ins

P , P -+ T of P , P -+

Page 34: Incremental Reasoning on Streams andRich Background Knowledge

Emanuele Della Valle - visit http://streamreasoning.org

Achievements Incremental Reasoning: State-of-the-Art

• Incremental Maintenance of Materialized Views– Stefano Ceri, Jennifer Widom: Deriving Incremental Production Rules

for Deductive Data. Inf. Syst. 19(6): 467-490 (1994)– HA Kuno, EA Rundensteiner: Incremental Maintenance of

Materialized Object-Oriented Views in MultiView: Strategies and Performance Evaluation. TDKE 1998

– Raphael Volz, Steffen Staab, Boris Motik: Incrementally Maintaining Materializations of Ontologies Stored in Logic Databases. J. Data Semantics 2: 1-34 (2005)

• Incremental Rule-based Reasoning– F Fabret, M Regnier, E Simon: An Adaptive Algorithm for Incremental

Evaluation of Production Rules in Databases. VLDB 1993– B. Berster: Extending the RETE Algorithm for Event

Management.TIME’02• Incremental DL Reasoning

– Cuenca-Grau et al : History Matters: Incremental Ontology Reasoning Using Modules. ISWC 2007.

– Parsia et al: Towards incremental reasoning through updates in OWL-DL. - Reasoning on the Web-Workshop at WWW-2006

34ESWC 2010, Heraklion, Greece, June 1st, 2010