49
Data Models and Query Data Models and Query Languages of Languages of Spatio-Temporal Information Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

Embed Size (px)

DESCRIPTION

3 Contribution of This Research Better data models and query languages for temporal and spatio-temporal information Multi-layered architecture for spatio-temporal extensions on O-R systems Support further extensions and customization by end-users via user-defined spatio-temporal aggregates

Citation preview

Page 1: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

Data Models and Query Data Models and Query Languages of Languages of Spatio-Temporal InformationSpatio-Temporal Information

Cindy Xinmin ChenComputer Science Department

UCLAFebruary 28, 2001

Page 2: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

2

The ProblemThe Problem Data models and query languages for spatio-

temporal databases: many different approaches proposed complexity of technical problem diversity of application requirements

Implementation: extensions for spatio-temporal information zero extensibility in Relational DBMS Object-Relational systems are better, but still

have many limitations

Page 3: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

3

Contribution of This ResearchContribution of This Research Better data models and query languages for

temporal and spatio-temporal information Multi-layered architecture for spatio-temporal

extensions on O-R systems Support further extensions and customization by

end-users via user-defined spatio-temporal aggregates

Page 4: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

4

OutlineOutline Temporal Data Models and Query Languages --

SQLT

Spatio-Temporal Data Models and Query Languages -- SQLST

Implementation of SQLST

More Abstract Representation of Spatio-Temporal Data

Conclusion

Page 5: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

5

Page 6: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

6

State of ArtState of Art More than 40 temporal data models according to

[Jensen and Snodgrass 99] Interval-based approach [Lorentzos 97]

same conceptual level and implementation level representations but requires interval coalescing after projection

TSQL2’s implicit time [Snodgrass 95] temporal joins are specified without ever

mentioning the time column in WHERE or SELECT clauses of the query

Point-based approach [Toman 98]

Page 7: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

7

Interval-Based Time Model Interval-Based Time Model and Coalescingand Coalescing A temporal relation contains prescription information

Projection on Name and Physican:

Projection on Name and Drug:

Prescription(Melanie, Dr. Jones, Proventil, 3mg, [19960101, 19960131])Prescription(Melanie, Dr. Jones, Prozac, 3mg, [19960201, 19960229])Prescription(Melanie, Dr. Bond, Prozac, 3mg, [19960301, 19960331])

(Melanie, Dr. Jones, [19960101, 19960229])(Melanie, Dr. Bond, [19960301, 19960331])

(Melanie, Proventil, [19960101, 19960131])(Melanie, Prozac, [19960201, 19960331])

Page 8: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

8

TSQL2TSQL2 Bitemporal Conceptual Data Model -- coalesced

data model Two dimensional time -- valid time and transaction

time Implicit time model -- no coalescing Lack of universality

Page 9: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

9

TSQL2 -- An ExampleTSQL2 -- An Example Schema Definition

Query 1: find the drugs Melanie took in 1996 and the time she took them.

CREATE TABLE Prescription (Name CHAR(30), Physician CHAR(30), Drug CHAR(30), Dosage CHAR(30))AS VALID STATE DAY

SELECT DrugVALID INTERSECT(VALID(Prescription), PERIOD(‘[1996]’ AS DAY))FROM PrescriptionWHERE Name = “Melanie”

Page 10: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

10

Point-Based ModelPoint-Based Model Expressive power [Toman 97] Use user-defined aggregates to express Allen's

interval operators Universality:

uniformly applicable to SQL, QBE and Datalog use current query languages’ construct types no new constructs are introduced

Page 11: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

11

SQLSQLTT: Schema Definition: Schema Definition Define the Prescription relation

CREATE TABLE Prescription (Name CHAR(30), Physician CHAR(30), Drug CHAR(30), Dosage CHAR(30), VTime DATE)

Page 12: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

12

Temporal Selection and JoinTemporal Selection and Join Query 1’: find the drugs Melanie took in 1996 and

the time she took them.

SELECT Drug, VTimeFROM PrescriptionWHERE Name = “Melanie” 19960101 <= VTime AND 19961231 >= VTime

Page 13: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

13

Interval-Oriented ReasoningInterval-Oriented Reasoning Query 2: find the patients who have taken Proventil

throughout the time they took Prozac.

SELECT P1.NameFROM Prescription AS P1 P2WHERE P1.Name = P2.Name AND P1.Drug = “Proventil” AND P2.Drug = “Prozac”GROUP BY P1.NameHAVING DURING(P1.VTime. P2.VTime)

Page 14: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

14

Interval-Oriented Reasoning (cont.)Interval-Oriented Reasoning (cont.) Query 2 in QBE

Prescription Name Physician Drug Dosage VTime P.G._name Proventil _vtime1 _name Prozac _vtime2

ConditionsDURING(_vtime1, _vtime2)

Page 15: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

15

Interval-Oriented Reasoning (cont.)Interval-Oriented Reasoning (cont.) Query 2 in Datalog

query2(Name, during<VTime1, VTime2>) prescription(Name, _, “Proventil”, _, VTime1), prescription(Name, _, “Prozac”, _, VTime2).

Page 16: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

16

Implementation of SQLImplementation of SQLTT on DB2 on DB2 From point-based representation to interval based

representation Difficulty of support temporal data model and query

language extensions on existed O-R systems only user-defined functions (UDFs) available

UDFs can not access the database tables directly

UDFs are hard to develop and debug

Page 17: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

17

Page 18: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

18

Previous WorkPrevious Work Constraint-based approach

Triangulation-based spatial objects + interval-based time [Chomicki 97]

Parametric rectangles + interval-based time [Cai 00]

Time as another dimension in space [Grumbach 98]

Composite spatio-temporal data types: mpoint and mregion [Güting 00]

Orthogonal space and time [Worboy 94]

Page 19: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

19

Previous Work (cont.)Previous Work (cont.) Commercial DBMSs

no spatio-temporal extensions only spatial DataBlades, Extenders, etc.

provide a predefined library of functions offer no extensibility

Page 20: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

20

Objective of SQLObjective of SQLSTST

orthogonality, minimality and extensibility separated temporal and spatial information minimal extensions to SQL additional constructs can be built in SQLST

Page 21: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

21

Design and Implementation of SQLDesign and Implementation of SQLSTST

Define a minimal set of built-in primitives in procedure language

Use user-defined aggregates for further extension Data types:

Temporal data type -- time interval Spatial data types -- points, lines (finite straight

line segments), and counterclockwise directed triangles

Page 22: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

22

Counterclockwise Directed TriangleCounterclockwise Directed Triangle A triangle is counterclockwise directed if its three

vertexes are counterclockwise orientated

Makes point-location problem easy inside(point, triangle)

01V3yV3x1V2yV2x1V1yV1x

T

V1 V2

V3

P’P

Page 23: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

23

Application ExampleApplication Example Cyclone statistics for the northern Hemisphere from

NSF Arctic System Science Research Program

ID Trajectory Pressure Start Time End Time x1 y1 x2 y2960001 (1146, 1034, 1303, 1775) 1004 1996-05-01 1996-05-02960001 (1303, 1775, 1664, 1779) 995 1996-05-02 1996-05-03960001 (1664, 1779, 1957, 1018) 991 1996-05-03 1996-05-04

day1day2 day3

day4

Page 24: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

24

SQLSQLSTST: Schema Definition: Schema Definition Define the Cyclone relation

Define the Island relation

CREATE TABLE Cyclone (ID INT, Trajectory LINE, Pressure REAL, Tstart DATE, Tend DATE)

CREATE TABLE Island (Name CHAR(30), Region TRIANGLE)

Page 25: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

25

Spatio-Temporal QueriesSpatio-Temporal Queries Query 3: find all cyclones whose high pressure stage

(pressure > 1000mb) have lasted more than 3 days.

SELECT ID FROM CycloneWHERE Pressure > 1000GROUP BY IDHAVING DURATION(Tstart, Tend) > 3

Page 26: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

26

Spatio-Temporal Queries (cont.)Spatio-Temporal Queries (cont.) Query 4: find the cyclones whose trajectory have

been enclosed by the island Misfortune.

SELECT ID FROM Cyclone, IslandWHERE Name = “Misfortune”GROUP BY IDHAVING CONTAIN(Trajectory, Region)

Page 27: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

27

Page 28: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

28

ApproachApproach Define a minimal set of ADTs built in C++ Use user-defined aggregates to define new spatio-

temporal primitives Allow end-users to extend and customize the system

for their application

Page 29: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

29

Built-in Spatial FunctionsBuilt-in Spatial Functions length(line) area(triangle) center_of_mass(triangle) distance(point, point) distance(point, line) intersect(line, line) intersect(line, triangle) intersect(triangle, triangle)

Page 30: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

30

User-Defined Aggregates (UDAs)User-Defined Aggregates (UDAs) UDAs provide a more general and powerful

mechanism for DB extensions ease of use no impedance mismatch of data types and

programming paradigms DB advantages -- scalability, data independence,

optimizability, etc.

Page 31: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

31

Aggregate eXtension Language Aggregate eXtension Language (AXL) [Wang 00](AXL) [Wang 00] Stream orientated processing Three functions expressed in SQL

INTIALIZE: gives an initial value to the aggregate ITERATE: computes the intermediate aggregate

value for each new record TERMINATE: returns the final value computed for

the aggregate Local tables

state return

Built on the Berkeley DB storage manager

Page 32: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

32

DurationDuration Calculates the total length of the time intervals

Cyclone(960001, _, _, 19960101, 19960105)Cyclone(960001, _, _, 19960111, 19960115)Cyclone(960001, _, _, 19960121, 19960125)

15 days

Page 33: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

33

Duration (cont.)Duration (cont.)

AGGREGATE DURATION(Tstart DATE, Tend DATE) : INT{ TABLE state (i INT); INITIALIZE : { INSERT INTO state VALUES(Tend - Tstart + 1); } ITERATE : { UPDATE state SET i = i + (Tend - Tstart + 1); } TERMINATE : { INSERT INTO return SELECT i FROM state; }}

Page 34: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

34

ContainContain Tests if one object contains another

returns 1 if true; returns nothing otherwise

contain(O1, O2) triangle t2 O2, vertex v of t2, triangle t1 O1, v inside t1

Page 35: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

35

Contain (cont.)Contain (cont.)AGGREGATE CONTAIN(Object1 TRIANGLE, Object2 TRIANGLE) : INT{ TABLE state (b INT) AS VALUES(1); TABLE triangles(Object TRIANGLE); TABLE points(Vertex POINT); INITIALIZE : ITERATE : { INSERT INTO triangles VALUES(Object1); INSERT INTO points VALUES(Object2.Vertex);} TERMINATE : { UPDATE state SET b = 0 WHERE NOT EXIST (SELECT Vertex FROM points, triangles WHERE inside(Vertex, Object) = 1); INSERT INTO return SELECT b FROM state WHERE b = 1; }}

Page 36: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

36

Other UDAsOther UDAs Overlap

tests if any edges of two objects intersect Edge_Distance

calculates the minimum distance from the vertexes of one object to the edges of the other object

Moving_Distance calculates the distance an object has traveled

continuously

Page 37: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

37

Key Issue: PerformanceKey Issue: Performance Size of data set:

Cyclone table -- 200,000 tuples Island table -- 1000 tuples

Cases compared AXL using indexes AXL not using indexes C++ using indexes C++ not using indexes

Index Tstart on Cyclone table and Name on Island table

Page 38: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

38

Performance -- DurationPerformance -- Duration Query 5: find the duration of the cyclones occurred

in June, 1996.

SELECT DURATION(Tstart, Tend) FROM CycloneWHERE 19960601 <= Tstart AND 19960630 >= TstartGROUP BY ID

Page 39: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

39

Performance – Duration (cont.)Performance – Duration (cont.)

Page 40: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

40

Performance – ContainPerformance – Contain Query 6: find the cyclones which occurred in June,

1996 and have been enclosed by the region of the island Misfortune.

SELECT IDFROM Cyclone, IslandWHERE 19960601 <= Tstart AND 19960630 >= Tstart AND Name = “Misfortune”GROUP BY IDHAVING CONTAIN(Region, Trajectory)

Page 41: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

41

Performance – Contain (cont.)Performance – Contain (cont.)

Page 42: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

42

Page 43: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

43

Abstract ModelAbstract Model Objective: flexibility

user can decide which level of abstraction they want

may have more than two layers Data types:

temporal data type -- time instants spatial data types – points, lines, and polygons

Page 44: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

44

A Spatio-Temporal ObjectA Spatio-Temporal Object The concrete model -- space triangles and time

intervals

A more abstract representation -- sequence of snapshots

(S , ((2,2),(6,2),(2,6)), [1,10])(S , ((2,6),(6,2),(6,6)), [1,10]) 1<=t<=10

S

2 4 6 8

(S, [(2,2),(2,6),(6,6),(6,2)], 1)(S, [(2,2),(2,6),(6,6),(6,2)], 2)

……(S, [(2,2),(2,6),(6,6),(6,2)], 10)

6

4

2

Page 45: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

45

Schema DefinitionSchema Definition The Cyclone relation

The Island relation

CREATE TABLE Cyclone (ID INT, Position POINT, Pressure REAL, Time DATE)

CREATE TABLE Island (Name CHAR(30), Extent POLYGON)

Page 46: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

46

MappingMapping UDA -- map

Table function -- decompose

(Point, Time Instant)

(Line, Time Interval)

(Polygon)

(Triangle)

Page 47: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

47

Spatio-Temporal QueriesSpatio-Temporal Queries Query 3’: find all cyclones whose high pressure

stage (pressure > 1000mb) have lasted more than 3 days.

SELECT NEW.ID FROM (SELECT ID, MAP(Position, Time) FROM Cyclone WHERE Pressure > 1000 GROUP BY ID) AS NEW(ID, Trajectory, Tstart, Tend)GROUP BY NEW.IDHAVING DURATION(New.Tstart, New.Tend) > 3

Page 48: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

48

Spatio-Temporal Queries (cont.)Spatio-Temporal Queries (cont.) Query 4’: find the cyclones whose trajectory have

been enclosed by the island Misfortune.

SELECT NEW.ID FROM (SELECT ID, MAP(Position, Time), T.Region FROM Cyclone, Island, TABLE(decompose(Extent)) AS T WHERE Name = “Misfortune” GROUP BY ID, T.Region) AS NEW(ID, Trajectory, Tstart, Tend, Region)GROUP BY NEW.IDHAVING CONTAIN(New.Region, New.Trajectory)

Page 49: Data Models and Query Languages of Spatio-Temporal Information Cindy Xinmin Chen Computer Science Department UCLA February 28, 2001

49

ConclusionConclusion Better data models and query languages for

temporal and spatio-temporal information Multi-layered architecture for spatio-temporal

extensions on O-R systems Support further extensions and customization by

end-users via user-defined spatio-temporal aggregates