23
Fjording the Stream: An Architecture for Queries over Streaming Sensor Da ta Samuel Madden, Michael J. Franklin University of California, Berkeley Proceedings of the 18th International Conference on Data Engineering (ICDE’02)

Fjording the Stream: An Architecture for Queries over Streaming Sensor Data Samuel Madden, Michael J. Franklin University of California, Berkeley Proceedings

  • View
    213

  • Download
    0

Embed Size (px)

Citation preview

Fjording the Stream: An Architecture for Queries over Streaming S

ensor Data

Samuel Madden, Michael J. Franklin

University of California, BerkeleyProceedings of the 18th International Conference on Data Engineering (ICDE’02)

Outline

IntroductionArchitectureMain characteristicsExperimentsConclusions

Introduction

Fjord also fiord, long narrow inlet of the sea between high cliffs, as in

Norway “Framework in Java for Operators on Remote Data streams”

Sensor infrastructure Cooperators

Berkeley Highway Lab (BHL) California Department of Transportation (CalTrans)

Location Bay Area Freeways

Objective Monitoring traffic conditions

Sensor limitations

Push-based data Waiting for queries wastes power

Power Sensors with battery 100mAh

CPU: 3.5 hours TRM-1000 radio: 14MB

Tradeoff It is often worth spending many CPU cycles to

conserve just a few bytes of radio traffic.

Issues in data stream systems

Operators Aware of the infinite nature of streams Modified versions of AVERAGE, COUNT, SORT, JOIN

hash-join• A. Wischut, P.Apers. Dataflow query execution in a parallel mai

n-memory environment. blocking operators (ex: average)

• specify a subset of the stream for them to operate over

Query plan optimization no mention

Architecture

Architecture (1/2)

Components Operators

has• a set of input queues• a set of output queues

Queues has

• one input operator• one output operator

Sensor proxy

Architecture (2/2)

Strategy State based execution model Rather than placing each pushing operator in its

own thread

Advantages Better control over priority Lower overhead

outputcurrentstate input

new state

Main characteristics (1/2)

Integrating streaming data with disk-based data Example

Relations between average speeds and traffic incidents Means

Using queues as data sources

Combining multiple queries into a single plan Reason

Several queries need data from the same sensors. Duplication wastes bandwidth and power.

Means Using the sensor proxy

Main characteristics (2/2)

Intergrating streaming data with disk-based data

Queue pull

push put get

transition get

input operator output operator

Code snippet

Sensor proxy

Functions Adjust the sample rate of the sensors, based on

user demand Direct the sensor to aggregate samples in pred

efined ways Let user queries share the same tuple data

Experiments

Traffic queries

Fjord

Performance

Output queues become slower

when there are more than a few thousand elements on them.

Scalability

Simulations

Speed, length of a vehicle

Speed

Length

Sensor parameters

Power consumption

Scenario The sensor

1. reads from it’s A-to-D input

2. transmits the sample

3. sleeps until the next sample period arrives

Power consumption

Scenario Sensors

observe when a car passes over them

transmit the { t0, t2 } or { t1, t3 }

relay only a few samples per second

Power consumption

Scenario The sensors

Only relay a count of the number of vehicles that passed in the previous second

Conclusions

Addressing the low level infrastructure issues in a sensor stream query processing via Fjord combines proxies, non-blocking operators

and conventional query plans Sensor proxies serve as intermediaries

between sensors and query plan