Upload
kerem
View
42
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Aurora – system architecture. Pawel Jurczyk. Currently used DB systems. Classical DBMS: Passive repository storing data (HADP – human-active, DBMS-passive model) Only current state of data is important Data synchronized; queries have exact answers (no support for approximation) - PowerPoint PPT Presentation
Citation preview
Aurora – system architecture
Pawel Jurczyk
Currently used DB systems
• Classical DBMS:– Passive repository storing data (HADP – human-active, DBMS-
passive model)– Only current state of data is important– Data synchronized; queries have exact answers (no support for
approximation)• Monitoring applications are difficult to implement in
traditional DBMS– Triggers do not scale past a few triggers per table– Problems with getting required data from historical time series– Development of dedicated middleware is expensive
• Conclusion: these systems are ill suited for applications used to alert human when abnormal situation occurs (expected DAHP model – DBMS-active, human-passive)
Aurora – main assumptions
• Data comes from various, uniquely identified data sources (data streams)
• Each incoming tuple is timestamped• Aurora is expected to process incoming streams• Tuples are transferred through loop-free,
directed graph• Outputs from the system are presented to
applications• Maintains historical storage
Aurora system overviewA
pplicationsInput data
streams
Output data
Queries Storage
•Any box can filter stream (select operation)
•Box can compute stream aggregates applying aggregate function accross a window of values in the stream
•Output of any box can be an input for several other boxes (split operation)
•Each box can gather tuples from many inputs (union operation)
Aurora query model
•Each CP and view should have a persistence specification (e.g. „keep data for 2 hr”)
•Each output is associated with QoS specification (helps to allocate the processing elements along the path)
b1
b7
b2
b6
b5b4
b3 Appl
Appl
Connection points
Storage S1 Storage S2
Storage S3
Continuous query
View
Ad-hoc query
„Keep 2 hr”
QoS spec
QoS spec
QoS spec
„Keep 2 hr”
Queries in the aurora• Continuous queries
– Query continuously processes tuples
– Output tuples are delivered to an application
• Ad-hoc queries– System will process data and deliver answer from the earliest
time stored in the connection point
– Semantic is the same as continuous query that started execution at tnow – (persistence specification)
– Query continues until explicit termination
• Views – Similar to materialized or partially-materialized views in classical
DB systems
– Application may connect to the end of this path whenever there is a need
Connection points
• Support for dynamic modification of network• Support for data caching (persistence specification) –
helpful for ad-hoc queries• Connection point without upload stream can be used as
a stored data set (like in classical DBMS)• Tuples from connection point can be pushed through the
system (e.g when connection point is „materialized” and stored tuples are passed as a stream to the downstream nodes)
• Alternatively, downstream node can pull the data (helpful in the execution of filtering or joining operations)
Optimization in the Aurora - problems
• Many changes in the network over the time
• The need of dealing with a large number of boxes
• The system operates in a data flow mode
• Optimization issues address different needs than classical DBMS
Optimization of continuous queries• Optimization is done during the run-time• Aurora starts execution of unoptimized network• Optimization is performed step-by-step for portions of
network (subnetworks)• Firstly, hold on all input messages for selected
subnetwork – drain it of messages• Then, optimize selected subnetwork
– Insert projections (get rid of unneeded attributes of tuples as soon as possible)
– Combine boxes (e.g. projection with filtering)– Reorder boxes (e.g. filtering can be pushed down the query tree
through join)• Finally, stop holding input messages• Optimizer cycles periodically through all subnetworks (it
is a background task)
Optimization of continuous queries - details
• Each box has:– c(b) – execution cost– s(b) – selectivity -expected
number of output tuples per 1 input tuple
• Amount of processing for successive boxes (according to the situation in figure):c(bi) + c(bj)*s(bi)
• Boxes are in right order if: (1-s(bj))/c(bj) < (1-s(bi))/c(bi)• Let’s check the condition above for bi and bj:
– (1 – 0.5)/1 < (1 – 5)/4 0.5 < -4/4 FALSE– The condition is not satisfied – we should change the order of boxes
S(A) T(B, C)
Filter (A>2, A<4)
Join (A=B)
Filter (C > 0)
bi: c=4; s=5
bj : c=1; s=0.5
Filter (B>2, B<4)
Optimization of ad-hoc queries
• Each ad-hoc query is attached to a connection point – it runs on all the historical data stored in a connection point
• Connection point keeps historical data as B-tree• Firstly examined ‘historical part’ of ad-hoc query
(successor(s) of connection point) – filter boxes being compatible with the B-tree storage
key can use indexed lookup– joins can use merge-sort or indexed lookup – the
cheapest one is chosen• The rest of query is optimized as continuous
queries
Run-time architecture
Router
Scheduler
Load Shedder
QoS Monitor
Storage manager
Box Processors
Q1
Q2
Qi
Qn
Qj
Buffer Manager
Persistent Storage
OutputsInputs
Run-time components• Router
– Routes tuples in the system– Forwards them either to outputs or to the storage manager
• Storage manager– Responsible for maintaining the box queues and managing the
buffer• Scheduler
– Decides which box will be processed• Box processor
– Executes the appropriate operation– Forwards output to router
• QoS monitor– Observes outputs and activates load shedder
• Load shedder– Shades load till the performance reaches the acceptable level
QoS
• Optimization is based on the attempt to maximize the perceived QoS for the outputs
• Basically, QoS is a function of:– Response times (production of output tuples)– Tuple drops– Values produced (importance of produced values)
• Administrator specifies QoS graphs for output based on one or more of mentioned functions
• Other types of QoS functions can be defined too• Administrator defines headroom for the system (the
percentage of computing resources that can be used by Aurora)
QoS graphs
• Graphs are expected to be normalized• Graphs should allow a properly sized network to operate
with all outputs in a ‘good zone’• Graphs should be convex (the value-based graph is an
exception)
1
0Delay
1
0% tuples delivered
1
0Output value
good zone
Aurora Storage Manager (ASM) – Queues management
• Windowed operations (e.g. aggregations) require historical collection of tuples
• Tuples may accumulate in various places when network is saturated
• There is one queue at the output of each box; this queue is shared by all successor boxes
• Queues are stored in memory and on disks• Queues may change length• Scheduler and ASM share scheduling priority and the
percentage of queue in the main memory
b2 b1
timeQueue organization
Processed tuples
Aurora Storage Manager (ASM) – Connection point management
• If the amount of needed historical data in the CP is less than the maximal window size of the successor boxes, no extra storage needed
• Historical data is organized in B-trees based on the storage key (default: timestamp)
• Periodically, all tuples that are older than the history requirement, are removed from B-tree
• B-trees are stored in the space allocated by the ASM
Scheduling in Aurora• Scheduler (and Aurora) aims to reduce overall
tuple execution cost• Exploit of two nonlineralities in tuple processing
– Interbox nonlinearity:• Minimaze tuple trashing (if buffer space is not sufficient
tuples has to be shuttled between memory and disk)• Avoiding to copy data from output to buffer (a possibility of
bypassing ASM when one box is scheduled right after another)
– Intrabox nonlinearity: • The cost of tuple processing may decrease as the number of
available tuples in the queue increases (avoiding context-switching, better optimization)
Scheduling in Aurora
• Aurora’s approach: (1) have in queues as many tuples as possible, (2) process it at once – train scheduling, and (3) pass them to subsequent boxes without going to disk – superbox scheduling
• Two goals: (1) minimize number of I/O operations and (2) minimize number of box calls per tuple
• How does it work?– Output is selected for execution– There is found the first downstream box with queue in memory– Then, there are considered upstream boxes – there is found as
many upstream boxes with queues (not empty) in memory as possible
– Found sequence of boxes can be scheduled one after another– Storage manager is notified to keep all the queues of selected
boxes in memory during the execution
Priorities assignment in Scheduler
• The waiting delay of tuples (a part of the latency of each output) is the function of scheduling
• The goal of scheduler: to assign priorities to boxes outputs that maximize the overall QoS
• The Scheduler’s approach is divided into two aspects: – state-based analysis that assigns priorities to outputs
and picks for scheduling the output with the highest utility
– feedback-based analysis that observes overall system and increases the priorities of outputs not doing well
Scheduler – execution overheadT
ime
(ms)
0
50
100
150
200
250
300Execution costs
Scheduling overhead
Tuple at a time Trains Superboxes
Prediction of overload situations
• Static analysis– The goal: determine if the hardware running the
network is sized correctly– Each box has processing cost c(b) and selectivity s(b)– Each input has the rate of tuples production r(d)– Analysis starts from each datasource and continues
downstream– The system is stable when: 1/c(bi) ≥ r(di)– The output rate from bi is: min(1/c(bi), r(di)) * s(bi)– Iteration of the steps above gives output data rate and
computational requirements for each box– Then there is a possibility of prediction required
computational resources
Prediction of overload situations
• b1: 1/0.05t/s ≥ 100t/s (not true!)
• Output stream: min(1/0.05s, 100t/s) * 0.1 = 2t/s
• b2: (1/0.05)t/s ≥ 100t/s (not true!)
• Output stream: min(1/0.05s, 100t/s) * 0.1 = 2t/s
S(A, B, C) T(B, C)
Filter (A>2, A<4)
Join (A=B)
Filter (C > 0)
b3: c=0.1s; s=5
b4 : c=0.05s; s=0.5
Filter (B>2, B<4)
b1: c=0.05s; s=0.1 b2: c=0.05s; s=0.1
rs=100t/s rt=100t/s
• b3: (1/0.1)t/s ≥ (2 + 2)t/s (true)
• Output stream: min(1/0.1s, 4t/s) * 5 = 20t/s
• b4: (1/0.05)t/s ≥ 20t/s (true)
• Output stream: min(1/0.05s, 20t/s) * 0.5 = 10t/s
• Needed computation: 100t/s+100t/s+2t/s+2t/s+20t/s+10t/s=234t/s
Prediction of overload situations
• Run-time analysis– Helps to deal with input rate spikes– Uses delay-based QoS information– If many of tuples are outside the ‘good zone’,
there is a probability of overload
Load shedding• Reaction to overload• Load shedding process relies on QoS information• Load shedding by dropping tuples
– Drop is a system level operator that enables to drop randomly tuples from stream at specified rate
– Drop box is located as far upstream as possible– Result of static analysis
• Dropping of tuples on network branches that terminate in more tolerant outputs
• Algorithm: (1) choose the output with the smallest negative slope in tuple drops graph, (2) move horizontally along this curve until there is another output with smaller negative slope at this point, (3) this horizontal difference is an indication of of the output tuples drop rate
– Result of dynamic analysis• Similar algorithm as previously• Can be use delay-based graphs• Dropping of tuples on branches that terminate in higher priority
outputs (otherwise it would be ineffective)
Load shedding• Load shedding by filtering tuples
– Idea: remove less important tuples rather than randomly chosen
– It use value-based QoS information– There is prepared a histogram containing the frequency
with which value ranges have been observed– Then there can be calculated utility of each of intervals
(multiply frequency with value-based QoS function value)– Backward interval propagation: Aurora picks the interval
with the lowest utility and prepares predicate for it that is used in filter box
– Forward interval propagation: Estimation of proper filter predicate and checking it by trial and error