View
214
Download
0
Embed Size (px)
Citation preview
Scalable Trigger Processing
Discussion of publication byEric N. Hanson et al
Int Conf Data Engineering 1999
CS561
Motivation
Triggers popular for: Integrity constraint checking Alerting, logging, etc.
Commercial database systems Limited triggering capabilities 1 trigger/update-type on table; or at best 100.
But : Current technology doesn’t scale well And, internet and web-based applications
may need millions of triggers.
An Example Trigger
Example “stock ticker notification”: Stock holding: 100*IBM Query: Inform an agent whenever the price of the stock
holding crosses $10,000
Create Trigger stock-watch from quotes q
on update(q.price)when q.name=‘IBM’ and 100*q.price > 10,000do raise event ThresholdCrossed(100*q.price).
Note: We may need 1,000 or millions of such triggers Web interface may allow users to create such triggers
Problem Definition
Given: Relational DB, Trigger statements, Data Stream Find: Triggers corresponding to each stream item Objective: Scalable trigger processing system
Assumptions: Number of distinct structures of trigger expressions is relatively
small All trigger expression structures small enough to fit in main
memory
The Problem, once more.
Requires millions of triggers (on huge data). Steps for trigger processing
Event monitoringCondition evaluationExecuting triggered action
Response time for database operations critical !
Related Work
Range Predicates, Marking-Based
[Hans96b, Ston90] (large memory, complicated
storage)
AI[Forg82,Mira87](smaller rule set)
Parallel Processing[Gupt89,Hell98]
IndexingECA Model(not scalable)
Overall Driving Idea
If large number of triggers are created, then many have the same format.
Triggers share same expression signature except that parameters substituted.
Group predicates from trigger conditions based on expression signatures into equivalence classes
Store them in efficient main memory data structures
Components
TriggerMan Datablade (lives inside Informix) Data Sources
Local/remote tables/streams; must capture updates and transmit to TriggerMan (place in a queue)
TriggerMan Client applications Create /drop triggers, etc.
TriggerMan Driver Periodically involve TmanTest() fn to perform condition testing
and action execution. TriggerMan console
Direct user interaction interface for trigger creation, system shutdown, etc.
TriggerMan Syntax
Trigger syntax
create trigger <triggerName> [in setName][optionalFlags]from fromList[on eventSpec][when condition][group by attributeList][having groupCondition]do action
Example : Salary Increases
Update Fred’s salary when Bob’s salary is updated
create trigger updateFred
from emp
on update (emp.salary)
when emp.name = ’Bob’
do execSQL ’update emp set salary=:NEW.emp.salary where emp.name=’’Fred’’’
Example : Real Estate Database
“If new house added which is in neighborhood that salesperson Iris reprensents then notify her”
House (hno,address,price,nno,spno)Salesperson (spno,name,phone)Represents (spno,nno)Neighborhood (nno,name,location)
create trigger IrisHouseAlerton insert to housefrom salesperson s, house h, represents rwhen s.name = ‘Iris’ and s.spno=r.spno and r.nno=h.nnodo raise event NewHouseInIrisNeighborhood(h.hno, h.address)
Trigger Condition Structure
Expression signature
Expression signature consists ofData source IDOperation code, e.g. insert, delete, etc.Generalized Expression (parameterized)
=
Emp.name CONSTANT
FROM: Data src: empON: Event : updateWHEN: boolean exp.
Condition structure (contd)
Steps to obtain canonical representation of WHEN clause Translate expression to CNF Group each conjunct by data source they refer to
Selection Predicate will be of form :
(C11 OR C12 OR ..) AND ... AND (Ck1 OR …),
where each Cij refers to same tuple variable. Each conjunct refers to zero, one, or more data sources Group conjuncts by set of sources they refer to
If one data source, then selection predicate If two data sources, then JOIN predicate
Triggers for stock ticker notification
Create trigger T1 from stock when stock.ticker = ‘GOOG’ and stock.value < 500 do notify_person(P1)
Create trigger T2 from stock when stock.ticker = ‘MSFT’ and stock.value < 30 do notify_person(P2)
Create trigger T3 from stock when stock.ticker = ‘ORCL’ and stock.value < 20 do notify_person(P3)
Create trigger T4 from stock when stock.ticker = ‘GOOG’ do notify_person(P4)
Expression Signature
Idea: Common structures in condition of triggers
Expression Signature: E1: stock.ticker = const1 and stock.value < const2
Expression Signature:
E2: stock.ticker = const3
Expression signature defines equivalence class of all instantiations of expression with different constants
T4: stock.ticker = ‘GOOG’
T1: stock.ticker = ‘GOOG’ and stock.value < 500T2: stock.ticker = ‘MSFT’ and stock.value < 30T3: stock.ticker = ‘ORCL’ and stock.value < 20
What to do now
Only a few distinct expression signatures, build data structures to represent them explicitly (in memory)
Create constant tables that store all different constants, and link them to their expression signature
Main Structures
A-treat NetworkNetwork for trigger condition testing
For a trigger to fire, all conditions must be true
Expression SignatureCommon structure in a trigger
E1: stock.ticker = const1 and stock.value < const2
Constant TablesConstants for each expression signature
A-Treat Network to represent a trigger
For each trigger condition stock.ticker = const1 and stock.value < const2
Root
stock.ticker = const1 stock.value < const2
alpha-node alpha-node
predicates
Node 1 Node 2
Condition Testing
A-Treat network is a discrimination network for trigger condition testing.
For a predicate to be satisfied, all its conjuncts should be true.
This is checked using A-Treat network.
A-Treat network (Hanson 1992)
Define rule SalesClerk
If emp.sal>30,000
And emp.dno=dept.dno
And dept.name=“sales”
And emp.jno=job.jno
And job.title=“clerk”
Then Action
Expression Signature TableEx.
ID
Data Source
Signature
Description
Constant Table
Number of Constants
Constant Organization
E1 stock … const_e1 2 Main Memory
E2 stock … const_e2 1 Main memory
E1: stock.ticker = const1 and stock.value < const2E2: stock.ticker = const3
Constant Tables Tables of constants in trigger conditions
Ex. ID Trigger ID Constant 1 Constant 2 Next Node Rest
E1 T1 GOOG 500 Node 2
E1 T2 MSFT 30 Node 2
E1 T3 ORCL 20 Node 2
T1: stock.ticker = ‘GOOG’ and stock.value < 500T2: stock.ticker = ‘MSFT’ and stock.value < 30T3: stock.ticker = ‘ORCL’ and stock.value < 20
Ex. ID Trigger ID Constant 1 Next Node Rest
E2 T4 GOOG Null
Const_e2
T4: stock.ticker = ‘GOOG’
Const_e1
Tables
Primary tables trigger_set (tsID, name, comments, creation_date, isEnabled) Trigger (triggerID, tsID, name, comments, trigger_text,
creation_date, isEnabled, …)
Trigger cache in main memory for recently accessed triggers.
Predicate Index
Tables expression_signature(sigID, dataSrcID, signatureDesc,
constTableName, constantSetSize, constantSetOrganization) const_tableN(exprID, triggerID, nextNetworkNode, const1, … constK,
restOfPredicate)
Root of predicate index linked to data source predicate indices Each data source contains an expression signature list Each expression signature links to its constant table. Index expressions on most selective conjunct (rest on fly).
Processing Trigger Definition
Parse the trigger and validate it Convert the when clause to conjunctive normal
form Group the conjuncts by the distinct sets of tuple
variables they refer to Form a trigger condition graph, that is, undirected
graph with node for each tuple variable and edge for join predicates.
Build A-Treat network
Processing trigger definition (2)
For each selection predicate If predicate with same signature not seen before
Add signature of predicate to list And, add signature to expression_signature table If signature has a constant placeholder in it, create a
constant table for the signature. Add constants
Else if predicate has constants, add a row to the constant table for the expression
Alternate Organizations
Storage for the expression signature’s equivalence class: Main memory lists Main memory index Non-indexed database table Indexed database table
For each expression signature, choose a structure depending on number of triggers.
efficiency
Scalability
Processing update descriptors
On getting an update descriptor (token) (data src ID, operator code, old/new tuple)
Locate data source predicate index from root of predicate index.
For each expression signature, find constant matching the token using index.
Check additional predicate clauses against the token. When all predicate clauses of a trigger have matched,
pin the trigger in main memory Bring in A-treat network representing that trigger to
process aremaining part of trigger, like join, etc. If trigger condition is satisfied, execute action.
Processing an Update
Root
Update Stock (ticker=GOOG, value=495)
Index ofstock.ticker=const1
E1 E2
const_e1 const_e2
Other source Predicate index…
Trigger ID Constant 1 Constant 2 Next Node
T1 GOOG 500 Node 2
T2 MSFT 30 Node 2
T3 ORCL 20 Node 2
E1: stock.ticker = const1 and stock.value < const2
const_e1
Concurrency
Identified elements that can be parallelized Token-level
Multiple tokens processed in parallel Condition-level
Multiple selection conditions tested concurrently Rule-action-level
Multiple rule actions fired at the same time Data-level
Set of data values in the network processed in parallel
Conclusion : Overall Key Points
If a large number of triggers are created, many of them have almost the same format
Group triggers with same structure together into expression signature equivalence classes
Number of distinct signatures is small enough to fit into main memory (index)
Develop a selection predicate index structures Architecture to build a scalable trigger system.