Lucas Jellema
Oracle OpenWorld 2014, San Francisco, CA, USA
How Fast Data Is Turned into Fast
Information and Timely Action
2
Overview
• What is [special about] Fast Data?
– Continuous, Volume|Velocity|Variety, Real Time
• Challenges
– Volatile, non persistent
– Data => Information, Conclusion, Alert, Recommendation, Action
• Strategies
– Smart gathering
– Discard – filter, aggregate, pattern
(and also look for missing events!)
– Promote (process, enrich)
– Visualize
• Technology/Tools
• Demonstration/Cases
4
Fast Data
• Tweet
• Feed
• Beat
• Signal
• Measurement
• Message
• Notification
• Tick
• Pulse
5
New theme (that brings it all together)
6
Some event producing devices
7
Most of these events….
9
10
Fast Data Processing
Fast Data
Smart Processing
• Information
• Conclusion
• Alert
• Recommendation
• Action
11
Fast Data Processing Multi-stage cleansing & aggregation
Fast Data
Smart Processing
• Information
• Conclusion
• Alert
• Recommendation
• Action
12
Typical Flow and Additional Challenge…
Business event
Bu
sin
ess V
alu
e
Data captured
Analysis completed Action taken
Fragmented
event entities
TIME
13
The V-factor
Volume
Velocity
Variety
VALUE
14
Key strategy
• Discard – as early as possible (close to the source)
– Ignore irrelevant events
– Filter out unneeded attributes
– Takes samples instead of entire stream
– Aggregate: merge multiple events into one
15
Fast Data Processing: Oracle Event Processor
Fast Data
Smart Processing
Oracle Event
Processor
RMI
File
REST
HTTP Channel
JMS
Database
Custom (Java)
SOA Suite EDN
Coherence
JMX
QuickFix (financial)
RMI
File
REST
HTTP Channel
JMS
Business Rule
Database
Custom (Java)
SOA Suite EDN
Coherence
JMX
16
Oracle Event Processor
• Light weight, real-time (sub-sub-second), in-memory, continuous query engine
– Available in embedded form – with corresponding licence
• Interacts with many different channels – inbound and outbound
• Has internal caches to enrich events and temporarily retain events
• Uses CQL to:
– Filter, aggregate, enrich and detect patterns (including missing events)
events
Event Processor
Input
Adapter Channel
Input
Adapter Channel
CQL
Processor
CQL
Processor
Channel
CQL
Processor
Channel
Channel
Output
Adapter
Output
Adapter
OSGi Bundle/Spring Application Context
DB
Input adapters connect to data sources Channels help control the flow of data and can be tuned for optimal performance Databases and Coherence caches can be referenced directly in CQL processors CQL processors contain correlation, aggregation and pattern matching business logic Output adapters send data and alerts to downstream systems and business processes
Coherence
Oracle Event Processing Application
18
Fast Data Processing Fusion Middleware Tooling
Fast Data
Smart Processing
Oracle Event
Processor
Coherence
SOA Suite
12c EDN
RMI
File
REST
HTTP Channel
JMS
Database
JMX
Custom (Java)
RMI
File
REST
HTTP Channel
JMS
Business Rule
JMX
Database
Custom (Java)
19
Fast Data Processing Fusion Middleware Tooling
Fast Data Smart Processing
• Information
• Conclusion
• Alert
• Recommendation
• Action OEP
BAM
ADF Coherence SOA
Suite
EDN BPM
Suite
BPEL Task
BI RTD
ODI Golden
Gate
NoSQL
20
Fast Data Example
14,0
16,1
14,1
16,1
16,0
13,1
14,0
16,0
13,1
13,0
14,1
16,0
14,1
13,0
14,1
16,0
13,1
14,0
Smart Processing
Oracle Event
Processor
21
Demonstration: Live Tennis
• Tennis Tournament
• Many matches played in parallel
• The data that is produced:
– At a rate of up to 10 events/minute
Match Id, Player [who scored]
14,0
16,1
14,1
16,1
16,0
13,1
14,0
22
Demonstration: Live Tennis
• The information, conclusions & actions we are looking for:
– Scoreboard per game, set, match
– Match start and completion (action:
inform next players for that court)
– Interrupted match (action: go and check
out the reason for the interruption)
Fast Data
Smart Processing
• Scoreboard
• Match start and
completion
• Interrupted match
23
OEP application to process fast tennis data
• Preparation
– Define event definitions
– Create local, in memory cache with static, enriching data
– Gather (in this case generate) tennis data through adapter
– Create Event Sink to consume all findings and publish to console
TennisMatch
Event
matchId
player
24
Simple Time-slice Aggregation
• Produce aggegrates once every 30 seconds
– Count number of matches going on currently (meaning: in the last 30 seconds)
– Calculate average time per rally (over the last 30 seconds)
– Count total number of points played (over the last 30 seconds)
25
Simple Time-slice Aggregation combined with all-time findings
• Produce aggegrates once every 30 seconds – partially over last 30 seconds and partially over ‘all time’
– Count number of matches going on currently (meaning: in the last 30 seconds)
– Calculate average time per rally and count total number of points played (all time)
– Longest rally played in the tournament
26
Simple Time-slice Aggregation combined with all-time findings
27
Match Level events
28
Rally’s to games
- The first player to have
won more than 4
points
- and have won two or
more points more than
his opponent
TennisMatch
Event
matchId
player
29
Games to Sets
- The first player to have
won more than 5
games
- and have won two or
more games more
than his opponent
30
Detect interrupted matches by ‘finding’ missing events
• When a match is interrupted, obviously no more ‘rally point events’ are produced
• Detecting the absence of these events for a match [that has begun] is equivalent to detecting an interruption of the match
– Unless the match is complete because someone won
31
Detect interrupted matches by ‘finding’ missing events
32
Complete EPN diagram for Tennis Tournament Processor
• A single OEP application that consumes fine grained rally point events and performs three-stage aggregation and enrichment
TennisMatch
Event
matchId
player
New
Match
Match
Finish Interrupted
Match
Set
Won Game
Won
33
Demonstration: Car Parks Management
Fast Data
• Available lots
• Average parking time
• Tow-candidates
(abandoned cars)
Smart
Proces
sing
34
Car Parks Management
Fast Data
• Available lots
• Average parking time
• Tow-candidates
(abandoned cars)
Smart
Proces
sing
Car Entry
- CarParkId
- Licence Plate
Car Exit
- CarParkId
- Licence Plate CarparkCapacity
Alert
- CarParkId
- % full
AbandonedCar
- CarParkId
- Licence Plate
CarStayDone
- CarParkId
- Licence Plate
- Duration
CarParkStatus
- CarParkId
- #cars
- Avg stay
35
Credit Card Theft Detection
• Several situations in the past
– Credit card is stolen in the main terminal building
– Several purchases are made in shops on the way from that area to the main exit • Purchases between $200-$500 dollar
• Purchases made within 5 minutes of each other
• Sometimes the purchases are made in not entirely the direct route to the exit
EXIT
Main
Terminal
36
Credit Card Theft Detection
• Several situations in the past
– Credit card is stolen in the main terminal building
– Several purchases are made in shops on the way from that area to the main exit • Purchases between $200-$500 dollar
• Purchases made within 5 minutes of each other
• Sometimes the purchases are made in not entirely the direct route to the exit
• To catch the perpetrator
– Consume the credit card purchase event stream for airport shops
– Spot situations where three or more purchases of $200-$500 are made within 5
minutes from each other and roughly in the terminal => exit physical order
– Publish an event to alert security staff
• To watch for any further purchases with that credit card
• To inform show staff for that credit card
• To send staff to the exit to try and apprehend the thief
(perhaps based on the shopping bags he is carrying
from the shops he bought stuff at)
37
Catch me if you can
EXIT
Main
Terminal
38
Catch me if you can
EXIT
Main
Terminal
$440
$300
$380
$250
39
Toilet Cleanliness
• Every toilet facility has a customer satisfaction station
– Every 30 seconds, someone can indicate (1-4) their satisfaction
• All entries are collected – signals (toiletId, rating)
• When the rating < 3.3 (average over last 5 signals) – the cleaning staff has to act
• When the rating < 3 and the
• previous signal for that toilet is longer ago than 5 minutes, then also action is required
40
Human consumers
• Slow at data processing
• Not electronically connected
• Visually oriented (1 picture > 1000 words)
• Frequently (though perhaps decreasingly so) the actor or decision maker
• Interact along human communication channels
• Use visualization to present findings, conclusions, recommended actions
– And as a second tier of fast data processing:
Highlight (filter), aggregate, patterns, extrapolate/interpolate, missing elements
• Sometimes take over from humans and just take action
41
Audience Challenge
42
Audience Challenge – 1/2
43
Audience Challenge – 2/2
44
Visualize and Aggregate
45
Trends and Extrapolation
JMS
HTTP
JMX
File
DB
RSS
EDN
…
OEP
SOA Suite
BAM
E
D
N
JMS
@
Alerts
(rules)
Reports
HTTP
Consumers
Web
Application
The OEP to Human interface
47
Real Time – from Event to Task OEP => SOA Suite 12c EDN
Fast Data
Smart Processing
Oracle Event
Processor
SOA Suite 12c
EDN
BPEL Task
BPMN
Medi-
ator
event event event
48
Real Time – from event to UI
Fast Data
Smart Processing
Oracle Event
Processor
WebLogic
JMS
event
msg
WebSocket
Server msg
msg
49
Real Time – from event to UI Business Activity Monitoring
Fast Data
Smart Processing
Oracle Event
Processor
WebLogic
JMS
event
msg BAM msg
50
Summary
• Fast Data (events): Vast, Continuous, Velocity, Variety
– Wanted: Near real time conclusions, recommendations, alerts, actions
• Strategy:
– Discard – as early as possible (Filter, Aggregate)
– Enrich, Pattern Match, Missing Events, Retain, Publish higher level, more coarse
grained business event
– Repeat this cycle multiple times (such as rally point, game, set, match)
• Technology for Fast Data processing: Oracle Event Processor & CQL
– Interacts with JMS, EDN, RMI, HTTP (/REST), JMX, Database, Coherence
• To assist humans in Fast Data and Information Processing: Visualization
– Filter, Aggregate, Enrich, Pattern Match (1 picture > 1000 words)
– Technology: BAM (Dashboard and Rule processing), ADF Data Visualization
– Also: turn findings into actions using Human Task, BPEL and BPM via the SOA Suite
12c Event Delivery Network (EDN)