Rules Mining Custom Analytics

Embed Size (px)

Citation preview

  • 8/3/2019 Rules Mining Custom Analytics

    1/20

    Name

    Title:

    InfoSphere Streams

    Jaskiran BhatiaCountry Manager Information Management

  • 8/3/2019 Rules Mining Custom Analytics

    2/20

    Rules! Mining ! Custom Analytics!

    ...In micro seconds

  • 8/3/2019 Rules Mining Custom Analytics

    3/20

    . & these opportunities are everywhere

    Stock market

    Impact of weather on securities prices

    Analyze market data at ultra-low latencies

    Transportation

    Intelligent traffic Management

    Health & Life Sciences

    Neonatal ICU monitoring

    Epidemic early warning system

    Remote healthcare monitoring

  • 8/3/2019 Rules Mining Custom Analytics

    4/20

    . & these opportunities are everywhere

    e-Science

    Space weather prediction

    Detection of transient events

    Synchrotron atomic research

    Telephony

    CDR processing

    Social analysis

    Churn prediction

    Geomapping

    Real-time multimodal surveillance

    Situational awareness

    Cyber security detection

    Law Enforcement, Defense & Cyber Security

  • 8/3/2019 Rules Mining Custom Analytics

    5/20

    Traffic Control System in City of Stockholm

    Data sources

    GPS from 1000s taxis

    Loop Sensors

    Speed of traffic Flow

    density of traffic (cars per

    second)

    CCTV video inside tunnels

    Real Time Weather data

    Output

    Travel time forecasts

    Via SMS

    Now, In 30 minutes, 1 hour, 2hours etc

    Integrate with existing system

  • 8/3/2019 Rules Mining Custom Analytics

    6/20

    Traffic Management for Sustainability and Efficiency

    Multimodal Data Streams

    GPS

    Cell-phones (location tracking)

    Public Transport (bus, docking)

    Pollution measurements

    Weather Conditions (including road conditions)

    Optical traffic flow detectors

    Travel time data based on plate recognition

    Induction loop detector data

    Accidents in network as they are being recorded

    Road closures (road work, etc) Still pictures from road cameras

    Real Time Traffic Monitoring & Information

    (Multimodal) Travel Planner

    GPSDataStreams

    Real TimeTransformationLogic

    Real TimeGeoMapping

    Real Time

    Speed &Heading

    Estimation

    Real TimeAggregates& Statistics

    DataWarehouseWeb

    Server

    GoogleEarth

    Offlinestatisticalanalysis

    Interactive

    visualization

    Storageadapters

    Only 4 x86 Blade servers to process

    250,000 GPS probes per second, maps of 630,000 line

    segments

  • 8/3/2019 Rules Mining Custom Analytics

    7/20

    Matching map artifact

    Estimated path

    GPS probe

    Estimated speed & heading

    Real Time Geo Mapping & Speed Estimation

  • 8/3/2019 Rules Mining Custom Analytics

    8/20

    Web Zero platform

    Capture weather sensor data, analyses hurricanepredicted path

    Estimateimpact onportfolios

    Recommendations Based on Hurricane Forecast

    Compute portfoliomarket indicators

    (low latency) Makerecommendations

    and notify

    Capturemarket data

    (highvolume)

    System S platform

    DHTML Resultrendering

    Real-time projections of

    hurricane path

    Dynamically updatedrisk assessment

    for assets in projectedpath

    Correlatecombined risk and trade VWAP to

    determine buy/sell

    recommendations

  • 8/3/2019 Rules Mining Custom Analytics

    9/20

    Worlds fastest options trading prototype Identify and execute trades

    Process over 5M events persecond with average

    latency of 150microseconds

    Expand to incorporate

    content feeds, news text,audio, video, to establish

    greater context for better

    decisions

    CIO TD Bank"TD Bank Financial Group worked with IBM Research to develop a first-of-a-kind architecture capable of consuming, analyzing and acting on real-time market data

    while maintaining sub-millisecond response times even under extreme data loads

  • 8/3/2019 Rules Mining Custom Analytics

    10/20

    Equities Trading Starter Application

    Modular designComponents are plug-replaceable

    extend these or substitute your ownDemonstrates how trading strategies

    may be swapped out at runtime,

    without stopping the rest of theapplication

    TradingStrategy

    module looks for

    opportunities that have specificquality values and trends

    OpportunityFinder

    module looks for

    opportunities and computes quality

    metrics

    SimpleVWAPCalculator

    module

    computes a running volume-weighted

    average price metric

  • 8/3/2019 Rules Mining Custom Analytics

    11/20

    Predictive Analytics using InfoSphere Streams in a neo natal

    ICU helps detect life threatening conditions upto 24hrs earlier

    Real Time analytics and correlationson physiological data streams Blood pressure, Temperature, EKG,

    Blood oxygen saturation etc.,

    Early detection of the onset ofpotentially life threatening

    conditions Upto 24 hours earlier than currentmedical practices

    Early intervention leads to lowerpatient morbidity and better long

    term outcomes

    Technology also enables physiciansto verify new clinical hypotheses

    http://www.uoit.ca/EN/index.html
  • 8/3/2019 Rules Mining Custom Analytics

    12/20

    Law Enforcement and Security Federal Government Streams of information including video

    surveillance, wire taps, communications,call records, etc.

    Millions of streams per secondwith low density of critical data

    Identify patterns and relationships

    among vast information sources

    "The US Government

    has been working with IBM Research since

    2003 on a radical new approach to data analysis that enables highspeed, scalable and complex analytics of heterogeneous datastreams in motion. The project has been so successful that USGovernment will deploy additional installations to enable otheragencies to

    achieve greater success in various future projects" -

    US

    Government

  • 8/3/2019 Rules Mining Custom Analytics

    13/20

    SPSS Modeler to Build Model, Streams to Detect Quickly

    Characterization of Motive

    Bui ld rulese t s ( prof i l es) of

    var ious cause cat egor ies

    Ut i l iz ing cr im e scene in format ion such as . . .

    Crime reports

    when entered

    REAL-TIME ANALYTICPROCESSING

  • 8/3/2019 Rules Mining Custom Analytics

    14/20

    Data stored forfuture auditingand evidencerequirements

    Data from 911calls, satellitefeeds, imagery

    from city trafficcameras

    Streams defines the geo spatial location ofthe call by running powerful analytics inreal time using satellite communication link

    and draws in city camera feeds fromaround the area

    Real timesupport for 911

    dispatcher andfield personnel

    Government and Law Enforcement: e911 Support

  • 8/3/2019 Rules Mining Custom Analytics

    15/20

    Sharpe Engineering and US Navy Phase 1 SBIR

    Research

    Navy SensorsAdvanced Analytics

    InfoSphere Streams

    Identify/build sampleproblem

    Preliminary sizing

    Est. development,deployment andoperation costs

    Possible use cases

    Maritime commerce &

    Anti piracy

    Unmanned surface

    vehicles

    Disaster relief

    Cyber security

    + + =

    Sharpe EngineeringCommand & Control

    http://www.sharpe.com/http://www.onr.navy.mil/en/Science-Technology/Departments/Code-31/All-Programs/311-Mathematics-Computers-Research/Command-Control.aspxhttp://www.onr.navy.mil/en/Science-Technology/Departments/Code-31/All-Programs/311-Mathematics-Computers-Research/Command-Control.aspxhttp://www.sharpe.com/
  • 8/3/2019 Rules Mining Custom Analytics

    16/20

    Temporal Anomalies,Event / DestinationCorrelations,Partial periodicity

    From Data Analytics To Smarter Cyber Security

    IDS

    Humanannotations

    InfoSphere

    Warehouse

    Firewall

    IPS/ADS

    Sensors

    DNS

    ID & NAC

    App/DB

    LiveData

    Logs

    Unsupervised Supervised

    Channel Profiles

    Botnet Models

    ADS Models

    Statistical Models

    SecurityEvents

    Event Normalization

    Historical data summaries&evidence

    Data Repository: Logaggregation / normalization

    Temporal Analysis

    Security Analytics, e.g.,Botnet Analytics

    Entity Analytics (GNR, ...)

    Self-tunin

    g

    Feedback

    ManualTuning

    True Positives/False Positives

  • 8/3/2019 Rules Mining Custom Analytics

    17/20

    Forecasting Space Weather at LOFAR Outrigger

    in Scandinavia (LOIS)

    Triaxial Antenna InfoSphere Streams

    Radio signalinput and datapreparation

    Signal detectionand noisefiltering

    Strength and 3D

    directionalanalysis

    Swedish Institute of Space Physics

    SolarFlares

    Space Weatherprediction

    regarding impact

    on satellites andelectric grids+ + =

  • 8/3/2019 Rules Mining Custom Analytics

    18/20

    Telco Moving to Agile, Real-time Processes & Analytics

    Information Management EvolutionInformation Management Evolution

    2006 2010

    BusinessValue

    Bus

    inessValue

    2007/8

    CorporateVisibility

    Data Infrastructure

    Optimization

    Marketing

    Campaign &Service Analytics

    Large Indian Wireless Telco. 100+ million customers. >10% annual

    growth. Expanding operations abroad and

    growing to provide real-time services and 3G capabilities to customers.

    Reduce Complexity Manage Risk Reduce Cost

    Reduce data latency from 6-12

    hours to seconds

    Improvement in data

    processing throughput

    Implemented fault tolerant,

    and flexible solution

    Consolidated existing

    integration systems by 50%

    Streamlined development

    & maintenance of data services

    Single, real-timedata feed for Fraud,

    BI & RevenueAssurance systems

    Enterprise Data

    Warehouse, Data

    Marts & Reporting

    Enterprise DataEnterprise Data

    Warehouse, DataWarehouse, Data

    Marts & ReportingMarts & Reporting

    Cross-sell/Up-Sell,Reduced Activation

    Time

    OperationalEfficiency

    Customer Analytics and Business

    Process Management

    Customer Analytics and BusinessCustomer Analytics and Business

    Process ManagementProcess Management

    Real-time CDR

    processing

    Real-time CDR

    processing

  • 8/3/2019 Rules Mining Custom Analytics

    19/20

    Streams provides tight integration with existing Information /

    Analytics Infrastructure for Call Detail Record Processing

    Cognos

    Spreadsheets

    Applications

    InfoServer

    Data Marts

    SOA WebService

    Fin Planning

    Mashups

    InfoSphereWarehouse

    InfoSphereStreams

    DB2

    ERP,CRM and OtherData Sources

    Real TimeUser Analytics

    AnalyticModels

    Pre-processedData

    CDRs

  • 8/3/2019 Rules Mining Custom Analytics

    20/20

    Real Time Charging system telco in Japan

    1. The Pain Point

    Real time charging can increase revenue/profitBut:

    20 million users

    Dramatic call and IP traffic growth SMS grows at 30% annuallyEven hourly summaries were hard to do

    2. The Solution

    InfoSphere Streams with solidDB processes: 55K CDRs per second on 1 octicore node

    10 million in 200 seconds

    160K CDRs per second on 3 octicore nodes 10 million in 60 seconds Nearly linear growth Architectural pattern to prevent data loss Demonstrated high software productivity

    3. The Happy Ending

    Telco is extending the pilot; in production in 1H 2011Platform to create new real time billing system