50
Analytics in Your Enterprise Dakshitha Ratnayake Lead Solutions Engineer

Analytics in Your Enterprise

Embed Size (px)

Citation preview

Analytics in Your Enterprise

Dakshitha Ratnayake

Lead Solutions Engineer

What is Analytics?

• organizations have more data than ever at their disposal.

• actually deriving meaningful insights from that data—and converting knowledge into action—is easier said than done.

• There’s no single technology that encompasses big data analytics.

• several types of technology work together to help Organization get the most value from Their information.

Big Data Analytics

Real-World Applications

o Portfolio analysis and to predict the

impact of global events on financial markets.

Customer experience management and network capacity planning and optimization.

Music recommendations based on user data.

predict what the customer wants to see before he or she knows what they want!

Song identifications and predict the popular artists and genres that will get attention in the upcoming years.

Monitor financial market activities and catch illegal insider trading activities in the financial markets.

Track patient signs using sensor data.

Reduce their claims cost through better fraud detection.

Detect and prevent cyber-attacks and criminal activity.

Predict trends and lay down preparation plans to meet future demand.

Measure player efficiency and defensive effectiveness.

Source - http://upxacademy.com/2016/05/31/big-data-and-analytics-use-cases-in-8-industries/

WSO2 Analytics

Platform

• a single platform to address all analytics styles

• We deliver: • Batch Analytics

• Real time Analytics

• Interactive Analytics

• Predictive Analytics

• WSO2 Analytics Platform uniquely combines the above styles to turn data from IoT, mobile and Web apps into actionable insights.

WSO2 Analytics Platform

WSO2 Analytics Platform

“Publish once, process

anyway you like”.

Data AnaLysis

WSO2

Analytics

Platform

• high-level, SQL query-like languages

• Client Applications are agnostic of the Analytics Components • Common set of receivers/publishers for all analytics types

• Common format for events

• Leverage leading open source projects e.g. Storm and Spark and contribute back (such as Siddhi).

Analytics Strategy

• Open Source

• Rich, extensible, SQL-like configuration language

• Rich set of data connectors, which can be easily extended

• Events only need to be published once from applications to the platform, and can be consumed by batch or real time pipeline.

• Part of the overall WSO2 platform

Key Differentiators

Data Collection and

Publishing

Collecting DAta

AgentHolder. setConfigPath (getDataAgentConfigPath ());

DataPublisher dataPublisher = new DataPublisher(url, username, password);

String streamId = DataBridgeCommonsUtils.generateStreamId(HTTPD_LOG_STREAM, VERSION);

Event event = new Event(streamId, System.currentTimeMillis(), new Object[]{"external"}, null, new Object[]{aLog});

dataPublisher.publish(event);

Collecting Data: Example

Initialize the data publisher

Generate the stream ID for the stream to which the event will be published

Create and Publish Event

As a prerequisite, the streams must be defined in the receiver server (WSO2 DAS/CEP)

• Events are the lifeline of WSO2 CEP/DAS.

• They not only process data as events, but also interact with external systems using events.

• An Event is a unit of data

• an event stream is a sequence of events of a particular type.

• The type of events can be defined as an event stream definition.

Events , Streams and Event

Stream Definitions

Publishing Data

o

Data Analysis

Batch Analytics

Batch Analytics

Generating insight by processing large amounts of stored data ● KPI Statistics

○ Application Statistics Monitoring

○ Network / Service Statistics ○ Sensor Data Aggregation

● Solving Optimization Problems ○ Urban Planning ○ Revenue Distribution Analysis

Source: www.e-

deal.com

• Batch analytics reads data from a disk (or some other storage) and process them record by record

• “MapReduce” is the most widely used technology for batch analytics

- Apache Hadoop - Apache Spark 30X faster and much more flexible

• Analytics (Min, Max, average, correlation, histograms, might join or group data in many ways)

• Key Performance indicators (KPIs) –  - e.g. Profit per square feet for retail

• Presented as a Dashboard

Batch Analytics

• Powered by Apache Spark

• up to 30x higher performance than Hadoop

• script-based analytics powered by Spark SQL

• Persist Data in A Database (RDBMS/NON-RDBMS) and process Using Spark Queries and persist analyzed data in RDBMS

WSO2 Data Analytics Server

Batch Analytics With DAS

WSO2 DAS In Action: API

Statistics

DAS In Action: API Statistics

DAS In Action: HTTP Monitoring

Real-Time Analytics

Real-time Analytics

Making sense of fast moving data

● Sports ○ Real-time Analysis of Player

Performance ○ Real-time Match Analysis

● Geo-Spatial ○ Traffic Monitoring and Alerting ○ Geo-fencing

● Finance ○ Stock Market Monitoring

● Anomaly Detection ○ Fraud Detection ○ Network Intrusion Detection ○ Server Health Monitoring

Source: www.promojam.com

• For some use cases, the value of insights degrades very quickly with time.

• We need technology that can produce outputs fast. • Static Queries, but need very fast output (Alerts, Real-time

control)

• Dynamic and Interactive Queries ( Data exploration)

Real-TIME Analytics

• WSO2 CEP facilitates • Real time event detection

• Correlation

• Notifications/alerts, visualization tools

• Siddhi - a high-performance streaming processing engine

• WSO2 CEP is configured using the Siddhi query language

• suited for complex queries involving time windows, as well as patterns and sequences detection.

• CEP queries can be changed dynamically at runtime using templates.

WSO2 Complex Event Processor

Real-TIME Analytics With WSO2

CEP

Real-time Analytics In Action

Real-time Analytics In Action

Real-time Analytics In Action

Interactive Analytics

Interactive Analytics

Near Real-time Indexed Data Search

● Log Analysis ○ Application / System Logs

● Activity Monitoring ○ Tracking Message Flows

● Fraud Detection ○ Executing queries to lookup

related data in a detected fraud situation

● HL7 Data Exploration ○ ESB HL7 Transport Interfaced

with DAS Source: befoundonline.com

• Best way to explore data is by asking Ad-hoc questions

• Interactive Analytics (search) let you query the system and receive fast results (<10s)

• Shows data in context (e.g. by grouping events from the same transaction together)

• Built using Lucene based Indexes.

Interactive Analytics with WSO2

DAS

Interactive Analytics In Action:

WSO2 DAS

Predictive Analytics

Predictive Analytics

Analyze Existing Data to Predict Future Events

● Next Value Prediction ○ Sales Forecasts ○ Electricity Loads

● Classification ○ Product Categorization ○ Customer Segmentation

● Anomaly Detection ○ Fraud Detection ○ Preventive Maintenance

● Other ○ Handwriting recognition

• Machine learning • Takes in a lot of examples, and builds a program that matches

those examples.

• Specifically, that program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.

• We call that program a “model”

• A Lot of Machine Learning tools • R ( Statistical language)

• Sci-kit learn (Python)

• Apache Spark’s MLLIB and Apache Mahout (Java)

Predictive Analytics

• Powered by Apache Spark MLlib

• Analyze data using machine learning algorithms

• Build machine learning models

• Compare and manage generated machine learning models

• Predict using the built models

Predictive Analytics with WSO2

Machine Learner

Predictive Analytics With WSO2

ML

Predictive Analytics In Action:

WSO2 ML

Home-Grown Solutions

WSO2 Solutions Based on the

Analytics Platform ● WSO2 Fraud Detection Solution

○ Built for detecting credit card fraud ○ The rules extensible with customized Siddhi execution

plans for any type of fraud detection ○ Currently uses Real-time and Interactive Analytics

features ● WSO2 Log Analytics Solution

○ Distributed indexing and searching of any type of logs stored in the system

○ Notifications support with Real-time event processing features

○ Application / Server health prediction with Machine Learning

○ Uses Interactive + Real-time Analytics + Machine Learning features

Source: www.retrospective.centeractive.com

Source: multichannelmerchant.com

Deployment

Minimum HA Deployment for DAS

2 Node Deployment Use RDBMS to Store Data If need to scale Higher Use HBASe/Cassandra

Minimum HA Deployment for CEP

Minimum 2 nodes Max throughput == 1 Node throughput

Minimum HA Deployment for ML

Minimum 1 node

Questions?? ?