Making Big Data Analytics with Hadoop fast & easy (webinar slides)

Preview:

DESCRIPTION

Looking to analyze your Big Data assets to unlock real business benefits today? But, are you sick of all the theories, hype and whoopla? View these slides from Actian and Yellowfin’s "Big Data Analytics with Hadoop" Webinar to discover how we’re making Big Data Analytics fast and easy. Hold on as we go from data in Hadoop to dashboard in just 40-minutes. Learn how to combine Hadoop with the most advanced Big Data technologies, and world’s easiest BI solution, to quickly generate real business value from Big Data Analytics. Watch as we use live CDR data stored in Hadoop – quickly connecting, preparing, optimizing and analyzing this data in a tangible real-world use case from the telecommunications industry – to easily deliver actionable insights to anyone, anywhere, anytime. To learn more about Yellowfin, and to try its intuitive Business Intelligence platform today, go here: http://www.yellowfinbi.com To learn more about Actian, and its next generation suite of Big Data technologies, go here: http://www.actian.com/

Citation preview

Making Big Data Analytics Fast and Easy

Using Actian, Yellowfin and Hadoop

December 16, 2013

John Ryan Marketing Manager APAC

Actian Corporation

Ryan Templeton Snr Solutions Architect

Actian Corporation

Ivan Seow Snr Technical Consultant

Yellowfin

2

Take Action on Big Data Making BI Easy

3

Take Action on Big Data

Fastest Data Prep Engine

Fastest Hadoop Loader

Fastest Single Node Database

Fastest MPP Database

Huge library of Analytical Functions

Making BI Easy

4

Take Action on Big Data Making BI Easy

Fastest Data Prep Engine

Fastest Hadoop Loader

Fastest Single Node Database

Fastest MPP Database

Huge library of Analytical Functions

Ranked #1 BI Vendor

Dresner Global BI Study 2012 & 13

#1 Dashboard Vendor: BARC BI Survey 12

#1 Enterprise Reporting Vendor:

BARC BI Survey 13

Gartner: ‘Vendor to Consider’

Today’s Agenda

5

1.  Big Data Analytics with Hadoop 2.  Making Analytics in Hadoop Fast & Easy 3.  Customer Example (Telecom) 4.  Demo: From Data to Dashboard

•  Making Hadoop Fast and Easy •  Making BI Fast and Easy

5.  Summary

6 Confidential © 2012 Actian Corporation

Big Data Analytics With Hadoop

Expect to have HDFS in production

7

Based on 263 respondents TDWI Best Practices Report – Q2 2013

73%

Big Data Source for Analytics Most Likely to Benefit from Hadoop

8

Based on 263 respondents TDWI Best Practices Report – Q2 2013

71%

Why is analytics inside Hadoop so hard and slow?

9

HDFS is a file system, not a database

Queries not standard SQL, only resemble SQL

Need a Data Scientist MapReduce inefficient for analytic queries

10 Confidential © 2012 Actian Corporation

Making Big Data with Hadoop Fast and Easy With Actian and Yellowfin

Enterprise

Actian Big Data Analytic Platform

11

DATA VALUE

Business Intelligence

Applications DW

Big Data Storage

Advanced technology platform:

Industry leading:  Scale

 Performance

 Complexity

 Cost (price/performance)

 Time to Value

Multiple deployment options:  On-premise

 Cloud

 Hybrid

 Embedded

Connect Prepare Analyze

Optimize

Accelerating Big Data 2.0

Enterprise

Actian Big Data Analytic Platform

12

DATA VALUE

Business Intelligence

Applications DW

Big Data Storage

Advanced technology platform:

Industry leading:  Scale

 Performance

 Complexity

 Cost (price/performance)

 Time to Value

Multiple deployment options:  On-premise

 Cloud

 Hybrid

 Embedded

Connect Prepare Analyze

Optimize

Accelerating Big Data 2.0

Enterprise

Actian Big Data Analytic Platform

13

DATA VALUE

Business Intelligence

Applications DW

Big Data Storage

Advanced technology platform:

Industry leading:  Scale

 Performance

 Complexity

 Cost (price/performance)

 Time to Value

Multiple deployment options:  On-premise

 Cloud

 Hybrid

 Embedded

Connect Prepare Analyze

Optimize

Accelerating Big Data 2.0

Enterprise

Actian Big Data Analytic Platform

14

DATA VALUE

Business Intelligence

Applications DW

Big Data Storage

Advanced technology platform:

Industry leading:  Scale

 Performance

 Complexity

 Cost (price/performance)

 Time to Value

Multiple deployment options:  On-premise

 Cloud

 Hybrid

 Embedded

Connect Prepare Analyze

Optimize

Accelerating Big Data 2.0

Enterprise

Actian Big Data Analytic Platform

15

DATA VALUE

Business Intelligence

Applications DW

Big Data Storage

Advanced technology platform:

Industry leading:  Scale

 Performance

 Complexity

 Cost (price/performance)

 Time to Value

Multiple deployment options:  On-premise

 Cloud

 Hybrid

 Embedded

Connect Prepare Analyze

Optimize

Accelerating Big Data 2.0

Industry Leading Performance

16

Process Hadoop Data Faster

Dataflow vs PIG (MapReduce) DBT-3@1TB : Run times

Analyze Data Faster

Database Benchmarks TPC-H QphH@1TB Benchmarks (non-clustered)

Today’s demonstration

17

Connect Hadoop

Transform Data

Parallel Load

Fast Database Queries

Fast Analysis

Actian Dataflow Actian Vector BI Visualization Layer Yellowfin BI

18 Confidential © 2012 Actian Corporation

Telecom Example Storing CDR Log Files inside Hadoop

Customer Use Case

  Tier two telecom provider

  Planning for large growth with minimal staff impact

  Business demands deeper insights

19

IT Challenges

20

Collect, manage, process CDR data in Hadoop

Users are domain experts, not data scientists

Swamped with data. Network switch dumps 200MB /min

during peak times. Hundreds of thousands of records per drop.

170 columns.

Too hard to analyze Raw data must first be distilled

and enriched to gain insight

What the business was asking for

Fastest time to decision Speed up processing by an order of magnitude

Increased granularity of analysis

Without increasing processing times or bogging down backend

Proactive analysis, not reactive Enable trend analysis and predictive capabilities

Answer real business questions

e.g. visual insight for near real-time customer and vendor performance, determine routing performance

optimization, etc

Scale for future growth Extensible for future capabilities and scalable growth

21

Specific Business Questions - CDR Analysis

  Answer Service Rate (ASR & Adjusted ASR) •  Calls completed vs. route attempts (vendor performance)

•  Calls completed vs. call attempts (customer satisfaction)

 Opportunity Monitor •  Calculate profit/loss per call due to routing path chosen

  Post Dial Delay (PDD) •  Annoying delay until path through network selected

  Analysis of near real time quality measures •  Call duration, jitter and packet loss

  Trends and correlations of above metrics

22

Filter data Logical functions Split flow for separate

processing rules

Meta-node encapsulates

processing Extract failed

routing attempts

CDR Workflow Overview

23

CONNECT TRANSFORM

PARALLEL DATA LOAD

Data processing – Execution Plan

24

Reader FilterRows DeriveFields Group(partial)

Reader FilterRows DeriveFields Group(partial)

Reader FilterRows DeriveFields Group(partial)

Reader FilterRows DeriveFields Group(partial)

Repartition Group(final) Writer

Repartition Group(final) Writer

Repartition Group(final) Writer

Repartition Group(final) Writer

Phase 1 Phase 2

Compiled to a set of physical graphs

25 Confidential © 2012 Actian Corporation

Demo Making Big Data Analytics Fast and Easy

Customer Take Aways – Actionable Insights

Processing streaming CDR data in seconds

26

FAST

Customer Take Aways - Analysis

visibility at the Area Code and Exchange level

27

Deeper Analysis

Customer Take Aways – Cost Savings

updates made to routing tables during first week of collecting data

28

20,000

Customer Take Aways - Scalability

rows of data collected during first 6 months

29

8.9 Billion

Solution Architecture

30 30 30

End Users

Desktop & Mobile Devices

Yellowfin BI

•  Dashboard •  Ad Hoc •  Statistics •  Data Mining •  Analytics

Hadoop Collection

Paraccel Dataflow

Extraction Cleansing Enrichment Aggregation Analysis Mining

Vectorwise Very fast reporting

database

Clustered Execution Parallel Loading

OSS/BSS

Data Retention

Summary – Take Action on Big Data

31

Enterprise

DATA VALUE

Business Intelligence

Applications DW

Big Data Storage

Advanced technology platform:

Industry leading:  Scale

 Performance

 Complexity

 Cost (price/performance)

 Time to Value

Multiple deployment options:  On-premise

 Cloud

 Hybrid

 Embedded

Connect Prepare Analyze

Optimize

Accelerating Big Data 2.0

32 Confidential © 2012 Actian Corporation

Questions

Ivan Seow Ivan.Seow@Yellowfin.BI

John Ryan John.Ryan@actian.com

Ryan Templeton Ryan.Templeton@actian.com

Actian www.actian.com Yellowfin www.Yellowfin.bi