36
How Verizon is Solving Big Data Problems with Interactive BI IBRAHIM ITANI Leader of Big Data Architecture and Technology Verizon SANJAY KUMAR General Manager, Telecom Hortonworks SANCHA NORRIS Director, Product Marketing Kyvos Insights

Kyvos Verizon HDP webinar Finalv2

Embed Size (px)

Citation preview

How Verizon is Solving Big Data Problems with Interactive BI

IBRAHIM ITANI

Leader of Big Data Architectureand TechnologyVerizon

SANJAY KUMAR

General Manager, TelecomHortonworks

SANCHA NORRIS

Director, Product MarketingKyvos Insights

All attendeeson mute

View infull screen

Send in questions via chat (Q&A) box

Copy of webinar recording will be

available

Webinar Housekeeping

Ibrahim Itani

Executive Leader ofBig Data Architecture and Technology

Sanjay Kumar

General Manager, Telecom

Sancha Norris

Director, Product Marketing

Speakers

Hortonworks

5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Service Providers Areas of FocusCustomer Experience & Marketing- Enhance End-to-end Experience of Customer- Become Trusted Partner to Customer - Awareness of customer’s needs when and where needed

New Business & Consumer Services- New Digital & Infrastructure Services - Data Monetization- M2M, IoT, Analytics-as-a service

Network Optimization- Move to Software Driven Networks- Leverage Network Data Assets- Self optimizing and provisioning Customer

Network Service

6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

CentralDataLake

Tomorrow: A Data-Centric Model

App 1

App 2

App 3

App 4

App 5

App 6

DATA-CENTRIC

Old Model: App-Centric

Limitations:• Multiple copies of data• Difficult cross-system integration• Upper-limit on data volumes before

harming performance

New Model: Data-Centric

Advantages:• One version of the data• No need for cross-app integration• System scales linearly

APP-CENTRIC

App1 App 2 App 3 App 4 App 5 App 6

App Centric will break down with x10, x100,x1000…Need to shift to Data Centric

7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Blind Spots Block Your Ability to Use All the Data

GROUP 3

GROUP 2 GROUP 4

GROUP 1INTERNET

OFANYTHING

Fragmented data-at-rest increases the cost of insight

Data-in-motion streams through your blind spots

8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

The Future of DataActionable Intelligence

D A T A I N M O T I O N

STORAG

ESTO

RAG

E

GROUP 2GROUP 1

GROUP 4GROUP 3

D A T A A T R E S T

INTERNETOF

ANYTHING

Combining Data-in-Motion and Data-at-Rest powers Actionable Intelligence

9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Streaming Injestion(NiFi)

The Target: Data-Centric Operations

YARN

CentralData Lake(Hadoop)

Streaming: Network Probes, Click Stream, Sensor, Location

Batch: Call Detail Records

On-Line: CustomerSentiment

Unstructured: Txt,Pictures, Video,Voice2Text

Clickstream

Web & Social

Geolocation

Event Records

Call Logs

Files, emails

Server Logs

Existing OSS/BSS/DWHApp 1 App 2 App 3 App 4 App 5 App 6

IOT Solutions

Cyber Security

Network Planning

OptimizationContextual Marketing

360 Customer Awareness

DELIVERY

Personal Data Analysis & Customer Insight ServicesTo Customers & Partners

10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved 10

360 Customer Household View

View of all family members across omni-channel and inferred demographics from social media around customer experience

60% Improvement in customer experience NPS

Customer Churn Reduction

Track dissatisfaction in near-real time to see what customer encounter at the web site that drives them to call contact centers in frustration and initiate change to resolve

35% reduction in customer churn

Field: predictive maintenance truck

Predicting when failure of battery & other truck equipment to the day needing replacement before failure

Tens of Millions in cost avoidance on fleet services

Target Advertising

Look alike audience profile models across 5000+ parameters and attribute for precision targeting advertising

Double in year advertising revenue in 1st year

Communications Industry Use Cases – Data At Rest

Customer Next Best Action

Improved customer retail store & omni-channel experience

Develop detailed Customer Churn models through sentiment from social media to drive next best action at retail stores & across channels

Customer Experience & Marketing

Network Optimization& Field Services

Network Optimization

Develop network over-subscription models based on complete network event details for 13 month period for an accurate view on demand

Tens/Hundreds of millions in cost avoidance on Network Build

Network Decommissioning

Ingestions of network events correlated with customer demand to manage circuit & equipment decommissioning without customer impact

Up to 20% Reduction in Network Operating costs

11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved 11

Customer Location & Context Based Service

Triangulate customer location with customer specific context for target promotions, advertising & customer experience management

10x greater chance of customer offer acceptance

Customer Sentiment

Tracking social media with all other customer touch points for complete view of customer sentiment and computed net promoter score

Improve customer Experience and reduce churn as it happens

Network Yield Optimization

Prioritize data to deliver lower priority on available network capacity for no additional cost on network

Ability to provide low cost IOT connectivity

Communications Industry Use Cases – Data In Motion

Customer Next Best Action

Enhance customer experience through context driven actions

Drive next best action by streaming data into analytics systems like Apache Storm or Apache Spark Streaming for event triggered actions

Customer Experience & Marketing

Network Optimization& Field Services

Device Analytics

Edge device analytics through MiNiFi Agents prioritizing and delivering data groups with device level event correlational

Analytics at the sources device

Set-top Box Ingest for Customer Experience

Streaming billions of events directly from set-to-box events to understand customer Behavior & Network Performance

Customer Behavior: Subscriber Experience, Navigation Paths, Clicking Patterns and Viewing Habits.

Network Performance: Network Traffic and congestion in real-time for enhancing the viewing experience of Customers.

New RT Models for Content Programming & Advertising

12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Modern Data AppsCustom or Off the Shelf

Real-Time Cyber Securityprotects systems with superior threat detectionSmart Manufacturingdramatically improves yields by managing more variables in greater detailConnected, Autonomous Carsdrive themselves and improve road safetyFuture Farmingoptimizing soil, seeds and equipment to measured conditions on each square footAutomatic Recommendation Enginesmatch products to preferences in milliseconds

DATA ATREST

DATA IN MOTION

ACTIONABLEINTELLIGENCE

Modern Data Applications

Hortonworks DataFlow

Hortonworks Data Platform

Verizon

Delivering the promiseof the digital world.

Annual Revenue

$131.6Bin 2015 revenues

Dividends

$8.5Bpaid in 2015

Fortune Rank

13as of 2016

Employees

162Kworldwide

Network

98%US wireless coverage

Broadband

100%fiber optic network

Video & Advertising

150Mhours of video

streamed monthly

Internet of Things

4K+developers hosted on

ThingSpace

Security

200+data centers in 24

countries

Agenda

• Big data business challenges

• Verizon’s big data architecture

• What is OLAP and why is it needed?

• Traditional vs Modern OLAP

• How modern OLAP achieves interactive BI

Progress in Big Data Analytics

2010 Hadoop Trials

2013 Hadoop in Production

2014 Big Data Streaming

2002 First OLAP cube 2015 OLAP on Hadoop

2016 Hybrid Architecture

Increased datavolume

Varied datasources

Real-time aspectof data

Data accessand reuse

Self service needs

Cost and hidden system debt

A challenging big data landscape

Challenges Technology Solution

Increased data volumes Hadoop Hortonworks distribution

Varied data sources Unified Data Models and standardized KPI stores

Real time aspect of data Kafka/Storm/Nifi/StreamAnalytix

Data access and reuse Hybrid architecture

Self service needs Pre-Connected data models/OLAP on Hadoop

Cost and hidden system debt Data locality and compute

Dealing with the complexity of big data

STRUCTURED / UNSTRUCTURED / DATA IN MOTION

Our expanded architecture

SalesChannels

DigitalChannels

Self-ServeChannels Operations Customer

ContactVendors/Partners

UNDERSTAND and VISUALIZE MODEL, SCORE, and ACT

Data Exchange

INTERNAL TOUCHPOINTS USAGES AND INTERACTIONS EXTERNAL TOUCHPOINTS

Cross Channel Analytics

Path Analysis Discovery Predictive Analytics

Machine Learning

Prescriptive Models

STRUCTURED / DATA AT REST

LEGACY DATA WAREHOUSE

OLAP ON HADOOPKYVOS ENGINE

Why the needfor OLAP?

• Speed of responses for multi-dimensional queries

• Iterative queries for data discovery

• The need to drill down for more details

• Access to historical side-by-side comparative analysis

What is OLAP?OLAP is a hierarchical multidimensional data model

• Analyze multidimensional data interactively from multiple perspectives

• Dimensions are qualitatively represented by Measures

• Cube Operations: Rolling up, Drilling down, Slice & Diceand Pivot

• Cube Query Language: Multi Dimensional eXpressions (MDX) TIME

PRODUCT

CUSTOMER

365 269 295 377 1306

234 465 255 678 1632

164 135 153 145 597

132 144 111 555 942

895 1013 814 1755 4477

Jan05 Feb05 Mar05 Apr05 Yr05

LCD Monitor

Digital Camera

40G Drive

Game Console

All Products

Fred Smith

Joe Green

Abner Kennedy

Abigail Rudy

All Customers

TIME

PRODUCT

CUSTOMER

Limitations of traditional OLAPVery rigid and not scalable

• Massive increase in data volumes(size of cubes)

• Explosion of cardinality (granularity) from added dimensions

• Processing is not linearly scalable forcing towards fixed column reporting

• Data movement and frequency of updates impacting Service Level Agreements

• Requires expert developers

TRADITIONAL OLAP ON SQL

BI Tools

ODBC

SQL MDX Query Engine

Cube Processing

Data Movement

Star Schema

Data Repository

OLAP Cube

Need for a Modern OLAPMore flexible, scalable, and performant

• Convenient and easy access to data

• Single source of truth (one cube vs many)

• More flexibility to model data

• Empower users to explore data at scale

• Scalable with Hadoop ecosystem

• No data movement

• No new infrastructure

MODERN OLAP ON HADOOP

REST Server

Query Engine

Cube Build Engine(Map Reduce)

BI Tools

Rest API/MDX/XMLA/ODBC

Star Schema OLAP Cube

HADOOP

Traditional vs. Modern OLAP

TRADITIONAL OLAP ON SQL

BI Tools

ODBC

SQL MDX Query Engine

Cube Processing

Data Movement

Star Schema

Data Repository

OLAP Cube

Cube Server

MODERN OLAP ON HADOOP

REST Server

Query Engine

Cube Build Engine(Map Reduce)

BI Tools

Rest API/MDX/XMLA/ODBC

Star Schema OLAP Cube

HADOOP

Options for OLAP on Hadoop Tools• Apache Kylin

(Open Source by eBay)

• Doradus

(Open Source by Dell Software Group)

• Atscale

• Kyvos

Results attained with Kyvos• 13 Months data: 89.3 Billions rows

• Size of Raw Fact Data (ORC) : 3.83 TB

• Size of processed Cube : 9.3 TB

• 29 Dimensions

Traditional OLAP

OLAP on Hadoop

CubeSize

Limited to 750 GB(1 month)

9.3 TB(13 months)

Time tore-process 8 hours 10 min

Query Performance 5 – 10 min < 5 seconds

Benefits of achieving interactive BI

1 Faster time to insights from faster processing

2 Interactive access to big data for analysis without interruption

3 Self service model to empower users

4 Turn any BI tool to a native on Hadoop big data tool

5 Consolidate multiple cubes to one cube for single source of truth

6 No data movement to access all the data at every granular level

7 No new infrastructure to minimize technical costs

Thank you.

[email protected]

Kyvos Insights

Kyvos Insights’Heritage

• Leader in Big Data Services and Products

• Founded 1996

• StreamAnalytix

• EDW Migration

• White-labelled BI and Enterprise Reporting

• Spun out in 2004

• Solution Provider for Cyber Security, Law Enforcement and Intelligence Agencies

• Spun out in 2004

• Massive real-time machine-learning based analytics systems

• Interactive BI for big data(Hadoop, S3)

• Founded 2014

• Service Supply Chain Planning Solution

• World leader installed at most global manufacturing companies

• Spun out in 1999

• Acquired by PTC in 2012

• Headquarters in Silicon Valley

• Self-funded

• Fortune 500 customers

• 1700+ Employees

• Locations: Silicon Valley, Atlanta, Austin, NYC, and India

Kyvos makes data lakes BI ready for analystsSelf service interactive BIon big data

• No data movement

• Access granular data

• Use your favorite BI tool

• No waiting for queries

• Unlimited scalability

• Secure access controlBig Data Lake

Kyvos BI Consumption Layer

Security and authentication

What makesKyvos unique?• Distributed cubes on HDFS allows

unlimited cube size

• Pre-built cubes fully materialized for speed

• Incremental cube builds

• Zero footprint architecture that scales with Hadoop

100B rows1B cardinality300 dimensions

Cold Queries SSB Benchmark tests• 100B+ rows, 30M cardinality

• Kyvos response time 200X faster than Impala and Spark SQL

Customer tests at financial services company• Kyvos response time for a complex query under 15 seconds compared

to 11 minutes for Impala

Warm Queries Less than one second for all SSB benchmark queries

Query performance

IoT analytics improvedtruck roll efficiency

Customer behavior analysis improved airline revenue

Streamlined risk compliance at global investment bank

Use Cases

Key benefits

Keep your favoriteBI tool

Drill down to transaction data

Scalable nativelyon Hadoop

Fastest query performance

Secure accessto data lake

Nocoding

No datamovement

No IT intervention

Q&A