Upload
sancha-huang-norris
View
70
Download
12
Embed Size (px)
Citation preview
How Verizon is Solving Big Data Problems with Interactive BI
IBRAHIM ITANI
Leader of Big Data Architectureand TechnologyVerizon
SANJAY KUMAR
General Manager, TelecomHortonworks
SANCHA NORRIS
Director, Product MarketingKyvos Insights
All attendeeson mute
View infull screen
Send in questions via chat (Q&A) box
Copy of webinar recording will be
available
Webinar Housekeeping
Ibrahim Itani
Executive Leader ofBig Data Architecture and Technology
Sanjay Kumar
General Manager, Telecom
Sancha Norris
Director, Product Marketing
Speakers
5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Service Providers Areas of FocusCustomer Experience & Marketing- Enhance End-to-end Experience of Customer- Become Trusted Partner to Customer - Awareness of customer’s needs when and where needed
New Business & Consumer Services- New Digital & Infrastructure Services - Data Monetization- M2M, IoT, Analytics-as-a service
Network Optimization- Move to Software Driven Networks- Leverage Network Data Assets- Self optimizing and provisioning Customer
Network Service
6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
CentralDataLake
Tomorrow: A Data-Centric Model
App 1
App 2
App 3
App 4
App 5
App 6
DATA-CENTRIC
Old Model: App-Centric
Limitations:• Multiple copies of data• Difficult cross-system integration• Upper-limit on data volumes before
harming performance
New Model: Data-Centric
Advantages:• One version of the data• No need for cross-app integration• System scales linearly
APP-CENTRIC
App1 App 2 App 3 App 4 App 5 App 6
App Centric will break down with x10, x100,x1000…Need to shift to Data Centric
7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Blind Spots Block Your Ability to Use All the Data
GROUP 3
GROUP 2 GROUP 4
GROUP 1INTERNET
OFANYTHING
Fragmented data-at-rest increases the cost of insight
Data-in-motion streams through your blind spots
8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
The Future of DataActionable Intelligence
D A T A I N M O T I O N
STORAG
ESTO
RAG
E
GROUP 2GROUP 1
GROUP 4GROUP 3
D A T A A T R E S T
INTERNETOF
ANYTHING
Combining Data-in-Motion and Data-at-Rest powers Actionable Intelligence
9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Streaming Injestion(NiFi)
The Target: Data-Centric Operations
YARN
CentralData Lake(Hadoop)
Streaming: Network Probes, Click Stream, Sensor, Location
Batch: Call Detail Records
On-Line: CustomerSentiment
Unstructured: Txt,Pictures, Video,Voice2Text
Clickstream
Web & Social
Geolocation
Event Records
Call Logs
Files, emails
Server Logs
Existing OSS/BSS/DWHApp 1 App 2 App 3 App 4 App 5 App 6
IOT Solutions
Cyber Security
Network Planning
OptimizationContextual Marketing
360 Customer Awareness
DELIVERY
Personal Data Analysis & Customer Insight ServicesTo Customers & Partners
10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved 10
360 Customer Household View
View of all family members across omni-channel and inferred demographics from social media around customer experience
60% Improvement in customer experience NPS
Customer Churn Reduction
Track dissatisfaction in near-real time to see what customer encounter at the web site that drives them to call contact centers in frustration and initiate change to resolve
35% reduction in customer churn
Field: predictive maintenance truck
Predicting when failure of battery & other truck equipment to the day needing replacement before failure
Tens of Millions in cost avoidance on fleet services
Target Advertising
Look alike audience profile models across 5000+ parameters and attribute for precision targeting advertising
Double in year advertising revenue in 1st year
Communications Industry Use Cases – Data At Rest
Customer Next Best Action
Improved customer retail store & omni-channel experience
Develop detailed Customer Churn models through sentiment from social media to drive next best action at retail stores & across channels
Customer Experience & Marketing
Network Optimization& Field Services
Network Optimization
Develop network over-subscription models based on complete network event details for 13 month period for an accurate view on demand
Tens/Hundreds of millions in cost avoidance on Network Build
Network Decommissioning
Ingestions of network events correlated with customer demand to manage circuit & equipment decommissioning without customer impact
Up to 20% Reduction in Network Operating costs
11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved 11
Customer Location & Context Based Service
Triangulate customer location with customer specific context for target promotions, advertising & customer experience management
10x greater chance of customer offer acceptance
Customer Sentiment
Tracking social media with all other customer touch points for complete view of customer sentiment and computed net promoter score
Improve customer Experience and reduce churn as it happens
Network Yield Optimization
Prioritize data to deliver lower priority on available network capacity for no additional cost on network
Ability to provide low cost IOT connectivity
Communications Industry Use Cases – Data In Motion
Customer Next Best Action
Enhance customer experience through context driven actions
Drive next best action by streaming data into analytics systems like Apache Storm or Apache Spark Streaming for event triggered actions
Customer Experience & Marketing
Network Optimization& Field Services
Device Analytics
Edge device analytics through MiNiFi Agents prioritizing and delivering data groups with device level event correlational
Analytics at the sources device
Set-top Box Ingest for Customer Experience
Streaming billions of events directly from set-to-box events to understand customer Behavior & Network Performance
Customer Behavior: Subscriber Experience, Navigation Paths, Clicking Patterns and Viewing Habits.
Network Performance: Network Traffic and congestion in real-time for enhancing the viewing experience of Customers.
New RT Models for Content Programming & Advertising
12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Modern Data AppsCustom or Off the Shelf
Real-Time Cyber Securityprotects systems with superior threat detectionSmart Manufacturingdramatically improves yields by managing more variables in greater detailConnected, Autonomous Carsdrive themselves and improve road safetyFuture Farmingoptimizing soil, seeds and equipment to measured conditions on each square footAutomatic Recommendation Enginesmatch products to preferences in milliseconds
DATA ATREST
DATA IN MOTION
ACTIONABLEINTELLIGENCE
Modern Data Applications
Hortonworks DataFlow
Hortonworks Data Platform
Delivering the promiseof the digital world.
Annual Revenue
$131.6Bin 2015 revenues
Dividends
$8.5Bpaid in 2015
Fortune Rank
13as of 2016
Employees
162Kworldwide
Network
98%US wireless coverage
Broadband
100%fiber optic network
Video & Advertising
150Mhours of video
streamed monthly
Internet of Things
4K+developers hosted on
ThingSpace
Security
200+data centers in 24
countries
Agenda
• Big data business challenges
• Verizon’s big data architecture
• What is OLAP and why is it needed?
• Traditional vs Modern OLAP
• How modern OLAP achieves interactive BI
Progress in Big Data Analytics
2010 Hadoop Trials
2013 Hadoop in Production
2014 Big Data Streaming
2002 First OLAP cube 2015 OLAP on Hadoop
2016 Hybrid Architecture
Increased datavolume
Varied datasources
Real-time aspectof data
Data accessand reuse
Self service needs
Cost and hidden system debt
A challenging big data landscape
Challenges Technology Solution
Increased data volumes Hadoop Hortonworks distribution
Varied data sources Unified Data Models and standardized KPI stores
Real time aspect of data Kafka/Storm/Nifi/StreamAnalytix
Data access and reuse Hybrid architecture
Self service needs Pre-Connected data models/OLAP on Hadoop
Cost and hidden system debt Data locality and compute
Dealing with the complexity of big data
STRUCTURED / UNSTRUCTURED / DATA IN MOTION
Our expanded architecture
SalesChannels
DigitalChannels
Self-ServeChannels Operations Customer
ContactVendors/Partners
UNDERSTAND and VISUALIZE MODEL, SCORE, and ACT
Data Exchange
INTERNAL TOUCHPOINTS USAGES AND INTERACTIONS EXTERNAL TOUCHPOINTS
Cross Channel Analytics
Path Analysis Discovery Predictive Analytics
Machine Learning
Prescriptive Models
STRUCTURED / DATA AT REST
LEGACY DATA WAREHOUSE
OLAP ON HADOOPKYVOS ENGINE
Why the needfor OLAP?
• Speed of responses for multi-dimensional queries
• Iterative queries for data discovery
• The need to drill down for more details
• Access to historical side-by-side comparative analysis
What is OLAP?OLAP is a hierarchical multidimensional data model
• Analyze multidimensional data interactively from multiple perspectives
• Dimensions are qualitatively represented by Measures
• Cube Operations: Rolling up, Drilling down, Slice & Diceand Pivot
• Cube Query Language: Multi Dimensional eXpressions (MDX) TIME
PRODUCT
CUSTOMER
365 269 295 377 1306
234 465 255 678 1632
164 135 153 145 597
132 144 111 555 942
895 1013 814 1755 4477
Jan05 Feb05 Mar05 Apr05 Yr05
LCD Monitor
Digital Camera
40G Drive
Game Console
All Products
Fred Smith
Joe Green
Abner Kennedy
Abigail Rudy
All Customers
TIME
PRODUCT
CUSTOMER
Limitations of traditional OLAPVery rigid and not scalable
• Massive increase in data volumes(size of cubes)
• Explosion of cardinality (granularity) from added dimensions
• Processing is not linearly scalable forcing towards fixed column reporting
• Data movement and frequency of updates impacting Service Level Agreements
• Requires expert developers
TRADITIONAL OLAP ON SQL
BI Tools
ODBC
SQL MDX Query Engine
Cube Processing
Data Movement
Star Schema
Data Repository
OLAP Cube
Need for a Modern OLAPMore flexible, scalable, and performant
• Convenient and easy access to data
• Single source of truth (one cube vs many)
• More flexibility to model data
• Empower users to explore data at scale
• Scalable with Hadoop ecosystem
• No data movement
• No new infrastructure
MODERN OLAP ON HADOOP
REST Server
Query Engine
Cube Build Engine(Map Reduce)
BI Tools
Rest API/MDX/XMLA/ODBC
Star Schema OLAP Cube
HADOOP
Traditional vs. Modern OLAP
TRADITIONAL OLAP ON SQL
BI Tools
ODBC
SQL MDX Query Engine
Cube Processing
Data Movement
Star Schema
Data Repository
OLAP Cube
Cube Server
MODERN OLAP ON HADOOP
REST Server
Query Engine
Cube Build Engine(Map Reduce)
BI Tools
Rest API/MDX/XMLA/ODBC
Star Schema OLAP Cube
HADOOP
Options for OLAP on Hadoop Tools• Apache Kylin
(Open Source by eBay)
• Doradus
(Open Source by Dell Software Group)
• Atscale
• Kyvos
Results attained with Kyvos• 13 Months data: 89.3 Billions rows
• Size of Raw Fact Data (ORC) : 3.83 TB
• Size of processed Cube : 9.3 TB
• 29 Dimensions
Traditional OLAP
OLAP on Hadoop
CubeSize
Limited to 750 GB(1 month)
9.3 TB(13 months)
Time tore-process 8 hours 10 min
Query Performance 5 – 10 min < 5 seconds
Benefits of achieving interactive BI
1 Faster time to insights from faster processing
2 Interactive access to big data for analysis without interruption
3 Self service model to empower users
4 Turn any BI tool to a native on Hadoop big data tool
5 Consolidate multiple cubes to one cube for single source of truth
6 No data movement to access all the data at every granular level
7 No new infrastructure to minimize technical costs
Kyvos Insights’Heritage
• Leader in Big Data Services and Products
• Founded 1996
• StreamAnalytix
• EDW Migration
• White-labelled BI and Enterprise Reporting
• Spun out in 2004
• Solution Provider for Cyber Security, Law Enforcement and Intelligence Agencies
• Spun out in 2004
• Massive real-time machine-learning based analytics systems
• Interactive BI for big data(Hadoop, S3)
• Founded 2014
• Service Supply Chain Planning Solution
• World leader installed at most global manufacturing companies
• Spun out in 1999
• Acquired by PTC in 2012
• Headquarters in Silicon Valley
• Self-funded
• Fortune 500 customers
• 1700+ Employees
• Locations: Silicon Valley, Atlanta, Austin, NYC, and India
Kyvos makes data lakes BI ready for analystsSelf service interactive BIon big data
• No data movement
• Access granular data
• Use your favorite BI tool
• No waiting for queries
• Unlimited scalability
• Secure access controlBig Data Lake
Kyvos BI Consumption Layer
Security and authentication
What makesKyvos unique?• Distributed cubes on HDFS allows
unlimited cube size
• Pre-built cubes fully materialized for speed
• Incremental cube builds
• Zero footprint architecture that scales with Hadoop
100B rows1B cardinality300 dimensions
Cold Queries SSB Benchmark tests• 100B+ rows, 30M cardinality
• Kyvos response time 200X faster than Impala and Spark SQL
Customer tests at financial services company• Kyvos response time for a complex query under 15 seconds compared
to 11 minutes for Impala
Warm Queries Less than one second for all SSB benchmark queries
Query performance
IoT analytics improvedtruck roll efficiency
Customer behavior analysis improved airline revenue
Streamlined risk compliance at global investment bank
Use Cases
Key benefits
Keep your favoriteBI tool
Drill down to transaction data
Scalable nativelyon Hadoop
Fastest query performance
Secure accessto data lake
Nocoding
No datamovement
No IT intervention