Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Copy r ight © 2012, SAS Insti tute I nc . All r i ghts r eserved.
BIG DATA:
BIG OPPORTUNITY OR BIG HEADACHE?
Peter Dorrington
SAS
Copy r ight © 2012, SAS Insti tute I nc . All r i ghts r eserved.
FIRST, A FEW WORDS ABOUT SAS
(Who do, after all, pay my salary)
(Post conference narrative annotations to this presentation are in green italics)
Copy r ight © 2012, SAS Insti tute I nc . All r i ghts r eserved.
2011
PERFORMANCE
A LEADING PROVIDER OF
ADVANCED ANALYTICS SOFTWARE
12% growth in total
revenue over 2011
36 consecutive
years of revenue
growth
24% of 2011
revenues invested
into R&D
For 37 years, we have focused on
giving our customers…
Being privately owned means we can afford to reinvest
in R&D, not focus on quarterly share price / dividends
Copy r ight © 2012, SAS Insti tute I nc . All r i ghts r eserved.
WHY DOES SAS CARE ABOUT BIG DATA?
Copy r ight © 2012, SAS Insti tute I nc . All r i ghts r eserved.
THE VISION WE HAVE ALWAYS UNDER-PINNED DECISION-MAKING
Organizations are inundated with data – terabytes and
petabytes of it. To put it in context, 1 terabyte contains
2,000 hours of CD-quality music and 10 terabytes could
store the entire US Library of Congress print collection.
Exabytes, zettabytes and yottabytes definitely are on the
horizon.
The hopeful vision of big data is that organizations will
be able to harvest and harness every byte of relevant
data and use it to make the best decisions. Big data
technologies not only support the ability to collect large
amounts, but more importantly, the ability to understand
and take advantage of its full value.
This is the vision – reality is somewhat different
Copy r ight © 2012, SAS Insti tute I nc . All r i ghts r eserved.
BUT DO YOU
REMEMBER THIS? SINGLE CUSTOMER VIEW (SCV)
“A complete SCV is not currently available in any of the
interviewed organisations. Most have a partial
implementation of some of the data and / or some of the
channels ...”
From a study
this year
A Market Study by Henley Business School
in association with SAS UK and Ireland
When I joined SAS UK as Head of
CRM in 2000, this was already old
news. Over a decade later, with all
the advances in data management
and analytics, it is still an issue.
The danger is that ‘Big Data’ will
make the challenge greater by
adding new data sources and
aspirations before we have fully got
to grips with our current reality.
Copy r ight © 2012, SAS Insti tute I nc . All r i ghts r eserved.
OUR PERSPECTIVE BIG DATA IS RELATIVE, NOT ABSOLUTE
When volume, velocity and variety of data exceeds an
organization’s storage or compute capacity for accurate
and timely decision-making.
The explosion of data isn’t new. It continues a trend that
started in the 1970s. What has changed is the velocity of
growth, the diversity of the data and the imperative to
make better use of information to transform the
business.
‘Big Data is…’
Big data is really just ‘more data, from more
sources’ . Most organizations already have ‘large
data’. (I regularly use Companies House data of 5.3
millions rows; far more than Excel can deal with.
Some of our customers are using 5.3 billion rows of
data and doing so very effectively)
Copy r ight © 2012, SAS Insti tute I nc . All r i ghts r eserved.
Based on 450 responses from 109 respondents who report practicing Big Data analytics;
4.1 responses per respondent on average.
Source: TDW I Big Data Analytics Report, 4 th Quarter 2011, Philip Russom
Structured data ( tables, records )
Semi-structured data ( XML and similar standards )
Complex data ( hierarchical or legacy sources )
Event data ( messages, usually in real time )
Unstructured data ( human language, audio, video )
Web logs and click streams
Social media data ( blogs, tweets, social networks )
Other
Spatial data ( long / lat coordinates, GPS output )
Machine-generated data ( sensors, RFID, devices )
Scientific data ( astronomy, genomes, physics )
“Which of the following data types are you collecting
as Big Data and/or using today?”
BIG DATA
SOURCES BIG DATA IS EVERYWHERE
It’s happening
already, a
significant
challenge will be in
working out how to
manage all these
sources
Copy r ight © 2012, SAS Insti tute I nc . All r i ghts r eserved.
Source: IDC Digital Universe Study, sponsored by EMC, May 2010
THE SCALE OF THE CHALLENGE
It’s not hard to
imagine a future
of super-cheap,
ubiquitous,
connected ‘chips
with everything’;
the data growth
curve is
potentially
exponential.
Will the future be
‘Even Bigger
Data’?
But how much of
this data is going
to be useful in
any given
context?
Copy r ight © 2012, SAS Insti tute I nc . All r i ghts r eserved.
VOLUME
VARIETY
VELOCITY
VARIABILITY
TODAY THE
FUTURE
DA
TA
SIZ
E
SO WHAT? NOT ALL DATA IS EQUAL
COMPLEXITY
- terabytes, petabytes and up
- from all kinds of sources
- some historic, others real-time
in fits-and-starts, as well as
- smooth flowing & of also
dubious quality
- and often without context
or clear value
The
challenge is
to find
relevance
from within
this ‘data
deluge’
Copy r ight © 2012, SAS Insti tute I nc . All r i ghts r eserved.
IMPLICATION WE WILL NEED TO RETHINK DATA MANAGEMENT
Where data integration, data quality, metadata
management and data governance are designed and
used together. The traditional extract-transform-load
(ETL) data approach augmented with one that minimizes
data movement and improves processing power.
From standalone
disciplines to
integrated
processes
- There is no meaningful way we can store all this data
(with today’s technologies), never mind build an OLAP
cube from it.
- For example, the Large Hadron Collider at CERN
products 15 petabytes of data per year: they can only
store a subset of this and that only by distributing the
storage around the world using multiple hubs.
- Now add real-time data feeds into the mix…
Copy r ight © 2012, SAS Insti tute I nc . All r i ghts r eserved.
BIG DATA & ANALYTICS
Data without analysis has only transactional value
Copy r ight © 2012, SAS Insti tute I nc . All r i ghts r eserved.
Big
Analytics
Business
Intelligence (BI)
BIG DATA, BI
AND ANALYTICS
Large Data
Reacti
ve
Analy
tics
Pre
dic
tive
Analy
tics
TRADITIONAL VIEW MY DEFINITIONS:
Predictive (Proactive)
Analytics:
• Optimisation - How do we
do things better? What is
the ‘best’ decision?
• Predictive Modelling - What
will happen next? How will it
affect me?
• Forecasting - What if the
trend(s) continue?
• Statistical Analysis - Why is
it happening? What am I
missing?
Reactive Analytics
(Business Intelligence):
• Alerts - When should I
react? What action is
needed now?
• Query Drilldown (OLAP) -
Where exactly? How do I
find the answers?
• Ad Hoc Reports - How
Many? How Often?
• Standard Reports - What
happened? when?
All have a role to play Pretty much all
organisations have
‘large data’
Copy r ight © 2012, SAS Insti tute I nc . All r i ghts r eserved.
Big Data
Analytics
Big
Analytics
Big Data
BI
Business
Intelligence (BI)
BIG DATA, BI
AND ANALYTICS WHAT CHANGED?
Large Data Big Data
Reacti
ve
Analy
tics
Pre
dic
tive
Analy
tics
Not much has changed when moving from ‘Large Data’ to
‘Big Data’: BI is still BI, Analytics is still Analytics – applying
BI to Big Data does not make it inherently analytical
Copy r ight © 2012, SAS Insti tute I nc . All r i ghts r eserved.
OUR PERSPECTIVE BIG DATA ANALYSIS: A PROMISE AS YET ONLY
PARTIALLY FULFILLED / ADOPTED
The true value of big data lies not just in having it, but in
harvesting it for fast, fact-based decisions that lead to
real business value.
‘There’s gold in
them thar hills’
- Just like mining for gold (a deliberate pun about data
mining) – you have to work for the reward, it is rarely found
just lying on the surface and if it was it wouldn’t be rare and
therefore valuable.
- The problem with ‘low hanging fruit’ is that everyone can
see it and reach for it – your competition included. Your
unique Intellectual Property (what you know, and what you
know about what you know) may be the only thing that
ultimately sets you apart.
Copy r ight © 2012, SAS Insti tute I nc . All r i ghts r eserved.
SO WHAT’S
STOPPING YOU?
10 ROADBLOCKS TO IMPLEMENTING BIG DATA
ANALYTICS
1. Budget
2. IT know-how
3. Business know-how
4. Data clean-up
5. The storage bulge
6. New data centre workloads
7. Data retention
8. Vendor role clarification
9. Business and IT alignment
10. Developing new talent
All of these
are solvable
- Mary Shacklett. TechRepublic, Nov 2012
- Develop a ‘plain English’ business case with £value
- Figure out what you need to do, then what
capabilities are needed & how obtained
- Partner with those who do have the skills
- Face up to the problem & prepare to invest
- Store only what you have to or is relevant
- Monitor & analyze workloads and
Plan accordingly
- (see Storage Bulge)
- Identify who can offer more than
‘canned analyses / reports | Partner
- Design a strategy around business,
not IT goals & objectives
- If at all possible, develop in-house talent using
consistent architectures, rather than buy-in skills
Copy r ight © 2012, SAS Insti tute I nc . All r i ghts r eserved.
HEALTH WARNING BOYD: BRING YOUR OWN DATA?
How many views of
how many data
sources, using how
many tools on how
many devices?
FINANCE DIRECTOR SALES DIRECTOR OPERATIONS DIRECTOR
- Imagine what would happen if the whole leadership team
turns up to a meeting with their own sets of data
- Implement a strategy that provides a consistent data
strategy / foundation
- Bring Your Own View of one set of Data
Copy r ight © 2012, SAS Insti tute I nc . All r i ghts r eserved.
THE VALUE OF BIG DATA
Copy r ight © 2012, SAS Insti tute I nc . All r i ghts r eserved.
OPPORTUNITY OR
THREAT? WHAT BUSINESS LEADERS SAY ABOUT BIG DATA
Should probably
ask “strength or
weakness?”
- Opportunities / Threats are often external – not under our
control
- Strengths/Weaknesses are internal – we decide where we
want to be strong
- Perhaps the internal debate should be able how the value
of big data can provide an organisation with new strengths
- In particular, proprietary IP based on data is very hard for
competitors to replicate, whereas products typically are not.
Copy r ight © 2012, SAS Insti tute I nc . All r i ghts r eserved.
EXAMPLE MARKETING & CUSTOMER ACQUISITION
Same / better result
for less investment
- This has been going on for years: by understanding
customers / segments better, we can focus our
investment on just those most likely to respond, this ‘lift’ in
response rates improves RoI
- Big data has the potential to know more about customers
& develop better models for more customers
Copy r ight © 2012, SAS Insti tute I nc . All r i ghts r eserved.
THE POTENTIAL THE UK’S CORPORATE GOLD
Copy r ight © 2012, SAS Insti tute I nc . All r i ghts r eserved.
IMPLICATION IMPACT EVERY PAGE OF THE ANNUAL REPORT
Cut losses from fraud by 30% in retail banking
Improved retention rates by 40~%, and increase product
holding by customers by 10%. (Retail)
Increased the number of customers by 1.7m pa
assisting to a 15% compound annual growth rate in just 2
years
Increased sales by 40% by identifying customers sales,
and matching the best salespeople to close the opportunity.
Increased customer purchase by 65% through data integration and effective targeting.
Maintain bad debt of <0.05%, compared to the industry
norm of 3.45%.
Reduced number of financial reports by 82% -
providing key fiscal information for rapid decision making.
The applications of analytics to address business
challenges / opportunities are not restricted to just
one function
Copy r ight © 2012, SAS Insti tute I nc . All r i ghts r eserved.
WHAT IS YOUR
ISSUE?
Time Orient
Observe
Act
MARKET
OPPORTUNITY
Decide
- OODA - Colonel John Boyd
Confidence
100% Sampling
- When you need to be able to reach a decision
& act faster than the competition
- When you need to consider
lots of scenarios
- When you need to see the whole
picture, not just a sample of it
Copy r ight © 2012, SAS Insti tute I nc . All r i ghts r eserved.
HOW TO… BIG DATA ANALYTICS
(Based on SAS technologies)
Copy r ight © 2012, SAS Insti tute I nc . All r i ghts r eserved.
Domain Expert
Makes Decisions
Evaluates Processes and ROI
BUSINESS
MANAGER
Model Validation
Model Deployment
Model Monitoring
Data Preparation
IT SYSTEMS /
MANAGEMENT
Data Exploration
Data Visualization
Report Creation
BUSINESS
ANALYST
Exploratory Analysis
Descriptive Segmentation
Predictive Modeling
DATA MINER /
STATISTICIAN
IDENTIFY /
FORMULATE
PROBLEM
DATA
PREPARATION
DATA
EXPLORATION
TRANSFORM
& SELECT
BUILD
MODEL
VALIDATE
MODEL
DEPLOY
MODEL
EVALUATE /
MONITOR
RESULTS
How can we
create
strategic
advantage?
THE ANALYTICS
LIFECYCLE THERE IS STILL A PLACE FOR A STRUCTURED APPROACH
In my opinion, you start out with identifying what question you need an answer to
Copy r ight © 2012, SAS Insti tute I nc . All r i ghts r eserved.
DISTRIBUTED COMPUTING
Almost all Big Data solutions run in grid environments –
chunking up the task to share across many processors
Copy r ight © 2012, SAS Insti tute I nc . All r i ghts r eserved.
IN-DATABASE ANALYTICS
Doing the ‘analytics’ in the database keeps it close
to the data and in an easily managed environment
Copy r ight © 2012, SAS Insti tute I nc . All r i ghts r eserved.
IN-MEMORY ANALYTICS ARCHITECTURE
But doing analytics in-memory allows for vast
improvements in speed & enables ‘train of thought’
development of new questions / answers
Copy r ight © 2012, SAS Insti tute I nc . All r i ghts r eserved.
USAGE EXAMPLES
Copy r ight © 2012, SAS Insti tute I nc . All r i ghts r eserved.
WHAT IF YOU
COULD…
• . . . predict the buying behavior and decision criteria of your prospects weeks before your competition
. . . gain first-mover advantage by introducing new
products and services to micro-segments that haven't
been identified by anyone
. . . evaluate the impact of your marketing campaigns
hourly and make adjustments in real-time
• . . . Improve customer experience scores that grow products per customer, reduce attrition, and leverage the power of customer recommendations for new business
Copy r ight © 2012, SAS Insti tute I nc . All r i ghts r eserved.
RETAIL
Big, general purpose retailers have 10,000s of SKUs across
tens of stores – having the right amount / mix of stock, at the
right price is critical to protecting (slim) margins. The challenge
is to adjust pricing as quickly as the market changes – not
monthly or weekly, but daily, or even hourly.
Copy r ight © 2012, SAS Insti tute I nc . All r i ghts r eserved.
TELCO
Two big issues: the market is saturated (very few ‘new’ customers) and is
commoditized (customers driven by price and ‘customer experience’).
Network failures directly impact the latter whereas just providing the
infrastructure does not make money. Some Telco's are looking at their IP
and working out how they can use it to grown new revenue
streams
Copy r ight © 2012, SAS Insti tute I nc . All r i ghts r eserved.
HEALTH CARE
In a recent case, the DNA of MSRA bacterium was sequenced in 48 hours
for a cost of £50; we are much closer to personalised health plans than
many would think. Even leaving the genetic issue to one side, it is possible
to use analytics to predict healthcare needs and therefore opportunities to
intervene before the chronic becomes acute.
Copy r ight © 2012, SAS Insti tute I nc . All r i ghts r eserved.
BANKING
If you have ever had a credit card transaction declined, then you will know
that the card issuers are working hard to identify 100% of the potential fraud,
whilst at the same time not generating ‘false positives’ – declining genuine
transactions because the detection models are incomplete or unresponsive
to individual consumer behaviour is bad for business
Copy r ight © 2012, SAS Insti tute I nc . All r i ghts r eserved.
PUBLIC SAFETY
Lots of what goes on in this sector is kept, quite rightly, under wraps but
there are case studies from all over the world where police forces are
starting to anticipate where crime hotspots are/will develop and fix
policing strategy accordingly
Copy r ight © 2012, SAS Insti tute I nc . All r i ghts r eserved.
INSURANCE
Telematics in cars for insurance is already available in the UK. Because
insurers get a better picture of individual driving patterns they can adjust
their risk calculations accordingly and offer individual (and competitive)
prices to better / safer drivers
Copy r ight © 2012, SAS Insti tute I nc . All r i ghts r eserved.
FINANCIAL
SERVICES
Risk is at the heart of all financial services; banks and insurers just need to
know how to price it correctly. In the example of ‘stress testing’ banks are
now asked to consider the impacts of a wide range of scenarios on their
business. The ability to run lots and lots of different risk scenarios directly
impacts price and tactically allows more responsiveness
- heading off problems before they become unmanageable
Copy r ight © 2012, SAS Insti tute I nc . All r i ghts r eserved.
UTILITIES
An incredible commoditised, mature, competitive
sector – leveraging IP is one way it is responding
Copy r ight © 2012, SAS Insti tute I nc . All r i ghts r eserved.
IN CONCLUSION
Copy r ight © 2012, SAS Insti tute I nc . All r i ghts r eserved.
ADOPTING BIG DATA ANALYTICS IS NOT WITHOUT
CHALLENGES
Source: The Current State of Business Analytics: Where Do We Go From Here?
Prepared by Bloomberg Businessweek Research Services, 2011
Copy r ight © 2012, SAS Insti tute I nc . All r i ghts r eserved.
BUT PLENTY TO
GET EXCITED
ABOUT!
“Problems cannot be solved by the same level of
thinking that created them.” - Albert Einstein
• Open Data
• The power to analyse more
• Lots and lots of solutions…..
….. Framing the problem
• Knowledge systems
• Interpretation of data
Copy r ight © 2012, SAS Insti tute I nc . All r i ghts r eserved.
FURTHER
READING http://www.sas.com/reg/wp/corp/46345
Copy r ight © 2012, SAS Insti tute I nc . All r i ghts r eserved. www.SAS.com
THANK YOU!