Upload
morris-blankenship
View
217
Download
2
Embed Size (px)
Citation preview
Data Warehousing
An Overview
Outline
• What is Data Warehousing? (Definition)
• Why does anyone need it? (Applications)
• How is the data organized? (Star Schema)
• Implementation Issues.
Data Warehouse Definitions
• Dyche’: Used for decision making- duplicates existing data - Combination of hardware, specialized software and data extracted from other corporate systems.
• Inmon: Subject-oriented, integrated, non-volatile and time-variant collection of data in support of management decisions.
Why Warehouse?
• Provide single view of customers across enterprise
• Improve turnaround time for common reports
• Monitor customer behavior
• Predict future purchases
• Improved responsiveness Business issues.
Coca Cola & IBM
• IBM helping Coca Cola with warehouse.
• Deal with Global companies like McDonalds – support for negotiating global contracts.
Financial Services Example – Credit Life Cycle
Product Planning
CustomerAcquisition
CustomerManagement
Collections
Customer Acquisition
Product Planning
CustomerAcquisition
Support for Marketing• Market Segmentation
Plus Forecasts with:• Response Models• Risk / Bankruptcy Models • Profitability Models
Customer Management
CustomerAcquisition
CustomerManagement
Who gets a credit increase?Which of delinquent customers is likely to default?What do you do (call, send letter, do nothing?)
Decision Support:Forecast Customer Behavior(Behavior Models)
Collections/Recovery
CustomerManagement
Collections
What is the likelihood of recovering money from anaccount sent to collections?
Decision Support:Collections models
Other Questions
• How can we reduce attrition?
• How can we activate inactive accounts?
• How well are my current strategies performing?
• How do we detect Fraud?
Where is the data?
• Transaction Systems
• Marketing Database
• Credit Reports
• Customer Service
How is it Organized?
• Separate from transactional data
• Contains Historical data
• Generally aggregated to some extent
• Optimized for flexible querying of large volumes of data
Star Schema
• Fact Table plus several dimensional tables
• Un-normalized
• Less flexible than normalized tables
• Faster retrieval than normalized tables for large volumes of data
Implementation
• Start with the Business Issues
• Project Planning/Human Resources
• Database design / data sources
• Application Development
Business Analysis
• What is the problem?
• Who owns the problem?
• Will data help solve it?
When can data be used to Predict?
Chaotic Markets
(fashion driven)
Real-Time Markets
(Stock Market)
Linear Markets (Local authority - # of trash cans)
Statistical Markets (retail)
Low High
Low
High
Cou
pli
ng
RandomnessSource: www.butlergroup.com
Also read article in Wired Magazine on Data Mining and Terrorism