Upload
brent-potter
View
213
Download
0
Tags:
Embed Size (px)
Citation preview
Week 6 Lecture
The Data Warehouse
Samuel Conn, Asst. Professor
2
In this lecture, you will learn: How operational data and decision
support differ What a data warehouse is and how its
data are prepared What star schemas are and how they
are constructed What steps are required to implement a
data warehouse successfully What data mining is and what role it
plays in decision support
3
External and internal forces require tactical and strategic decisions
Search for competitive advantage Business environments are dynamic Decision-making cycle time is reduced Different managers require different
decision support systems (DSS)
The Need for Data Analysis
4
Decision Support Is a methodology Extracts information from data Uses information as basis for decision
making
Decision Support Systems
5
Decision Support Systems Decision support system (DSS)
Arrangement of computerized tools Used to assist managerial decision Extensive data “massaging” to produce
information Used at all levels in organization Tailored to focus on specific areas and needs Interactive Provides ad hoc query tools
6
DSS Components
Figure 13.1
7
Operational data Relational, normalized database Optimized to support transactions Real time updates
DSS Snapshot of operational data Summarized Large amounts of data
Data analyst viewpoint Timespan Granularity Dimensionality
Operational vs. Decision Support Data
8
Database schema Support complex (non-normalized) data Extract multidimensional time slices
Data extraction and filtering End-user analytical interface Database size
Very large databases (VLDBs) Contains redundant and duplicated data
The DSS Database Requirements
9
Integrated Centralized Holds data retrieved from entire organization
Subject-Oriented Optimized to give answers to diverse questions Used by all functional areas
Time Variant Flow of data through time Projected data
Non-Volatile Data never removed Always growing
Data Warehouse
10
Creating a Data Warehouse
Figure 13.3
11
Single-subject data warehouse subset
Decision support to small group Can be test for exploring potential
benefits of Data warehouses Address local or departmental
problems
Data Marts
12
DSS Architectural Styles
Traditional mainframe-based OLTP Managerial information system
(MIS) with 3GL First-generation departmental DSS First-generation enterprise data
warehouse using RDMS Second-generation data
warehouse using MDBMS
13
Advanced data analysis environment Supports decision making, business
modeling, and operations research activities
Characteristics of OLAP Use multidimensional data analysis
techniques Provide advanced database support Provide easy-to-use end-user interfaces Support client/server architecture
Online Analytical Processing (OLAP)
14
OLAP Client/Server Architecture
Figure 13.6
15
OLAP Server Arrangement
Figure 13.7
16
OLAP Server with Multidimensional Data Store Arrangement
Figure 13.8
17
OLAP Server with Local Mini-Data-Marts
Figure 13.9
18
OLAP functionality Uses relational DB query tools Extensions to RDBMS
Multidimensional data schema support Data access language and query
performance optimized for multidimensional data
Support for very large databases (VLDBs)
Relational OLAP (ROLAP)
19
Typical ROLAP Client/Server Architecture
Figure 13.10
20
OLAP functionality to multidimensional databases (MDBMS)
Stored data in multidimensional data cube
N-dimensional cubes called hypercubes
Cube cache memory speeds processing
Affected by how the database system handles density of data cube called sparsity
Multidimensional OLAP (MOLAP)
21
MOLAP Client/Server Architecture
Figure 13.11
22
Data-modeling technique Maps multidimensional decision support into
relational database Yield model for multidimensional data
analysis while preserving relational structure of operational DB
Four Components: Facts Dimensions Attributes Attribute hierarchies
Star Schema
23
Simple Star Schema
Figure 13.12
24
Slice and Dice View of Sales
Figure 13.14
25
Facts and dimensions represented by physical tables in data warehouse DB Fact table related to each dimension table (M:1) Fact and dimension tables related by foreign keys Subject to the primary/foreign key constraints
Star Schema Representation
26
Star Schema for Sales
Figure 13.17
27
Normalization of dimensional tables Multiple fact tables representing different aggregation levels
Denormalization of the fact tables Table partitioning and replication
Performance-Improving Techniques for Star Schema
28
Data Warehouse Implementation Road Map
Figure 13.21
29
Seeks to discover unknown data characteristics
Automatically searches data for anomalies and relationships
Data mining tools Analyze data Uncover problems or opportunities Form computer models based on findings Predict business behavior with models Require minimal end-user intervention
Data Mining
30
Extraction of Knowledge from Data
Figure 13.22
31
Data Mining Process
Figure 13.23