27
DW-1: Introduction to Data Warehousing

DW-1: Introduction to Data Warehousing. Overview What is Database What Is Data Warehousing Data Marts and Data Warehouses The Data Warehousing Process

Embed Size (px)

Citation preview

DW-1: Introduction to Data Warehousing

Overview

What is Database

What Is Data Warehousing

Data Marts and Data Warehouses

The Data Warehousing Process

Data in a Data Warehouse

What Is Database

Before

Program = Algorithm + Data Structure

Now

Application (Weblication) = Visual I/F + SQL Query + Database

Database is Integrated Data

from multiple file system data for OLTP

Data Base (From Air Base?), DB, 데이타베이스 , 자료기지 ( 북한 )

Database and Data Model

Computer Representation of Data for efficient understanding and processing

Data Model based on Relationship modeling

Relationship between recordone-to-one(1:1), one-to-many(1:N), many-to-many(N:M)

Hierarhical Model: Hierarchical Relationship, 1:N

Network model: Network like relationship, N:M

Relational Model: Use relation (table) for Relationship

Object-Oriented data model: Complex object modelingSET type, Reference, List

What Is Data Warehousing

Defining Data Warehousing

Operational Systems: A Transactional Solution

Analytical Systems: A Data Warehousing Solution

Comparing Transactional and Data Warehousing Solutions

Defining Data Warehousing

Business Intelligence Database Marketing: Personalized Product

Especially S/W, Cocoon business etc. Electronic Commerce

Data Warehouse: 자료 창고 for OLAP, Data Mining, DSS

Knowledge Management

Data Warehousing: Process to build Data Warehouse

Defining Data Warehousing

A Data Warehouse Is a Database That Contains: Enterprise data Integrated sets of historical data Subject-oriented, consolidated, consistent data Data structured for distribution and querying

A Data Warehousing Solution Is a Process That: Retrieves and transforms data Manages the database Uses tools for building and managing the data warehouse

Operational Systems: A Transactional Solution

Track Individual Events

Used for Real-time Data Entry and Editing

Examples:

Order-tracking applications

Customer service applications

Point-of-sale applications

Service-based sales applications

Banking functions

Analytical Systems: A Data Warehousing Solution

Assist with Strategic Decision Support

Provide Different Levels of Analysis

Allow Users to Navigate to Different Levels of Data

Allow System Searches to Find New Relationships

Examples:

Spreadsheet-based applications

Sales forecasting applications

Comparing Transactional and Data Warehousing Solutions

TransactionalTransactionalsolutionssolutions

TransactionalTransactionalsolutionssolutions

Data warehousingData warehousingsolutionssolutions

Data warehousingData warehousingsolutionssolutions

Update frequencyUpdate frequency Real-timeReal-time PeriodicallyPeriodically

Structured forStructured for Data integrityData integrity Ease in queryingEase in querying

Optimized forOptimized for Transaction performanceTransaction performance Query performanceQuery performance

Data Marts and Data Warehouses

What Is a Data Mart

Moving Data from a Data Warehouse to Data Marts

Moving Data from Data Marts to a Data Warehouse

What Is a Data Mart

What Is a Data Mart A subset of a data warehouse Used in an enterprise Specific to a particular subject or business activity

Why Build Data Marts Faster queries and fewer users Faster deployment time

Integrated Data Marts Ensure consistent data Require advance planning

Moving Data From a Data Warehouse to Data Marts

Advantages Shared fields Common source Distributed processing

Disadvantages Longer time to develop

Customer Customer Service MartService Mart

Sales MartSales Mart

DataDataWarehouseWarehouse Financial MartFinancial Mart

Source 1Source 1Source 1Source 1

Source 2Source 2Source 2Source 2

Source 3Source 3Source 3Source 3

Moving Data from Data Marts to a Data Warehouse

Advantages Simpler and faster to implement Department-specific data Smaller hardware requirements

Disadvantages Data duplication Incompatible data marts

DataDataWarehouseWarehouse

Sales MartSales Mart

Financial MartFinancial Mart

Customer ServiceCustomer ServiceMartMart

Source 1Source 1Source 1Source 1

Source 2Source 2Source 2Source 2

Source 3Source 3Source 3Source 3

The Data Warehousing Process

Basic Elements of the Process

Tools to Manage the Process

Basic Elements of the Process

Data Marts

DataDataWarehouseWarehouse

Source OLTPSystems

Clients

Retrieve DataRetrieve Data Populate Populate Populate Populate Query Query Transform Data Transform Data Data Warehouse Data Warehouse Data Marts Data Marts the Data the Data

11

22

33 44 55

Tools to Manage the Process

SQL Server

Data Transformation Services

SQL Server OLAP Services

Microsoft Repository

Microsoft English Query

PivotTable Service

ETL process

Extraction, Transformation, Loading

Extraction: 추출

Data retrieval from existing data source such as File, Table etc.

Transformation: 변환

Data modification, sorting, calculation etc

Loading: 적재

Bulk, incremental loading from operational DB

Time consuming process: may use special H/W

Data in a Data Warehouse

Data Characteristics

Example of Organizing Data

Data Characteristics

Data characteristicData characteristicData characteristicData characteristic DescriptionDescriptionDescriptionDescription

ConsolidatedConsolidated Enterprise-wideEnterprise-wide

ConsistentConsistent Within the data warehouseWithin the data warehouse

Subject-orientedSubject-oriented Organized to user perspectiveOrganized to user perspective

HistoricalHistorical Snapshots over timeSnapshots over time

Read-onlyRead-only Cannot updateCannot update

SummarizedSummarized To appropriate level of detailTo appropriate level of detail

Example of Organizing Data

Southeast RegionTotal

City

Miami

Tampa

Atlanta

Savannah

Columbia

Monthly Southeast Regional Sales Report - May 1999

State

FL

FL

FL Totals

GA

GA

GA Totals

SC

SC Totals

Units Sold

2,500

2,750

5,250

3,200

1,725

4,925

1,900

1,900

12,075

Sales $

$12,850

$14,135

$26,985

$16,800

$ 9,143

$25,943

$ 9,595

$ 9,595

$62,473

Data Warehouse Schema Example: Star schema

A Example of Cube Browsing

1 Fact with 4 Dimension Table-- Sales_Fact, Product, Store, Time, Customer

Drilling Down

Drilling Down to products

Drilling Down

Drilling Down to the lowest level of Customer Dimension

Rolling up

Rolling up

Review

What Is Data Warehousing

Data Marts and Data Warehouses

The Data Warehousing Process

Data in a Data Warehouse

Data Warehouse will be more popular than DB?