37
Database Processing for Business Intelligence Systems Chapter Eight DAVID M. KROENKE and DAVID J. AUER DATABASE CONCEPTS, 4 th Edition

Database Processing for Business Intelligence Systems Chapter Eight DAVID M. KROENKE and DAVID J. AUER DATABASE CONCEPTS, 4 th Edition

Embed Size (px)

Citation preview

Database Processing forBusiness Intelligence Systems

Chapter Eight

DAVID M. KROENKE and DAVID J. AUER

DATABASE CONCEPTS, 4th Edition

Chapter Objectives

• Learn the basic concepts of data warehouses and data marts

• Learn the basic concepts of dimensional databases

• Learn the basic concepts of business intelligence (BI) systems

• Learn the basic concepts of OLAP and data mining

KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-2

Heather Sweeney Designs:Database Design

KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-3

Heather Sweeney Designs:HSD Database Diagram in SQL Server 2005

KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-4

Business Intelligence Systems

• Business intelligence (BI) systems are information systems that– Assist managers and other professionals in the analysis of

current and past activities and in the prediction of future events

– Do not support operational activities, such as the recording and processing of orders

• These are supported by transaction processing systems– Support management assessment, analysis, planning and

control

• BI systems fall into two broad categories– Reporting systems that sort, filter, group, and make

elementary calculations on operational data– Data mining applications that perform sophisticated

analyses on data, analyses that usually involve complex statistical and mathematical processing

KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-5

The Relationship Among Operational and BI Applications

KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-6

Characteristics of Business Intelligence Applications

KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-7

Characteristics of aData Warehouse

KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-8

Problems with Operational Data

• “Dirty Data”– Example – “G” for Gender– Example – “213” for Age

• Missing Values

• Inconsistent Data– Example – data that have changed,

such as a customer’s phone number

KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-9

Problems with Operational Data (Continued)

• Nonintegrated Data– Example – data from two or more

sources that need to be combined

• Incorrect Format– Example – time data in hours when

needed in minutes

• Too Much Data– Example – An excess number of

columnsKROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-10

ETL Data Transformation

• Data may need to be transformed for use in a data warehouse– Example

• {CountryCode CountryName}• “US” “United States”

– Example• Email address to Email domain• [email protected] “somewhere.com”

KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-11

Characteristics of aData Mart

KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-12

Enterprise Data Warehouse (EDW) Architecture

• Combines the data warehouse structure and the data mart structures shown above

• Expensive to create, staff and operate

• Smaller organizations use subsets of the EDW architecture

KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-13

Dimensional Databases

• A non-normalized database structure used for data warehouses

• May use slowly changing dimensions– Values change infrequently

• Phone Number• Address

• Use a Date or Time dimension

KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-14

Star Schema

KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-15

HSD-DW Star Schema

KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-16

Two-Dimensional Matrix

KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-17

Three-Dimensional Matrix

KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-18

Conformed Dimensionsand the Extended HSD-DW Schema

KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-19

Reporting Systems

KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-20

Reporting Systems:RFM Analysis

• RFM Analysis analyzes and ranks customers according to purchasing patterns:– R = Recent (most recent order)– F = Frequent (how often an order is made)– M = Money (dollar amount of orders)

• Customers are sorted into five groups, each containing 20% of the customers.

• Each group is given a numerical value:– 1 = Top 20%– 2, 3, 4 = Each 20% in between top and

bottom 20%– 5 = Bottom 20%

KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-21

The RFM Score Report

KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-22

Reporting Systems: Report Characteristics

KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-23

Reporting Systems: Report System Functions

• Report Authoring:– Connect to data sources– Create the report structure– Format the report

• Report Management:– Defines who receives what reports

when and by what means

• Report Delivery:– Push reports or allow them to be pulled

KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-24

OLAP and Data Mining

• OnLine Analytical Processing (OLAP) is a technique for dynamically examining database data– OLAP uses arithmetic functions such as Sum

and Average

• Data Mining is a mathematically sophisticated technique for analyzing database data– Data mining uses mathematical and statistical

techniques

KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-25

OLAP

• OLAP systems produce an OLAP report, also know as an OLAP cube

• The OLAP report uses inputs called dimensions

• The OLAP report calculates outputs called measures

• Excel PivotTables can be used to create OLAP reports

KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-26

Excel PivotTableOLAP Report I

KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-27

Excel PivotTableOLAP Report II

KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-28

Excel PivotTableOLAP Report III

KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-29

Data Mining Applications:The Convergence of the Disciplines

KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-30

Data Mining Applications:Popular Data Mining Techniques

KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-31

• Cluster analysis – Identifies groups of entities that have similar characteristics

• Decision tree analysis – Classifies entities into groups based on past history

• Logistic regression – Produces equations that offer probabilities that certain events will occur

• Neural Networks – Complex statistical prediction techniques

• Market Basket Analysis – Determines patterns of associated buying behavior

Data Mining Applications:Cluster Analysis I

KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-32

Data Mining Applications:Cluster Analysis II

KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-33

Data Mining Applications:Cluster Analysis III

KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-34

Data Mining Applications:Market Basket Analysis

KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-35

Database Processing forBusiness Intelligence Systems

End of Presentation on Chapter Eight

DAVID M. KROENKE and DAVID J. AUER

DATABASE CONCEPTS, 4th Edition

8-37

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means,

electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of the publisher. Printed in the United States

of America.

Copyright © 2010 Pearson Education, Inc.  Publishing as Prentice Hall