Upload
ella-barker
View
219
Download
4
Tags:
Embed Size (px)
Citation preview
Business Intelligence : a primerRev April 2012
Introduction & overviewThe paradigm of BI systemsPlatformsAppendixReview questions
Our approach to BI
Plan Exec Mon
Dash Rep DSS
Ctl Info
ES taxonom
y
BI ArchitectureB
I syste
ms m
od
ellin
g
Enterprise Information Modeling
SIRE
1. Process duration2. Activity timeliness3. Resource flexiblity
1. Production unitcost
2. Productivity3. Usage / workload
1. Customer access / acquisition unitcost
2. Customer use cost
1. Technology response time
2. Technology timeliness
3. Activity & technology flexibility
1. Execution unitcost
2. Preparation effort
Flexibility & speedCost
Man
ager
Cust
omer
Wor
ker
1. Response time 2. Response timeliness3. Vendor flexiblity
1. Spec conformity of service and products
2. Technologydependability
1. Expectation conformityof the service
2. Service dependability3. Customer satisfaction
1. Expectation conformityof work / work environment
2. Technologydependability
3. Employee satisfaction
Quality & satisfaction
KPI Identification /
mappingHIGO
Aggregate Strategic Level (ASL)
GUI ModelingGOA
Analytic Information Modeling
DFM
Rich Semantic Level (RSL)
Software Engineering Interface (SEI)
Implementation Level
Business Intelligence: the role within Enterprise Systems
Front-end systems (Support the life cycle of
customers and end products)
Back-end systems (Support the cycle of
production and delivery)
Administrative systems (Finance, HR etc.)
Management support
Operations support
Management Information Systems [Planning & Management Control + Business Intelligence ]
Acronyms
• ABC: Activity Base Costing• ABM: Activity Based Management• BI: Business Intelligence • BW: Business Warehouse (synonym
of DW)• BSC: Balanced Score Card• CPM: Corporate Performance
Management (synonym of SEM)• CRM: Customer Relationship
Management• CSF: Critical Success Factor• DBMS: Data Base Management
System• DSS: Decision Support System• DW: Data Warehouse• EIS: Executive Information System
• EPM: Enterprise Performance Management (synonym of SEM)
• ERP: Enterprise Resource Planning• ERM: Enterprise Resource
Management• ES: Enterprise System • KPI: Key Performance Indicator• MBO: Management By Objectives • MRP: Manufacturing Resource
Management• ODS: Operational Data Store• OLAP: On Line Analytical Processing• OLTP: On Line Transaction
Processing• SCM: Supply Chain Management • SEM: Strategic Enterprise
Management
Characteristics of Analytic & Management Information
• Information is– Periodical – Output of computation or aggregations – Reflects objectives or actual data
• E.g. data of P& L of an imaginative Car Company come from different transaction processing systems
– Sales– Purchasing – Accounting– Etc.
• Therefore, the design of BI / MIS :– Is top-own – Defines first target data i.e. the
variables that BI should process – Identifies corresponding source data – Defines the process to extract and
transform source in target data
The 4-layer paradigm of BI /MIS systems
Extraction DATA ENTRY
BASI DATI OPERATIVEBASI DATI OPERATIVE
BASI DATI OPERATIVETransactions Data Bases
TranformationLoading
DATA WAREHOUSE
Decision support engines (DSS)
Presentation / reporting engine
(EIS, reporting)
Mining & other application engines
DATA MART
The 4-layer paradigm of BI /MIS systems
• BI/MIS applications are based on 4 layers
• Layer 1 contains source data, typically stored in Transaction Data Base
• Layer 2 extracts information, and transforms source data into Multi-key & Time-dependent data
• Layer 3 stores such transformed information
• Layer 4 processes transformed information according various purposes
– Support decisions (DSS)• E.g. define the sale budget
– Prepare reports and dashboard (Report)• E.g., sales performance
– Mine stored data (Mining) • E.g. identify customer who may churn
Business Intelligence : a primerRev April 2012
Introduction & overviewThe paradigm of BI systemsPlatformsAppendixReview questions
“Jones” case study
• CONTEXT– The Supermarket Chain
«Jones» includes 300 shops in 3 regions with 60k items on sale
– A POS (Point Of Sale) system supports all activities of each shop :
• item receiving, • storing, • scrapping, • selling
– Specifically, POS terminals record sales transactions and issue receipts
• REQUIREMENTS – Management want to
analyze sales
– Facts : Sales
– Measures: amount, quantity, number of tickets
– Analysis dimensions• Date • Item • Shop
– Time span : 24 months rolling
Level 1 (source data) «Jones» case study
Ticket # 2002a23b11Store #0021MI
Item Des Price Qty Amount#190 Pen 3560 2 7.12#69 Mat 550 10 5.50#90 Lib 32000 1 32.00TOTALE 44.62Payment Fidelity P.Date 120109
Item Master Data • # Item• # Store• Description • Price • Qunatity mesuere • Stock on hand• Stock at the beginning of the day • Average forecasted dayly sale
Receipt Heading• # Store• # Ticket • Amount• Payment • Date
Receipt detail • # Ticket• # Item• Amount• Qty
EXTRACTION
DATA ENTRY
TRANSACTIONS DATABASES
TRANSFORMATION
LOADING
DATA WAREHOUSE
DSSReport/
dashboardMining &
other
DATA MART
Level 2
• Extraction includes – Select source data– Check and clean source data (data
cleaning o data cleansing)– Staging of extracted data (as needed) – Log of extractions
• Extraction can be – Automatic: a batch procedure that runs
periodically (e.g. daily, weekly, monthly)
– Interactive: integrates and fixes automatic data
• ETL can use intermediate databases – Staging Area : where extracted data are
temporarily parked (e.g. Data of each individual shop)
– Operational Data Store (ODS): where granular data are stored and reconciled for future use (e.g. receipt data)
Level 3
• Data are stored in Data Warehouse and Data Marts
• A Data Warehouse is a “subject-oriented, integrated, time-variant (temporal), non volatile collection of summary and detailed data, used to support strategic decision-making process for the enterprise” (Inmon 1996)
• Data Mart is a smaller warehouse, often a subset or extraction of a warehouse.
• Warehouse e Mart typically adopt different data schemas
EXTRACTION
DATA ENTRY
TRANSACTIONS DATABASES
TRANSFORMATION
LOADING
DATA WAREHOUSE
DSSReport/
dashboardMining &
other
DATA MART
Level 3 : Data Warehouse
Fact table •Key 1•Key 2•Key …•Measure 1•Measure 2•Measure ….
Key table 2•Key 2•Attribute 1•Attribute 2•Attribute ….
Key table …•Key …•Attribute 1•Attribute 2•Attribute ….
Key table 1•Key 1•Attribute 1•Attribute 2•Attribute ….
• The warehouse is typically implemented by relational database, whose schema reflects the corresponding DFM (Dimensional Fact Model).
• In relational schemas: • Fact tables:
• Store the value of facts (measures)• Are identified by multiple keys
(K>= 2)• Key tables
• Describe the attributes of dimensions
Level 3: Data Warehouse: star schemaJones case study
Sales •Date# •Item#•Shop#•Sales amount •Sales qty •Number of receipts
TimeDate# •Week-day•Flag work/holyday for local calendar•Date in muslim calendar •Flag work/holyday for muslim calendar
Item Item#• Billing-metric •Item description • Bar-code# • Package qty • Package-class • Supplier-brand • Item-class
Shop Shop#• Description • Shop-class • ZIP-code
• A simple implementation of the DFM is a STAR schema where key tables are implemented only for immediate keys
• Further analysis / segmentation is obtained by queries on attributes of key tables
Level 3 : Data Warehouse : Snow flake schema Jones Case study
• A full implementation of the DFM requirements implies a snow flake schema with a key table for every hierarchy node
1 Source Data Base Identification
Target Data design 2
Mapping of Source Data into Target Data
3
4
Creation of Data Warehouse5
Data extraction6
Level 3: design steps
The process from extraction up to data warehouse creation is supported by warehouse building tools that are incorporated in most BI platforms
ETL code generation
Level 3: design steps : detail
Level 3: Data Mart
• Data mart store frequently accessed information
• From a same warehouse multiple data marts can be created
• Data marts are typically implemented by hypercube (OLAP technology)
EXTRACTION
DATA ENTRY
TRANSACTIONS DATABASES
TRANSFORMATION
LOADING
DATA WAREHOUSE
DSSReport/
dashboardMining &
other
DATA MART
Level 3: Data Mart
Data WarehouseShop
Marketing
Sales Analysis
Customer History
Accounting
From a same warehouse multiple data marts can be
created
Level 3: Data Mart : Hyper-cube : display
Pages Columns
Facts
Level 3: Data Mart : Hyper-cube : logic
• An hypercube is a matrix of tables
• A Fact (e.g. Sales) is identified in a multidimensional space whose axes are Analysis Dimensions (e.g. Shop, Time, Item)
• An hypercube enables to instantly retrieve complex information e.g. : – Sales in last Year (aggregation
of Time)– by Region (=aggregation of
Shops) – by Category (= aggregation of
Product)
Sales
Time
Item
Shop
Quantity = 20Amount= 100
Event
Dimension
Fact
Level 3: Data Mart : Hyper-cube : logic
Shop Item Month BUDGET
MB21000
MB21000
MB21000
MB21000
MB31000
MB31000
MB31000
MB31000
MB41000
MB41000
MB41000
MB41000
0601
0601
0602
0602
0601
0601
0602
0602
0601
0601
0602
0602
Jan
Feb
Jan
Feb
Jan
Feb
Jan
Feb
Jan
Feb
Jan
Feb
50
55
50
60
65
45
55
50
60
70
65
75
Shops
Item
MB21000 MB31000 MB41000
0601 0602
Date
Jan Feb Mar Apr
SHOP
ITEM’
MONTH
OLAP dimensions = warehouse key
Level 3: Data Mart : Hyper-cube : logic
• Dimensions are arranged in «aggregation hierarchies» (roll-up)
• Levels of hierarchies are called «dimensional attributes»
• A multidimensional analysis is performed by navigating trough aggregation levels of dimensions
All Products
HouseCleaning
Hardware
Food
Washing powder
Soap
Dairy
Bread & Biscuit
Drinks
Tools
Nuts & bolts
Dash
…
Palmolive
Svelto
….
Ajax
CategoryTypeProduct
Dimension Hierarchy
Level 3: Data Mart : Hyper-cube : implementation
Time
Item
Sho
p
Time
Item
Time
Item
Sales-amount Sales-qty
Receipt-number
FACT
TIME
Tempo (ch) Tempo attributi (da def.)
ITEM
Shop
Prodotto (ch) Prodotto attributi (da def.)
PuntoVendita (ch) PuntoVendita attributi (da def.)
Date
Item
Shop
Sales-amount
Sales-qty
Receipt-number
• A wise approach to implement multidimensional information is to have an hyper-cube for each measure
• This easies arithmetic operations and keeps hyper-cubes light
Sho
p Sho
p
Level 4
• It processes information for management from various perspectives– Define / assess decisions and
program (DSS)
– Present information with a friendly navigation that enables roll up and drill down (EIS & dashboard)
– Produce structured reports (reporting)
– Identify trends an pattern in stored information (mining and profiling)
EXTRACTION
DATA ENTRY
TRANSACTIONS DATABASES
TRANSFORMATION
LOADING
DATA WAREHOUSE
DSSReport/
dashboardMining &
other
DATA MART
Data warehouse
Data Marts Data Bases
Semantic Layer
Format editing
Information distribution and privileges handling
Leve 4 : reporting
Level 4: reporting : semantic layer
• Purpose: to map data from heterogeneous sources
• Generally semantic layer includes a set of types e.g.: – Dimensions (= warehouse
keys)
– Dimensions attributes ( = key attributes)
– Measures and Facts
Level 4: reporting : format editing
• Includes editing functions by which report pages are defined.
• He content of the report is obtained by dragging an dropping information item from the catalogue of the semantic layer
• Further activities manage the layout of pages
Level 4: reporting : information distribution
Level 4 : DSS
• A DSS is a computer based application designed to support semi-structured management decisions by
– Searching and analyzing information on a collection of sources
– Compute and assess results (e.g. sensitivity analysis)
• Typical application fields are:– Planning– Budgeting– Optimization– Funding and Investment Decisions
• ERP / CRM vendors offer DSS suites for corporate planning as Oracle’s EPM and SAP’s BO
Level 4 : DSS : an example (budgeting)
Memorizzazione e calcolo Elaborazione report
Stato patrimoniale
Datifinanziari
KPI
Processidi calcolo
Ricavi
Ricavi
Voci economiche e patrimoniali
Processidi calcolo
Processidi calcolo
Conto economico
Cashflow
KPI
Processidi calcolo
Sistema amministrativo
Sistema di vendita
Ricavi a budget
Spese e costi a budget
Costi
The control system produces monthly a financial report and a report with physical performance indicators (KPI)
Financial report and KPI report are on 5 dimensions:
1. Time
2. Cost centers
3. Item
4. Sales channel
5. Activity
Sales data come from the Sales systems and are stored in a data mart; the same approach is also for sales budget, actual costs and budget costs
Data marts are merged in two hyper-cubes, respectively KPI and Financial.
Over hyper-cubes a software processes reports on P&L, A&L, Cashflow, KPI
Level 4 : Analysis Engines
• Data mining applications for research and marketing are designed for – Discover in a data base relations and associations previously unknown
(“data mining helps end user extract useful business information from large databases” (Berson 1997)).
– Mining software is a key in marketing to calculate predictive indicators as
• Churning,
• Fraud risk,
• Saving attitude,
• Economic potential etc.
• Customer Profiling systems (Analytic CRM)
Business Intelligence : a primerRev April 2012
Introduction & overviewThe paradigm of BI systemsPlatformsAppendixReview questions
BI solutions are offered by all main vendors
• BI is 5-10% of the ES market• Main vendors offer BI products &
applications – ES vendors
• Oracle: the largest DB vendor – products on Warehousing and
applications from vendors acquired (Essbase, Hyperion )
– Applications: EPM analogous of SAP’s SEM
• SAP: the largest ERP vendor– Applications: Strategic Enterprise
Management (SEM) to support the entire management and analysis life cycle
– Products : Crystal report, Business Object (founder of reporting paradigm)
• Microsoft : Office products , SQL server family
– BI vendors • SAS: founder of BI and the largest BI
independent vendor, offers a wide range of applications by industry and business area, and specific solutions
• Microstrategy • Open source platforms: e.g. Pentaho
Business Intelligence Platforms : SAS
• By industry– …– Education– Financial Services– Government– …..
• By solution – Analytics– Business Analytics– Business Intelligence– Customer Intelligence– Data Management– Fraud & Financial Crimes– High-Performance Analytics– IT & CIO Enablement– On Demand Solutions– Performance Management– Risk Management– SAS® 9.3– Supply Chain Intelligence– Sustainability Management
• Featured solutions
– SAS® 9.3
– SAS® Clinical Data Integration
– SAS® Curriculum Pathways®
– SAS® Enterprise Guide®
– SAS® Enterprise Miner™
– SAS Fraud Framework for Government
– SAS® High-Performance Analytics
– SAS® Inventory Optimization
– SAS® OnDemand for Academics
– SAS® Social Media Analytics
– SAS® Text Analytics
– SAS® Visual Data Discovery
Business Intelligence : a primerRev April 2012
Introduction & overviewThe paradigm of BI systemsPlatformsAppendixReview questions
Data Warehouse and Data Mart vs Database
Data base Data Warehouse Data Mart
Conceptual modeling (Rich Semantic Layer)
ERA DFM DFM
Information type (Master, Event, Analysis)
Master + Event Analysis Analysis
Information organization
Normalized (e.g. 3NF) Star or snowflake Hypercube
Data schema Relational Relational OLAP or Relational
Processing orientation Create + Update Read Read
Typical data operations
Insert one individual record or modify one or multiple records
Access a vector of records Roll-up, Drill down, Dice
Access one ore multiple a vector of records Roll-up, Drill down, Dice
Transaction example Enter a customer order Segment customer in Italy with a degree of loyalty >70% by age and region
Segment customer in Italy with a degree of loyalty >70% by age and region
Business Intelligence : a primerRev April 2012
Introduction & overviewThe paradigm of BI systemsPlatformsAppendixReview questions
Review questions
• Illustrate the input, process and output of the four layers of BI systems
• What is an Hyper-cube ?
• What is a data mart?
• What is a data warehouse? Compare data warehouse versus classic database in terms of– Conceptual modeling (Rich Semantic Layer)
– Implementation (DB schema)
– Information type (Master, Event, Analysis)
– Processing orientation