42
The IT Perspective: Data Warehousing, Management, and Analytical Structures

The IT Perspective of Business Intelligence

Embed Size (px)

DESCRIPTION

The IT Perspective of Business Intelligence

Citation preview

Page 1: The IT Perspective of Business Intelligence

The IT Perspective: Data Warehousing, Management, and Analytical

Structures

Page 2: The IT Perspective of Business Intelligence

• Explain the basics of:

Master Data Management

Data Warehousing

ETL

OLAP/Multidimensional Data

Objectives

Page 3: The IT Perspective of Business Intelligence
Page 4: The IT Perspective of Business Intelligence

• Install only the ones you need

• Which?

– Integration Services • Get your data from the world outside (ETL)

– Analysis Services • Cubes, Data Mining, support for PowerPivot on SharePoint

– Reporting Services • DIY Report Builder and traditional “big” reports

– Master Data Services • Quality of critical master data (cities, colours, customers)

– Database Engine • Data warehouse and OLTP relational storage

SQL Services – Why?

Page 5: The IT Perspective of Business Intelligence

Master Data Management

Page 6: The IT Perspective of Business Intelligence

• Ensures consistency of data across all organisational uses

• Impacts overall data quality

• Processes and tools for:

– Collection, aggregation, matching, distribution, and persistence of master data • Consistently

– Related to Federated Data Management

• Key to MDM: Modelling

MDM

Page 7: The IT Perspective of Business Intelligence

Why MDM? It’s About Evolution of Enterprise Architecture

Page 8: The IT Perspective of Business Intelligence

MDM Processes

• Batched Acquisition from Staging Tables

• Members, Attributes, Parent-Child Relationships

• SQL Integration Services

Import & Integration

• Versioning Changes

• Auditing

• Compliance

• Tracking of Instances

Modeling • Subscription Views

• Export to:

• Operational Systems

• Data Warehouses

• BI Analytics

• Reporting Tools

Export & Subscription

Page 9: The IT Perspective of Business Intelligence

• Tools:

– Master Data Manager • Primary tool for managing

your master data

– MDS Configuration Manager • IT Pro tool

– MDS Web Service • For developers wanting to

extend MDS

• Concepts:

– Models

– Entities

– Attributes

– Members

– Hierarchies

– Collections

– Versions

– Database

Microsoft Master Data Services SQL 2008 R2 Enterprise, Datacenter, Developer

Page 10: The IT Perspective of Business Intelligence

• Model organises data at highest level

– Allowing versioning of changes to data

• There are typically four categories of models:

– People (Customers, Staff)

– Places (Geographies, Cities, Countries)

– Things (Products)

– Concepts (Accounts, Behaviours, Transactions)

Modelling Master Data

Page 11: The IT Perspective of Business Intelligence

Example: Product MDM Model

Product (model)

Product (entity)

Name (free-form attr)

Code (free-form attr)

Subcategory (domain-

based attr)

Name (free-form attr)

Code (free-form attr)

Category (domain-

based attr)

Name (free-form attr)

Code (free-form attr)

StandardCost (free-form

attr)

ListPrice (free-form

attr)

Photo (file attr)

Page 12: The IT Perspective of Business Intelligence

Data Warehouse

Page 13: The IT Perspective of Business Intelligence

OLE DB

ODBC

DB2 Oracle

XML

SQL Server

Analysis Services

SQL Server

Report Server Models

SQL Server

Data Mining Models

SQL Server

Integration Services

MySAP

Hyperion Essbase

SAP

NetWeaver BI SQL Server

Teradata

Rich Connectivity Data Providers

Page 14: The IT Perspective of Business Intelligence

Star Schema

Page 15: The IT Perspective of Business Intelligence

Star Schema Benefits

• Simple, not-so-normalised model

• High-performance queries

– Especially with Star Join Query Optimisation

• Mature and widely supported

• Low-maintenance

Page 16: The IT Perspective of Business Intelligence

• Define hierarchies using multiple dimension tables

• Support fact tables with varying granularity

• Simplify consolidation of heterogeneous data

Snowflake Dimension Tables

Potential for slower query performance in relational reporting

No difference in performance in Analysis Services database

Page 17: The IT Perspective of Business Intelligence

Fact Table Fundamentals

• Collection of measurements associated with a specific business process

• Specific column types

– Foreign keys to dimensions

– Measures – numeric and additive

– Metadata and lineage

• Consistent granularity – the most atomic level by which the facts can be defined

Page 18: The IT Perspective of Business Intelligence

Fact Table Examples

Day Grain

Quarter Grain

Reseller sales data by:

•Product

•Order Date

•Reseller

•Employee

•Sales Territory

Sales quota data by:

•Employee

•Time

Page 19: The IT Perspective of Business Intelligence

• Most common dimension used in analysis (aka Time dimension)

• Use consistently with all facts

• Useful common attributes – Year, Quarter, Month, Day

– Time series analysis support

– Navigation and summarisation enabled with hierarchies, such as calendar or fiscal

• Single table design (typically not snowflake design)

Date Dimension Table

Tip: Format the key of the dimension as yyyymmdd (e.g. 20100115) to make it readily understandable

Page 20: The IT Perspective of Business Intelligence

Parent-Child Hierarchy

• A dimension that contains a parent attribute

• A parent attribute describes a self-referencing relationship, or a self-join, within a dimension table

• Common examples

– Organisational charts

– General Ledger structures

– Bill of Materials

Page 21: The IT Perspective of Business Intelligence

Parent-Child Hierarchy Example

Brian

Amy

Stacia

Stephen

Shu Michael

Peter

José

Syed

Page 22: The IT Perspective of Business Intelligence

Slowly Changing Dimensions

• Maintain historical context as dimension data changes

• Three common ways (there are more):

– Type 1: Overwrite the existing dimension record

– Type 2: Insert a new ‘versioned’ dimension record

– Type 3: Track limited history with attributes

Page 23: The IT Perspective of Business Intelligence

Integration and ETL

Page 24: The IT Perspective of Business Intelligence

Let’s do ETL with SSIS

• SQL Server Integration Services (SSIS) service

• SSIS object model

• Two distinct runtime engines:

– Control flow

– Data flow

• 32-bit and 64-bit editions

Page 25: The IT Perspective of Business Intelligence

• The basic unit of work, deployment, and execution

• An organised collection of: – Connection managers

– Control flow components

– Data flow components

– Variables

– Event handlers

– Configurations

• Can be designed graphically or built programmatically

• Saved in XML format to the file system or SQL Server

The Package

Page 26: The IT Perspective of Business Intelligence

Control Flow

• Control flow is a process-oriented workflow engine

• A package contains a single control flow

• Control flow elements

– Containers

– Tasks

– Precedence constraints

– Variables

Page 27: The IT Perspective of Business Intelligence

• The Data Flow Task

– Performs traditional ETL and more

– Fast and scalable

• Data Flow Components

– Extract data from Sources

– Load data into Destinations

– Modify data with Transformations

• Service Paths

– Connect data flow components

– Create the pipeline

Data Flow

Page 28: The IT Perspective of Business Intelligence

demos Using SQL Server Integration Services

Objectives:

Gain an overview understanding of UI and approach of Integration Services

Appreciate additional capability and functionality

Page 29: The IT Perspective of Business Intelligence

demos What did we see?

How to build up SSIS tasks

How to deploy SSIS packages and deliver comprehensive ETL

Page 30: The IT Perspective of Business Intelligence

OLAP/Multidimensional Data

Page 31: The IT Perspective of Business Intelligence

• Multidimensional data

• Combination of measures and dimensions as one conceptual model

– Measures are sourced from fact tables

– Dimensions are sourced from dimension tables

Cube = Unified Dimensional Model

Page 32: The IT Perspective of Business Intelligence

• Members from tables/views in a data source view (based on a Data Warehouse)

• Contain attributes matching dimension columns

• Organise attributes as hierarchies

– One All level and one leaf level

– User hierarchies are multi-level combinations of attributes

– Can be placed in display folders

• Used for slicing and dicing by attribute

Dimensions

Page 33: The IT Perspective of Business Intelligence

• Benefits

– View of data at different levels of summarisation

– Path to drill down or drill up

• Implementation

– Denormalised star schema dimension

– Normalised snowflake dimension

– Self-referencing relationship

Hierarchies

Page 34: The IT Perspective of Business Intelligence

• Defined in Analysis Services

• Ordered collection of attributes into levels

• Navigation path through dimensional space

• Very important to get right!

Hierarchy

Customers by Geography

Country

State

City

Customer

Customers by Demographics

Marital

Gender

Customer

Page 35: The IT Perspective of Business Intelligence

• Group of measures with same dimensionality

• Analogous to a fact table

• Cube can contain more than one measure group

– E.g. Sales, Inventory, Finance

• Defined by dimension relationships

Measure Group

Page 36: The IT Perspective of Business Intelligence

Sales Inventory Finance

Customers X

Products X X

Time X X X

Promotions X

Warehouse X

Department X

Account X

Scenario X

Measure Group

Measure Group D

ime

ns

ion

Page 37: The IT Perspective of Business Intelligence

• Define interaction between dimensions and measure groups

• Relationship types

– Regular

– Reference

– Fact (Degenerate)

– Many-to-many

– Data mining

Dimension Relationships

Page 38: The IT Perspective of Business Intelligence

• Expressions evaluated at query time for values that cannot be stored in fact table

• Types of calculations

– Calculated members

– Named sets

– Scoped assignments

• Calculations are defined using MDX

Calculations

MDX = MultiDimensional EXpressions

Page 39: The IT Perspective of Business Intelligence

demos Using BIDS to Review Dimension Design

Cube Design and Functionality

Objectives:

Introduction to Business Intelligence Development Studio

Illustrate Cube concepts and design, editing and deployment of Analysis

Services

Page 40: The IT Perspective of Business Intelligence

demos What did we see?

How to design and build SSAS databases

How to deploy SSAS databases

Page 41: The IT Perspective of Business Intelligence

• As a platform for enterprise Business Intelligence you should consider four services:

– Data Warehouse (can be relational)

– Process for Data Management (MDS)

– Process for Data Integration (ETL)

– Analysis (OLAP, Data Mining, Columnar)

= SQL Server 2008 R2

Summary

Page 42: The IT Perspective of Business Intelligence

© 2010 Microsoft Corporation & Content and Code Ltd. All rights reserved. The information herein is for informational purposes only and represents the opinions and views of Content and Code. The material presented is not certain and may vary based on several factors. Microsoft makes no warranties, express, implied or statutory, as to the information in this presentation. © 2010 Microsoft Corp. Some slides contain quotations from copyrighted materials by other authors, as individually attributed or as already covered by Microsoft Copyright ownerships. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Content and Code as of the date of this presentation. Because Content and Code & Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft and Content and Code cannot guarantee the accuracy of any information provided after the date of this presentation. Content and Code no warranties, express, implied or statutory, as to the information in this presentation. E&OE.