Upload
muruganandhams
View
219
Download
0
Embed Size (px)
Citation preview
8/8/2019 Business Analytics on IDS-WAIUG
1/41
Business Analytics with IDS
Fred Ho, IDS Development
8/8/2019 Business Analytics on IDS-WAIUG
2/41
Copyright IBM Corporation 2009. All rights reserved.U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule
Contract with IBM Corp.
THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONAL PURPOSESONLY. WHILE EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND ACCURACY OF THEINFORMATION CONTAINED IN THIS PRESENTATION, IT IS PROVIDED AS IS WITHOUT WARRANTY OF
ANY KIND, EXPRESS OR IMPLIED. IN ADDITION, THIS INFORMATION IS BASED ON IBMS CURRENT
PRODUCT PLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM WITHOUT NOTICE. IBMSHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISERELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION. NOTHING CONTAINED IN THISPRESENTATION IS INTENDED TO, NOR SHALL HAVE THE EFFECT OF, CREATING ANY WARRANTIES OR
REPRESENTATIONS FROM IBM (OR ITS SUPPLIERS OR LICENSORS), OR ALTERING THE TERMS ANDCONDITIONS OF ANY AGREEMENT OR LICENSE GOVERNING THE USE OF IBM PRODUCTS AND/OR
SOFTWARE.
IBM, the IBM logo, ibm.com, Informix, solid, DataMirror, Optim, Cognos are trademarks or registered trademarks ofInternational Business Machines Corporation in the United States, other countries, or both. If these and other IBM
trademarked terms are marked on their first occurrence in this information with a trademark symbol ( or ), these symbolsindicate U.S. registered or common law trademarks owned by IBM at the time this information was published. Suchtrademarks may also be registered or common law trademarks in other countries.
Other company, product, or service names may be trademarks or service marks of others.
Disclaimer
8/8/2019 Business Analytics on IDS-WAIUG
3/41
Contents
Definition of BI/DW/BA
Types of IDS BI Users
OLTP vs. Data Warehousing
Informix Warehouse
IDS Storage Optimization
Your Feedback and Requirements
8/8/2019 Business Analytics on IDS-WAIUG
4/41
Business Intelligence
A set of concepts and methodologies to improve decision making inbusiness through use of facts and fact-based systems
..Howard Dresner, The Gartner Group
The processes, technologies, and tools needed to turn data intoinformation, information into knowledge, and knowledge into plans that
drive profitable business actions
.David Loshin, Business Intelligence: The Savvy Managers Guide
The foundation that enables BI is the enterprise architecture business, data, and technology. A well-implemented data warehousing
program provides much of that foundation.
8/8/2019 Business Analytics on IDS-WAIUG
5/41
Data Warehousing
A data warehouse is a subject-oriented, integrated, non-volatile, timevariant collection of data organized to support management needs
.W H Inmon
The Data Warehouse is nothing more than the union of all the constituentdata marts
.Ralph Kimball, et al, The Data Warehouse Life Cycle Toolkit
The data warehousing process turns raw data into potentially valuable
information usable by people and systems. Warehousing enhances data
assets value by:
Applying standards and consistency to the data Organizing the data into subject areas that cross business functional
lines
Integrating the data Enforcing data consistency over time to provide meaningful history Acting as a stable and reliable source Providing easy access to data
8/8/2019 Business Analytics on IDS-WAIUG
6/41
Business Analytics
The process of using information to enhance knowledge and apply thatknowledge to help a business achieve its objectives. Analytic applications
provide tools to facilitate the business analytics process.
Business Metrics and Business Management
Business Process Management
Business Performance Management
Business Activity Monitoring
Customer Relationship Management
Supply Chain Management
Performance Dashboards for Information Delivery
Real-time (or near Real-time) Monitoring
Scorecards for Information Delivery
Monitoring history & trends
Analytic Applications for Information Delivery
Customer Analysis, Marketplace Analysis, Sales Channel Analysis,
8/8/2019 Business Analytics on IDS-WAIUG
7/41
Range of Business Analytics
Reporting
Using Query,Reporting and
search tools
Analysis
Monitoring
Prediction
Using OLAP &Virtualization
tools
Using Dashboards& Scorecards
Using PredictiveAnalysis tools
Business Value HighLow
High
Complex
ity
Source: TDWI
8/8/2019 Business Analytics on IDS-WAIUG
8/41
IDS in BI/Warehousing
Given the IDS Characteristics of Reliability, High Availability,Performance, Ease of Use, why isnt IDS in this space?
IDS has traditionally been viewed as an OLTP solution However, there a lot more warehousing users on IDS than one
realizes!
Some customers have implemented IDS warehouses atTerabyte levels There are a lot of features already in IDS that make it suitable
for BI/Warehousing
BI tools have become very sophisticated over the years
We recognize the need to provide better warehousing capabilitiesfor IDS users
8/8/2019 Business Analytics on IDS-WAIUG
9/41
Whats Available? IDS Warehousing Features
Performance & Scalability Inherent SMP Multi-threading Parallel Data Query (PDQ) Light Scan for fast table scans Online Index build Efficient Hash Joins Auto Fragment Elimination Memory Grant Manager (MGM) High Performance Loader Optimistic Concurrency
Easy of Management Time cyclic data management using Range Partitioning OPTCOMPIND optimization
8/8/2019 Business Analytics on IDS-WAIUG
10/41
BI Users Classification
1.BI on Existing OLTP Schema (Operational BI)2.BI on Star Schema (Data Mart)3.BI in a Mix-Workload Environment4.Enterprise BI
8/8/2019 Business Analytics on IDS-WAIUG
11/41
Type 1: BI/Analytics on OLTP Schema
Majority of todays IDS customers have the need to do BI/Analytics on their existing IDS (OLTP) database.
They currently use a combination of 4GL programs, Excel,and BI tools (Business Objects, Cognos, Crystal Reports)
Custom code and maintenance required by customer Performance may be acceptable even on an OLTP schema Allows for operational BI
8/8/2019 Business Analytics on IDS-WAIUG
12/41
OLTP vs. Data Warehousing Workload
Short Transactions Relatively simple SQL
Random Updates Few Rows accessed
Sub-second response time ER Modeling
Minimizes redundancy Normalized data (5NF)
Minimizes duplicates Few indexes
Avoids index maintenance Pre-compiled queries
Repeated execution of queries
Longer Transactions Complex SQL with analytics
Sequential Updates Many Rows Accessed
Secs to Mins response time Dimensional Modeling
OK to have redundancy De-normalized data (3NF)
Duplicates are OK OK to have more indexes
Mostly read only Ad-hoc queries
Unpredictable load
8/8/2019 Business Analytics on IDS-WAIUG
13/41
Type 2. BI/Analytics on IDS on Star Schema
Transform OLTP database into StarSchema database
Better performance for datawarehousing and dimensional
queries
Star Schema database may be on aseparate machine/domain
Suitable for customers buildingseparate data mart
Use IDS as is against Star Schema
8/8/2019 Business Analytics on IDS-WAIUG
14/41
Whats Available? BI Tools
The Performance Management FrameworkCognos identifies best-practice decision areas, orinformation sweet spots by business function:
Cognos 8 provides a comprehensive set of BI tools for:
Reporting
Analysis
Dashboards
Scorecards
Performance Management Framework for:
Solutions for different areas of the organization
8/8/2019 Business Analytics on IDS-WAIUG
15/41
Cognos Business Intelligence and Performance Management
One Platform, One Architecture
Industry and
Functional Solutions
Complete Coverage
of all capabilities
Enterprise-Class
SOA Platform
8/8/2019 Business Analytics on IDS-WAIUG
16/41
Data Warehouse Architecture
8/8/2019 Business Analytics on IDS-WAIUG
17/41
SQL Warehousing Tool Overview
Warehousing Process Design Studio Admin Console Summary
8/8/2019 Business Analytics on IDS-WAIUG
18/41
SQL Warehousing Tools Overview
Typical process Identify requirements Data Architect
Define data transformation (ETL/ELT)process
SQL/ETL developer Development of sql/shell scripts
SQL/ETL developer Deployment in production system
Application Architect, DBA Reporting
Business user Refine requirements
SQW Solution Data Modeling
Physical Data Model (Reverseengineering, New from scratch,generate DDL), compare & sync
Data Flows Visual Design Optimized SQL code generation Control flow supports programming
logic
Admin Console Schedule, Monitor, Parameterizedvalues
Eclipse free reporting tool e.g. BIRT
Reusable flows Easy refinement Copy & paste, refactor Challenges
Dynamic requirements Constantly refinement
Multiple roles, tools Each have different
perspective Communication cost/
information loss
Unreadable, hard-to-debug scripts Poor productivity
Values Easy to design & reuse
Increased productivity Integrated tools
Seamless integration insideEclipse
Auto generated code from visualizedflows
Optimized SQL code Impact analysis for any data model
change
8/8/2019 Business Analytics on IDS-WAIUG
19/41
SQW
Control DB
IDS
Execution
DESIGN
Design Center(Eclipse)
Data Flows + ControlFlows
DEPLOY
Deploymentpreparation
Deployment
packageCode Units
Build Profile
User scripts
Deploy
RUNTIME
HTTP service (WAS )
SQW Runtime
ApplicationsOther Servers
(DataStage)
WarehouseDB
IDS
DB2
Oracle
SQL Server
De
sign
S
tudio
AdminConsole
Deploy
SQW
Execution
DB
IDS
Data Source
Databa
ses
SQW Architecture
8/8/2019 Business Analytics on IDS-WAIUG
20/41
SQW: Design Studio
Design Studio Eclipse based IDE
Integrated tools, shell sharing Team development
CVS, clearcase for checkin/checkoutprojects, flows
Data Warehousing Project Data Models Data Flows Control Flows Warehouse Applications (deployment
packages) Subflow & Subprocess (reusable flow
module) Variables
Data Source Explorer Database connections to multiple
vendors, e.g. Informix, DB2 LUW,Oracle, SQL Server, MySQL, DB2 z/OS
DataStage Servers Integration with IBM DataStage
8/8/2019 Business Analytics on IDS-WAIUG
21/41
SQW: Data Modeling
Physical Data ModelVisualized data modelingImpact analysisReverse engineering or new from scratchCompare & syncGenerate DDLOverview diagram
Shell Sharing with Rational Data Architect & other DataStudio products
8/8/2019 Business Analytics on IDS-WAIUG
22/41
SQW: Data Flows
Data Flow Operators:
-- source & target operators (table, file)
-- SQL Transformation operators
-- Warehousing operators
File source
Table source
Table
join
aggregation
Table target
8/8/2019 Business Analytics on IDS-WAIUG
23/41
SQW: Data Flows
A simple flow
Generated SQL code
-- optimization across SQL statements.
-- optimized staging strategy
-- in-database transformation
8/8/2019 Business Analytics on IDS-WAIUG
24/41
SQW: Control Flows
Control flow
Common utility operatorsControl logic, parallel execution, loop iterationError handling
8/8/2019 Business Analytics on IDS-WAIUG
25/41
SQW Overview
Design Studio
Eclipse Based Design Environment
Admin Console
Production Environment in Websphere
deploy
Application package (zip file)
deployment profile(database connections, machine resources,
variable definitions, DDL files etc..)
Generated code
crea
te
Manage warehouse applications
ScheduleMonitor
manage
8/8/2019 Business Analytics on IDS-WAIUG
26/41
Admin Console
Flex RIA based WarehouseAdmin Console
Admin Console managescommon resources (e.g.
databases connections, ftp
servers, datastage servers)
Schedule & monitor warehouseprocesses
8/8/2019 Business Analytics on IDS-WAIUG
27/41
XPS Customers Looking to Migrate to IDS
External Tables XPS style loader for easy migration
Partitioning Strategies Auto fragmentation Fragment Advisor
Fragment stats Update Truncate Fragments
Primary Storage Manager (PSM) For simpler, easier management of backups
(replacing ISM)
Merge
UpSert capabilities
* Features to be included in the next release(s)
8/8/2019 Business Analytics on IDS-WAIUG
28/41
Shared
Disk
OLTP Apps
SQW
Connection Manager
Primary
SDS
SDS
OLTPNodeGroup
SDS
SQWNodeGroup
MAC
H
11
Blade Server
User transparencySingle
database
view
OLTP Apps SQW
OR
(ETL)
OLTPDatabase
DataWarehouseDatabase
Use Separate Boxes
Use MACH 11
Using Mach11 for OLTP/Warehousing in IDS
8/8/2019 Business Analytics on IDS-WAIUG
29/41
IDS Storage Optimization
Now Available as of 11.50xC4
Deep Compression + Storage Optimization
8/8/2019 Business Analytics on IDS-WAIUG
30/41
Row Compression Concepts
Compression looks for repeating patterns across the entire table When pattern found, string replaced by a 12 bit symbol Symbols are stored in a dictionary for fast lookup
Data resides compressed on pages (both on-disk and in bufferpool) Significant I/O bandwidth savings better performance Significant memory savings more efficient memory utilization Some CPU overhead costs
Rows must be uncompressed before being processed forevaluation
8/8/2019 Business Analytics on IDS-WAIUG
31/41
Row Compression Using a Compression Dictionary
Dictionary contains repeated information from the rows inthe table
Compression candidates can be across column boundariesor within columns
A (01)220
J200 (02) S (01)
580
T132 (02)
Animated
Slide
PartCode SPart Quantity LotNum BinLoc Aisle
ANCPRPLT 220J 200 Z165-3 NE132 6157
SNCPRPLT 580T 132 Z165-3 NE132 6157
Dictionary
01 NCPRPLT
02 Z165-3NE1326157
ANCPRPLT 220J 200 Z165-3 NE132 6157 SNCPRPLT 580T 132 Z165-3 NE132 6157
A (01) 220J 200 (02) S (01) 580T 132 (02)
8/8/2019 Business Analytics on IDS-WAIUG
32/41
Storage savings
Tables will often compress in the range of 60% - 80% Overall database storage savings will be between 40% and 50% Thats 50% less disk space needed to support IDS 11 database!
81%Smaller
78%Smaller
Sales Table Product Table
8/8/2019 Business Analytics on IDS-WAIUG
33/41
Performance Benefit
Performance can be improved using compression Many queries will benefit from compression with fewer I/Os Consumes more CPU - most customers not 100% CPU bound
40%Faster
Lab tests show I/O boundworkloads improve by 30-40%
Many utility (backup and recoveryfor example) will be faster
2x as fast in some cases as thedatabase may now be the size
8/8/2019 Business Analytics on IDS-WAIUG
34/41
IDS 11 Compression Operations
estimate_compression Estimates compression ratio on a table
create_dictionary Creates compression dictionary for a table
compress
Does implicit create_dictionary and compress all previous datauncompress
Uncompress the table and deactivates compressionuncompress_offline
XLOCK table and uncompress it. Also deactivates compressionpurge_dictionary
Delete old inactive dictionaries
8/8/2019 Business Analytics on IDS-WAIUG
35/41
Storage Optimization Operations
repack Move rows within a table or fragment to consolidate free space
repack_offline XLOCK the table and move rows within a table or fragment to
consolidate free space
shrink Return free space at end of table or fragment to the dbspace Normally done after a repack
8/8/2019 Business Analytics on IDS-WAIUG
36/41
Compression On Data Page With Multiple Rows
compress repack
Uncompressed Compressed Compressed
shrink
Multiple
Compressed
Pages
Dictionary
Empty Data PagesAnimated
Slide
8/8/2019 Business Analytics on IDS-WAIUG
37/41
Admin API Interface
All compression and storage optimization operations are invokedvia the IDS Admin API built-in UDRs
execute function task();execute function admin();
Exampleexecute function task
(
table compress repack shrink,
table_name, database_name, owner_name
);
8/8/2019 Business Analytics on IDS-WAIUG
38/41
Features That Cannot Be Compressed
Out-of-row data (e.g. blobs) Indexes Temp tables Catalog tables (Data Dictionary) Partition tables (Tablespace Tablespace) Dictionary Partitions Tables in the following databases:
Sysmaster
SysutilsSysuser
SyscdrSyscdcv1
8/8/2019 Business Analytics on IDS-WAIUG
39/41
HDR, ER, CDC (DataMirror) and Compression
All are supported on compressed tables
HDR Tables will be compressed on secondary iff they are
compressed on primary
ER
Compression status of tables is independent between sourceand target, specified by user
CDC Compression of targets is a function of what the target
database supports and what use specifies
8/8/2019 Business Analytics on IDS-WAIUG
40/41
Summary
Storage optimization through IDS 11 compression can save40-50% of your database storage requirements
For IO-bound workloads Compression can also improveperformance
You not only see your online database shrink but often moreimportantly, your backup storage and disaster recovery storage iscut in half as well
In real customer examples storage savings are realized andperformance benefits are apparent
Add in the time savings with utilities processing (particularlydatabase backup and recover time is cut in half) and you can see
the benefits of IDS 11 compression
8/8/2019 Business Analytics on IDS-WAIUG
41/41