Upload
sunil-m-kanta
View
25
Download
4
Tags:
Embed Size (px)
Citation preview
Data Warehouse ConceptsData Warehouse Concepts
ContentsContents
� Data & Information
� Introduction to Data warehouse (DWH)
� Characteristics of DWH
� Operational System Vs DWH
� DWH Architectures
� Data Marts
� Metadata
� Data & Information
� Introduction to Data warehouse (DWH)
� Characteristics of DWH
� Operational System Vs DWH
� DWH Architectures
� Data Marts
� Metadata
Data & InformationData & Information
� A fundamental concept of data warehouse is the distinction between data and information.
� Data is composed of observable and recordable facts that are often found in operational or transactional systems.
� In a data warehouse environment, data only comes to have value to end-users when it is organized and presented as information.
� Information is an integrated collection of facts and is used as the basis for decision making.
� A fundamental concept of data warehouse is the distinction between data and information.
� Data is composed of observable and recordable facts that are often found in operational or transactional systems.
� In a data warehouse environment, data only comes to have value to end-users when it is organized and presented as information.
� Information is an integrated collection of facts and is used as the basis for decision making.
Introduction to Data WarehouseIntroduction to Data Warehouse
� Definitions:� "A data warehouse is a subject oriented, integrated, time-variant, non volatilecollection of data in support of management's decision making process".
� A data warehouse is a relational database that is designed for query and analysis rather than for transaction processing.
� A Data Warehouse is a structured repository (Subject Oriented) of Historic Data.
� Data warehouses separate analysis part from transactional part and enables the organization to collect data from several sources.
� Definitions:� "A data warehouse is a subject oriented, integrated, time-variant, non volatilecollection of data in support of management's decision making process".
� A data warehouse is a relational database that is designed for query and analysis rather than for transaction processing.
� A Data Warehouse is a structured repository (Subject Oriented) of Historic Data.
� Data warehouses separate analysis part from transactional part and enables the organization to collect data from several sources.
Characteristics of Data Warehouse
Characteristics of Data Warehouse
Data Warehouse is usually:
� Subject Oriented
� Integrated
� Non-Volatile
� Time-Variant
� Accessible & Process Oriented
Data Warehouse is usually:
� Subject Oriented
� Integrated
� Non-Volatile
� Time-Variant
� Accessible & Process Oriented
Subject OrientedSubject Oriented
� Information is presented according to specific subjects or areas of interest.
� Data is manipulated to provide information about a particular subject.
� Information is presented according to specific subjects or areas of interest.
� Data is manipulated to provide information about a particular subject.
Sales
Marketing
Finance
DWH
IntegratedIntegrated
� Appln A – m/f� Appln B – 1/0� Appln C – Male/Female
� Appln A – m/f� Appln B – 1/0� Appln C – Male/Female
� Appln A – Bal_On_Hand� Appln B – Current_Balance� Appln C – Cash_On_Hand
Current_Balance
m/f
DWHOperational Systems
� Though the data in the data warehouses is scattered around different tables, databases or even servers but the data is integrated consistently in the values of variables, naming conventions and physical data definitions (datatype).
Time-VariantTime-Variant
� Contains a history of the subject, as well as current information.
� Historical information is an important component of a data warehouse.
� Contains a history of the subject, as well as current information.
� Historical information is an important component of a data warehouse.
� View of Business Today
DWHOperational Systems
� Designated Time Frame (3 – 10 years).
� DWH stores historical data.
Non-VolatileNon-Volatile
� Stable information that doesn’t change each time an operational process is executed. Information is consistent regardless of when the warehouse is accessed.
� There exist only two operations – time based loading of data, accessing the loaded data.
� Stable information that doesn’t change each time an operational process is executed. Information is consistent regardless of when the warehouse is accessed.
� There exist only two operations – time based loading of data, accessing the loaded data.
DWHOperational Systems
Create
Update
Insert
Delete
Read
Read
Read
Read
Read
Read
Load
Read Only
Accessible & Process OrientedAccessible & Process Oriented
� Accessible: The primary purpose of a data warehouse is to provide readily accessible information to end-users.
� Process-Oriented: It is important to view data warehousing as a process for delivery of information.
� Accessible: The primary purpose of a data warehouse is to provide readily accessible information to end-users.
� Process-Oriented: It is important to view data warehousing as a process for delivery of information.
Operational System Vs Data Warehouse
Operational System Vs Data Warehouse
Weekly, Monthly, Quarterly.
Twice daily, Daily, Weekly.
Frequency of load
Long-term decisions, Reporting, Trend detection.
Day-to-day decisions, Current operational results.
Primary Use
Historic (Last month, Quarterly, Five years).
Current, Near-term (Today, Last week).
Age of the data
Subject Oriented,
Integrated,
Non-Volatile,
Time-Variant.
Data Focused,
Transaction Processing focused system.
Characteristics
Data WarehouseOperational System
DWH ArchitecturesDWH Architectures
� Data Warehouse Architecture (Basic)
� Data Warehouse Architecture (with a Staging Area)
� Data Warehouse Architecture (with a Staging Area and Data Marts)
� Data Warehouse Architecture (Basic)
� Data Warehouse Architecture (with a Staging Area)
� Data Warehouse Architecture (with a Staging Area and Data Marts)
DWH Architectures (contd..)DWH Architectures (contd..)
� Data Warehouse Architecture (Basic)� Data Warehouse Architecture (Basic)
Operational
Systems
DWH
Meta
Data
Data Warehouse
Data Storing
OperationalSystem
Data Extraction
Data Transformation
Data Loading
Data Access
Users
Analysis
Reporting
Mining
Legacy Systems
DWH Architectures (contd..)DWH Architectures (contd..)
� Data Warehouse Architecture (with a Staging Area)
� Data Warehouse Architecture (with a Staging Area)
Operational
Systems
OperationalSystem
DWH
Meta
Data
Data Warehouse
Data Storing Data Access
Users
Analysis
Reporting
Mining
Legacy Systems
Staging
Area
Data Extraction
Data Transformation
Data Loading
DWH Architectures (contd..)DWH Architectures (contd..)
� Data Warehouse Architecture (with a Staging Area and Data Marts)
� Data Warehouse Architecture (with a Staging Area and Data Marts)
Operational
Systems
DWH
Meta
Data
Data Warehouse
Data Storing Data Access
Users
Analysis
Reporting
Mining
Legacy Systems
Staging Area
OperationalSystem
Data Extraction
Data Transformation
Data Loading
Data Marts
Sales
Marketing
Finance
Data MartsData Marts
� Data Marts:
� Data mart is a subset of DWH.
� A data mart is a specialized version of a DWH.
� A data mart configuration emphasizes easy access to relevant information.
� Data Marts:
� Data mart is a subset of DWH.
� A data mart is a specialized version of a DWH.
� A data mart configuration emphasizes easy access to relevant information.
DWH
Data Marts
Data Marts (contd..)Data Marts (contd..)
� Dependent data mart: Data can be derived from an enterprise-wide data warehouse.
� Independent data mart: Data can be collected directly from sources.
� Dependent data mart: Data can be derived from an enterprise-wide data warehouse.
� Independent data mart: Data can be collected directly from sources.
Data Marts (contd..)Data Marts (contd..)
� Reasons for creating a Data mart
� Eases access to frequently needed data
� Creates collective view by a group of users
� Improves end-user response time
� Ease of creation
� Lower cost than implementing a full Data warehouse
� Reasons for creating a Data mart
� Eases access to frequently needed data
� Creates collective view by a group of users
� Improves end-user response time
� Ease of creation
� Lower cost than implementing a full Data warehouse
MetadataMetadata
� Metadata:
� Metadata is data about data.
� Something can be data and metadata at the same time.
� It is possible to create meta-meta-...-metadata.
� Metadata is used to speed up and enrich searching for resources.
� E.g: Browsers automatically download and locally cache metadata, to improve the speed at which files can be accessed and searched.
� Metadata:
� Metadata is data about data.
� Something can be data and metadata at the same time.
� It is possible to create meta-meta-...-metadata.
� Metadata is used to speed up and enrich searching for resources.
� E.g: Browsers automatically download and locally cache metadata, to improve the speed at which files can be accessed and searched.
Questions ?Questions ?
Thank You !Thank You !