21
Data Warehouse Concepts Data Warehouse Concepts

Data Warehouse Concepts

Embed Size (px)

Citation preview

Page 1: Data Warehouse Concepts

Data Warehouse ConceptsData Warehouse Concepts

Page 2: Data Warehouse Concepts

ContentsContents

� Data & Information

� Introduction to Data warehouse (DWH)

� Characteristics of DWH

� Operational System Vs DWH

� DWH Architectures

� Data Marts

� Metadata

� Data & Information

� Introduction to Data warehouse (DWH)

� Characteristics of DWH

� Operational System Vs DWH

� DWH Architectures

� Data Marts

� Metadata

Page 3: Data Warehouse Concepts

Data & InformationData & Information

� A fundamental concept of data warehouse is the distinction between data and information.

� Data is composed of observable and recordable facts that are often found in operational or transactional systems.

� In a data warehouse environment, data only comes to have value to end-users when it is organized and presented as information.

� Information is an integrated collection of facts and is used as the basis for decision making.

� A fundamental concept of data warehouse is the distinction between data and information.

� Data is composed of observable and recordable facts that are often found in operational or transactional systems.

� In a data warehouse environment, data only comes to have value to end-users when it is organized and presented as information.

� Information is an integrated collection of facts and is used as the basis for decision making.

Page 4: Data Warehouse Concepts

Introduction to Data WarehouseIntroduction to Data Warehouse

� Definitions:� "A data warehouse is a subject oriented, integrated, time-variant, non volatilecollection of data in support of management's decision making process".

� A data warehouse is a relational database that is designed for query and analysis rather than for transaction processing.

� A Data Warehouse is a structured repository (Subject Oriented) of Historic Data.

� Data warehouses separate analysis part from transactional part and enables the organization to collect data from several sources.

� Definitions:� "A data warehouse is a subject oriented, integrated, time-variant, non volatilecollection of data in support of management's decision making process".

� A data warehouse is a relational database that is designed for query and analysis rather than for transaction processing.

� A Data Warehouse is a structured repository (Subject Oriented) of Historic Data.

� Data warehouses separate analysis part from transactional part and enables the organization to collect data from several sources.

Page 5: Data Warehouse Concepts

Characteristics of Data Warehouse

Characteristics of Data Warehouse

Data Warehouse is usually:

� Subject Oriented

� Integrated

� Non-Volatile

� Time-Variant

� Accessible & Process Oriented

Data Warehouse is usually:

� Subject Oriented

� Integrated

� Non-Volatile

� Time-Variant

� Accessible & Process Oriented

Page 6: Data Warehouse Concepts

Subject OrientedSubject Oriented

� Information is presented according to specific subjects or areas of interest.

� Data is manipulated to provide information about a particular subject.

� Information is presented according to specific subjects or areas of interest.

� Data is manipulated to provide information about a particular subject.

Sales

Marketing

Finance

DWH

Page 7: Data Warehouse Concepts

IntegratedIntegrated

� Appln A – m/f� Appln B – 1/0� Appln C – Male/Female

� Appln A – m/f� Appln B – 1/0� Appln C – Male/Female

� Appln A – Bal_On_Hand� Appln B – Current_Balance� Appln C – Cash_On_Hand

Current_Balance

m/f

DWHOperational Systems

� Though the data in the data warehouses is scattered around different tables, databases or even servers but the data is integrated consistently in the values of variables, naming conventions and physical data definitions (datatype).

Page 8: Data Warehouse Concepts

Time-VariantTime-Variant

� Contains a history of the subject, as well as current information.

� Historical information is an important component of a data warehouse.

� Contains a history of the subject, as well as current information.

� Historical information is an important component of a data warehouse.

� View of Business Today

DWHOperational Systems

� Designated Time Frame (3 – 10 years).

� DWH stores historical data.

Page 9: Data Warehouse Concepts

Non-VolatileNon-Volatile

� Stable information that doesn’t change each time an operational process is executed. Information is consistent regardless of when the warehouse is accessed.

� There exist only two operations – time based loading of data, accessing the loaded data.

� Stable information that doesn’t change each time an operational process is executed. Information is consistent regardless of when the warehouse is accessed.

� There exist only two operations – time based loading of data, accessing the loaded data.

DWHOperational Systems

Create

Update

Insert

Delete

Read

Read

Read

Read

Read

Read

Load

Read Only

Page 10: Data Warehouse Concepts

Accessible & Process OrientedAccessible & Process Oriented

� Accessible: The primary purpose of a data warehouse is to provide readily accessible information to end-users.

� Process-Oriented: It is important to view data warehousing as a process for delivery of information.

� Accessible: The primary purpose of a data warehouse is to provide readily accessible information to end-users.

� Process-Oriented: It is important to view data warehousing as a process for delivery of information.

Page 11: Data Warehouse Concepts

Operational System Vs Data Warehouse

Operational System Vs Data Warehouse

Weekly, Monthly, Quarterly.

Twice daily, Daily, Weekly.

Frequency of load

Long-term decisions, Reporting, Trend detection.

Day-to-day decisions, Current operational results.

Primary Use

Historic (Last month, Quarterly, Five years).

Current, Near-term (Today, Last week).

Age of the data

Subject Oriented,

Integrated,

Non-Volatile,

Time-Variant.

Data Focused,

Transaction Processing focused system.

Characteristics

Data WarehouseOperational System

Page 12: Data Warehouse Concepts

DWH ArchitecturesDWH Architectures

� Data Warehouse Architecture (Basic)

� Data Warehouse Architecture (with a Staging Area)

� Data Warehouse Architecture (with a Staging Area and Data Marts)

� Data Warehouse Architecture (Basic)

� Data Warehouse Architecture (with a Staging Area)

� Data Warehouse Architecture (with a Staging Area and Data Marts)

Page 13: Data Warehouse Concepts

DWH Architectures (contd..)DWH Architectures (contd..)

� Data Warehouse Architecture (Basic)� Data Warehouse Architecture (Basic)

Operational

Systems

DWH

Meta

Data

Data Warehouse

Data Storing

OperationalSystem

Data Extraction

Data Transformation

Data Loading

Data Access

Users

Analysis

Reporting

Mining

Legacy Systems

Page 14: Data Warehouse Concepts

DWH Architectures (contd..)DWH Architectures (contd..)

� Data Warehouse Architecture (with a Staging Area)

� Data Warehouse Architecture (with a Staging Area)

Operational

Systems

OperationalSystem

DWH

Meta

Data

Data Warehouse

Data Storing Data Access

Users

Analysis

Reporting

Mining

Legacy Systems

Staging

Area

Data Extraction

Data Transformation

Data Loading

Page 15: Data Warehouse Concepts

DWH Architectures (contd..)DWH Architectures (contd..)

� Data Warehouse Architecture (with a Staging Area and Data Marts)

� Data Warehouse Architecture (with a Staging Area and Data Marts)

Operational

Systems

DWH

Meta

Data

Data Warehouse

Data Storing Data Access

Users

Analysis

Reporting

Mining

Legacy Systems

Staging Area

OperationalSystem

Data Extraction

Data Transformation

Data Loading

Data Marts

Sales

Marketing

Finance

Page 16: Data Warehouse Concepts

Data MartsData Marts

� Data Marts:

� Data mart is a subset of DWH.

� A data mart is a specialized version of a DWH.

� A data mart configuration emphasizes easy access to relevant information.

� Data Marts:

� Data mart is a subset of DWH.

� A data mart is a specialized version of a DWH.

� A data mart configuration emphasizes easy access to relevant information.

DWH

Data Marts

Page 17: Data Warehouse Concepts

Data Marts (contd..)Data Marts (contd..)

� Dependent data mart: Data can be derived from an enterprise-wide data warehouse.

� Independent data mart: Data can be collected directly from sources.

� Dependent data mart: Data can be derived from an enterprise-wide data warehouse.

� Independent data mart: Data can be collected directly from sources.

Page 18: Data Warehouse Concepts

Data Marts (contd..)Data Marts (contd..)

� Reasons for creating a Data mart

� Eases access to frequently needed data

� Creates collective view by a group of users

� Improves end-user response time

� Ease of creation

� Lower cost than implementing a full Data warehouse

� Reasons for creating a Data mart

� Eases access to frequently needed data

� Creates collective view by a group of users

� Improves end-user response time

� Ease of creation

� Lower cost than implementing a full Data warehouse

Page 19: Data Warehouse Concepts

MetadataMetadata

� Metadata:

� Metadata is data about data.

� Something can be data and metadata at the same time.

� It is possible to create meta-meta-...-metadata.

� Metadata is used to speed up and enrich searching for resources.

� E.g: Browsers automatically download and locally cache metadata, to improve the speed at which files can be accessed and searched.

� Metadata:

� Metadata is data about data.

� Something can be data and metadata at the same time.

� It is possible to create meta-meta-...-metadata.

� Metadata is used to speed up and enrich searching for resources.

� E.g: Browsers automatically download and locally cache metadata, to improve the speed at which files can be accessed and searched.

Page 20: Data Warehouse Concepts

Questions ?Questions ?

Page 21: Data Warehouse Concepts

Thank You !Thank You !