12
Data Warehouses What are they and how will they benefit your organization? A White Paper by Guident Technologies, Inc. Adam Getz Business Intelligence Architect December, 2006 Guident Technologies, Inc. || 198 Van Buren Street, Suite 120, Herndon, VA 20170 Ph.:703-326-0888 – Fax: 703-326-0677 || email: [email protected] || web: www.guident.com

Data Warehouseshosteddocs.ittoolbox.com/AG082807.pdf · effective decision making. ... In a nutshell, data warehouses are read-only, integrated databases designed to provide insight

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Data Warehouseshosteddocs.ittoolbox.com/AG082807.pdf · effective decision making. ... In a nutshell, data warehouses are read-only, integrated databases designed to provide insight

Data Warehouses What are they and how will they benefit your organization?

A White Paper by Guident Technologies, Inc. Adam Getz Business Intelligence Architect December, 2006

Guident Technologies, Inc. || 198 Van Buren Street, Suite 120, Herndon, VA 20170 Ph.:703-326-0888 – Fax: 703-326-0677 || email: [email protected] || web: www.guident.com

Page 2: Data Warehouseshosteddocs.ittoolbox.com/AG082807.pdf · effective decision making. ... In a nutshell, data warehouses are read-only, integrated databases designed to provide insight

© 2006 Guident 1

Introduction

Since the early 1990s, data warehouses have been at the forefront of computer applications as a way for management, executives, and business users to effectively use organizational data for decision support and organizational planning. Since that time, data warehouses have been designed and built as separate technology entities from operational and transactional systems and have become the primary source for transforming the masses of data stored in business systems into meaningful information used for effective decision making. This white paper provides a broad understanding of data warehouse concepts and describes a business case for why data warehouse implementations have benefited many organizations. What is a Data Warehouse?

Data warehouses are centralized data repositories that integrate data from various transactional, legacy, or external systems, applications, and sources. Data warehouses have become the principal source for managers and decision makers to easily and rapidly access information to answer questions about their organizations. In a nutshell, data warehouses are read-only, integrated databases designed to provide insight into past organizational performance and project future results. Unlike operational databases that are designed to manage transactions and are accurate as of the last transaction, data warehouses are analytical, subject-oriented, and are designed to read transactions as a snapshot in time. As such, the data warehouse is deigned and optimized specifically for the retrieval and analysis of data.

Technically, the data warehouse provides an environment separate from the operational systems for analytical reporting and ad hoc query support. This enables complex queries to be performed without any impact on the systems that support the business’ primary transactions.

Fundamentally data warehouses can be defined as a central repository of data that integrates organizational data collected from various disparate corporate systems. The data warehouse is organized and optimized for retrieval and analysis of data and provides managers and executives a single view of the truth. By converting operational and transactional data into enterprise information, the data warehouse enables optimal decision making.

Further, the data warehouse allows for organizational barriers to be broken, as distributed information is consolidated from various sources. Well-built data warehouses include coordination, architecture, and periodic migration of data from transactional and operational systems into an environment optimized for business intelligence, decision support, and information retrieval. A key component of this architecture is the extraction, transformation and loading (ETL) of data into the data warehouse. This ETL or data integration component is used to extract data from various sources and to transform it into an integrated model that is most efficient for information analysis.

In summary, data warehouses transform data from software systems within the operational environment, and form the basis for analyzing data within the business intelligence environment. The true value of a data warehouse is realized when this information is utilized to improve business processes.

Guident Technologies, Inc. || 198 Van Buren Street, Suite 120, Herndon, VA 20170 Ph.:703-326-0888 – Fax: 703-326-0677 || email: [email protected] || web: www.guident.com

Page 3: Data Warehouseshosteddocs.ittoolbox.com/AG082807.pdf · effective decision making. ... In a nutshell, data warehouses are read-only, integrated databases designed to provide insight

© 2006 Guident 2

Figure 1: Basic Data Warehouse Components Classifications of Data Warehouses

Although the size, complexity, and magnitude of data warehouses will be tailored to each organization’s unique needs, requirements, schedule, budget constraints, available resources, and technology infrastructure, there are basically two types of data warehouses that organizations will build and maintain:

Enterprise Data Warehouse: Strategic and broad in nature, the enterprise data warehouse is typically a large organization-wide implementation that crosses over every business function and includes data elements from every organizational unit and department. The enterprise data warehouse contains a broad range of related subject areas, and includes every data element that an organization needs to broadly analyze information. Data entities and fields from all organizational units and departments are all consolidated and converted into a single central repository. Business units such as sales, marketing, operations, accounting, and customer support are typically all involved and all work with each other to centralize the analysis of all of their disparate data. Data is converted into standard formats to improve methods of analysis across organizational units, enhanced organizational data quality, consistent results, and a broad overview organizational efficiency.

Guident Technologies, Inc. || 198 Van Buren Street, Suite 120, Herndon, VA 20170 Ph.:703-326-0888 – Fax: 703-326-0677 || email: [email protected] || web: www.guident.com

Page 4: Data Warehouseshosteddocs.ittoolbox.com/AG082807.pdf · effective decision making. ... In a nutshell, data warehouses are read-only, integrated databases designed to provide insight

© 2006 Guident 3

Enterprise data warehouse implementations tend to be quite large as they can contain data volumes involving hundreds of gigabytes. The implementations are usually technology driven, affect multiple organizational units and departments, and can have lengthy development schedules. The business impact of enterprise data warehouses tend to be high as multiple organizational units are affected and a broad perspective of the entire organization can be provided.

Data Mart: Designed for more tactical and quick-strike purposes, the data mart is usually a repository of data for one business function or organizational unit in order to answer a specific set of business questions within relatively narrow confines. The data mart contains data feeds from a minimal number of source systems, and is focused on a rapid development and deployment schedule. Many times, a data mart will serve as the reporting and analytical solution for a particular department within an organization, such as accounting, accounts receivable, sales, customer support, or marketing. These departments will design the data mart with enough data entities and fields to be able to analyze data for their own unit’s needs.

Data mart implementations are typically focused on solving a particular business issue or meeting an individual department’s needs. These solutions usually have a rapid implementation schedule and include a manageable volume of data. Further, data residing in data marts can be converted from either an enterprise data warehouse or directly from a data warehouse.

Figure 2: Enterprise Data Warehouses / Data Mart Components

Guident Technologies, Inc. || 198 Van Buren Street, Suite 120, Herndon, VA 20170 Ph.:703-326-0888 – Fax: 703-326-0677 || email: [email protected] || web: www.guident.com

Page 5: Data Warehouseshosteddocs.ittoolbox.com/AG082807.pdf · effective decision making. ... In a nutshell, data warehouses are read-only, integrated databases designed to provide insight

© 2006 Guident 4

Business Intelligence Environment

Once data has been migrated into the organization’s large enterprise data warehouse or into the multitude of smaller data marts designed for a particular business function, analytics can be performed to convert the raw data into formats that are useful for making decisions. In contract to operational systems, which are targeted to efficiently conduct the transactions of the organization, the business intelligence environment focuses on providing access to large amounts of data to assist the organization in making better business decisions. Hence, business intelligence environments consist of a suite of software applications that enable business users, management, and executives of organizations to gain a better understanding of the data they have within their organizations, and gives them the information they need to make informed decisions.

Business intelligence software includes the tools that change raw data into information and provide a mechanism for making knowledgeable decisions.

Business Reports: Primarily focusing on display and organization of data, business reports are richly formatted methods to display data with rich presentation and within a structured layout. Business reports are typically developed by information technology personnel and/or knowledgeable business exports that understand the underlying database structure and can create templates for a repeatable method of retrieving data. These types of tools enable organizations to present data in a formal and logical manner, execute and publish data on a regular schedule, and create robust organized listings of data. Further, business reports can be called upon and displayed from a variety of sources, and can be easily integrated into corporate client-server and web applications.

Figure 3: Example of a Business Report

Guident Technologies, Inc. || 198 Van Buren Street, Suite 120, Herndon, VA 20170 Ph.:703-326-0888 – Fax: 703-326-0677 || email: [email protected] || web: www.guident.com

Page 6: Data Warehouseshosteddocs.ittoolbox.com/AG082807.pdf · effective decision making. ... In a nutshell, data warehouses are read-only, integrated databases designed to provide insight

© 2006 Guident 5

Query and Analysis: Used primarily by business users, query and analysis tools provide an application environment that enable interactive methods to query data, present data in an ad-hoc manner, and to find information on an as-needed basis. These tools typically provide intuitive interfaces with powerful query features. With minimal understanding of database structures, business users can rapidly generate queries of data and can analyze key indicators on-demand. Most often, query and analysis tools include a versatile semantic layer that converts database conventions into business terminology and logic. Controlled and secure access to information is ensured while understanding of business data can be made in a timely manner.

Figure 4: Example of Query and Analysis

Guident Technologies, Inc. || 198 Van Buren Street, Suite 120, Herndon, VA 20170 Ph.:703-326-0888 – Fax: 703-326-0677 || email: [email protected] || web: www.guident.com

Page 7: Data Warehouseshosteddocs.ittoolbox.com/AG082807.pdf · effective decision making. ... In a nutshell, data warehouses are read-only, integrated databases designed to provide insight

© 2006 Guident 6

Performance Management: With the use of dashboards, scorecards, and alerts, performance management tools provide a graphical interface and real-time methods to monitor organizational metrics and key performance indicators. These tools are primarily used by managers and executives, provide proactive insight into organizational efficiency, and enable instant visibility to critical data thresholds. The dashboards within performance management environments are intuitive, easily personalized, and can notify decision makers when business metrics approach and exceed accepted ranges and targets. Further, performance management enables team consensus as it typically includes tools for collaboration and workflow. Performance management tools typically provide a mechanism for notification of exception situations and optimally control the efficiency of the organization.

Figure 5: Example of Performance Management

Guident Technologies, Inc. || 198 Van Buren Street, Suite 120, Herndon, VA 20170 Ph.:703-326-0888 – Fax: 703-326-0677 || email: [email protected] || web: www.guident.com

Page 8: Data Warehouseshosteddocs.ittoolbox.com/AG082807.pdf · effective decision making. ... In a nutshell, data warehouses are read-only, integrated databases designed to provide insight

© 2006 Guident 7

OLAP (On line Analytical Processing): Dynamic in nature, OLAP tools provide a high degree of variability into the user interaction model and organize data into information that can be rapidly viewed from many perspectives. The main distinguishing features of OLAP is the enablement of users to perform multi-dimensional processing and to query and view data in a variable number of view points. In the core of these systems is a concept of an OLAP cube that consists of numeric facts, called measures, which are categorized by text values, called dimensions. Primary features of OLAP tools that differentiate them from other business intelligence tools include drill-up/drill-down analysis, reach-through analysis, data pivoting, and trending. OLAP tools provide advanced insight into past performance of an organization and enable a deep understanding of the reasons behind why prior events have occurred in the manner that they did occur.

Figure 6: Example of OLAP

Guident Technologies, Inc. || 198 Van Buren Street, Suite 120, Herndon, VA 20170 Ph.:703-326-0888 – Fax: 703-326-0677 || email: [email protected] || web: www.guident.com

Page 9: Data Warehouseshosteddocs.ittoolbox.com/AG082807.pdf · effective decision making. ... In a nutshell, data warehouses are read-only, integrated databases designed to provide insight

© 2006 Guident 8

Data mining: Using advanced statistical techniques models and algorithms and sophisticated data search capabilities, data mining applications discover patterns and relationships in large volumes of data and predict future results. These tools identify trends within data that go beyond simple analysis, extrapolate on past performance to forecast potential outcomes, and identify key attributes of business processes and target opportunities. The discovery orientation of these tools provides answers to questions that have not been asked and demonstrates correlation strength between data elements. Further, the predictive features of these tools enable organizations to exploit useful patterns in massive data volumes.

Figure 7: Data Mining Examples

Guident Technologies, Inc. || 198 Van Buren Street, Suite 120, Herndon, VA 20170 Ph.:703-326-0888 – Fax: 703-326-0677 || email: [email protected] || web: www.guident.com

Page 10: Data Warehouseshosteddocs.ittoolbox.com/AG082807.pdf · effective decision making. ... In a nutshell, data warehouses are read-only, integrated databases designed to provide insight

© 2006 Guident 9

Benefits of a Data Warehouse

Benefits realized and value gained from the successful implementation of a data warehouse and its related business intelligence environment are numerous and substantial. These benefits will more than justify the financial investment and resource commitment that organizations will make. Upon completion of the construction of a data warehouse, organizations will see immediate and long-term positive gains. In addition to a high return on investment, organizations will benefit from enhanced business decisions, timely access to data, consistency of data, and increased system performance.

Return on Investment (ROI): ROI refers to the amount of increased revenue or decreased expense an organization will realize from any given use of money. Implementations of data warehouses and analytical applications have been found to provide substantial cost savings for organizations and have positive affects towards an organization’s financial “bottom line.” According to a 2002 International Data Corporation (IDC) study “The Financial Impact of Business Analytics”, analytics projects have been achieving a substantial impact on an organization’ financial state. The study found that business analytics implementations have generated a median five-year return on investment of 112% with a mean payback of 1.6 years. Of the organizations included in the study, 54% have had a return on investment of 101% or more.

Enhanced Business Decisions: Decisions that affect the strategy and operations of organizations will be based upon credible facts and will be backed up with evidence and data encapsulated within the organization. Managers and executive will be freed from making their decisions based on limited data and their own “gut feelings”. Moreover, decision makers will be well-informed as they will be able to query actual organizational data, and will retrieve highly organized information tailored to their personal needs. Insights gained through improved information access can be applied directly to business processes to have an immediate impact on business results. In addition, data warehouses and related business intelligence have been used to improve marketing processes (e.g. campaign management), inventory management, financial management and sales processes.

Timely Access to Data: Organizations will have access to data from many different sources as they need it, will spend little time in the retrieval process, and will query and analyze data as they need to. Scheduled routines, known as ETL, are set up within the data warehouse environment to assemble data from the disparate source systems and transform the data into a format that is useful for query and analysis. Decision makers can easily access data from one interface and will no longer need to compile data from multiple locations. Further, business users will be able to query data directly with less information technology support. The waiting time for information technology professionals to develop reports and queries is greatly diminished, as the business users are given the ability to generate reports and queries on their own. The use of query and analysis tools against a consistent and consolidated data source enables managers and analysts to spend more time performing valuable analyses and less time gathering data.

Consistency of Data: Data will be consolidated from many disparate source systems and converted into a standard format. Nomenclature and data formats in the various organizational units and departments will be standardized throughout the enterprise, and the inconsistent nature

Guident Technologies, Inc. || 198 Van Buren Street, Suite 120, Herndon, VA 20170 Ph.:703-326-0888 – Fax: 703-326-0677 || email: [email protected] || web: www.guident.com

Page 11: Data Warehouseshosteddocs.ittoolbox.com/AG082807.pdf · effective decision making. ... In a nutshell, data warehouses are read-only, integrated databases designed to provide insight

© 2006 Guident 10

of data within the disparate operational systems will be removed. In addition, all organizational units such as sales, marketing, and operations, will use the same data repository as the source for their individual queries and analysis providing a single version of the truth. Thus each of these individual organizational units will produce results that are consistent with the other organizational units within the enterprise increasing the overall confidence in the resulting information.

System Performance: Data warehousing environments are built and organized with speed of data retrieval and analysis as its primary focus. The underlying structure is optimized for storing large volumes of data and being able to query in rapid fashion. The systems are engineered differently from operational systems which focus on processing transactions. In contrast, the data warehouse is built for optimal analysis and retrieval of data rather than efficient creation and modification of data. Further, the data warehouse allows for a large system burden to be taken off the operational environment and effectively distributes system load across the enterprise’s technology infrastructure.

Key Terms

• Business Intelligence: Application programs used to gather, store, and analyze, and provide access to data to assist organizations in making better business decisions. Business intelligence applications include the activities of report development, query and analysis, performance management, online analytical processing (OLAP), and data mining.

• Business Reports: Highly formatted methods to display data with rich presentation and within a structured layout.

• Data mart: Repository of data used to analyze information for one organizational unit, business function, and/or department that typically answers a specific set of business questions within narrow confines.

• Data mining: Software applications that use advanced statistical techniques models and algorithms and sophisticated data search capabilities to discover patterns and relationships in large volumes of data and predict future results for the organization.

• Enterprise data warehouse: Central repository of data for an entire organization that converts data collected from many disparate corporate systems into a format that is organized and optimized for retrieval and analysis of data.

• ETL (Extraction, Transformation, & Loading): Data integration processes that include extracting data from their operational or external data sources, transforming the data into an appropriate format, and loading the data into a data warehouse repository.

• Query and Analysis: Application environment that enables interactive methods to query data, present data in an ad-hoc manner, and to find information on an as-needed basis

• OLAP (Online Analytical Processing): Software applications that organize data in an enterprise data warehouse or data mart into information that can be rapidly viewed from many perspectives and multiple dimensions.

Guident Technologies, Inc. || 198 Van Buren Street, Suite 120, Herndon, VA 20170 Ph.:703-326-0888 – Fax: 703-326-0677 || email: [email protected] || web: www.guident.com

Page 12: Data Warehouseshosteddocs.ittoolbox.com/AG082807.pdf · effective decision making. ... In a nutshell, data warehouses are read-only, integrated databases designed to provide insight

© 2006 Guident 11

• Performance Management: Graphical user interface and real-time methods, including dashboards, scorecards, and alerts that monitor organizational metrics and key performance indicators.

About Guident

Guident is a leading information technology services firm providing enterprise-consulting services to clients in the federal government and commercial industry sectors. Guident specializes in building and implementing solutions that provide rapid, sustainable value, in the areas of Business Intelligence Solutions, Oracle Solutions, and Systems Integration. Guident was founded in 1996 based on the beliefs in providing the disciplined methodologies of the "Big Four" consultancies with the flexibility and cost benefits of a smaller practice unit.

For more information visit www.guident.com

Guident Technologies, Inc. || 198 Van Buren Street, Suite 120, Herndon, VA 20170 Ph.:703-326-0888 – Fax: 703-326-0677 || email: [email protected] || web: www.guident.com