12
Reviewing features of Data Warehouse Architectures: A Taxonomy Qishan Yanga and Tai Mai Linz, Austria June 11, 2018

Reviewing features of Data Warehouse …doras.dcu.ie/22429/1/Research_Day_Presentation_Qishan...Data warehouse architectures typically contain numerous features related to various

  • Upload
    others

  • View
    9

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Reviewing features of Data Warehouse …doras.dcu.ie/22429/1/Research_Day_Presentation_Qishan...Data warehouse architectures typically contain numerous features related to various

Reviewing features of Data Warehouse Architectures: A Taxonomy

Qishan Yanga and Tai Mai Linz, Austria

June 11, 2018

Page 2: Reviewing features of Data Warehouse …doras.dcu.ie/22429/1/Research_Day_Presentation_Qishan...Data warehouse architectures typically contain numerous features related to various

SEITE | 2

A data warehouse (DWH) is a subject oriented, integrated, non-volatile and time-variant collection of data that supports the capability of the decision-making in organisations [1]. Through decades of development and innovation, their architectures have been extended to the variety of derivatives for achieving different either business or technical requirements of organisations.

Introduction

Page 3: Reviewing features of Data Warehouse …doras.dcu.ie/22429/1/Research_Day_Presentation_Qishan...Data warehouse architectures typically contain numerous features related to various

SEITE | 3

Problem Statement

Data Flow Staging Area

Confused!!!

Presenter
Presentation Notes
Data warehouse architectures typically contain numerous features related to various essential aspects of a DWH such as its components, functions and usability. This chaotic situation may create a huge confusion for experts and system architects what should be considered when design, building, evaluating and improving a proper DWHA.
Page 4: Reviewing features of Data Warehouse …doras.dcu.ie/22429/1/Research_Day_Presentation_Qishan...Data warehouse architectures typically contain numerous features related to various

SEITE | 4

Motivation

To the best of our knowledge, there is no effort from the literature which collects and classifies features for data warehouse architecture evaluations generally and systematically.

According to the problem and current situation, it motivates us to investigate DWHA features, which will benefit DWHA evaluations for different perspectives. This research conducts a systematical literature review for data warehouse architecture feature collection and classification.

Page 5: Reviewing features of Data Warehouse …doras.dcu.ie/22429/1/Research_Day_Presentation_Qishan...Data warehouse architectures typically contain numerous features related to various

SEITE | 5

Theoretical Background

The Data Warehouse Architecture is a principled system with fundamental

properties and rational relationships for data manipulations, storage and analyses,

which includes the Source Layer, ETL Layer, Data Warehouse Layer and Data

Presentation Layer.

[4]

[2] [3]

A feature of the DWHA is a distinguishing characteristic to reflect an attribute or component to constitute an important

portion of the DWHA

[5]

[6]

…for advanced analysis of integrated data…an

(EDS), and (ETL), a data warehouse (DW), and an (OLAP)

Presenter
Presentation Notes
The Data Warehouse Architecture is the structure that brings all the components of a data warehouse together, it has three major areas: the data acquisition, data storage, and information delivery [2]. The data warehouse architecture is based on a data-layering concept and can be identified three distinct structures: the single-layer architecture, the two-layer architecture and three-layer architecture [3]. A data warehouse architecture (DWHA) has been developed for the purpose of integrating data from multiple heterogeneous, distributed, and autonomous external data sources (EDSs) as well as for pro- viding means for advanced analysis of integrated data. The major components of this architecture include: an external data source (EDS) layer, and extraction-transformation-loading (ETL) layer, a data warehouse (DW) layer, and an on-line analytical processing (OLAP) layer [4]. The Institute of Electrical and Electronics Engineers defines the term feature in IEEE 829 as "A distinguishing characteristic of a software item (e.g., performance, portability, or functionality) [5]. The definition of the feature from [6] is "a typical quality or an important part of something". It has a variety of synonyms:characteristic, attribute, quality, property, trait, mark, hallmark, trademark; aspect,, facet, side, point, detail, factor, ingredient, component, constituent, element, theme; peculiarity, idiosyncrasy, quirk, oddity.
Page 6: Reviewing features of Data Warehouse …doras.dcu.ie/22429/1/Research_Day_Presentation_Qishan...Data warehouse architectures typically contain numerous features related to various

SEITE | 6

Features of data warehouse architectures can be categorised into several classifications to provide guidelines for better requirement understanding and efficient evaluations from different perspectives.

Hypothesis

What requirements I

should consider?

I only care about

components of the DWHA.

Page 7: Reviewing features of Data Warehouse …doras.dcu.ie/22429/1/Research_Day_Presentation_Qishan...Data warehouse architectures typically contain numerous features related to various

SEITE | 7

The systematic literature review is processed based on [7, 8]. 1. Research question 2. Identify concepts for searching 3. Searching

i. Database: IEEE, dblp, Google Scholar. ii. Year range: 2008 - 2018. iii. Keywords: “ Data Warehouse Architecture, feature and its synonyms” iv. Identification is made by reading the title and abstract to distinguish

potentially eligible for inclusion. 4. Features extraction.

i. Collect features mentioned in each selected paper ii. Remove duplicated features

5. Features classification.

Research Method

Page 8: Reviewing features of Data Warehouse …doras.dcu.ie/22429/1/Research_Day_Presentation_Qishan...Data warehouse architectures typically contain numerous features related to various

SEITE | 8

Initial Results

user meta data ETL management cycle OLAP external data

operational data data flow schema functionality internal data data mart information interdependence

interface security urgency cost unstructured data data age strategic

respond time team skill integration area view of the data warehouse usability data storage

component direct cost

information delivery component efficiency data access tool distributed operational data

quality compatibility with existing systems size

business requirement technical issues reliability availability development skill computing budget portability

information systems vendor stability labor usage vendor reputation vendor experience transformation tool development approach

vendor support DW administration warehouse engine data model project team data staging component data quality check

indirect cost lifecylce of data operational system operational data store

source of sponsorship query database support

247 features are abstracted, collected. Then, duplicated features are deleted and 148 unique features are summarised. Some of them are tabulated blow.

Page 9: Reviewing features of Data Warehouse …doras.dcu.ie/22429/1/Research_Day_Presentation_Qishan...Data warehouse architectures typically contain numerous features related to various

SEITE | 9

These features are categoried into six groups as the data warehouse architecture feature taxonomy.

Initial Results

Features

Technical

development technology

Informational meta data

Competent ETL

Functional Query

Componential …

Organisational …

Page 10: Reviewing features of Data Warehouse …doras.dcu.ie/22429/1/Research_Day_Presentation_Qishan...Data warehouse architectures typically contain numerous features related to various

SEITE | 10

This research conducts a systematical literature review for data warehouse architecture feature collection and classification. The contributions are summarised below: It organises and provids features for people who want to investigate or

evaluate data warehouse architectures. Various features are provided, which would benefit the requirement

collection. A taxonomy of DWHA feature will be porposed as a guideline for

further evaluation (To-Do).

Conclusion

Page 11: Reviewing features of Data Warehouse …doras.dcu.ie/22429/1/Research_Day_Presentation_Qishan...Data warehouse architectures typically contain numerous features related to various

SEITE | 11

[1] Inmon, W.H., 1997. What is a data warehouse?. Prism Tech. Topic 1(1). [2] Laney D, Warehouse factors to address for success, HP Professional 14 (5) (2000). [3] Crane G. MacGregor-I. Meyer S. Chan, A. and 2001. WWW-based Accelerator Data Warehouse. Sass, R. [4] Wrembel, R., 2011. On handling the evolution of external data sources in a data warehouse architecture. In Integrations of data warehousing, data mining and database technologies: innovative approaches (pp. 106-147). IGI Global. [5] ANSI\/IEEE STD 829-1983, 1983. Standard for Software Test Documentation. Institute of Electrical and Electronics Engineers, New York. [6] Cambridge Dictionary, (2017) Meaning of “feature” in the English Dictionary. Available at: https://en.oxforddictionaries.com/definition/feature [Accessed 18 May 2017]. [7] Okoli, C. and Schabram, K., 2010. A guide to conducting a systematic literature review of information systems research. [8] Webster, J. and Watson, R.T., 2002. Analyzing the past to prepare for the future: Writing a literature review. MIS quarterly, pp.xiii-xxiii.

References

Page 12: Reviewing features of Data Warehouse …doras.dcu.ie/22429/1/Research_Day_Presentation_Qishan...Data warehouse architectures typically contain numerous features related to various

SEITE | 12

Thanks For Your Attention!