24
DEVELOPMENT OF REAL-TIME OLAP ALGORITHM USING MULTICORE DISTRIBUTED PROCCESSING BY HAYTHAM I. M. ALZEINI A dissertation submitted in fulfilment of the requirement for the degree of Master of Science in Computer and Information Engineering Kulliyyah of Engineering International Islamic University Malaysia AUGUST 2014

DEVELOPMENT OF REAL-TIME OLAP ALGORITHM USING …

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

DEVELOPMENT OF REAL-TIME OLAP

ALGORITHM USING MULTICORE DISTRIBUTED

PROCCESSING

BY

HAYTHAM I. M. ALZEINI

A dissertation submitted in fulfilment of the requirement

for the degree of Master of Science in Computer and

Information Engineering

Kulliyyah of Engineering

International Islamic University Malaysia

AUGUST 2014

ii

ABSTRACT

Online analytical processing (OLAP) is becoming increasingly essential technique,

particularly for decision support systems (DSS). OLAP is considered a suitable

technology for online analysis, in comparison to its counterpart: Online transaction

processing (OLTP), due to the fact that OLAP offers instantaneous answers to the

immediate queries that decision makers urgently need to make their decisions at some

critical moments based on the latest updates of the warehouse. However, despite its

speed processing capabilities; OLAP does not satisfy stringent Real-Time

applications’ requirements. Rather, current OLAP approaches the Real-Time. In other

words OLAP can achieve partial Real-Time results and the reset is materialized. Our

study addresses this shortcoming and attempts to propose a novel solution taking

advantage of revolutionary hardware development on two levels; namely, the multi-

core processors as well as distributed heterogeneous systems processing. This new

approach exploits the hardware resources optimally, and as a result; significantly

increases the processing speed. Our results have shown gain from 350% to 1200% in

terms of response time compared to our benchmark in which multi-core CPU only has

been utilized. In addition, the results have shown a propositionally increased gain with

increasing size of data due to the fact that the Graphical Processing Unit (GPU)

becomes more dominant component in the searching process as the data size

increases. We argue that with the new results, the heterogeneous solution is a very

strong candidate to our Real-Time OLAP problem

iii

ملخص البحث

أصبحت تقنية أساسية خصوصاً في أنظمة دعم (OLAP) كةالمعالجة التحليلية عبر الشبالقرار. المعالجة التحليلية عبر الشبكة هي تكنولوجيا مناسبة من أجل التحليل عبر الشبكة

ذلك نظراً للحقيقة القائلة بأن .(OLTP) مقارنة بنظيرتها معالجة التحويلية عبر الشبكةلحظية للاستعلامات الحالية التي يحتاجها صناع المعالجة التحليلية عبر الشبكة تقدم أجوبة

القرار بشكل عاجل لبناء قراراتهم في اللحظات الحرجة متظمنة كل تحديثات مخزن البياناتعلى الرغم من إمكانيات السرعة التي تتمتع بها المعالجة التحليلة عبر الشبكة، إلا أنها لا تحقق

دراستنا تناقش .ولكنها تقارب الزمن الحقيقيمتظلبات تطبيقات الزمن الحقيقي كاملة. العيوب والمحاولات التي قدمت من أجل تقديم حل جديد يأخذ في الحسبان التطور الثوري الطارئ على العتاد الصلب على مستويين اثنين: أولًا، المعالجات المتعددة الأنوية، ونظم

اد الصلب بشكل أمثل والنتائج تزيد المعالجة الهجينة. المقاربة الجديدة تستغل مصادر العت %0033على %053نتائج البحث أظهر ربح يعادل بشكل ملحوط سرعة الاستجابة

من حيث زمن الاستجابة مقارنة بالعمل السابق المعتمد فقط على المعالجات متعددة الأنوية. انات نتيجة بالإضافة إلى ذلك، أظهرت النتائج تزايد نسبي للربح طراً مع زيادة حجم البي

كنتيجة لذلك، . على عملية المعالجة والبحث GPU لزيادة سيطرة وحدة معالجة الصورفإننا نجادل في ضوء النتائج الجديدة أن استخدام الأنظمة الهجينة هي مرشح قوي لحل

.لية عبر الشبكة في الزمن الحقيقيمشكلة المعالجة التحلي

iv

APPROVAL PAGE

I certify that I have supervised and read this study and that in my opinion, it conforms

to acceptable standards of scholarly presentation and is fully adequate, in scope and

quality, as a dissertation for the degree of Master of Science (Communications

Engineering)

…………………………………..

Shihab A. Hameed

Main Supervisor

………………………………….

Mohamed H. Habaebi

Co-Supervisor

I certify that I have read this study and that in my opinion it conforms to acceptable

standards of scholarly presentation and is fully adequate, in scope and quality, as a

dissertation for the degree of Master of Science (Communications Engineering)

…………………………………..

Aisha Hassan Abdalla Hashim

Internal Examiner

…………………………………..

Azizah Binti Abdul Manaf

External Examiner

This dissertation was submitted to the Department of Electrical and Computer

Engineering and is accepted as a fulfilment of the requirement for the degree of

Master of Science (Communications Engineering)

…………………………………..

Othman O. Khalifa

Head, Department of Electrical

and Computer Engineering

This dissertation was submitted to the Kulliyyah of Engineering and is accepted as a

fulfilment of the requirement for the degree of Master of Science (Communications

Engineering)

…………………………………..

Md. Noor B. Salleh

Dean, Kulliyyah of Engineering

v

DECLARATION

I hereby declare that this dissertation is the result of my own investigation, except

where otherwise stated. I also declare that it has not been previously or concurrently

submitted as a whole for any other degrees at IIUM or other institutions.

Haytham I. M. Alzeini

Signature…………………. Date …..................

vi

INTERNATIONAL ISLAMIC UNIVERSITY MALAYSIA

DECLARATION OF COPYRIGHT AND AFFIRMATION

OF FAIR USE OF UNPUBLISHED RESEARCH

Copyright ©2014 by International Islamic University Malaysia. All rights reserved.

HYBRID SPECTRUM SENSING USING ENERGY DETECTOR

AND CYCLOSTATIONARY FEATURE DETECTION WITH

WIRELESS DISTRIBUTED COMPUTING CONCEPT

No part of this unpublished research may be reproduced, stored in a retrieval system,

or transmitted, in any form or by any means, electronic, mechanical, photocopying,

recording or otherwise without prior written permission of the copyright holder except

as provided below.

1. Any material contained in or derived from this unpublished research may

be used by others in their writing with due acknowledgement.

2. IIUM or its library will have the right to make and transmit copies (print

or electronic) for institutional and academic purposes.

3. The IIUM library will have the right to make, store in a retrieval system

and supply copies of this unpublished research if requested by other

universities and research libraries.

Affirmed by Haytham I. M. Alzeini

……..……..…………… …………………..

Signature Date

vii

ACKNOWLEDGEMENTS

This work would not have been possible without the support and encouragement of

my supervisor Dr. Shihab A. Hameed and my Co-supervisor Dr. Mohamed H.

Habaebi. I would like to thank them both for guiding and helping me throughout the

research from the early beginning to the end.

I would like to thank all my teachers for all knowledge they provided during the

course work period. I emphasize that such a knowledge has added a highly valuable

touch to my research. Special thanks for Dr. Aisha Hassan Abdalla who enlighten the

way of research for me.

I cannot end without thanking my amazing parents, Mr. Ibrahim Alzeini and Mrs.

Zubaida Alkurd who taught the alphabet and counting from one to ten. Every

countable knowledge I have obtained ever since is just built upon that. Finally, my

thanks to my beloved Sharon Jones, whose opinions were always helpful, encouraging

and supporting for the past eight months of hardworking.

viii

TABLE OF CONTENTS

Abstract .......................................................................................................................... ii Abstract in Arabic ......................................................................................................... iii Approval page ............................................................................................................... iv Declaration ..................................................................................................................... v

Copyright Page .............................................................................................................. vi Acknowledgement ....................................................................................................... vii

List of Tables ................................................................................................................ xi

List of Figures .............................................................................................................. xii List of Abbreviations ................................................................................................... xv

CHAPTER ONE: INTRODUCTION ........................................................................ 1

1.1 OLAP Technology ........................................................................................ 1

1.1.1 OLAP History .................................................................................... 2

1.1.2 OLAP vs. OLTP ................................................................................. 4

1.2 Problem Statement and Motivation ............................................................ 10

1.3 Research Scope ........................................................................................... 11

1.4 Study Objectives ......................................................................................... 11

1.5 Research Methodology ............................................................................... 12

1.6 Contributions .............................................................................................. 14

1.7 Thesis Breakdown ...................................................................................... 14

CHAPTER TWO: LITERATURE REVIEW ......................................................... 16

2.1 Overview..................................................................................................... 16

2.2 Materialization ............................................................................................ 16 2.2.1 Cube and Sub-Cube Construction .................................................... 17

2.2.2 Compression ..................................................................................... 19

2.2.3 View Selection ................................................................................. 19

2.2.4 Distributed OLAP ............................................................................ 20

2.3 Real-Time OLAP ........................................................................................ 26

2.3.1 Multi-core OLAP Processing ........................................................... 27

2.3.2 GPU OLAP Processing .................................................................... 28

2.3.3 Heterogeneous OLAP Processin ...................................................... 28

2.3.3.1 General Performance Enhancement ...................................... 29

2.3.3.2 OLAP Cube Creation ............................................................ 31

2.3.3.3 OLAP Queries Improvement ................................................. 32

2.3.3.4 OLAP Memory Ameliorating ................................................ 33

2.3.4 Sequential OLAP Processing ........................................................... 33

2.3.4.1 Intensive Computing ............................................................. 34

2.3.4.2 Multidimensional Complex Patterns ..................................... 35

2.4 OLAP Stage of the Art Summary ............................................................... 36

2.5 String Search Algorithms ........................................................................... 39 2.6 Aho-Corasick Algorithm ............................................................................ 40

2.6.1 Aho-Corasick Time Complexity ...................................................... 40

ix

2.7 Boyer-Moor Algorithm ............................................................................... 41 2.7.1 Shift Rules ........................................................................................ 41

2.7.1.1 Bad Character Shifting Rule .................................................. 42

2.7.1..2 Good Suffic Shifting Rule .................................................... 42

2.7.2 Boyer-Moore Time Complexity ....................................................... 44

2.8 Knuth-Morris-Pratt Algorithm ................................................................... 44 2.8.1 Knuth-Morris-Pratt Time Complexity ............................................. 46

2.9 Rabin-Karp Agorithm ................................................................................. 47

2.9.1 Rabin-Karp Time Complexity.......................................................... 47

2.10 Summary ................................................................................................... 47 2.10.1 Rabin-Karp Justification ................................................................ 47

CHAPTER THREE: OLAP SERVER SCHEMA DESIGN ................................. 53 3.1 Overview..................................................................................................... 53

3.2 Distributed OLAP System (high level design and workflow) .................... 54 3.2.1 High Level Schema Design .............................................................. 55

3.2.2 High Level Workflow ...................................................................... 57 3.3 Heterogeneous OLAP System (low level design and workflow) ............... 58

3.3.1 Rabin-Karp Algorithm ..................................................................... 60 3.3.2 Optimized Rabin-Karp Algorithm ................................................... 61

3.3.3 Low Level Workflow ....................................................................... 65

3.3.3.1 CPU Tasks Workflow ............................................................ 66

3.3.3.2 GPU Tasks Workflow ........................................................... 68

3.3.3.3 Upper and Lower Threshohlds .............................................. 68

3.4 Summary ..................................................................................................... 69

CHAPTER FOUR: IMMPLEMENTATION AND RESULTS ............................ 71 4.1 Overview..................................................................................................... 71 4.2 Experimental Set up .................................................................................... 71

4.2.1 Hardware Specifications .................................................................. 72

4.2.1.1 Sony VPCSB36FG ................................................................ 72

4.2.1.2 Dell XPS 8700 ....................................................................... 73

4.2.2 Software Specifications .................................................................... 73 4.2.3 The Code .......................................................................................... 74

4.3 Experiments Flowchart ............................................................................... 74 4.4 Empirical Results ........................................................................................ 75 4.5 Upper and Lower Thresholds ..................................................................... 83

4.5 Summary ..................................................................................................... 85

CHAPTER FIVE: CONCLUSION AND RECOMMENDATIONS .................... 88 5.1 Conclusion .................................................................................................. 88

5.2 Future Work ................................................................................................ 90 5.2.1 Sequential OLAP ............................................................................. 91

5.2.2 OLAP Physical Data Structure......................................................... 91 5.2.3 OLAP and Big Data ......................................................................... 91 5.2.4 ETL Architecture for Real-Time OLAP and Streaming Data ......... 92

x

REFERENCES ........................................................................................................... 93

APPENDIX A: THE CODE ...................................................................................... 101 APPENDIX B: GLOSSARY ..................................................................................... 111 APPENDIX C: LIST OF PUBLICATIONS .............................................................. 112

xi

LIST OF TABLES

Table No. Page No.

1.1 Sample sales spreadsheet

6

1.2 Comparison between OLTP and OLAP systems

9

2.1 State of the art conclusion and criticism

36

2.2 Algorithms comparison from OLAP point of view

49

xii

LIST OF FIGURES

Figure No. Page No.

1.1 OLAP and OLTP interaction 4

1.2 OLAP operations 5

1.3 Sample cube 7

1.4 Three-dimensional cube 8

1.5 Four-dimensional cube 8

1.6 Four-dimensional cube 8

1.7 Building four-dimensional cube

8

1.8 Five-dimensional cube

9

1.9 Research Methodology

13

2.1 Bad Character Shifting match (Boyer-Moore)

42

2.2 Bad Character Shifting Mismatches (Boyer-Moore) 43

2.3 Good Suffix Shifting Matches (Boyer-Moore) 43

2.4 Good Suffix Mismatches (Boyer-Moore) 43

2.5 Boyer-Moore rules example

44

2.6 KMP Algorithm’s First Shift

45

2.7 KMP Algorithm’s Second Shift

45

2.8 KMP Algorithm First Match

46

2.9 KMP Algorithm Second Match

46

3.1 Distributed Enterprise

55

3.2 Distributed OLAP Schema 56

xiii

3.3 Frontend and Backend OLAP Servers Schema 59

3.4 Rabin-Karp Algorithm 60

3.5 Pseudo-code for Optimized Rabin-Karp Algorithm

62

3.6 Quad-Core Intel CPU

63

3.7 Simplified AMD GPU architecture

63

3.8 Interaction between GPU and CPU

64

3.9 Optimized Rabin-Karp algorithm workflow

67

4.1 Response Time Calculation Code 74

4.2 Experiments Flowchart 75

4.3 Phase (1): 4KB patterns recognition time – VPCSB 77

4.4 Phase (2): 8KB patterns recognition time – VPCSB 78

4.5 Phase (3): 16KB patterns recognition time – VPCSB 79

4.6 Phase (1): 4KB patterns recognition time – XPS 8700 80

4.7 Phase (2): 8KB patterns recognition time – XPS 8700 81

4.8 Phase (3): 16KB patterns recognition time – XPS 8700 81

4.9 Threshold β and the comparative gain 84

4.10 Proportional gain of Heterogeneous Rabin-Karp algorithm –

VPCSB

86

4.11 Proportional gain of Heterogeneous Rabin-Karp algorithm – XPS

8700

86

4.12 Achieved Gain Using Heterogeneous RK Algorithm Comparison 87

xiv

LIST OF ABBREVIATIONS

BAA. Blending-As-Aggregation

BIA. Business Intelligence Accelerator.

CPU. Central Processing Unit

CUDA. Computer Unified Device Architecture

DB. Database.

DFA. Deterministic Finite Automaton

DPA. Data Placement Advisor.

DSS. Decision Support System

DW. Data Warehouse

ETL. Extract, Transform, and Load Process

FAR. Fragment Aggregation and Recombination

FSM. Finite State Machine

GA. Genetic Algorithm

GFS. Google Files System

GPU. Graphical Processing Unit

I3DC. Interactive Three-Dimensional Cubes

MPI. Message Passing Interface

MQT. Materialized Query Tables

OLAP. Online Analytical Processing

OLTP. Online Transactional Processing

RDBM. Relational Database Management Systems

BAA. Blending-As-Aggregation

BIA. Business Intelligence Accelerator.

CPU. Central Processing Unit

CUDA. Computer Unified Device Architecture

DB. Database.

DFA. Deterministic Finite Automaton

DPA. Data Placement Advisor.

1

CHAPTER ONE

INTRODUCTION

1.1 OLAP TECHNOLOGY

Online analytical processing (OLAP) by definition is a category of software

technology that enables analysts, managers and executives to gain insight into data

through fast, consistent, interactive access to a wide variety of possible views of

information that has been transformed from raw data to reflect the real dimensionality

of the enterprise as understood by the user. OLAP functionality is characterized by

dynamic multidimensional analysis of consolidated enterprise data supporting end

user analytical and navigational activities including calculations and modeling

applied across dimensions, through hierarchies and/or across members, trend

analysis over sequential time periods, slicing subsets for on-screen viewing, drill-

down to deeper levels of consolidation, rotation to new dimensional comparisons in

the viewing area …etc (OLAP council 2013). OLAP is offering a new approach of

storing dimensional relational database. In which, the approach suggests storing data

in ‘multi-dimensional cubes’ instead of the traditional tables. This gives an advantage

of visualizing data in multi-dimensional manner so data can be seen from different

points of view. In addition, applying four main operations on these cubes (slice, dice,

drill down, roll up) which will be elaborated in the following sections.

OLAP heals several issues in IT fields and offers solutions to many difficulties

are being encountered. These difficulties included the slow query results for online

transactions and the lack of data projection flexibility. The major goal was to reduce

2

the on-fly processing processes by preprocessing every possible query. This allows the

data to show instantaneously whenever the user run one of these queries. In fact, what

enhances this technology is the ability of presenting statistical information, identifying

unusual patterns and exploring trends by analyzing multidimensional cubes of data

interactively. However, due to the high complexity of data structures of business and

accounting sector; digging out to identify complex queries has not been an easy

mission for both IT specialists and ordinary users as well (Chen, Dehne, Eavis, 2008).

Querying becomes even harder when digging down up to more than three or four

levels of related tables.

Nonetheless, OLAP technology has been considered a breakthrough in data

mining field whereby OLAP located under this category of IT along with data mining

and relational database and so forth. Typical OLAP applications includes, but limited

to, reporting, process management, budgeting, market and weather forecasting,

medical applications and intelligence applications in business, finance, healthcare and

military fields. However, we can argue that most, if not all, OLAP application serves

one goal, decision support utilized by decision support systems (DSS) which

encompasses wide range of systems and application that belong to the aforementioned

IT categories (Chen, Dehne, Eavis, 2008).

1.1.1. OLAP History

The seeds of OLAP idea can be seen in early 1960’s when Kenneth Iverson had

introduced the first multidimensional programming language APL (A Programming

language) which offered processing operators and multidimensional variables.

However, due to hardware resource requirements, APL market had declined

3

significantly even though some of its ideas still surface in few modern OLAP tools.

Despite the fact that the term OLAP has come to the world in the mid of 90’s (Gray,

Chaudhuri, Bosworth, Layman, Reichart, Venkatrao, Pellow, Pirahesh, 1997) (1993 to

be specific by the father of the relational database Edgar F. Codd); the first

commercial tool usage can be traced back to the 1970’s when the first market related

product (Express) had been released. Indeed, Oracle9i OLAP by ORACLE is one of

the main Express’ successors. Essbase was the first commercial OLAP product in the

1990’s that has been used under new term usage; the product had been followed by

many products with strong growth in late 90s. SSAS by Microsoft was one of the

main products that had been released in 1998.

By enabling users to smoothly and dynamically manipulate transactional data

in relational database with real multidimensional environments, PowerOLAP had been

considered as big jump had been achieved in 1997 by PARIS Technologies. The new

product has been described as a milestone in OLAP evolution life cycle and a

revolutionary event as it has been utilized by Excel and Web to allow users connecting

in an organization. OLAP@Work Excel Add-In allows users to take full advantage of

OLAP services with all features. In 2004, this feature had gone to the mainstream and

many vendors have launched their own versions concurrently.

Today, OLAP has a wide range of applications in numerous industries.

Regardless the commercial version that is being utilized, the idea of OLAP is still

dominating in data mining field.

4

1.1.2. OLAP vs. OLTP

OLTP (On-line Transaction Processing) is described by a huge number of short on-

line transactions. That includes INSERT, DELET, UPDATE processes which are used

on daily basis to facilitate business applications mostly (Seagat series 2002). OLAP,

on the other hand, is described by relatively low, but complex, number of transactional

queries which usually involve multidimensional aggregations. OLAP is used for long-

term business plans and support decisions making. Figure 1 shows the interaction

process between the two technologies.

Figure 1.1: OLAP and OLTP interaction

On contrast to OLTP – which is integrated with databases; OLAP has a

completely different data structure that is based on cubes rather than traditional tables.

This approach of storing data (cubes) offers an essential advantage of analyzing data

and deliver conclusions by applying a certain cube operations Figure 2 show these

operations which includes drill-down, slice, dice roll-up. Although this mechanism

causes a delay due to the enormous amount of data and cube processing operations;

5

OLAP has an advantage over OLTP in terms of processing results. In order to present

the OLAP vs. OLTP issue clearly; the drawbacks of OLTP in front of OLAP manifest

themselves into the following points (Gray et al, 1997):

i. Limited data storage: retrieving tens of thousands of rows in order to

analysis and present them in short time is a challenge, particularly when

users use the same application at the same time and if each user is

retrieving completely different rows.

ii. Data versus Information: answers are demanded by business industries, an

OLAP application would always give answers based on few-seconds

response analysis, users need data accompanied with meta data,

information.

iii. Data layout: the way which OLAP applications use to store data is to serve

big numbers of questions in parallel. This need to work with aggregated

data in order to answer high-level questions. In other words, to overcome

the drawback of OLTP we better use different techniques rather than waste

more money on bigger and faster databases.

Figure 1.2: OLAP operations

6

All OLAP tools use the same high-level concept to present data, that is:

multidimensional cube, since cubes are easy to imagine and capture in minds. Cubes

are much different from traditional databases, the cube is a conceptual design and at

the same time is a logical design. In order to understand the cube, we apt to compare it

with a database table. Table 1.1 (Pedersen, Jensen, Dyreson, 2001) is a useful tool to

analyze sales data. A pivot table is a two-dimensional spreadsheet with associated

subtotals and totals which enhances complex views to be easier to understand though

it is still a two-dimensional axis. Hence, traditional tables – spreadsheet – are not

sufficient to manage and store multidimensional data since they cannot segregate the

real information – structural view – from the desired view of information.

Multidimensional databases view data as cubes generalize spreadsheets to any number

of dimensions (Pedersen et al, 2001).

Table 1.1 Sample sales spreadsheet

Figure 1.3 shows the same information in the table 1.1 but in cube technique,

in general, a cube enhances viewing two or three dimensions simultaneously;

however, it can show up to four low-cardinality dimensions using nesting, it further

decreases at query time by projecting the data down to 2D or 3D by aggregating the

values, e.g. in order to view sales by city and time.

7

Figure 1.3: Sample cube

Despite the fact that it implies three dimensions; the acronym cube can

theoretically have any number of dimensions. Indeed, most real-world cubes have four

and more dimensions.

Although attempting to imagine a multidimensional cube can be somehow

tough; the understanding of its advantages could help. Now try to visualize that we

have few three-dimensional cubes, each of which has the same structure and we need

another measure, let us assume the day’s trading. Then we need to merge them. To do

so we create a fourth dimension. It is not easy to draw such a cube; however, it is not

difficult to realize the integrity of the design. Most of the literature suggests that we

can just retrieve and work with such cube simply without the need to draw or imagine

the entire cube since most applications present only a two-dimensional view of the

data even when the actual cube is more that three-dimensional cube. Here, few figures

are sufficient to realize how multidimensional cube that can be imagined and drawn.

Figure 4 shows the three-dimensional cube while figures 5, 6 show the same

information with the additional fourth dimension in two ways respectively (Seagat

series 2002). Figures 7 and 8 (Hybercube geometry, 2013) depict the

multidimensional cubes construction in spatial manner.

8

Figure 1.6: Four-dimensional cube

Figure 1.4: Three-dimensional cube Figure 1.5: Four-dimensional cube

Figure 1.7: Building four-dimensional cube

9

Finally, we summarize the differences between OLAP and OLTP by the

following table 1.2 (Mauve, Fuessler, Widmer, Lang, 2011).

Table 1.2: Comparison between OLTP and OLAP systems

Feature OLTP OLAP

Characteristic operational processing informational processing

Orientation transaction analysis

User clerk, DBA, database

professional

manager, executive, analyst

Function data-to-day operations long term informational requirements,

decision support

DB design E-R based, application- star/snowflake, subject-oriented

Figure 1.8: Five-dimensional cube

10

oriented

Data current historical

Summarization primitive, highly detailed summarized, consolidated

View flat relational multidimensional

Unit of work short, simple transaction complex query

Access read/write mostly read

Focus data in information out

DB size 100 MB to GB 100 GB to TB

Priority high performance, availability high flexibility

Metric transaction throughput query throughput, response time

1.2 PROBLEM STATEMENT AND MOTIVATION

Generally, OLAP works with data warehouses with data size of Gigabytes to

petabytes (1015 Byte) in certain instances. Therefore, analyzing and processing such

size requires powerful processing capabilities. Moreover, most of OLAP applications

are time sensitive and critical (medical and finance applications) which entails the

need of very short response time. Having said that, processors, in particular cases;

have to handle billions of rows – reading and processing – in few seconds or less.

Many enhancements have been proposed and these enhancements have stemmed from

different diagnoses, mainly can be divided into two main streams: materialization-

oriented solutions and hardware-oriented solutions. It has been elaborated why

materialization does not meet all our optimized criteria (especially Real-Time

requirement) (Alzeini, Hameed, Habebi, 2013). Thus, the problem is how to achieve

Real-Time OLAP application using the existing resources, and optimally taking