30
© 2013 IBM Corporation IBM Confidential BAFEDM2: Fundamentals of Enterprise Data Management Week 01

MELJUN CORTES Fundamentals of Enterprise Data Management Week 01

Embed Size (px)

DESCRIPTION

MELJUN CORTES Fundamentals of Enterprise Data Management Week 01

Citation preview

Page 1: MELJUN CORTES Fundamentals of Enterprise Data Management Week 01

© 2013 IBM CorporationIBM Confidential

BAFEDM2: Fundamentals of Enterprise Data Management

Week 01

Page 2: MELJUN CORTES Fundamentals of Enterprise Data Management Week 01

© 2013 IBM CorporationIBM Confidential

Agenda

2

Course Overview• Introduction to the Course• Setting of Course Objectives• Administrative Matters• Project Introduction

Module 1: Introduction to Data Warehousing• What is a Data Warehouse• Why a Data Warehouse

Page 3: MELJUN CORTES Fundamentals of Enterprise Data Management Week 01

© 2013 IBM CorporationIBM Confidential

Course OverviewBAFEDM2: Fundamentals of Enterprise Data Management

3

Page 4: MELJUN CORTES Fundamentals of Enterprise Data Management Week 01

© 2013 IBM CorporationIBM Confidential

Course Description

The course is designed to introduce students to the fundamentals of database management systems, enterprise data management using data warehouse, which can be used for further data mining, reporting and data analysis purposes. It describes various activities involved in data mining tasks like data anomaly detection, data association rule learning, data clustering, data classification, data regression and data summarization. This course also introduces students to formalized means of organizing and storing structured and unstructured data in an organization.

4

Page 5: MELJUN CORTES Fundamentals of Enterprise Data Management Week 01

© 2013 IBM CorporationIBM Confidential

Course Objectives

The course will enable the student to:Understand database management systemsDescribe the process of data discovery and data patterns in large data setsUnderstand various methods related to intersection of artificial intelligence, machine learning, statistics, and database systemsUnderstand various techniques related to data extraction and data pre-processing before using for data modelingUnderstand the concept of master data management (MDM)Describe data inference considerations, interestingness metrics, complexity considerationsUnderstand various techniques used for post-processing of discovered structures and visualization

5

Page 6: MELJUN CORTES Fundamentals of Enterprise Data Management Week 01

© 2013 IBM CorporationIBM Confidential

Course Objectives (continued)

Describe the importance of data warehouses for reporting and data analysis and understand the difference from operation data source

Describe formalized means of organizing and storing of documents and other content in an organization related to the organization’s processes

Describe the need and policy around data security and privacy and techniques to restrict information from unauthorized access, use, disclosure, disruption, modification, perusal, inspection, recording or destruction

Describe online fraud and their consequences and understand predictive analytics for detection of fraudulent activities

6

Page 7: MELJUN CORTES Fundamentals of Enterprise Data Management Week 01

© 2013 IBM CorporationIBM Confidential

Learning Outcomes

Upon completion of this course, the student should be able to:Understand data management concepts and criticality of data availability in order to make reliable business decisionsDemonstrate understanding of business intelligence including the importance of data gathering, data storing, data analyzing and accessing dataDescribe where to look for data in an organization and create required reportsUnderstand the functions and data access constraints of various departments within an organization and provide compliance reportsPerform high-quality tasks required by the organization in particular, and the industry in general

7

Page 8: MELJUN CORTES Fundamentals of Enterprise Data Management Week 01

© 2013 IBM CorporationIBM Confidential

Course Modules

Module 1: Introduction to Data WarehousingIn this module, the business reasons behind undertaking a data

warehousing project are outlined and the framework of data warehouse architecture is explained. The key factors to be considered while developing a data warehouse are analyzed, and the goals of data warehousing are arrived at.

Module 2: Data Warehouse Design ConsiderationsIn this module, a realistic approach to modeling a business process is

given. It aims at answering questions on how to model complex hierarchies, how to determine the granularity of data required by a business, and other design considerations during dimensional modeling.

8

Page 9: MELJUN CORTES Fundamentals of Enterprise Data Management Week 01

© 2013 IBM CorporationIBM Confidential

Course Modules (continued)

Module 3: Extract, Transform and Loading ProcessThis module expounds on the different ETL architectures and strategies

available, and how to choose the best strategy for any business environment.

Module 4: Measuring the Effectiveness of a Data WarehouseThis module stresses the importance of having the ability to measure the

progress and quality of a data warehouse in effectively meeting business goals.

Module 5: Data SecurityThis module describes the need and policy around data security and

privacy and techniques to restrict information from unauthorized access, use.

9

Page 10: MELJUN CORTES Fundamentals of Enterprise Data Management Week 01

© 2013 IBM CorporationIBM Confidential

Course Outline and Timeframe

10

Schedule Module Duration (Hours)

Week 01 Course Overview•Introduction to the Course•Setting of Course Objectives•Administrative Matters•Project IntroductionModule 1: Introduction to Data Warehousing•Data Evolution•What is a Data Warehouse•Data Warehouse vs Business Analytics•Why a Data Warehouse•The Goals of a Data Warehouse

3.0

Week 02 • Framework of the Data Warehouse• Data Warehouse Options

3.0

Week 03 Module 2: Data Warehouse Design Considerations•Data Models•The Dimensional Model•Facts and Dimensions•Four-Step Dimensional Design Process•Case Study: Retail

3.0

Page 11: MELJUN CORTES Fundamentals of Enterprise Data Management Week 01

© 2013 IBM CorporationIBM Confidential

Course Outline and Timeframe (continued)

11

Schedule Module Duration (Hours)

Week 04 • Case Study: Education• Case Study: Communications• Dimensional Modeling Best Practices• Project Identification and Sign-Off (Data Model)

3.0

Week 05 Project Consultation 3.0

Week 06 • Long Exam 1 (Modules 1 to 2)• Project Consultation

3.0

Week 07 Module 3: Extract, Transform and Loading Process•Extract Processing•Transform and Prepare for Load•Load Process

3.0

Week 08 Module 4: Measuring the Effectiveness of a Data Warehouse•First Step: Measure•Next Step: Manage and Improve•Project Development (ETL Process, Measures)

3.0

Week 09 Project Consultation 3.0

Page 12: MELJUN CORTES Fundamentals of Enterprise Data Management Week 01

© 2013 IBM CorporationIBM Confidential

Course Outline and Timeframe (continued)

12

Schedule Module Duration (Hours)

Week 10 • Definition of Acceptable Performance• Capacity Planning for the Data Warehouse

3.0

Week 11 Module 5: Data Security•Industry Standards on Data Security•Securing the Data Warehouse•Project Development (Data Security)

3.0

Week 12 Project Consultation 3.0

Week 13 • Long Exam 2 (Modules 3 to 5)• Project Consultation

3.0

Week 14-15 Project Presentation Dry-Run 6.0

Week 16-17 Finals: Project Presentation 6.0

Week 18 • Final Project Submission• Course Debrief

3.0

Page 13: MELJUN CORTES Fundamentals of Enterprise Data Management Week 01

© 2013 IBM CorporationIBM Confidential

Readings

13

“Data Warehousing: Design, Development and Best Practices”by Soumendra Mohanty

“Building the Data Warehouse”by William H. Inmon

“The Data Warehouse Toolkit”by Ralph Kimball

“The Data Warehouse Lifecycle Toolkit”by Ralph Kimball

Page 14: MELJUN CORTES Fundamentals of Enterprise Data Management Week 01

© 2013 IBM CorporationIBM Confidential

Course Requirements Class Participation• Lectures and class discussions• Reading and written assignments• Long exams

Project• The class will be divided in groups of 3 or 4. A designated “Project Sponsor” will be

assigned to each group.• Each group will be asked to put together a data warehouse design for the analytics

project that they will put up in BAFBAN1.• The deliverables will be discussed and will be built upon as the course progresses.

Checkpoints may occur during the duration of the course to check the progress of the deliverables.• The goal of the group is to get approval from their “Project Sponsor” for their project

which would mean that they will secure funding for it.• The deliverables will be submitted to the instructor(s) on the appointed time as

indicated in the course timeframe. The instructor(s) will then distribute the deliverables to the “Project Sponsor”.• Each group will be given 1.0 hours to conduct the presentation to their “Project

Sponsor”.• The final grade of the group will be determined by a point system.

14

Page 15: MELJUN CORTES Fundamentals of Enterprise Data Management Week 01

© 2013 IBM CorporationIBM Confidential

Grading System

Breakdown of MarksExamsQuizzes 10%Long Exams 20%Case Analysis 20%Project Quality of Deliverables* 25%Presentation 10%Final Output 10%Contribution Rating 5% (peer evaluation)

15

*Consider project ranking or extra merit.

Page 16: MELJUN CORTES Fundamentals of Enterprise Data Management Week 01

© 2013 IBM CorporationIBM Confidential

Grading System (continued)

Grading Scale

16

Final Mark Numerical Equivalent Quality Point Equivalent

A 92 to 100 3.76 to 4.00

B+ 87 to 91 3.31 to 3.75

B 83 to 86 2.81 to 3.30

C+ 79 to 82 2.31 to 2.80

C 76 to 78 1.81 to 2.30

D 70 to 75 1.00 to 1.80

F Below 70 Below 1.00

W Overcut Overcut

Page 17: MELJUN CORTES Fundamentals of Enterprise Data Management Week 01

© 2013 IBM CorporationIBM Confidential

Classroom PoliciesAttendance will be checked at the start of the sessions. Students are

allowed to miss a maximum of nine (9) class hours for this course. Hours missed due to tardiness will be counted towards this maximum number.

Deadlines will be strictly enforced. Deliverables received after the designated deadlines will not be checked.

Graded work will be returned to the students within a reasonable period of time. One week after the release of graded work, students are allowed to appeal for changes of grade. Beyond this period, appeals will no longer be entertained.

Make-up activities may be given only to students who have missed or are unable to complete or undertake a major class requirement due to:• Participation in an official school activity• Illness which involves hospitalization or contagious diseasesIn either case, students are required to present proper documentation

prior to taking the make-up exam.

17

Page 18: MELJUN CORTES Fundamentals of Enterprise Data Management Week 01

© 2013 IBM CorporationIBM Confidential

Classroom Policies (continued)

Students are not allowed to eat or drink inside the classrooms. If students should choose to eat dinner or any snack during the break, they must take their food outside the classroom.

Students are required to turn off their mobile devices before the start of class. Any device that goes off during class may be confiscated. A first offense is punishable with a warning. A second offense can be subjected to disciplinary proceedings.

Students should come to class in proper attire. Student not in proper attire will not be allowed inside the classroom.

Other rules and general academic policies will apply.

18

Page 19: MELJUN CORTES Fundamentals of Enterprise Data Management Week 01

© 2013 IBM CorporationIBM Confidential

Module 1: Introduction to Data Warehousing

BAFEDM2: Fundamentals of Enterprise Data Management

19

Page 20: MELJUN CORTES Fundamentals of Enterprise Data Management Week 01

© 2013 IBM CorporationIBM Confidential

Data Evolution

Evolving data to information to knowledge to action creates outcome that has the most impact and value to an organization.

20

“Give a man a fish and you feed him for a day. Teach a man to fish and you feed him for a lifetime.” This is what Business Intelligence is really about!“Give a man a fish and you feed him for a day. Teach a man to fish and you feed him for a lifetime.” This is what Business Intelligence is really about!

DescriptiveQuantitativeQualitative

FactsMetrics

RecallInstincts

ExperienceBelief

InsightResolveDecision

Innovation

Done by Software Done by People

DataInformation

KnowledgeAction

AchievementDiscovery

Outcome

Impact and Value

Page 21: MELJUN CORTES Fundamentals of Enterprise Data Management Week 01

© 2013 IBM CorporationIBM Confidential

Data Evolution (continued)Data: is composed of individual discreet facts that collect descriptive,

quantitative, and qualitative value of business interests. Data warehousing involves three types of data:• Run the Business Data: produced by corporate applications, such as the one ‐ ‐

used to fill customer orders for its products or the one used to manage financial transactions.• Integrate the Business Data: built to improve the quality of and synchronize ‐ ‐

two or more applications, such as a master list of customers.• Monitor the Business Data: presented to end users for reporting and decision ‐ ‐

support, such as financial dashboards. Information: is an organized collection of data presented in a specific and

meaningful wayKnowledge: it encompasses the familiarity, awareness, understanding,

and perceptions of a person about a given subjectAction: is the process of doing something; effective action is the process

of doing the right thing

21

Page 22: MELJUN CORTES Fundamentals of Enterprise Data Management Week 01

© 2013 IBM CorporationIBM Confidential

What is a Data Warehouse

22

A data warehouse is a subject-oriented, integrated, non-volatile, time-variant collection of data in support of management’s decisions.

A data warehouse is a subject-oriented, integrated, non-volatile, time-variant collection of data in support of management’s decisions.

In a data warehouse, information used for analysis is organized around subjects (e.g., employees, accounts, sales, products) rather than activities.

Integrated data refers to de-duplicating information and merging it from many sources into one consistent definition (e.g., when short listing the top banks in the country, you must know that “BPI” and “Bank of the Philippine Islands” are one and the same.

Since the information in a data warehouse is heavily queried against time, it is extremely important to preserve it pertaining to each and every business event of an organization.

Time-referenced data essentially refers to its time-valued characteristic (e.g., what were the total sales of Product A for the past three years).

Page 23: MELJUN CORTES Fundamentals of Enterprise Data Management Week 01

© 2013 IBM CorporationIBM Confidential

What is a Data Warehouse (continued)

A data warehouse is a powerful database model that significantly enhances the user’s ability to quickly analyze large, multidimensional data sets. It cleanses and organizes data to allow users to make business decisions based on facts.

A data warehouse is a collection of integrated, subject oriented databases designed to support the decision support function where each unit of data is relevant to some moment of time.

A data warehouse is a repository of data summarized or aggregated in simplified form from operational systems. End-user orientated data access and reporting tools let user get at the data for decision support.

23

Page 24: MELJUN CORTES Fundamentals of Enterprise Data Management Week 01

© 2013 IBM CorporationIBM Confidential

Data Warehouse vs Business Analytics

Data Warehousing• is a way of storing data and creating information through leveraging

data marts. Data marts are segments or categories of information and/or data that are grouped together to provide insights into that segment or category. A data warehouse does not require business intelligence to work. Reporting tools can generate reports from the data warehouse.

Business Analytics• is the leveraging of a data warehouse to help make business decisions

and recommendations. Information and data rules engines are leveraged here to help make these decisions along with statistical analysis tools and data mining tools.

24

Page 25: MELJUN CORTES Fundamentals of Enterprise Data Management Week 01

© 2013 IBM CorporationIBM Confidential

Why a Data Warehouse: Need and AdvantagesOperational Efficiency• Make the right information available at the right time• Manage data volumes and business complexities

Compliance and Transparency• Leverage value of data across the enterprise• Adherence to federal rules and regulations

Information Integration• Manage information complexity with data integration• Manage information technology costs

Competitive Differentiation• Outperform the competition rather than just stay in business• Use analytics to get insights information within data

Data Governance• Manage data as an asset• Make data secure and reliable

25

Page 26: MELJUN CORTES Fundamentals of Enterprise Data Management Week 01

© 2013 IBM CorporationIBM Confidential

Why a Data Warehouse: Implementation ChallengesCost in Building the Data Warehouse• Operational cost for a data warehouse• Maintenance cost for a data warehouse

Big Bang Approach• Justifying return on investment (ROI) on data warehouse development• Time required to build and get results in a data warehouse

Architectural Challenges• Manage information cosistency with architectural changes• Manage information complexity with data consolidation

Data Governance Issues• Data ownership issues• Data integration, consistency, and quality issues

Business Complexities• Adaptability of a data warehouse to ever changing business scenarios• Business complexities due to mergers and acquisitions

26

Page 27: MELJUN CORTES Fundamentals of Enterprise Data Management Week 01

© 2013 IBM CorporationIBM Confidential

The Goals of a Data Warehouse

27

"We have mountains of data in this company, but we can't access it."

"We need to slice and dice the data every which way."

"You've got to make it easy for business people to get at the data directly."

"Just show me what is important."

"It drives me crazy to have two people present the same business metrics at a meeting, but with different numbers."

"We want people to use information to support more fact-based decision making."

Page 28: MELJUN CORTES Fundamentals of Enterprise Data Management Week 01

© 2013 IBM CorporationIBM Confidential

The Goals of a Data Warehouse (continued)

The data warehouse must make an organization's information easily accessible.

The data warehouse must present the organization's information consistently.

The data warehouse must be adaptive and resilient to change.The data warehouse must be a secure bastion that protects our

information assets.The data warehouse must serve as the foundation for improved

decision making. The business community must accept the data warehouse if it is to

be deemed successful.

28

Page 29: MELJUN CORTES Fundamentals of Enterprise Data Management Week 01

© 2013 IBM CorporationIBM Confidential

For the Next SessionBAFEDM2: Fundamentals of Enterprise Data Management

29

Page 30: MELJUN CORTES Fundamentals of Enterprise Data Management Week 01

© 2013 IBM CorporationIBM Confidential

For the Next Sessions

Agenda• Framework of the Data Warehouse• Data Warehouse Options

30