8

Click here to load reader

BUSINESS INTELLIGENCE & DATAWAREHOUSING · BUSINESS INTELLIGENCE & DATAWAREHOUSING Professor: JOSÉ CURTO DÍAZ ... Agile Data Warehouse Design: Collaborative Dimensional Modeling,

Embed Size (px)

Citation preview

Page 1: BUSINESS INTELLIGENCE & DATAWAREHOUSING · BUSINESS INTELLIGENCE & DATAWAREHOUSING Professor: JOSÉ CURTO DÍAZ ... Agile Data Warehouse Design: Collaborative Dimensional Modeling,

BUSINESS INTELLIGENCE &DATAWAREHOUSING

Professor: JOSÉ CURTO DÍAZE-Mail: [email protected]

Academic BackgroundAdjunct professor, Social and Behavioral Area, IE Business SchoolAdjunct professor, Information Systems Area, IE Business SchoolAssociate professor, EIMT Department, UOCPart-time professor, U-TADPart-time professor, IEBPart-time professor, EOIPart-time professor, KSchoolAssociate professor, Department of Economics, Universidad Autonoma de Barcelona (Barcelona:2001 – 2005)Ph.D in Network and Information Technologies (in process), Universitat Oberta de CatalunyaDigital Marketing, IE Business SchoolInternational Executive MBA, IE Business SchoolMaster in Executive Management of IT Systems, Universitat Oberta de CatalunyaMaster in Business Intelligence, Universitat Oberta de CatalunyaWeb and e-commerce postgraduate studies, Universitat Oberta de CatalunyaB.S. in Mathematics, Universitat Autonoma de BarcelonaAuthor of books / academic material: “Introduccion al Business Intelligence”, “Data Warehousing”,“Customer Analytics” and “Data-driven Organizations” (Spanish)

Corporate ExperienceCEO, Delfos Research (UK)Co-founder and Data Strategist, Questionity (Spain: 2011 - 2015)Independent Research Analyst (UK: 2013 – 2014)Senior Research Analyst, IDC Research (Spain: 2010 – 2013)Business Intelligence Manager, ICNET Consulting (Spain: 2008 – 2010)Solutions Manager, Stratebi Business Solutions (Spain: 2007 – 2008)PLM Analyst, Thales Group (Spain, France: 2006 – 2007)Java Programmer, EDS (Spain: 2005 – 2006)

Published by IE Publishing Department.Last revised, November 2016.

1

Page 2: BUSINESS INTELLIGENCE & DATAWAREHOUSING · BUSINESS INTELLIGENCE & DATAWAREHOUSING Professor: JOSÉ CURTO DÍAZ ... Agile Data Warehouse Design: Collaborative Dimensional Modeling,

LEARNING OBJECTIVESIn recent years, organisations have aligned technology and business in order to adapt to achallenging and hyper-competitive market. In a situation where organisations in many industriesoffer similar products and use similar technologies, business processes are established as oneof the last points of differentiation. Also, data is becoming the new stronghold on which leveragethe differentiation of business processes. As a result, increasingly more companies use differentanalytical strategies to improve decision-making, optimising processes and creating data-drivenproducts.

However, as in many other areas, technology is not sufficient to generate competitive advantagestatus. For example, decision-making requires actual art (based on experience and intuition) andscience (based on analysis) to survive the increasing complexity of the market. Understandinghow to use data and analytics to create competitive business can be the difference to thesuccess or failure of a business advantage.

This course consists of twelve sessions and it is designed to introduce and learn BusinessIntelligence and Data Warehousing: the first steps to transform a company into a data-drivenorganization.

This course has several objectives:

Introducing Business Intelligence (BI) and Data Warehouse (DW) to the studentLearn and practice some BI & DW techniques required for the data scientist's toolkit: datawarehousing, ETL and DashboardsUnderstanding how to develop a BI and DW project from the beginning to the endIdentifying main roles and tasks in BI and DW

MATERIALSDuring this course the following resources will be used along different sessions:

Book: Kimball, R. and Ross, M. (2013) The Data Warehouse Toolkit: The Definitive Guide toDimensional Modeling (3rd Edition) Indiana: Wiley Publishing.Book: Linstedt, D. and Olschimke, M. (2015). Building a Scalable Data Warehouse withData Vault 2.0. Walthma: Morgan KaufmannSoftware: Tableau (Analysis), Pentaho Data Integration (ELT) and MySQL/MySQLWorkbench (Relational Database, Modeller). Data set(s): the professor will provide the required data sets to be used during this course.

Further readings or complementary cases are specified in every session. Specific softwaredocumentation and instructions will be provided at the begining of the course.

2

Page 3: BUSINESS INTELLIGENCE & DATAWAREHOUSING · BUSINESS INTELLIGENCE & DATAWAREHOUSING Professor: JOSÉ CURTO DÍAZ ... Agile Data Warehouse Design: Collaborative Dimensional Modeling,

PROGRAMSESSION 1 (FACE TO FACE)

What is Business Intelligence and Data Warehousing

Business Intelligence & Data WarehousingRelationship between conceptsWhy Business Intelligence & Data Warehousing are neededBI in the Age of Big DataExamples of companiesBenefits and Challenges for a CompanyMarket: situation and players

The case must be read in advance as it will be discussed in class.

M.D.: Business Intelligence (SI1-131-I-M)M.D.: Big Data (SI2-107-I-M)P.C.: Business Intelligence Software at SYSCO (604080-PDF-ENG)R.A.: You May Not Need Big Data After All (R1312F-PDF-ENG)

SESSION 2

Components of Business Intelligence and Data Warehousing

Data Warehousing Components: Data Mart, Data Warehouse, Operational Data Store,Staging Area, Data Integration, MetadataData Warehouse architectureData Warehouse Design methodologies: Kimball vs. Inmon vs. Data VaultBusiness Intelligence Components: Data Warehouse, Platform, OLAP, Reporting,Dashboards, Scorecards, Balanced Scorecard, Analytics and Alerts

The group project will be presented in this session.

B.C.: Chapter 1: Data Warehousing, Business Intelligence, and Dimensional Modeling Primer (The Data Warehouse Toolkit)

SESSION 3

Mastering data warehouse design (I)

Definition of facts, dimensions and metricsTypes of Facts, dimensions and metricsDiscussion of several examplesTool: MySQL, MySQL Workbench

B.C.: Chapter 2: Kimball Dimensional Modeling Techniques Overview (The Data WarehouseToolkit)

3

Page 4: BUSINESS INTELLIGENCE & DATAWAREHOUSING · BUSINESS INTELLIGENCE & DATAWAREHOUSING Professor: JOSÉ CURTO DÍAZ ... Agile Data Warehouse Design: Collaborative Dimensional Modeling,

SESSION 4

Mastering data warehouse design (II)

How to design a data warehousePractice: Designing a data warehouseTool: MySQL, MySQL Workbench

The "Caterpillar Case" is the individual assignment. Every student must present the proposalsolution one week after session 4.

B.C.: Chapter 18: Dimensional Modeling Process and Tasks (The Data Warehouse Toolkit)P.C.: Caterpillar Tunnelling: Revitalizing User Adoption of Business Intelligence (W13513-PDF-ENG)

SESSION 5

Mastering ETL design (I)

Data Integration: techniques and technologiesIntroduction to the 34 ETL SubsystemsETL development LifecycleTool: Pentaho Data Integration

B.C.: Chapter 19: ETL Subsystems and Techniques (The Data Warehouse Toolkit)

SESSION 6

Mastering ETL design (II)

Data extractionCleansing and ConformingHandling dimension tablesLoading Fact TablesPractice: How to design an ETL processTool: Pentaho Data Integration

B.C.: Chapter 20: ETL System Design and Development Process and Tasks (The DataWarehouse Toolkit)

SESSION 7

Mastering ETL design (III)

Practice: How to design an ETL processTool: Pentaho Data Integration

B.C.: Chapter 20: ETL System Design and Development Process and Tasks (The DataWarehouse Toolkit)

4

Page 5: BUSINESS INTELLIGENCE & DATAWAREHOUSING · BUSINESS INTELLIGENCE & DATAWAREHOUSING Professor: JOSÉ CURTO DÍAZ ... Agile Data Warehouse Design: Collaborative Dimensional Modeling,

SESSION 8

Data Governance

What is Data GovernancePrinciples of Data GovernanceHow to implement a Data Governance programTechnologies in Data Governance

SESSION 9

Mastering Analysis (I): Reporting

Reporting: definition, types, elementsTypes of metricsTypes of graphs (how to choose a graph)Practice: how to analyze dataTool: Tableau Software

SESSION 10

Mastering Analysis (II): Dashboard

Dashboard: definition, types, elementsDashboard vs. Scorecard vs. Balanced ScorecardPractice: how to analyze dataTool: Tableau Software

SESSION 11

Mastering Analysis (III)

Data VisualizationData StorytellingHow Data Visualization can benefit reports and dashboardsPractice: how to analyze dataTool: Tableau Software

SESSION 12

The value of BI & DW

Session with companies. Some companies will present real business cases where BusinessIntelligence helped to create competitive advantages. The students will have the opportunity to askand discuss any interest related with the topics of the course.

EVALUATION METHODThe evaluation consist in three workgroup assigments, one individual assignment and classparticipation. The three practical activities are linked and belong to the same BusinessIntelligence project.

5

Page 6: BUSINESS INTELLIGENCE & DATAWAREHOUSING · BUSINESS INTELLIGENCE & DATAWAREHOUSING Professor: JOSÉ CURTO DÍAZ ... Agile Data Warehouse Design: Collaborative Dimensional Modeling,

Criteria Score %Data Warehouse Design 25%

ETL Design 25%Analysis 20%

Case Discussion (s1) & Class Participation 10%Individual Case 20%

Students are expected to attend every class and to participate in the class discussions. Classparticipation grades are based on two aspects: your attendance in class and your contributionsto the class discussions. Contributions to discussions will focus on the quality, not the quantity ofthe contribution; therefore students who participate often will not necessarily receive a bettergrade than those who participate less often. One must recognize, however, that there is an art toquality participation that is only learned by trial and error.

Therefore, students are encouraged to begin contributing to the discussions early in the course.As the value of this course stems from class discussion, participation and practice, yourattendance at class sessions is critical to learning the material and to enhancing the discussions.Therefore, your participation grade will include your class attendance. If you are unable to attenda class, please call the instructor prior to the class period to let him know. If you must miss asession, you may write and submit a THREE-page analysis of the issues discussed in thereferences (chapters, cases, technical notes or articles) in order to avoid penalizing yourparticipation grade. It is due by the beginning of the next class and no late write-ups will beaccepted.

Each student in the class is required to participate in a working team. Working teams, therefore,will serve as a forum where students test and refine their analysis of the topic addressed. Theworking teams may be particularly useful in providing students with a sense of their increasingexpertise in the application of research and problem-solving skills and methodologies that aredeveloped by a "student-centered" learning approach.

Workgroup - Data Warehouse Design: 25%

Each group must present a design of a data warehouse based on the case to be presented at thesecond session. This design will be based on the methodologies learned in sessions 3, 4 and 5.The student will use a dataset to be provided at the beginning of the course. The purpose of thisactivity is to acquire the necessary skills for the development of data warehouses. More detailedinformation will be presented in Session 5. This activity must be delivered before session 6.

Workgroup - ETL Design: 25%

From the data warehouse model presented in the previous activity, each group must create ETLprocesses for loading data into the data warehouse. This design will be based on the topicslearned in sessions 6, 7 and 8. The purpose of this activity is to acquire the necessary skills fordatawarehousing. More detailed information will be presented in Session 8. This activity must bedelivered before session 9.

Workgroup - Analysis: 20%

Each team will be have to propose one analysis (reporting, dashboard, data visualization, datastorytelling) related to dataset and data mart presented in session 8. The dashboard will be thefinal step in the development of the Business Intelligence project. This design will be based on themethodologies learned in sessions 9, 10 and 11. More detailed information will be presented inSession 9. This activity must be delivered to days after session 12.

Individual Case: 20%

Each student will propose a solution for a business case related to business intelligence. Thiscase will focus on the managerial aspects of this kind of projects and the impact on theorganization.

6

Page 7: BUSINESS INTELLIGENCE & DATAWAREHOUSING · BUSINESS INTELLIGENCE & DATAWAREHOUSING Professor: JOSÉ CURTO DÍAZ ... Agile Data Warehouse Design: Collaborative Dimensional Modeling,

REFERENCEShttps://ie.on.worldcat.org/courseReserves/course/id/10041209

Course Readings

Kimball, R. and Ross, M. (2013) The Data Warehouse Toolkit: The Definitive Guide toDimensional Modeling (3rd Edition) Indiana: Wiley Publishing

Business Intelligence

Atre, S. and Moss, L. (2003) Business Intelligence Roadmap: The Complete ProjectLifecycle for Decision-Support Applications. Boston: Addison-Wesley ProfessionalDevlin, B. (2013) Business unIntelligence: Insight and Innovation beyond Analytics and BigData. New Jersey: Technics PublicationsLoshin, D. (2012) Business Intelligence: The Savvy Manager's Guide (The MorganKaufmann Series on Business Intelligence) (2nd Edition). New York: Morgan KaufmannHowson, C. (2013)Successful Business Intelligence, Second Edition: Unlock the Value of BI & Big Data. NewYork: McGraw-Hill Osborne Media

Data Warehousing

Adamson, C. (2006) Mastering Data Warehouse Aggregates: Solutions for Star SchemaPerformance. New York: WileyBecker, B., Ross, M., Thornthwaite, W., Mundy, J. and Kimball, R. (2008) The DataWarehouse Lifecycle Toolkit. Indiana: Wiley PublishingCaserta, J. and Kimball, R. (2004) The Data Warehouse ETL Toolkit: Practical Techniquesfor Extracting, Cleaning, Conforming, and Delivering Data. Indiana: Wiley PublishingCorr, L. (2011). Agile Data Warehouse Design: Collaborative Dimensional Modeling, fromWhiteboard to Star Schema. Leeds: DecisionOne PressKoncilia, C. and Wrembel, R. (2006) Data Warehouses and Olap: Concepts, Architecturesand Solutions. Hersley: IGI GlobalThomsen, E. (2002) OLAP Solutions: Building Multidimensional Information Systems (2ndEdition). New York: Wiley Venerable, M. (1998) Data Warehouse Design Solutions. NewYork: Wiley

Balanced Scorecard

Kaplan, R. and Norton, D. (1996). The Balanced Scorecard: Translating Strategy intoAction 6Boston: Harvard Business Review Press Niven, P. (2014) Balanced Scorecard Evolution: ADynamic Approach to Strategy Execution. Indiana: Wiley Publishing

Project Management (in the context of Business Intelligence and Data Warehousing)

Collier, K. (2011) Agile Analytics: A Value-Driven Approach to Business Intelligence andData Warehousing (Agile Software Development Series) Boston: Addison-WesleyProfessionalMoss, L. (2013) Extreme Scoping: An Agile Approach to Enterprise Data Warehousing andBusiness Intelligence. New Jersey: Technics PublicationsCorr, L. (2011) Reeves, L.L. (2009) A Manager’s Guide to Data Warehousing. Indiana:Wiley Publishing

Visualization

Cairo, A. (2012) The Functional Art: An introduction to information graphics andvisualization. Berkley: PeachPit Press, a division of Pearson EducationEckerson, W. (2010) Performance Dashboards: Measuring, Monitoring, and ManagingYour Business (2nd Edition). New York: Wiley

7

Page 8: BUSINESS INTELLIGENCE & DATAWAREHOUSING · BUSINESS INTELLIGENCE & DATAWAREHOUSING Professor: JOSÉ CURTO DÍAZ ... Agile Data Warehouse Design: Collaborative Dimensional Modeling,

Few, S. (2012) Show Me the Numbers: Designing Tables and Graphs to Enlighten (2ndEdition). Burlingame: Analytic PressFew, S. (2009) Now You See It: Simple Visualization Techniques for Quantitative Analysis.Burlingame: Analytic PressFew, S. (2013) Information Dashboard Design: Displaying Data for At-a-Glance Monitoring,Second Edition. Burlingame: Analytic Press

Beyond Business Intelligence

Krishnan, K. (2013) Data Warehousing in the Age of Big Data (The Morgan KaufmannSeries on Business Intelligence). New York: Morgan KaufmannLaursen G. and Thorlund, J. (2010) Business Analytics for Managers: Taking BusinessIntelligence Beyond Reporting. New York: Wiley

8