29
Business Intelligence

Business Intelligence. BI Fundamentals Business Transactions Data Bases Data Warehouses Data Marts Data Mining

Embed Size (px)

Citation preview

Business Intelligence

BI Fundamentals

Business Transactions

Data Bases

Data Warehouses

Data Marts

Data Mining

Data bases (DB or DBMS)

“A collection of information organized in such a way that a computer program can quickly select desired pieces of data.”

An electronic filing system Organized by

Fields: a single piece of information Records: one complete set of fields Files: a collection of record

Data warehouse (DW)

“Contain a wide variety of data that present a coherent picture of business conditions at a single point in time.”

“A database system which contains periodically collected samples or summarized (aggregated) transactional data; e.g., daily totals, or monthly averages”

Typically a compilation of information from multiple transactional databases

Data mart

“A database, or collection of databases, designed to help managers make strategic decisions about their business.”

A smaller and more focused form of a data warehouse.

Usually created for a particular department or position A data mart created as a subset of data warehouse

data are referred to as a “dependent data mart”.

Data mining

“A class of database applications that look for patterns in data to be used to predict and direct future behavior.”

Increasingly being used by marketers to find consumer data through the web and store purchases.

7

What is BI? The new technology for understanding the past and predicting the

future A broad category of technologies that allows for

Gathering, storing, accessing and analyzing the data business users make better decisions

Analyzing business performance through data-driven insight A broad category of applications, which includes the activities of

Decision support systems Query and reporting OLAP Statistical, forecasting and data mining

8

BI vs. AI

AI systems make decisions for the users BI systems help users make the right

decisions, based on the available data However, many BI techniques have roots in

AI

9

BI Processes

10

Business Intelligence

11

Data-Information–Knowledge–Decision Making Cycle

What does BI seek to find?

Patterns

What kind of patterns? Sales Stocks Anything useful

Techniques for Finding Patterns

Statistics Trends Correlation

(searching for a best fit)

Patterns continued

Combinatorial If-then relationships

Example If we put chips on sale on a Friday, then

we also sell more soda.

Leading the Industry

Cognos BI software company

Software Used for reporting, analysis, scorecarding,

dashboards, business event management, and data integration

Cognos

Multiple Solutions Industry

Banking Education Defense Government

Department Executive Management Finance Marketing

17

Open Source Tools for BI

ETL (Extract, Transform, Load) tools OLAP (Online Analytical Processing) servers OLAP clients DBMSs (Data Base Management System)

18

ETL Tools

Bee ROLAP (Relational OLAP) oriented ETL tool

CloverETL ROLAP oriented ETL tool Implemented in Java and uses JDBC to transfer data cloveretl.berlios.de

Octopus ROLAP oriented ETL tool Implemented in Java and uses JDBC octopus.objectweb.org

19

OLAP Servers

Bee ROLAP oriented server Uses mySQL to manage the DB sourceforge.net/products/bee/

Lemur HOLAP oriented server www.nongnu.org/lemur

Mondrian ROLAP oriented server Implemented in Java Can be used with any DBMS sourceforge.net/projects/mondrian/

20

OLAP Client

Bee Web-based, used with Bee OLAP server Generates pie, bar, chat, etc. (in 2D & 3D) Export data to Excel, PDF, PNG, Powerpoint, XML

Jpivot Web-based, used with Mondrian OLAP server Generates 2D & 3D graphics Export data to PDF jvipot.sourceforge.net

21

DBMSs

MonetDB Run on Linux, Windows, Mac OS, etc. monetdb.cwi.nl

MySQL Run on Linux, Windows www.mysql.com/products/mysql

MaxDB Formely SAP DB (by SAP AG) Run on Linux, Windows www.mysql.com/products/maxdb

22

PostgreSQL www.postgresql.org Run on Linux, Unix, Windows (versi > 8.0)

23

PALO OLAP

Palo OLAP Server http://www.jedox.com/ Open source MOLAP server be installed locally or in a company network

Palo ETL Server enables the efficient extraction of mass data from

heterogeneous data sources, ie. all common relational database systems and flat files

Palo OLAP Client http://www.jpalo.com/en/ Two versions: Palo Client and Palo Web Client

24

Data Mining Softwares Open sources Borgelt data mining suite Gnome data mine Weka RapidMiner

Commercials See5 (Rulequest) Clementine (SPSS) Enterprise Miner (SAS) GhostMiner (Fujitsu) Statistica Data Miner (StatSoft) Oracle Data Miner (Oracle)

25

Borgelt Data Mining Suite Tasks:

Association: apriori, eclat Classification: bayesian networks, decision

trees, naive bayes Regression: neural networks Clustering: self-organizing maps (SOM)

Platforms: Linux, Unix, MS Windows Website:

http://fuzzy.cs.unimagdeburg.de/~borgelt/software.html

26

Genome Data Mine Tasks:

Association: apriori

Classification: decision trees

Platforms: Linux, Unix, MS Windows Website:

http://www.togaware.com/datamining/gdatamine

Owner: Togaware, Canberra, Australia.

27

WEKA Tasks: Association: apriori

Classification: decision trees, support vector machines, conjunctive rules

Clustering: k-means Platforms: Linux, Unix, MS Windows Website:

http://www.cs.waikato.ac.nz/ml/

Owner: University of Waikato, Hamilton, New Zealand

28

RapidMiner

http://rapid-i.com/ The world-leading open-source system for

knowledge discovery and data mining Multiplaftorm: implemented in Java Supports about 400 operators data mining

Who uses BI?

Businesses The Government ? ?

What are some ethical implications of the use of BI?