Upload
praburam-chandrasekaran
View
222
Download
0
Embed Size (px)
Citation preview
8/2/2019 MIS - 7 [Compatibility Mode]
1/48
Data Management
8/2/2019 MIS - 7 [Compatibility Mode]
2/48
Data
Data
A necessity for almost any enterprise to carryout its business. Consists of raw facts, andwhen organized may be transformed into
information. Database
A collection of data organized to meet users
needsDatabase Management System (DBMS)
A group of programs that manipulate the database and
provide an interface between the database and the user
of the database or other application programs.
8/2/2019 MIS - 7 [Compatibility Mode]
3/48
Brief History of Data Management Early DBMSs (late 1960s) evolved from file-
based processing systems Visualize the data much as it was stored
Tree-based (hierarchical model)
-
DEPTS
EMPS MGR
ITEMS
NAME SS#
8/2/2019 MIS - 7 [Compatibility Mode]
4/48
Advent of Modern DBMS Early 1970s Ted Codd invented new data
model (=relational data model) and theconcept of data abstraction Soon thereafter, team of IBMers invented
SQL (Structured Query Language)
Became de-facto standard for query languagesbased on the relational data model
Commercial DBMS based on relational modelare now widely accepted in industry e.g., Microsoft Access, Oracle 9i, Sybase Adaptive
Server, >20 billion dollar industry!
8/2/2019 MIS - 7 [Compatibility Mode]
5/48
Data Hierarchy in a Computer System
8/2/2019 MIS - 7 [Compatibility Mode]
6/48
File Management Systems: a physical interface
Student
AdminStudentData
Year ListTheTraditionalApproach
Se arate
TimetableScheduler
Payroll
CourseData
LecturerData
Cheques
files arecreatedand storedfor eachapplicationprogram.
8/2/2019 MIS - 7 [Compatibility Mode]
7/48
File Management Systems: Sharing
Student
AdminStudentData
Year List
TimetableScheduler
Payroll
CourseData
LecturerData
Cheques
8/2/2019 MIS - 7 [Compatibility Mode]
8/48
Data redundancy
Program-Data dependence
Problems with the Traditional File Environment
Lack of flexibility
Poor security
Lack of data-sharing and availability
8/2/2019 MIS - 7 [Compatibility Mode]
9/48
The Contemporary Database Environment
The Database
Approach A pool of related
data is shared bymultiple
programs.Rather thanhaving separatedata files, eachapplication usesa collection ofdata that is eitherjoined or relatedin the database.
8/2/2019 MIS - 7 [Compatibility Mode]
10/48
Creates and maintains databasesCreates and maintains databases
Eliminates requirement for data definitionEliminates requirement for data definitionstatementsstatements
Database Management System (DBMS)
Acts as interface between applicationActs as interface between applicationprograms and physical data filesprograms and physical data files
Separates logical and design views ofSeparates logical and design views ofdatadata
8/2/2019 MIS - 7 [Compatibility Mode]
11/48
Data Base Structures
Hierarchical and Network DBMSHierarchical and Network DBMS
Relational DBMSRelational DBMS
ObjectObject--Oriented DatabasesOriented Databases
8/2/2019 MIS - 7 [Compatibility Mode]
12/48
Hierarchical Database Model
Hierarchical Database Model
A data model in which the data is organized in a top-down, orinverted tree structure.
8/2/2019 MIS - 7 [Compatibility Mode]
13/48
A Network Data Model
Network Data Model
An expansion of the hierarchical database modelwith an owner-member relationship in which amember may have many owners.
8/2/2019 MIS - 7 [Compatibility Mode]
14/48
A Relational Data Model
Relational Data Model
All data elements are placed in two-dimensional
tables, called relations, that are the logicalequivalent of files.
8/2/2019 MIS - 7 [Compatibility Mode]
15/48
Types of Databases
Centralized database
Used by single central processor or multiple processors in
client/server network
Distributed database
Stored in more than one physical location
Partitioned database
Duplicated database
8/2/2019 MIS - 7 [Compatibility Mode]
16/48
(Analytical Database)
Multidimensional data analysis Supports manipulation and analysis of large
volumes of data from multiple
dimensions/perspectives
Operational Databases
a a ases s ore e a e a a nee e osupport the operations of the entire
organizations .
Also called Subject area database(SADB),transaction data base,Production
databases or personal databases
8/2/2019 MIS - 7 [Compatibility Mode]
17/48
The Web and Hypermedia database
Organizes data as network of nodes
Supports text, graphic, sound, video andexecutable programs
8/2/2019 MIS - 7 [Compatibility Mode]
18/48
Characteristics of a Database
Structure
data types
data behavior Persistence
store data on
Performance
retrieve and storedata quicklyCorrectness
secon ary s orage
Retrieval
a declarative querylanguage
a proceduraldatabaseprogramming
language
Sharing
concurrency
Reliability and
resilience
Large volumes
8/2/2019 MIS - 7 [Compatibility Mode]
19/48
Conceptual design: Abstract model of database
from a business perspective
Designing Databases
ys ca es gn: Detailed description ofbusiness information needs
8/2/2019 MIS - 7 [Compatibility Mode]
20/48
Data Modeling and DatabaseModels
Data Model
A map or diagram of entities and theirrelationships.
Enterprise data modeling Data modeling done at the level of the
entire organization.
Entity-Relationship (ER) diagrams A data model that uses basic graphical
symbols to show the organization of andrelationships between data.
8/2/2019 MIS - 7 [Compatibility Mode]
21/48
Data Entities, Attributes, and Keys
Entity
A generalized class of people, places, orthings (objects) for which data is collected,stored, and maintained.
Attribute
A characteristics of an entity; something theentity is identified by.
Keys
A field or set of fields in a record that is usedto identify the record.
Entities
Customer,Employee
Attributes
Customer name,Employee name
Primary key
A field or set of fields that uniquelyidentifies the record.
8/2/2019 MIS - 7 [Compatibility Mode]
22/48
An Entity-Relationship Diagram
8/2/2019 MIS - 7 [Compatibility Mode]
23/48
The Use of Schemas andSubschema's
Schema
A description of the entire
database.
Subschema
A file that contains a
description of a subset of the
database and identifies
modifications on the dataitems in that subset.
21
8/2/2019 MIS - 7 [Compatibility Mode]
24/48
Management Requirements for Database Systems
Key elements in a database environment:
Data Administration
ata ann ng an o e ng et o o ogy
Database Technology and Management
Users
8/2/2019 MIS - 7 [Compatibility Mode]
25/48
Management Requirements for Database Systems
8/2/2019 MIS - 7 [Compatibility Mode]
26/48
Advanced Databases -
8/2/2019 MIS - 7 [Compatibility Mode]
27/48
todays Problem: Heterogeneous Information Sources
Heterogeneities are everywhere
PersonalDatabases
Different interfaces
Different data representations
Duplicate and inconsistent information
Digital Libraries
Scientific DatabasesWorldWideWeb
8/2/2019 MIS - 7 [Compatibility Mode]
28/48
8/2/2019 MIS - 7 [Compatibility Mode]
29/48
Our Goal: Unified Access to Data
Integration System
Collects and combines information
Provides integrated view, uniform user interface
Supports sharing
WorldWorldWorldWorldWideWideWideWide
WebWebWebWeb
Digital Libraries Scientific Databases
Personal
Databases
8/2/2019 MIS - 7 [Compatibility Mode]
30/48
8/2/2019 MIS - 7 [Compatibility Mode]
31/48
Data warehouse Evolution
200019951990198519801960 1975
Relational
Databases
Company
DWs
Building the
DW
Inmon (1992)
Data Replication
Tools
n ormat on-
Based
Management
DataRevolution
MiddleAges
PrehistoricTimes
PCs and
SpreadsheetsEnd-user
Interfaces
1st DW
Article
DW
Confs.
Vendor DW
Frameworks
8/2/2019 MIS - 7 [Compatibility Mode]
32/48
What is a Data Warehouse?
A Data Warehouse is a
subject-oriented, integrated,
- ,
non-volatile
collection of data used in support of
management decision makingprocesses.
8/2/2019 MIS - 7 [Compatibility Mode]
33/48
Subject-Oriented: The data warehouse is organized around the key
subjects (or high-level entities) of the enterprise.
Major subjects include Customers, Patients,Students,Products etc .
n egra e
The data housed in the data warehouse are defined
using consistent
Naming conventions Formats
Encoding Structures
Related Characteristics
8/2/2019 MIS - 7 [Compatibility Mode]
34/48
Time-variant
The data in the warehouse contain a time
dimension so that they may be used as a historical
record of the business
-
Data in the data warehouse are loaded and
refreshed from operational systems, but cannot be
updated by end-users
8/2/2019 MIS - 7 [Compatibility Mode]
35/48
The Data Warehouse Continued
Characteristics of data warehousing are:
Time variant. The data are kept for many years so they canbe used for trends, forecasting, and comparisons over time.
Nonvolatile. Once entered into the warehouse, data are notu dated.
Relational. Typically the data warehouse uses a relationalstructure.
Client/server. The data warehouse uses the client/serverarchitecture mainly to provide the end user an easy accessto its data.
Web-based. Data warehouses are designed to provide anefficient computing environment for Web-based applications
8/2/2019 MIS - 7 [Compatibility Mode]
36/48
Data Warehouse- A Practitioners Viewpoint
A data warehouse is simply a single,
complete, and consistent store of data obtained
from a variety of sources and made available
to end users in a way they can understand anduse it in a business context.
-- Barry Devlin, IBM Consultant
W h i d I d t
8/2/2019 MIS - 7 [Compatibility Mode]
37/48
Warehousing and Industry
Warehousing is big business
$2 billion in 1995
$3.5 billion in early 1997
Predicted: $8 billion in 1998 [Metagroup]
WalMart has largest warehouse 900-CPU, 2,700 disk, 23 TB Teradata system
~7TB in warehouse
40-50GB per day
W h i S i li d DB
8/2/2019 MIS - 7 [Compatibility Mode]
38/48
Warehouse is a Specialized DB
Standard DB
Mostly updates
Many small transactions Mb - Gb of data
Current sna shot
Data Warehouse Mostly reads
Queries are long andcomplex
Gb - Tb of data
Index/hash on p.k. Raw data
Thousands of users (e.g.,
clerical users)
Lots of scans Summarized, reconciled
data
Hundreds of users (e.g.,decision-makers,analysts)
P iti f th D t W h Withi
8/2/2019 MIS - 7 [Compatibility Mode]
39/48
Position of the Data Warehouse Within
the Organization
Th D t W h
8/2/2019 MIS - 7 [Compatibility Mode]
40/48
The Data Warehouse Architecture
8/2/2019 MIS - 7 [Compatibility Mode]
41/48
The Data Mart
A data mart is a small scaled-down version of a datawarehouse designed for a strategic business unit (SBU) or adepartment. Since they contain less information than thedata warehouse they provide more rapid response and aremore easily navigated than enterprise-wide datawarehouses.
Replicated (dependent) data marts are small subsets of thedata warehouse. In such cases one replicates some subsetof the data warehouse into smaller data marts, each of whichis dedicated to a certain functional area.
Stand-alone data marts. A company can have one or moreindependent data marts without having a data warehouse.Typical data marts are for marketing, finance, andengineering applications.
Position of the Data Mart Within
8/2/2019 MIS - 7 [Compatibility Mode]
42/48
Position of the Data Mart Within
the Organization
Data Mart
Decision
Support
Information
ata
Delivery
Data Mart
Data Mart
Decision
SupportInformation
Decision
Support
Information
Data Warehousing: Two Distinct Issues
8/2/2019 MIS - 7 [Compatibility Mode]
43/48
Data Warehousing: Two Distinct Issues
(1) How to get information into warehouse
Data warehousing
(2) What to do with data once its inwarehouse
Warehouse DBMSBoth rich research areas
Industry has focused on (2)
Wh C D W h D ?
8/2/2019 MIS - 7 [Compatibility Mode]
44/48
What Can a Data Warehouse Do?
Some of the benefits of a DW are:
Immediate information delivery Data integration from across and even
Future vision from historical trends
Tools for looking at data in new ways
Freedom from IS department resourcelimitations (you dont need programmersto use a data warehouse)
8/2/2019 MIS - 7 [Compatibility Mode]
45/48
Examples of Common DW Applications
8/2/2019 MIS - 7 [Compatibility Mode]
46/48
Sales Analysis
Determine real-time product sales to make vital pricing and distributiondecisions.Analyze historical product sales to determine success or failure attributes.Evaluate successful products and determine key success factors.
Quickly isolate past preferred customers who no longer buy.Identify daily what product is in the manufacturing and distribution pipeline.Instantly determine which salespeople are performing, on both a revenueand margin basis, and which are behind.
Examples of Common DW Applications
Financial Analysis
Compare actual to budgets on an annual, monthly and month-to-date basis.Review past cash flow trends and forecast future needs.Instantly generate a current set of key financial ratios and indicators.Receive near-real-time, interactive financial statements.
Human Resource AnalysisEvaluate trends in benefit program use.Identify the wage and benefits costs to determine company-wide variation.
Other Areas
Warehouses have also been applied to areas such as: logistics, inventory,
purchasing, detailed transaction analysis and load balancing.
The Future of Data Warehousing
8/2/2019 MIS - 7 [Compatibility Mode]
47/48
g
Typical Nonintegrated Information Architecture
i2 Supply Chain Oracle Financials Siebel CRM 3rd Party Data
OracleFinancial
DW
MarketingDW
SupplyChain
Data Mart
Subset Non-Architected Data Marts
Federated Integrated Information Architecture
8/2/2019 MIS - 7 [Compatibility Mode]
48/48
Federated Integrated Information Architecture
i2 Supply Chain Oracle Financials Siebel CRM 3rd Party Data
CommonData Staging
FederatedFinancial
DW
FederatedMarketing
DW
FederatedSupply ChainData Mart
Subset Non-Architected Data Marts