Click here to load reader

Data Modeling and Analysis Done

Embed Size (px)

DESCRIPTION

Data Modeling and Analysis

Citation preview

DATA MODELLING AND ANALYSISPRSENTED BY: PRATEEK JAIN NITIN SHARMA

1

DATA CONTEXT Data without context is merely data

8-2

INFORMATION Data within a context is information

8-3

INFORMATION PLUS STRUCTURE

Definitions of the data and its structure are called meta-data, that is, data about data8-4

INFORMATION MANAGEMENT The practice of giving context to data Often referred to as Data Management or

Data Administration within an organization

8-5

Data ModelingData modeling Data modeling is a process used to define and analyze data requirements needed to support the business processes within the scope of corresponding information systems in organizations. Therefore, the process of data modeling involves professional data modelers working closely with business stakeholders, as well as potential users of the information system

8-6

Without Data Model

8-7

There are three different types of data models

8-8

produced while progressing from requirements to the actual database to be used for the information system. The data requirements are initially recorded as a conceptual data model which is essentially a set of technology independent specifications about the data and is used to discuss initial requirements with the business stakeholders. The conceptual model is then translated into a logical data model, which documents structures of the data that can be implemented in databases. Implementation of one conceptual data model may require multiple logical data models. The last step in data modeling is transforming the logical data model to a physical data model that organizes the data into tables, and

8-9

Data modeling defines not just data elements,

but their structures and relationships between them. Data modeling techniques and methodologies are used to model data in a standard, consistent, predictable manner in order to manage it as a resource. The use of data modeling standards is strongly recommended for all projects requiring a standard means of defining and analyzing data within an organization8-10

How data models deliver benefit..? Data models provide a structure for data used

within information systems by providing specific definition and format. If a data model is used consistently across systems then compatibility of data can be achieved. If the same data structures are used to store and access data then different applications can share data seamlessly. The results of this are indicated in the diagram.

8-11

Data modeling Process In the context of business process

integration ,(data modeling complements business process modeling, and ultimately results in database generation

8-12

8-13

Modeling Methodologies While there are many ways to create data models only

8-14

two modeling methodologies stand out, top-down and bottom-up: Bottom-up models are often the result of a reengineering effort. They usually start with existing data structures forms, fields on application screens, or reports. These models are usually physical, application-specific, and incomplete from an enterprise perspective. Top-down logical data models, on the other hand, are created in an abstract way by getting information from people who know the subject area. A system may not implement all the entities in a logical model, but the model serves as a reference point or template Sometimes models are created in a mixture of the two methods. Unfortunately, in many environments the distinction between a logical data model and a physical data model is blurred.

Data Modeling Concepts: EntityEntity a class of persons, places, objects, events, or concepts about which we need to capture and store data. Named by a singular noun

Persons: agency, contractor, customer, department, division, employee, instructor, student, supplier. Places: sales region, building, room, branch office, campus. Objects: book, machine, part, product, raw material, software license, software package, tool, vehicle model, vehicle. Events: application, award, cancellation, class, flight, invoice, order, registration, renewal, requisition, reservation, sale, trip. Concepts: account, block of time, bond, course, fund, qualification, stock.

8-15

Data Modeling Concepts: EntityEntity instance a single occurrence of an entity.entity Student ID Last Name First Name 2144 3122 3843 instances Arnold Taylor Simmons Betty John Lisa

98442837 2293

MacyLeath Wrench

BillHeather Tim

8-16

Data Modeling Concepts: AttributesAttribute a descriptive property or characteristic of an entity. Synonyms include element, property, and field. Just as a physical student can have attributes, such as hair color, height, etc., data entity has data attributes8-17

Data Modeling Concepts: Data TypeData type a property of an attribute that identifies what type of data can be stored in that attribute. Representative Logical Data Types for AttributesData Type NUMBER TEXT MEMO DATE TIME YES/NO VALUE SET8-18IMAGE

Logical Business Meaning Any number, real or integer. A string of characters, inclusive of numbers. When numbers are included in a TEXT attribute, it means that we do not expect to perform arithmetic or comparisons with those numbers. Same as TEXT but of an indeterminate size. Some business systems require the ability to attach potentially lengthy notes to a give database record. Any date in any format. Any time in any format. An attribute that can assume only one of these two values. A finite set of values. In most cases, a coding scheme would be established (e.g., FR=Freshman, SO=Sophomore, JR=Junior, SR=Senior). Any picture or image.

Data Modeling Concepts: DomainsDomain a property of an attribute that defines what values an attribute can legitimately take on. Representative Logical Domains for Logical Data TypesData Type NUMBER TEXT DATE Domain For integers, specify the range. For real numbers, specify the range and precision. Maximum size of attribute. Actual values usually infinite; however, users may specify certain narrative restrictions. Variation on the MMDDYYYY format. Examples {10-99} {1.000-799.999} Text(30) MMDDYYYY MMYYYY

TIMEYES/NO VALUE SET8-19

For AM/PM times: HHMMT For military (24-hour times): HHMM{YES, NO} {value#1, value#2,value#n} {table of codes and meanings}

HHMMT HHMM{YES, NO} {ON, OFF} {M=Male F=Female}

Data Modeling Concepts: Default ValueDefault value the value that will be recorded if a value is not specified by the user.Permissible Default Values for AttributesDefault Value A legal value from the domain NONE or NULL Required or NOT NULL Interpretation Examples For an instance of the attribute, if the user does not specify 0 a value, then use this value. 1.00 For an instance of the attribute, if the user does not specify NONE a value, then leave it blank. NULL For an instance of the attribute, require that the user enter REQUIRED a legal value from the domain. (This is used when no value NOT NULL in the domain is common enough to be a default but some value must be entered.)

8-20

KEYA key is a single or combination of multiple fields.It is sometimes called an identifier.

Its purpose is to access or retrieve data rows from table according to the requirement.

8-21

Concatenated key - group of attributes that uniquely identifies an instance. Synonyms: composite key, compound key.Candidate key one of a number of keys that may serve as the primary key. Synonym: candidate identifier. Primary key a candidate key used to uniquely identify a single entity instance. Alternate key a candidate key not selected to become the primary key. Synonym: secondary key.

8-22

8-23

Data Modeling Concepts: Foreign KeysForeign key a primary key of an entity that is used in another entity to identify instances of a relationship. A foreign key is a primary key of one entity that is

contributed to (duplicated in) another entity to identify instances of a relationship. A foreign key always matches the primary key in the another entity A foreign key may or may not be unique (generally not) The entity with the foreign key is called the child. The entity with the matching primary key is called the parent.8-24

Data Modeling Concepts: Foreign KeysPrimary Key Student ID Last Name First Name Dorm

21443122 3843 9844 2837 2293 Primary KeyDorm Smith Jones8-25

ArnoldTaylor Simmons Macy Leath Wrench

BettyJohn Lisa Bill Heather Tim Foreign Key Duplicated from primary key of Dorm entity (not unique in Student entity)

SmithJones Smith Smith Jones

Residence Director Andrea Fernandez Daniel Abidjan

Data Modeling Concepts: DegreeDegree the number of entities that participate in the relationship.A relationship between two entities is called a binary relationship.

A relationship between three entities is called a 3-ary or ternary relationship.

8-26

MAPPING CARDINALITIESEntities to which one entity can be associated Types: One to one One to many Many to one Many to many

8-27

One to one1COLLEGE HA S

1DIRECTOR

One to many1 DEPARTMENTS HAS N FACULTY

8-28

Many to oneN Course 1 team s institutes

Many to many M Assi gned N project

employees

8-29

Data Modeling Concepts: Parent and Child EntitiesParent entity - a data entity that contributes one or more attributes to another entity, called the child. In a one-to-many relationship the parent is the entity on the "one" side.

Child entity - a data entity that derives one or more attributes from another entity, called the parent. In a one-to-many relationship the child is the entity on the "many" side.For example- Student and Faculty are child entity of DSM8-30

NETWORK MODELSome data were more naturally modelled with more than one parent per child. So, the network model permitted the modelling of many-to-many relationships in data

8-31

HEIRARCHIAL MODEL

In this model data is organized into a tree-like structure

8-32

RELATIONAL MODEL

In such a database the data and relations between them are organised in tables. A table is a collection of records and each record in a table contains the same fields.

CUSTOME R ID 454

FIRST NAME NITIN

LAST NAME SHARMA

650123

PRATEEKVIKAS

JAINGUPTA

8-33

ENTITY RELATIONSHIP DIAGRAM Major entities and relationships within a

business area Attributes included only as examples Many-to-many relationships allowed Keys not included

8-34

ER MODEL

CUSTOMER

BORROWE R

LOAN

CUSTOME R ID

NAME

AMOUNT

8-35

What is a Good Data Model? A good data model is simple. Data attributes that describe any given entity should describe only that entity. Each attribute of an entity instance can have only one value. A good data model is essentially

nonredundant. Each data attribute, other than foreign keys,

describes at most one entity. Look for the same attribute recorded more than once under different names. A good data model should be flexible and8-36

adaptable to future needs.

Data Analysis & NormalizationData analysis a technique used to improve a data model for implementation as a database.Goal is a simple, nonredundant, flexible, and adaptable database.

Normalization a data analysis technique that organizes data into groups to form nonredundant, stable, flexible, and adaptive entities.

8-37

Normalization: 1NF, 2NF, 3NFFirst normal form (1NF) entity whose attributes have no more than one value for a single instance of that entity Any attributes that can have multiple values actually describe a separate entity, possibly an entity and relationship. Second normal form (2NF) entity whose nonprimary-key attributes are dependent on the full primary key. Any nonkey attributes dependent on only part of the primary key should be moved to entity where that partial key is the full key. May require creating a new entity and relationship on the model.

8-38

Third normal form (3NF) entity whose nonprimary-key attributes are not dependent on any other non-primary key attributes. Any nonkey attributes that are dependent on other nonkey attributes must be moved or deleted. Again, new entities and relationships may have to be added to the data model.

Definition of 1NFFirst Normal Form is a relation in which the intersection of each row and column contains one and only one value. There are two approaches to removing repeating groups from unnormalized tables: 1. Removes the repeating groups by entering appropriate data in the empty columns of rows containing the repeating data. 2. Removes the repeating group by placing the repeating data, along with a copy of the original key attribute(s), in a separate relation. A primary key is identified for the new relation.

8-39

First Normal Form (1NF)Unnormalized form (UNF) A table that contains one or more repeating groups.ClientNo cName propertyNoPG4 PG16

pAddress6 lawrence St,Glasgow 5 Novar Dr, Glasgow 6 lawrence St,Glasgow 2 Manor Rd, Glasgow 5 Novar Dr, Glasgow

rentStart1-Jul-00

rentFinish31-Aug-01

rent350

ownerNoCO40

oNameTina Murphy Tony Shaw Tina Murphy Tony Shaw Tony Shaw

CR76

John kay

1-Sep-02

1-Sep-02

450

CO93

PG4

1-Sep-99

10-Jun-00

350

CO40

CR56

Aline Stewart

PG36

10-Oct-00

1-Dec-01

370

CO93

PG16

1-Nov-02

1-Aug-03

450

CO93

Figure 8-40

3 ClientRental unnormalized table

1NF ClientRental relation with the second approachWith the second approach, we remove the repeating group (property rented details) by placing the repeating data along with a copy of the original key attribute (clientNo) in a separte relation.ClientNoCR76 CR56

cNameJohn Kay Aline Stewart

ClientNoCR76 CR76 CR56 CR56 CR56

propertyNoPG4 PG16 PG4 PG36 PG16

pAddress6 lawrence St,Glasgow 5 Novar Dr, Glasgow 6 lawrence St,Glasgow 2 Manor Rd, Glasgow 5 Novar Dr, Glasgow

rentStart1-Jul-00 1-Sep-02 1-Sep-99 10-Oct-00 1-Nov-02

rentFinish31-Aug-01 1-Sep-02 10-Jun-00 1-Dec-01 1-Aug-03

rent350 450 350 370 450

ownerNoCO40 CO93 CO40 CO93 CO93

oNameTina Murphy

Tony ShawTina Murphy Tony Shaw Tony Shaw

8-41

Figure 5 1NF ClientRental relation with the second approach

Second Normal Form (2NF)Second normal form (2NF) is a relation that is in first normal form and every non-primary-key attribute is fully functionally dependent on the primary key. The normalization of 1NF relations to 2NF involves the removal of partial dependencies. If a partial dependency exists, we remove the function dependent attributes from the relation by placing them in a new relation along with a copy of their determinant.

8-42

2NF ClientRental relationThe ClientRental relation has the following functional dependencies:fd1 fd2 fd3 fd4 fd5 fd6

clientNo, propertyNo rentStart, rentFinish (Primary Key) clientNo cName (Partial dependency) propertyNo pAddress, rent, ownerNo, oName (Partial dependency) ownerNo oName (Transitive Dependency) clientNo, rentStart propertyNo, pAddress, rentFinish, rent, ownerNo, oName (Candidate key) propertyNo, rentStart clientNo, cName, rentFinish (Candidate key)

8-43

2NF ClientRental relation

After removing the partial dependencies, the creation of the thre new relations called Client, Rental, and PropertyOwnerClientClientNoCR76 CR56

RentalcNameJohn Kay Aline Stewart

ClientNoCR76 CR76 CR56 CR56 CR56

propertyNoPG4 PG16 PG4 PG36 PG16

rentStart1-Jul-00 1-Sep-02 1-Sep-99 10-Oct-00 1-Nov-02

rentFinish31-Aug-01 1-Sep-02 10-Jun-00 1-Dec-01 1-Aug-03

PropertyOwnerpropertyNoPG4 PG16 PG36

pAddress6 lawrence St,Glasgow 5 Novar Dr, Glasgow 2 Manor Rd, Glasgow

rent350 450 370

ownerNoCO40 CO93 CO93

oNameTina Murphy Tony Shaw Tony Shaw

8-44

Figure 6 2NF ClientRental relation

Third Normal Form (3NF)Transitive dependency A condition where A, B, and C are attributes of a relation such that if A B and B C, then C is transitively dependent on A via B (provided that A is not functionally dependent on B or C).

Third normal form (3NF) A relation that is in first and second normal form, and in which no non-primary-key attribute is transitively dependent on the primary key. The normalization of 2NF relations to 3NF involves the removal of transitive dependencies by placing the attribute(s) in a new relation along with a copy of the determinant.8-45

3NF ClientRental relationThe functional dependencies for the Client, Rental and PropertyOwner relations are as follows: Client fd2 clientNo cName (Primary Key) clientNo, propertyNo rentStart, rentFinish clientNo, rentStart propertyNo, rentFinish (Candidatekey) propertyNo, rentStart clientNo, rentFinish (Candidatekey)

Rental fd1 fd5 fd6

(Primary Key)

8-46

PropertyOwner fd3 propertyNo pAddress, rent, ownerNo, oName (Primary Key) fd4 ownerNo oName (Transitive Dependency)

3NF ClientRental relationClientClientNoCR76 CR56

RentalcNameJohn Kay Aline Stewart

ClientNoCR76 CR76 CR56 CR56 CR56

propertyNoPG4 PG16 PG4 PG36 PG16

rentStart1-Jul-00 1-Sep-02 1-Sep-99 10-Oct-00 1-Nov-02

rentFinish31-Aug-01 1-Sep-02 10-Jun-00 1-Dec-01 1-Aug-03

PropertyOwnerpropertyNoPG4 PG16 PG36

Ownerrent350 450 370

pAddress6 lawrence St,Glasgow 5 Novar Dr, Glasgow 2 Manor Rd, Glasgow

ownerNoCO40 CO93 CO93

ownerNoCO40 CO93

oNameTina Murphy Tony Shaw

8-47

Figure 7 2NF ClientRental relation