26
ENGISTAN.COM DBMS Concepts For IBPS IT-Officer 2014 This document is prepared for IBPS SO (IT-Officer) Examination 2014. The key concepts of DBMS are explained in a very precise & lucid way to assist the aspirants in their preparation. If you have any queries, doubts, or suggestions, please do share with us in our Forum. We wish you All The Best – TEAM Engistan Contents 1. Basic Terms 2. Database Models 3. RDBMS 4. Database Keys 5. Database Users 6. Normalization 7. E-R Diagram 8. Generalization & Specialization 9. SQL Basics 10. Data Languages 11. SQL Queries 12. Transactions-ACID Properties

DBMS_IBPS Study material

Embed Size (px)

DESCRIPTION

best study material for DBMS

Citation preview

ENGISTAN.COM

DBMS Concepts For IBPS IT-Officer 2014

This document is prepared for IBPS SO (IT-Officer) Examination 2014. The key concepts of DBMS are explained in a very precise & lucid way to assist the aspirants in their preparation. If you have any queries, doubts, or suggestions, please do share with us in our Forum.

We wish you All The Best – TEAM Engistan

Contents 1. Basic Terms

2. Database Models 3. RDBMS

4. Database Keys 5. Database Users 6. Normalization 7. E-R Diagram

8. Generalization & Specialization 9. SQL Basics

10. Data Languages 11. SQL Queries

12. Transactions-ACID Properties

Engistan.com 90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014[ ] Data: Data is the quantities, characters, or symbols on which operations are performed by a computer.

Data (or) Information Processing: The process of converting the facts into meaningful information is known as Data processing. It is also known as Information processing.

Meta Data: The term Metadata refers to "data about data”. Metadata is defined as the data providing information about one or more aspects of the data, such as:

• Means of creation of the data

• Purpose of the data

• Time and date of creation

• Creator or author of the data

• Location on a computer network where the data were created

• Standards used

Database: A database is a structured collection of data, which is organized into files called tables.

o A logically coherent collection of related data that (i) describes the entities and their inter-relationships, and (ii) is designed, built & populated for a specific reason.

Database Model

A Database model defines the logical design of data. The model describes the relationships between different parts of the data. In history of database design, three models have been in use.

• Hierarchical Model

• Network Model

• Relational Model

Hierarchical Model: In this model each entity has only one parent but can have several children. At the top of hierarchy there is only one entity which is called Root.

Engistan.com | Engineer’s Community

1

Engistan.com 90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014[ ]

Network Model: In the network model, entities are organised in a graph, in which some entities can be accessed through several path

Relational Model: In this model, data is organised in two-dimesional tables called relations. The tables or relation are related to each other.

Engistan.com | Engineer’s Community

2

Engistan.com 90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014[ ]

RDBMS Concepts

A Relational Database management System (RDBMS) is a database management system based on relational model introduced by E.F Codd. In relational model, data is represented in terms of tuples (rows).

RDBMS is used to manage Relational database. Relational database is a collection of organized set of tables from which data can be accessed easily. Relational Database is most commonly used database. It consists of number of tables and each table has its own primary key.

What is Table ?

In Relational database, a table is a collection of data elements organised in terms of rows and columns. A table is also considered as convenient representation of relations. But a table can have duplicate tuples while a true relation cannot have duplicate tuples. Table is the most simplest form of data storage. Below is an example of Employee table.

ID Name Age Salary

Engistan.com | Engineer’s Community

3

Engistan.com 90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014[ ]

1 Adam 34 13000

2 Alex 28 15000

3 Stuart 20 18000

4 Ross 42 19020

What is a Record ?

A single entry in a table is called a Record or Row. A Record in a table represents set of related data. For example, the above Employee table has 4 records. Following is an example of single record.

1 Adam 34 13000

What is Field ?

A table consists of several records (row), each record can be broken into several smaller entities known as Fields. The above Employee table consist of four fields, ID, Name, Age and Salary.

What is a Column ?

In Relational table, a column is a set of value of a particular type. The term Attribute is also used to represent a column. For example, in Employee table, Name is a column that represent names of employee.

Name

Engistan.com | Engineer’s Community

4

Engistan.com 90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014[ ]

Adam

Alex

Stuart

Ross

Database Management System (DBMS):

- A collection of programs that enables users to perform certain actions on a particular database:

define the structure of database information (descriptive attributes, data types, constraints, etc), storing this as meta- data

populate the database with appropriate information

manipulate the database (for retrieval/update/removal/insertion of information)

protect the database contents against accidental or deliberate corruption of contents (involves secure access by users and automatic recovery in the case of user/hardware faults)

share the database among multiple users, possibly

concurrently

Examples of DBMS are Oracle, Sybase, MySQL, DB/2, SQLServer, Informix, MS-Access, FileMaker etc

Sample Databases

Shown below is an extract from a (relational) database that might be part of a

University’s Academic Information System:

Engistan.com | Engineer’s Community

5

Engistan.com 90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014[ ]

Terminology:

relation = table (file)

attribute = column (field)

tuple = row (record)

Engistan.com | Engineer’s Community

6

Engistan.com 90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014[ ]

Database Keys:

Keys are very important part of Relational database. They are used to establish and identify relation between tables. They also ensure that each record within a table can be uniquely identified by combination of one or more fields within a table.

Super Key: Super Key is defined as a set of attributes within a table that uniquely identifies each record within a table. Super Key is a superset of Candidate key.

Candidate Key: Candidate keys are defined as the set of fields from which primary key can be selected. It is an attribute or set of attribute that can act as a primary key for a table to uniquely identify each record in that table.

Primary Key: Primary key is a candidate key that is most appropriate to become main key of the table. It is a key that uniquely identify each record in a table.

Engistan.com | Engineer’s Community

7

Engistan.com 90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014[ ]

Foreign Key: A foreign key is generally a primary key from one table that appears as a field in another where the first table has a relationship to the second. In other words, if we had a table A with a primary key X that linked to a table B where X was a field in B, then X would be a foreign key in B.

Composite Key: Key that consists of two or more attributes that uniquely identify an entity occurrence is called Composite key. But any attribute that makes up the Composite key is not a simple key in its own.

Secondary or Alternative key: The candidate key which are not selected for primary key are known as secondary keys or alternative keys

Engistan.com | Engineer’s Community

8

Engistan.com 90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014[ ] Non-key Attribute: Non-key attributes are attributes other than candidate key

attributes in a table.

Non-prime Attribute: Non-prime Attributes are attributes other than Primary attribute.

Database Users:

Database Administrators (DBA):

o individual(s) that determine & implement policy regarding users, their permissions on a database and the design & construction of that database

Database Designers:

o individual(s) – possibly also software engineers – who apply design techniques to produce database structures pertinent to a specific application

End Users:

o People who, from time to time, access the contents of a database:

Casual end users may submit ad-hoc queries as the need arises, using a high-level query language

naïve, or parametric, end-users access the database

through pre-written programs that effect an appropriate

interface to the database

database programmers write code, using a relevant

programming language and the high-level query language, that

can later be used by parametric users

Normalization

Normalization is a systematic approach of decomposing tables to eliminate data redundancy and undesirable characteristics like Insertion, Update and Deletion Anomalies. It is a two-

Engistan.com | Engineer’s Community

9

Engistan.com 90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014[ ]

step process that puts data into tabular form by removing duplicated data from the relation tables.

Normalization is used for mainly two purposes,

• Eliminating redundant (useless) data.

• Ensuring data dependencies make sense i.e data is logically stored.

Problem Without Normalization

Without Normalization, it becomes difficult to handle and update the database, without facing data loss. Insertion, Updation and Deletion Anomalies are very frequent if Database is not Normalized. To understand these anomalies let us take an example of Student table.

S_id S_Name S_Address Subject_opted

401 Adam Noida Bio

402 Alex Panipat Maths

403 Stuart Jammu Maths

404 Adam Noida Physics

Updation Anomaly : To update address of a student who occurs twice or more than

twice in a table, we will have to update S_Address column in all the rows, else data will

become inconsistent.

Engistan.com | Engineer’s Community

10

Engistan.com 90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014[ ]

Insertion Anomaly: Suppose for a new admission, we have a Student id(S_id), name and

address of a student but if student has not opted for any subjects yet then we have to

insert NULL there, leading to Insertion Anamoly.

Deletion Anomaly: If (S_id) 401 has only one subject and temporarily he drops it, when

we delete that row, entire student record will be deleted along with it.

Normalization Rule

Normalization rule are divided into following normal form.

1. First Normal Form

2. Second Normal Form

3. Third Normal Form

4. BCNF

1. First Normal Form (1NF): A row of data cannot contain repeating group of data i.e each column must have a unique value. Each row of data must have a unique identifier i.e Primary key. For example consider a table which is not in First normal form

Student Table :

S_id S_Name subject

401 Adam Biology

401 Adam Physics

402 Alex Maths

403 Stuart Maths

You can clearly see here that student name Adam is used twice in the table and subject math is also repeated. This violates the First Normal form. To reduce above table to First Normal form breaks the table into two different tables

Engistan.com | Engineer’s Community

11

Engistan.com 90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014[ ]

New Student Table :

S_id S_Name

401 Adam

402 Alex

403 Stuart

Subject Table :

subject_id student_id subject

10 401 Biology

11 401 Physics

12 402 Math

12 403 Math

In Student table concatenation of subject_id and student_id is the Primary key. Now both the Student table and Subject table are normalized to first normal form

2. Second Normal Form (2NF): A table to be normalized to Second Normal Form should meet all the needs of First Normal Form and there must not be any partial dependency of any column on primary key. It means that for a table that has concatenated primary key, each column in the table that is not part of the primary key must depend upon the entire concatenated key for its existence. If any column depends oly on one part of the concatenated key, then the table fails Second normal form. For example, consider a table which is not in Second normal form.

Engistan.com | Engineer’s Community

12

Engistan.com 90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014[ ]

Customer Table:

customer_id Customer_Name Order_id Order_name Sale_detail

101 Adam 10 order1 sale1

101 Adam 11 order2 sale2

102 Alex 12 order3 sale3

103 Stuart 13 order4 sale4

In Customer table concatenation of Customer_id and Order_id is the primary key. This table is in First Normal form but not in Second Normal form because there are partial dependencies of columns on primary key. Customer_Name is only dependent on customer_id, Order_name is dependent on Order_id and there is no link between sale_detail and Customer_name.

To reduce Customer table to Second Normal form break the table into following three different tables.

Customer_Detail Table :

customer_id Customer_Name

101 Adam

102 Alex

103 Stuart

Order_Detail Table :

Engistan.com | Engineer’s Community

13

Engistan.com 90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014[ ]

Order_id Order_Name

10 Order1

11 Order2

12 Order3

13 Order4

Sale_Detail Table :

customer_id Order_id Sale_detail

101 10 sale1

101 11 sale2

102 12 sale3

103 13 sale4

Now all these three table comply with Second Normal form.

3. Third Normal Form (3NF): Third Normal form applies that every non-prime attribute of table must be dependent on primary key. The transitive functional dependency should be removed from the table. The table must be in Second Normal form. For example, consider a table with following fields.

Engistan.com | Engineer’s Community

14

Engistan.com 90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014[ ]

Student_Detail Table:

Student_id Student_name DOB Street city State Zip

In this table Student_id is Primary key, but street, city and state depends upon Zip. The dependency between zip and other fields is called transitive dependency. Hence to apply 3NF, we need to move the street, city and state to new table, with Zip as primary key.

New Student_Detail Table :

Student_id Student_name DOB Zip

Address Table :

Zip Street city state

The advantage of removing transitive dependency is,

• Amount of data duplication is reduced. • Data integrity achieved.

4. Boyce and Codd Normal Form (BCNF): Boyce and Codd Normal Form is a higher version of the Third Normal form. This form deals with certain type of anamoly that is not handled by 3NF. A 3NF table which does not have multiple overlapping candidate keys is said to be in BCNF.

Engistan.com | Engineer’s Community

15

Engistan.com 90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014[ ]

E-R Diagram

ER-Diagram is a visual representation of data that describes how data is related to each other.

Symbols and Notations

Engistan.com | Engineer’s Community

16

Engistan.com 90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014[ ]

Components of E-R Diagram

The E-R diagram has three main components.

1) Entity

An Entity can be any object, place, person or class. In E-R Diagram, an entity is represented using rectangles. Consider an example of an Organisation. Employee, Manager, Department, Product and many more can be taken as entities from an Organisation.

Weak Entity

Weak entity is an entity that depends on another entity. Weak entity doen't have key attribute of their own. Double rectangle represents weak entity.

2) Attribute

An Attribute describes a property or characterstic of an entity. For example, Name, Age, Address etc can be attributes of a Student. An attribute is represented using eclipse.

Engistan.com | Engineer’s Community

17

Engistan.com 90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014[ ]

Key Attribute

Key attribute represents the main characteristic of an Entity. It is used to represent Primary key. Ellipse with underlying lines represent Key Attribute.

Composite Attribute

An attribute can also have their own attributes. These attributes are known as Composite attribute.

3) Relationship

A Relationship describes relations between entities. Relationship is represented using diamonds.

There are three types of relationship that exist between Entities.

• Binary Relationship

• Recursive Relationship

• Ternary Relationship

Binary Relationship

Binary Relationship means relation between two Entities. This is further divided into three types.

Engistan.com | Engineer’s Community

18

Engistan.com 90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014[ ]

1. One to One : This type of relationship is rarely seen in real world.

The above example describes that one student can enroll ony for one course and a course will also have only one Student. This is not what you will usually see in relationship.

2. One to Many : It reflects business rule that one entity is associated with many number

of same entity. For example, Student enrolls for only one Course but a Course can have

many Students.

The arrows in the diagram describes that one student can enroll for only one course.

3. Many to Many :

The above diagram represents that many students can enroll for more than one courses.

Recursive Relationship

When an Entity is related with itself it is known as Recursive Relationship.

Ternary Relationship

Relationship of degree three is called Ternary relationship.

Engistan.com | Engineer’s Community

19

Engistan.com 90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014[ ]

Generalization and Specialization

Generalization: Generalization is a bottom-up approach in which two lower level entities combine to form a higher level entity. In generalization, the higher level entity can also combine with other lower level entity to make further higher level entity.

Specialization: Specialization is opposite to Generalization. It is a top-down approach in which one higher level entity can be broken down into two lower level entity. In specialization, some higher level entities may not have lower-level entity sets at all.

Aggregation: Aggregation is a process when relation between two entity is treated as a single entity. Here the relation between Center and Course is acting as an Entity in relation with Visitor.

Engistan.com | Engineer’s Community

20

Engistan.com 90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014[ ] SQL Basics

Introduction to SQL

Structure Query Language (SQL) is a programming language used for storing and managing data in RDBMS. SQL was the first commercial language introduced for E.F Codd's Relational model. Today almost all RDBMS (MySql, Oracle, Infomix, Sybase, MS Access) uses SQL as the standard database language.

SQL is used to perform all type of data operations in RDBMS.

SQL Command

SQL defines following data languages to manipulate data of RDBMS.

DDL : Data Definition Language

All DDL commands are auto-committed. That means it saves all the changes permanently in the database.

Command Description

create to create new table or database

alter for alteration

truncate delete data from table

drop to drop a table

rename to rename a table

DML : Data Manipulation Language

DML commands are not auto-committed. It means changes are not permanent to database, they can be rolled back.

Engistan.com | Engineer’s Community

21

Engistan.com 90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014[ ]

Command Description

insert to insert a new row

update to update existing row

delete to delete a row

merge merging two rows or two tables

TCL : Transaction Control Language

These commands are to keep a check on other commands and their affect on the database. These commands can annul changes made by other commands by rolling back to original state. It can also make changes permanent.

Command Description

commit to permanently save

rollback to undo change

savepoint to save temporarily

DCL : Data Control Language

Data control language provides command to grant and take back authority.

Command Description

grant grant permission of right

revoke take back permission.

Engistan.com | Engineer’s Community

22

Engistan.com 90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014[ ]

DQL : Data Query Language

Command Description

select retrieve records from one or more table

Basic Structure of SQL Queries:

The basic structure of an SQL query consists of three clauses: SELECT, FROM, and WHERE.

1. SELECT Statement: SELECT Statement Defines WHAT is to be returned (separated by commas) Database Columns (From Tables or Views) Constant Text Values Formulas Pre-defined Functions Group Functions (COUNT, SUM, MAX, MIN, AVG)

“*” Means All Columns From All Tables In the FROM Statement Example: SELECT state_code, state_name

2. FROM Statement: Defines the Table(s) or View(s) Used by the SELECT or WHERE Statements „ You MUST Have a FROM statement „ Multiple Tables/Views are separated by Commas

3. WHERE Clause: Defines what records are to be included in the query

It is Optional. Uses Comparison Operators (=, >, >=, <, <=,!=,<> Multiple Conditions Linked with AND & OR Statements Strings Contained Within SINGLE QUOTES.

- AND & OR Statements:

Multiple WHERE conditions are Linked by AND / OR Statements „ “AND” Means All Conditions are TRUE for the Record

Engistan.com | Engineer’s Community

23

Engistan.com 90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014[ ]

„ “OR” Means at least 1 of the Conditions is TRUE „ You May Group Statements with ( ) „ BE CAREFUL MIXING “AND” & “OR” Conditions

Examples:

1. SELECT * FROM annual_summaries WHERE sd_duration_code = ‘‘1

2. SELECT state_name FROM states WHERE state_population > 15000000

3. SELECT state_name, state_population FROM states WHERE state_name LIKE ‘‘%NORTH%’’

4. SELECT * FROM annual_summaries WHERE sd_duration_code IN (‘‘1’’, , ‘‘W’’, , ‘‘X’’) AND annual_summary_year = 2000

Transaction Management:

Transaction: A transaction is a unit of program execution that accesses and possibly updates various data items. Or in simple words A transaction is an event which occurs on the database. Generally a transaction reads a value from the database or writes a value to the database.

Goal Of Transactions: The ACID properties

Atomicity: Either all actions are carried out, or none are.

Consistency: If each transaction is consistent, and the database is initially consistent, then it is left consistent.

Isolation: Transactions are isolated, or protected, from the effects of other scheduled transactions.

Durability: If a transaction completes successfully, then its effects persist.

1. Atomicity: A transaction can Commit after completing its actions, or Abort because of

- Internal DBMS decision: restart - System crash: power, disk failure, … - Unexpected situation: unable to access disk, data value, …

A transaction interrupted in the middle could leave the database inconsistent

Engistan.com | Engineer’s Community

24

Engistan.com 90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014[ ]

DBMS needs to remove the effects of partial transactions to ensure atomicity: either all a transaction’s actions are performed or none.

2. Consistency: Database consistency is the property that every transaction sees a consistent database instance. It follows from transaction atomicity, isolation and transaction consistency Users are responsible for ensuring transaction consistency

- when run to completion against a consistent database instance, the transaction leaves the database consistent

For example, consistency criterion that my inter-account-transfer transaction does not change the total amount of money in the accounts!

3. Isolation: Guarantee that even though transactions may be interleaved, the net effect is identical to executing the transactions serially For example, if transactions T1 and T2 are executed concurrently, the net

effect is equivalent to executing - T1 followed by T2, or - T2 followed by T1

NOTE: The DBMS provides no guarantee of effective order of execution.

4. Durability: DBMS uses the log to ensure durability. If the system crashed before the changes made by a completed transaction

are written to disk, the log is used to remember and restore these changes when the system is restarted.

Again, this is handled by the recovery manager

Engistan.com | Engineer’s Community

25