DBMS ( BA9176) - SYLLABUS UNIT – IINTRODUCTION Database and DBMS – characteristics –...

Preview:

Citation preview

DBMS (BA9176) - SYLLABUSUNIT – I INTRODUCTION

Database and DBMS – characteristics – importance – advantages – evolution - codd rules-database architecture; data organization- file structures and indexing

 UNIT – II MODELING AND DESIGN FRAME WORK

Data models- Conceptual design- ER diagram-relationships- normalization -data management and system integration

 UNIT – III DATABASE IMPLEMENTATION

Query languages-SQL for data creation, retrieval and manipulation, database transactions, concurrency control, atomicity, recovery, security, backup and recovery, data base administration- client server architecture based RDBMS.

 UNIT – IV DISTRIBUTED DATABASE AND OBJECT ORIENTED DATABASES

Concepts of distributed databases and design, Object oriented databases-object life cycle modeling conceptual design-UML.

 UNIT – V EMERGING TRENDS

Overview of visual databases and knowledge based databases-conceptual design and business impacts. Scope for professionals and certifications such as Oracle Certified Professional.

DATABASE MANAGEMENT SYSTEM

UNIT I

DATABASE

A database is a collection of information that is organized so that it can easily be accessed, managed, and updated.

In one view, databases can be classified according to types of content: bibliographic, full-text, numeric, and images.

DBMSDatabase-management system is a collection of interrelated data and a set of programs to access those data.

The primary goal of a DBMS is to provide a way to store and retrieve database information that is both convenient and efficient.

CHARACTERISTICS OF DBMS• Represents complex relationship between

data • Controls data redundancy. • Enforces user defined rules. • Ensures data sharing. • It has automatic and intelligent backup and

recovery procedures. • It has central dictionary to store information. • Pertaining to data and its manipulation. • It has different interfaces via which user can

manipulate the data. • Enforces data access authorization.

IMPORTANCE OF DBMS

• It helps make data management more efficient and effective.

• Its query language allows quick answers to ad hoc queries.

• It provides end users better access to more and better-managed data.

• It promotes an integrated view of the organization’s operations “big picture.”

• It reduces the probability of inconsistent data.

ADVANTAGES OF DBMS• Redundancies and inconsistencies can

be reduced• Better service to the Users• Flexibility of the system is improved• Cost of developing and maintaining systems

is lower• Standards can be enforced• Security can be improved • Integrity can be improved• Enterprise requirements can be identified• Data model must be developed

EVOLUTION • MIS and early database concepts– SDC a group of RAND corp adopted the term DBMS

• File management system – To reduce the cost of producing routine administrative programs– Reuse of subroutines written to handle

• IDS - Integrated Data Store – First commercial dbms by General Electric

• IMS - Information Management Systems – by IBM for the Apollo moon landing project

• CODASYL and DBTG - Committee On Data SYstems Languages & Database Task Group

– Played important role for DBMS– Subgroup DBTG for promoting DBMS idea– Technical savior for MIS– Ex: Prime dbms from PRIME Computer, IDS II from Honeywell, DMS 170 from

CDC, DMS II and DMS 1100 from UNISYS, and DBMS 10 and DBMS 11 from Digital Equipment Corp

CODD RULES 1. Information Rule - All information in a relational

database including table names, column names are represented by values in tables.

2. Guaranteed Access Rule -Every piece of data in a relational database, can be accessed by using combination of a table name & a primary key

3. Systematic Treatment of Nulls Rule - The RDBMS handles records that have unknown or inapplicable values in a pre-defined fashion

4. Active On-line catalog based on the relational model - The descriptions of a database and in its contents are database tables and therefore can be queried on-line via the data manipulation language.

CODD RULES 5. Comprehensive Data Sub-language Rule - A RDBMS

may support several languages. But at least one of them should allow user to do all of the following: define tables and views, query and update the data, set integrity constraints, set authorizations and define transactions.

6. View Updating Rule - Any view that is theoretically updateable can be updated using the RDBMS.

7. High-Level Insert, Update and Delete - The RDBMS supports insertions, updation and deletion at a table level.

8. Physical Data Independence - The execution of adhoc requests and application programs is not affected by changes in the physical data access and storage methods

CODD RULES 9. Logical Data Independence - Logical changes in tables

and views such adding/deleting columns or changing fields lengths need not necessitate modifications in the programs or in the format of adhoc requests.

10.Integrity Independence - Like table/view definition, integrity constraints are stored in the on-line catalog and can therefore be changed without necessitating changes in the application programs.

11.Distribution Independence - Application programs and adhoc requests are not affected by change in the distribution of physical data.

12.No subversion Rule - If the RDBMS has a language that accesses the information of a record at a time, this language should not be used to bypass the integrity constraints. This is necessary for data integrity.

DATABASE ARCHITECTURE

• The external level (or View Level) defines how each end-user type understands the organization of its respective relevant data in the database, i.e., the different needed end-user views. A single database can have any number of views at the external level.

• The conceptual level unifies the various external views into a coherent whole, global view. It provides the common-denominator of all the external views.

• The Internal level (or Physical level) is as a matter of fact part of the database implementation inside a DBMS. It is concerned with cost, performance, scalability and other operational matters.

DATA ORGANIZATION

DATA ORGANIZATION• Data Value (Cells).Contents of a field contained in a record.

“Raw Facts” that can be recognized.

• Relation – Table. Collection of related records

• Tuple -Record/Row. Collection of related fields

• Attributes - Field/Columns. Group of characters representing something

• Domain - Set of valid values of attributes

• Degree -Number of columns in a table

• Cardinality -Number of rows in a table

FILE STRUCTURE AND INDEXING

• The database is stored as a collection of files. Each file is a sequence of records.

• A record is a sequence of fields.

• One approach: assume record size is fixed

• This case is easiest to implement; when compared to variable length records.

FIXED LENGTH RECORD

Deletion of record I: alternatives : • move records i + 1, . . ., n to i, . . . , n – 1 • move record n to i • do not move records, but link all free records on a free list

FREE LISTS

• Store the address of the first deleted record in the file header.

• Use this first record to store the address of the second deleted record, and so on

• These stored addresses can be used as pointers since they “point” to the location of a record.

VARIABLE LENGTH RECORD

Fixed-length representation

POINTER METHOD

ORGANIZATION OF RECORDS IN FILES

Heap – a record can be placed anywhere in the file where there is space

Sequential – store records in sequential order, based on the value of the search key of each record

Hashing – a hash function computed on some attribute of each record; the result specifies in which block of the file the record should be placed.

SEQUENTIAL FILE ORGANIZATION

CLUSTERING FILE ORGANISATION

DATA DICTIONARY

• Data dictionary stores metadata: that is, data about data such asInformation about relations User and accounting information, including

passwords Statistical and descriptive data

Number of tuples in each relation Physical file organization information

How relation is stored (sequential/hash/…) Physical location of relation operating system file name ordisk addresses of blocks containing records of the relation

INDEXING

An index is a small table having only two columns.

The first column contains a copy of the primary or candidate key of a table and the second column contains a set of pointers holding the address of the disk block where that particular key value can be found.

ADVANTAGE OF USING INDEX

• Index makes search operation perform very fast.

• Suppose a table has a several rows of data, each row is 20 bytes wide.

• If you want to search for the record number 100, the management system must thoroughly read each and every row and after reading 99x20 = 1980 bytes it will find record number 100.

• If we have an index, the management system starts to search for record number 100 not from the table, but from the index. The index, containing only two columns, may be just 4 bytes wide in each of its rows.

• The result is a much quicker access to the record (a speed advantage of 1980:396).

DISADVANTAGE

• Little more space than the main table

• Index needs to be updated periodically for insertion or deletion of records in the main table

TYPES OF INDEX

PRIMARY INDEX

In primary index, there is a one-to-

one relationship between the entries

in the index table and the records in

the main table.

DENSE PRIMARY INDEX

SPARSE / NON- DENSE PRIMARY INDEX

CLUSTERING INDEX

SECONDARY INDEX

INDEX IN A TREE LIKE STRUCTURE

M-WAY SEARCH TREEThe above mentioned concept can be further expanded with the notion of the m-Way Search Tree, where m represents the number of pointers in a particular node.

If m = 3, then each node of the search tree contains 3 pointers, and each node would then contain 2 values.

A sample m-Way Search Tree with m = 3 is given in the following.

DATA MODEL

• A data model in software engineering is an abstract model that describes how data are represented and accessed.

• The three types of database models are network model, hierarchal model and relational model.

NETWORK DATABASE MODEL (NDBM) - Bachman’s diagram

Hodges Sidehill Brooklyn

BrookylnShiver North

MapleLowary Queen

55900

556 100000

801 10532

647 105366

HIERARCHICAL DATABASE MODEL (HDBM)

55900

647

105366

MapleLowary Queensy

Hodges Sidehill Brooklyn

Brookyln

Shiver North

556

100000

801

10532

647

105366

•HDBM starts with a root and has several roots.•A root will have several branches•Each branch is connected to one and only one root.•A branch has several leaves and set of leaves are connected to one branch.

RELATIONAL DATABASE MODEL

SubjectId

Title

SubjectSupervises

Teacher

FirstNameLastName

TeacherId

RELATIONAL DATABASE MODEL

Entity Key attribute

Other attributes

ItemVendorPurchase Order

Item codeVendor CodePurchase Order Number

Description 30 charactersVendor NameDate

Specification 20 CharactersVendor LocationVendor Name

Unit of measuresVendor Reg. No.Vendor Code

RELATIONSHIP TYPES

PATIENT

OCCUPIES

ASSIGNED

BEDONE-TO-ONE

ACCOMODATES

HOSPITAL ROOMONE-TO-MANYPATIENT

PATIENT

▲▲

OPERATED

OPERATES

SURGEONMANY-TO-MANY

OBJECT ORIENTED DATABASE MODEL (OODBMS)

• Object oriented databases store objects rather than data such as integers, strings or real numbers.

• Objects are used in object oriented languages such as Smalltalk, C++, Java, and others.

• Attributes - Attributes are data which defines the characteristics of an object.

• Methods - Methods define the behavior of an object and are what was formally called procedures or functions.

DATABASE DESIGN

• Database design is the process of

producing a detailed data model of

database to meet an end users

requirement.

PHASES OF DATABASE DESIGN

• Conceptual database design

• Logical database design

• Physical database design

CONCEPTUAL DATABASE DESIGN

It is a process of constructing a data model for each view of the real world problem which is independent of physical considerations.

This step involves:• Constructing the ER Model• Check the model for redundancy• Validating the model against user

transactions to ensure all the scenarios are supported

Ex: ER Model

LOGICAL DATABASE DESIGN

It is a process of constructing a model of information, which can then be mapped into storage objects supported by the Database Management System.

This step involves:• Table Generation From ER Model• Normalization of Tables

PHYSICAL DATABASE DESIGN

• The physical design of the database specifies the physical configuration of the database on the storage media.

• This step involves describing the base relations, file organisations, and indexes design used to achieve efficient access to the data, and any associated integrity constraints and security measures

ER DIAGRAM-RELATIONSHIPS

ENTITY• Entity is a thing in the real world with an

independent existence.

• Entity set is collection or set all entities of a particular entity type at any point of time.

Ex:A company have many employees.

Employees are defined as entities(e1,e2,e3....) These entities have same attributes are defined under ENTITY TYPE employee.Set{e1,e2,.....} is called entity set.

WEAK ENTITY

• A weak entity is an entity that cannot be uniquely identified by its attributes alone; therefore, it must use a foreign key in conjunction with its attributes to create a primary key.

READING AN ERD

NORMALIZATION

Database normalization is the process of organizing the fields and tables of a relational database to minimize redundancy and dependency.

Normalization usually involves dividing large tables into smaller (and less redundant) tables and defining relationships between them.

The objective is to isolate data so that additions, deletions, and modifications of a field can be made in just one table and then propagated through the rest of the database via the defined relationships.

THE NORMAL FORMS

• First Normal Form (1NF)– Eliminate duplicative columns from the same

table.– Create separate tables for each group of related

data and identify each row with a unique column or set of columns (the primary key).

• Second Normal Form (2NF)– Meet all the requirements of the first normal form.– Remove subsets of data that apply to multiple rows

of a table and place them in separate tables.– Create relationships between these new tables

and their predecessors through the use of foreign keys.

• Third Normal Form (3NF)– Meet all the requirements of the second

normal form.– Remove columns that are not dependent

upon the primary key.

• Boyce-Codd Normal Form (BCNF or 3.5NF)– The Boyce-Codd Normal Form, also referred

to as the "third and half (3.5) normal form", adds one more requirement:

– Meet all the requirements of the third normal form.

– Every determinant must be a candidate key.

• Fourth Normal Form (4NF)– Meet all the requirements of the third normal

form.– A relation is in 4NF if it has no multi-valued

dependencies.

• Fifth Normal Form (5NF)– Every non-trivial join dependency in the

table is implied by the superkeys of the table

• Sixth Normal Form (6NF)– Table features no non-trivial join

dependencies at all

• Data management comprises all the disciplines related to managing data as a valuable resource. Data management is an overarching term that refers to all aspects of creating, housing, delivering, maintaining and retiring data with the goal of valuing data as a corporate asset.

• System Integration: The process of putting a system together, with techniques to ensure all the parts work as a whole. Generally, the main contractor for the project is responsible for systems integration. The sub-contractors will usually be part of the integration team.

DDL1. Create:

Syntax: Create table <table name> (<list of variable>);Example: Create table inventory( id int primary key, product varchar(50), quantity int, price decimal(18,2));

2. Alter:Syntax: Alter table <table name> Example: Alter table department add primary key (dname)

3. Drop/Truncate:Syntax: Drop table <table name> or Truncate table <table name>Example: Drop table customer or Truncate table employee

DATA MANIPULATION LANGUAGE (DML)

• Select: To retrieval of information stored in the database

• Insert: To insert information into the database

• Delete: To delete information from the database

• Update: To modify or update the information stored in the database.

• Call: To call a PL/SQL or java sub program.

Retrieval:Syntax: SELECT <field names> FROM table_name [WHERE condition];Example: SELECT * FROM employee;

Insertion:Syntax: INSERT INTO <tablename> (col1name, col2name... colxname)

VALUES (value1, value2... valuex); Example: INSERT INTO citylist (name, state, population, zipcode) VALUES

('Argos', 'Indiana', '89', '46501');

Deletion:Syntax: DELETE FROM table_name [WHERE condition];Example: DELETE FROM employee WHERE id = 100;

Modification:Syntax: UPDATE <tablename> SET column_1 = [value1], column_2=

[value2]WHERE {condition}

Example: UPDATE StoreInformation SET Sales = 500 WHERE storename = "Los Angeles"

DATA CONTROL LANGUAGE (DCL)

• GRANT: To grant users the right to access the database or perform certain tasksGRANT {ALL/SPECIFIC PERMISSIONS} ON {TABLENAME} TO {USER ACCOUNT}Example: SQL> grant select on emp to endusers;

• REVOKE: To cancel any previously granted or denied permissionREVOKE {ALL/SPECIFIC PERMISSIONS} ON {TABLENAME} FROM {USER ACCOUNT} [CASCADE]Example:SQL> revoke insert, delete on emp from operators;

DATABASE TRANSACTION

• A transaction comprises a unit of work performed within a database management system (or similar system) against a database, and treated in a coherent and reliable way independent of other transactions.

• Transaction should have four properties: atomic, consistent, isolated and durable.

ATOMICITY

• Atomicity is one of the ACID transaction properties. In an atomic transaction, a series of database operations either all occur, or nothing occurs.

• An example of atomicity is ordering an airline ticket where two actions are required: payment, and a seat reservation. The potential passenger must either:– both pay for and reserve a seat; OR– neither pay for nor reserve a seat.

CONSISTENCY

• Consistency states that only valid data will be written to the database. If, for some reason, a transaction is executed that violates the database’s consistency rules, the entire transaction will be rolled back and the database will be restored to a state consistent with those rules.

• Assume that a transaction attempts to subtract 10 from A without altering B. Because consistency is checked after each transaction, it is known that A + B = 100 before the transaction begins. If the transaction removes 10 from A successfully, atomicity will be achieved. However, a validation check will show that A + B = 90.

ISOLATION

• Isolation requires that multiple transactions occurring at the same time not impact each other’s execution.

• For example, if Joe issues a transaction against a database at the same time that Mary issues a different transaction, both transactions should operate on the database in an isolated manner. The database should either perform Joe’s entire transaction before executing Mary’s or vice-versa. This prevents Joe’s transaction from reading intermediate data produced as a side effect of part of Mary’s transaction that will not eventually be committed to the database. Note that the isolation property does not ensure which transaction will execute first, merely that they will not interfere with each other.

DURABILITY• Durability ensures that any transaction

committed to the database will not be lost.

• Ex: Assume that a transaction transfers 10 from A to B. It removes 10 from A. It then adds 10 to B. At this point, a "success" message is sent to the user. However, the changes are still queued in the disk buffer waiting to be committed to the disk. Power fails and the changes are lost. The user assumes (validly) that the changes have been made.

CONCURRENCY CONTROL

• Concurrency control is a database management systems (DBMS) concept that is used to address conflicts with the simultaneous accessing or altering of data that can occur with a multi-user system.

• Ex: To illustrate the concept of concurrency control, consider two travellers who go to electronic kiosks at the same time to purchase a train ticket to the same destination on the same train.

RECOVERY

• Data recovery is the process of salvaging

data from damaged, failed, corrupted, or

inaccessible secondary storage media

when it cannot be accessed normally.

SECURITY• Unauthorized or unintended activity or misuse by authorized database

users, database administrators, or network/systems managers, or by unauthorized users or hackers

• Malware infections causing incidents such as unauthorized access, leakage or disclosure of personal or proprietary data, deletion of or damage to the data

• Overloads, performance constraints and capacity issues resulting in the inability of authorized users to use databases as intended;

• Physical damage to database servers caused by computer room fires or floods, overheating, lightning, accidental liquid spills

• Design flaws and programming bugs in databases and the associated programs and systems, creating various security vulnerabilities

• Data corruption and/or loss caused by the entry of invalid data or commands, mistakes in database or system administration processes, sabotage/criminal damage etc.

BACKUP

• It backs up a complete SQL Server database to create a database backup, or one or more files or filegroups of the database to create a file backup.

• Syntax: BACKUP DATABASE database TO backup_device [ ,...n ]

• Ex: BACKUP DATABASE AdventureWorks2008R2 TO DISK = 'Z:\SQLServerBackups\ AdventureWorks2008R2.Bak'

ACID PROPERTIES

• Atomicity: All actions of a transaction happen, or none happen.

• Consistency: If a transaction is consistent, and the database starts from a consistent state, then it will end in a consistent state.

• Isolation: The execution of one transaction is isolated from other transactions.

• Durability: If a transaction commits, its effects persist in the database.

CONCURRENCY CONTROL

a)Locking1. Pessimistic concurrency

control2. Optimistic concurrency

control 3. Overly Optimistic Locking

b) Timestamps

LOCKING• A lock is used when multiple users need

to access a database concurrently. This prevents data from being corrupted or invalidated when multiple users try to write to the database.

T1read-lock(X)

Read(X)write-lock(Y)

unlock(X)Read(Y)

Y = Y + XWrite(Y)

unlock(Y)

T2read-lock(X)

Read(X)unlock(X)

write-lock(Y)Read(Y)

Y = Y + XWrite(Y)unlock(Y

1. PESSIMISTIC CONCURRENCY CONTROL

• Locks prevents users from modifying data in a way that affects other users.

• Lock applied, other users cannot perform actions that would conflict with the lock until the owner releases it.

• Used in environments where cost of protecting data with locks is less than the cost of rolling back transactions if concurrency conflicts occur.

2. OPTIMISTIC CONCURRENCY CONTROL

• Users do not lock data when they read it.

• When a user updates data, the system checks to see if another user changed the data after it was read. If so an error is raised.

• Receiving error, rolls back the transaction.

• Used in environment where cost of rolling back is lower than the cost of locking.

3. OVERLY OPTIMISTIC CONCURRENCY CONTROL

• You neither try to avoid nor detect collisions, assuming that they will never occur.

• This strategy is appropriate for single user systems,

TIMESTAMPS

• A timestamp-based concurrency control algorithm is a non-lock concurrency control method.

• It is used in some databases to safely handle transactions, using timestamps.

• Whenever a transaction starts, it is given a timestamp

• It is to tell which order that the transactions are supposed to be applied in.

• Given two transactions that affect the same object, the transaction that has the earlier timestamp is meant to be applied before the other one.

• If the wrong transaction is actually presented first, it is aborted and must be restarted.

• If a transaction wants to read an object,– Before write timestamp it means that something changed

the object's data after the transaction started and must be cancelled and restarted.

– and the transaction started after the object's write timestamp, it means that it is safe to read the object. And the read timestamp is set to the transaction timestamp.

• If a transaction wants to write to an object,– transaction started before the object's read timestamp it

means it took a copy of the object's data. So the transaction is aborted and must be restarted.

– before the object's write timestamp it means that something has changed the object since we started our transaction. In this case we use the Thomas Write Rule and simply skip our write operation and continue as normal; the transaction does not have to be aborted or restarted

– otherwise, the transaction writes to the object, and the object's write timestamp is set to the transaction's timestamp.

 DATABASE RECOVERY TECHNIQUES

a) Deferred update techniques

b) Immediate update techniques

c) Shadow Paging

d) The Aries Recovery algorithm

a) DEFERRED UPDATE TECHNIQUES

• Do not update the database until reaches commit point.

• Before reaching the commit point, all transaction updates are recorded in the local transaction workspace (or buffers).

• During commit, the updates are first recorded persistently in the log and then written to the DB.

• If a transaction fails before reaching its commit point, no UNDO is needed because it will not have changed the database anyway.

• If there is a crash, it may be necessary to REDO the effects of committed transactions from the Log because their effect may not have been recorded in the database.

• Deferred update also known as NO-UNDO/REDO algorithm.

b) IMMEDIATE UPDATE TECHNIQUES

• Use two lists of transactions.

• List of all committed transactions since the last checkpoint

• List of all active transactions since the last checkpoint

• Undo the writes of all active transactions using the undo policy

• Redo the write operations of all the committed transactions

• Submit all active transactions again to the DBMS

C) SHADOW PAGING• This technique does not require LOG in single user environment• In multi-user may need LOG for concurrency control method• Shadow paging considers

– The database is partitioned into fixed-length blocks referred to as PAGES.

– Page table has n entries– Each contain pointer to a page on disk

• The idea is to maintain 2 pages tables during the life of transaction.

– The current page table– The shadow page table

• When transaction starts, both page tables are identical– The shadow page table is never changed over the duration of the

transaction.– The current page table may be changed when a transaction performs

a write operation.– All input and output operations use the current page table to locate

database pages on disk.

D) THE ARIES RECOVERY ALGORITHM

• Analysis: identify dirty pages in buffer pool (i.e., changes not yet written to disk), and identify active transactions at time of crash.

• Redo: repeats all actions, starting from proper point in log, thus restoring the DB state to what is was at time of crash.

• Undo: undo actions of transactions that didn’t commit --> DB reflects only committed transactions.

E) LOG-BASED RECOVERY

• Logging is the most popular mechanism for implementing recovery algorithms.

• The recovery manager implements– Commit - by writing a commit record to the

log and flushing the log (satisfies the Redo Rule)

– Abort - by using the transaction’s log records to restore before-images

– Restart - by scanning the log and undoing and redoing operations as necessary

DATA BASE ADMINISTRATION

• Database administration is the

function of managing and

maintaining database management

systems (DBMS) software.

DBA RESPONSIBILITIES• Installation, configuration and upgrading of Database

server software and related products.

• Evaluate Database features and Database related products.

• Establish and maintain sound backup and recovery policies and procedures.

• Take care of the Database design and implementation.

• Implement and maintain database security

• Database tuning and performance monitoring.

• Setup and maintain documentation and standards.

• Plan growth and changes (capacity planning).

• Work as part of a team and provide 24x7 support when required.

• Do general technical troubleshooting and give cons.

• Database recovery.

TYPES OF DBAS

• Systems DBAs: focus on the physical aspects such as DBMS installation, configuration, patching, upgrades, backups, restores, refreshes, performance optimization, maintenance and disaster recovery.

• Development DBAs: focus on the logical and development aspects of database administration.

• Application DBAs: usually found in organizations that have purchased 3rd party application software such as ERP and CRM systems.

CLIENT SERVER ARCHITECTURE BASED RDBMS

3-tier client-server architecture

2-tier client-server architecture

UNIT IV

DISTRIBUTED DATABASE

A logically interrelated collection of shared data, physically distributed over a computer network is called Distributed Database.

COMPONENT ARCHITECTURE FOR A DDBMS

GDD

DDBMS

DC

Computer Network

DDBMS

DC LDBMS

LDBMS : Local DBMS componentDC : Data communication componentGDD : Global Data Dictionary

site 2 DB

GDD

PARALLEL DBMS

A DBMS running across multiple processors and disks designed to execute operations in parallel, whenever possible, to improve performance.

Parallel DBMSs link multiple, smaller machines to achieve same throughput as single, larger machine, with greater scalability and reliability.

TYPES OF DDBMSHomogeneous DDBMS• All sites use same DBMS product. • Much easier to design and manage. • Approach provides incremental growth and allows

increased performance.

Heterogeneous DDBMS• Sites may run different DBMS products, with possibly

different underlying data models. • Occurs when sites have implemented their own

databases and integration is considered later. • Translations required to allow for:

– Different hardware.– Different DBMS products.– Different hardware and different DBMS products.

• Typical solution is to use gateways.

OBJECT ORIENTED DATABASES

• An object-oriented database

management system (OODBMS),

sometimes shortened to ODBMS

(object database management

system), is a database management

system (DBMS) that supports the

modeling and creation of data as

objects.

UML

• Unified Modeling Language (UML) is a

standardized general-purpose modeling

language in the field of object-oriented

software engineering.

• UML is used to specify, visualize, modify,

construct and document the artifacts of an

object-oriented software-intensive system

under development.

TYPES OF UML DIAGRAMS

a) Use Case Diagram

b) Class Diagram

c) Sequence Diagram

d) Collaboration Diagram

e) State Diagram

a) USE CASE DIAGRAM

Used for describing a set of user scenarios

Mainly used for capturing user requirements

Work like a contract between end user and

software developers

COMPONENTS OF USE CASE DIAGRAM

• Actors

• Use Case

• System boundary

• Association

• Generalization

• Include

• Extend

<<include>>

<<extend>>

COMPONENTS OF USE CASE DIAGRAM

• Actors - A role that a user plays

• Use Case - set of scenarios

• System boundary - boundary between actors & system

• Association - communication between actor &use case

• Generalization - relationship between general use case &

special use case

• Include - chunk of behavior is similar across more than

one use case

• Extend - add behavior to the base use case

b) CLASS DIAGRAM

• A Class is represented by a rectangle subdivided into three compartments– Name– Attributes– Operations

• Modifiers are used to indicate visibility of attributes and operations.– ‘+’ is used to denote Public visibility (everyone)– ‘#’ is used to denote Protected visibility (friends and

derived)– ‘-’ is used to denote Private visibility (no one)

• By default, attributes are hidden and operations are visible.

STRUCTURE OF CLASS DIAGRAM

Account_Name

- Customer_Name- Balance

+addFunds( )+withDraw( )+transfer( )

Name

Attributes

Operations

c) SEQUENCE DIAGRAM• Creation

– Create message– Object life starts at that point

• Activation– Symbolized by rectangular stripes– Place on the lifeline where object is activated.– Rectangle also denotes when object is

deactivated.

• Deletion– Placing an ‘X’ on lifeline– Object’s life ends at that point

d) COLLABORATION DIAGRAM

• Shows the relationship between objects and the order of messages passed between them. 

• The objects are listed as rectangles and arrows

indicate the messages being passed.

• The numbers next to the messages are called sequence numbers. They show the sequence of the messages as they are passed between the objects.

• Convey the same information as sequence diagrams, but focus on object roles instead of the time sequence.

d) STATE DIAGRAM• State Diagrams show the sequences

of states an object goes through during its life cycle in response to stimuli, together with its responses and actions; an abstraction of all possible behaviors.

Unpaid

StartEnd

Paid

Invoice createdpaying Invoice destroying

Ex: TRAFFIC LIGHT

Yellow

Red

Green

Traffic LightState

Transition

Event

Start

UNIT V

VISUAL DATABASES AND KNOWLEDGE BASED DATABASES

• The analysis and retrieval of large collections of image and video data, with emphasis on visual semantics, human psychology, and user interfaces is called Visual Databases.

• A knowledge base is a database used for knowledge sharing and management. It promotes the collection, organization and retrieval of knowledge.

SCOPE FOR PROFESSIONALS

• Competitive edge, differentiation and credibility = Job Opportunities

• Improved Efficiencies– Certified professional handle more support

calls– Companies advocating certification report

less down time.– Employee ROI payback in only nine months

because of savings due to increased internal effectiveness

• Quicker Adoption of Technology• Improved customer Service

CERTIFICATION PROGRAMS OVERVIEW

• Oracle Certified Professional Credential

(OCP)

• Microsoft Certified Database

Administrator (MCDBA)

• MY SQL Certification

ORACLE CERTIFICATION PROGRAM

• OCA:– Oracle9i DBA Certified Associate– Oracle9iAS Web Administrator Certified Associate– Oracle9i PL/SQL Developer Certified Associate

• OCP:– Oracle8i DBA Certified Professional– Oracle9i DBA Certified Professional– Oracle6i Internet Application Developer Certified

Professional– Oracle9i Forms Developer Certified Professional

• OCM:– Oracle9i Database Administrator Certified Master

TRENDS IN DBMS

• Multimedia DatabasesMultimedia data typically means digital images i.e audio, video, animation and graphics. The acquisition, generation, storage and processing of multimedia data.

• Distributed DatabaseA distributed database is a collection of multiple, logically interrelated databases of the same system distributed over various sites of a computer network.

• Document-oriented DatabasesEach record/document might have a different format (number and size of fields). They don’t store data in tables. Each record is stored as a document that has certain characteristics. An XML database are a document oriented database.

• Mobile & embedded DatabasesMany daily-use devices contain databases. TVs, washing machines, mobile phones e.g. Android phones with SQLite database. Embedded databases in cars, airplanes etc. manage configurations & store sensor data. Ex. db4o object database used in BMW Car IT system.

Recommended