31
Database Management Systems Versus File Management Systems A Database Management System (DMS) is a combination of computer software, hardware, and information designed to electronically manipulate data via computer processing. Two types of database management systems are DBMS’s and FMS’s. In simple terms, a File Management System (FMS) is a Database Management System that allows access to single files or tables at a time. FMS’s accommodate flat files that have no relation to other files. The FMS was the predecessor for the Database Management System (DBMS), which allows access to multiple files or tables at a time (see Figure 1 below). File Management Systems Advantages Disadvantages Simpler to use Typically does not support multi-user access Less expensive· Limited to smaller databases Fits the needs of many small businesses and home users Limited functionality (i.e. no support for complicated transactions, recovery, etc.) Popular FMS’s are packaged along with the operating systems of personal computers (i.e. Microsoft Cardfile and Microsoft Works) Decentralization of data Good for database solutions for hand held devices such as Palm Pilot Redundancy and Integrity issues Typically, File Management Systems provide the following advantages and disadvantages: The goals of a File Management System can be summarized as follows (Calleri, 2001):

Database Management Systems Versus File Management Systems

Embed Size (px)

Citation preview

Page 1: Database Management Systems Versus File Management Systems

Database Management Systems Versus File Management Systems

A Database Management System (DMS) is a combination of computer software, hardware, and information designed to electronically manipulate data via computer processing. Two types of database management systems are DBMS’s and FMS’s. In simple terms, a File Management System (FMS) is a Database Management System that allows access to single files or tables at a time. FMS’s accommodate flat files that have no relation to other files. The FMS was the predecessor for the Database Management System (DBMS), which allows access to multiple files or tables at a time (see Figure 1 below).

File Management Systems

Advantages Disadvantages

Simpler to useTypically does not support multi-user access

Less expensive· Limited to smaller databases

Fits the needs of many small businesses and home users

Limited functionality (i.e. no support for complicated transactions, recovery, etc.)

Popular FMS’s are packaged along with the operating systems of personal computers (i.e. Microsoft Cardfile and Microsoft Works)

Decentralization of data

Good for database solutions for hand held devices such as Palm Pilot

Redundancy and Integrity issues

Typically, File Management Systems provide the following advantages and disadvantages: 

The goals of a File Management System can be summarized as follows (Calleri, 2001):

Data Management.  An FMS should provide data management services to the application.

Generality with respect to storage devices. The FMS data abstractions and access methods should remain unchanged irrespective of the devices involved in data storage.

Validity. An FMS should guarantee that at any given moment the stored data reflect the operations performed on them.

Page 2: Database Management Systems Versus File Management Systems

Protection. Illegal or potentially dangerous operations on the data should be controlled by the FMS.

Concurrency. In multiprogramming systems, concurrent access to the data should be allowed with minimal differences.

Performance. Compromise data access speed and data transfer rate with functionality.

From the point of view of an end user (or application) an FMS typically provides the following functionalities (Calleri, 2001): 

File creation, modification and deletion. Ownership of files and access control on the basis of ownership

permissions. Facilities to structure data within files (predefined record

formats, etc). Facilities for maintaining data redundancies against technical

failure (back-ups, disk mirroring, etc.). Logical identification and structuring of the data, via file names

and hierarchical directory structures.

Database Management Systems

Database Management Systems provide the following advantages and disadvantages: 

Advantages Disadvantages

Greater flexibility Difficult to learn

Good for larger databases

Packaged separately from the operating system (i.e. Oracle, Microsoft Access, Lotus/IBM Approach, Borland Paradox, Claris FileMaker Pro)

Greater processing power Slower processing speeds

Fits the needs of many medium to large-sized organizations

Requires skilled administrators

Storage for all relevant data Expensive

Provides user views relevant to tasks performed

 

Ensures data integrity by managing transactions (ACID test = atomicity, consistency, isolation, durability)  

 

Supports simultaneous access   

Enforces design criteria in relation to data format and structure 

 

Page 3: Database Management Systems Versus File Management Systems

Provides backup and recovery controls

 

Advanced security  

The goals of a Database Management System can be summarized as follows (Connelly, Begg, and Strachan, 1999, pps. 54 – 60): 

Data storage, retrieval, and update (while hiding the internal physical implementation details)

A user-accessible catalog Transaction support Concurrency control services (multi-user update functionality) Recovery services (damaged database must be returned to a

consistent state) Authorization services (security) Support for data communication Integrity services (i.e.

constraints) Services to promote data independence Utility services (i.e. importing, monitoring, performance, record

deletion, etc.)

The components to facilitate the goals of a DBMS may include the following:

Query processor Data Manipulation Language preprocessor Database manager (software components to include

authorization control, command processor, integrity checker, query optimizer, transaction manager, scheduler, recovery manager, and buffer manager)

Data Definition Language compiler File manager Catalog manager

ConclusionFrom the File Management System, the Database Management System evolved. Part of the DBMS evolution was the need for a more complex database that the FMS could not support (i.e. interrelationships). Even so, there will always be a need for the File Management System as a practical tool and in support of small, flat file databases. Choosing a DBMS in support of developing databases for interrelations can be a complicated and costly task. DBMS’s are themselves evolving into another generation of object-oriented systems. The Object-Oriented Database Management System is expected to grow at a rate of 50% per year (Connelly, Begg, and Strachan, 1999, pg. 755). Object-Relational Database Management System vendors

Page 4: Database Management Systems Versus File Management Systems

such as Oracle, Informix, and IBM have been predicted to gain a 50% larger share of the market than the RDBMS vendors. Whatever the direction, the Database Management System has gained its’ permanence as a fundamental root source of the information system.

Advantages of database systems

Database advantages include the following:

shared data; centralized control; disadvantages of redundancy control; improved data integrity; improved data security, and database systems; and,

flexible conceptual design. [Top]

Disadvantages of database system

Disadvantages are as follows:

a complex conceptual design process; the need for multiple external databases; the need to hire database-related employees; high DBMS acquisition costs; a more complex programmer environment; potentially catastrophic program failures; a longer running time for individual applications; and,

highly dependent DBMS operations. [Top]

Nowadays every open source CMS or other script, down to the most basic of ideas such as simple password-protection scripts, demand the use of a (usually) MySQL database to function. Simple scripts using Berkeley DB files or even plain text files are seen as marginal, too limited or too awkward for use. I have several sites which run a simple CMS based on plain text files, and I can't see a reason to switch.

Is MySQL overkill for many uses, or does it offer any significant advantages over flat files other than the fact that you can't reliably use flat files when load-balancing between different servers? flat files are usually faster, aren't they? What do MySQL/PostgreSQL/MS-SQL offer that makes them so popular for web development?

arran

#:1579015

 5:48 pm on Sep. 5, 2005 (utc 0)

Very strange - i was thinking about this yesterday!

SQL (the query language) is the main reason i

Page 5: Database Management Systems Versus File Management Systems

abandoned my plan to switch from database to flat-files. Although not perfect, it's extremely easy to quickly throw together all the queries you need for a site. The thought of writing a hacky php function every time you think of a new way to display/sort information doesn't appeal.

Other reasons that come to mind are:

- Transactions - Scalability (Indexing in particular) - Security

In general, i do agree that many sites could live without databases and in terms of performance would definitely benefit from switching to flat-files.

arran.

txbakers

#:1579016

 6:44 pm on Sep. 5, 2005 (utc 0)

flat files are usually faster

flat files are NOT faster. they can only be read from top to bottom, and usually they have to be read all the way through.

A good database engine will take a query (Select field from table where field = '#*$!x') and first determine the best way to go into the database to solve the query, then leave the processing when the query is solved.

Another main reason to use databases is the ability to minimize the repetitions of data. Think about having multiple spreadsheets for a project. If you have an ID number assigned to a particular item, you only need to reference the ID number between different sets of data, rather than replicating all of the data for each sheet.

Database tables give you the ability to do this easily with "joins".

Still, if all you need to do is validate passwords, a flat file might be easier to use. But beyond that,

Page 6: Database Management Systems Versus File Management Systems

you're much better off using a database.

Lord Majestic

#:1579017

 6:53 pm on Sep. 5, 2005 (utc 0)

Good databases have key advantage over flat files -- concurrency. When you just read stuff from file its easy, but try to syncronise multiple updates or writes into flat file from scripts that run in different process spaces.

AffiliateDreamer

#:1579018

 2:38 am on Sep. 7, 2005 (utc 0)

say you go the flat file route....(maybe i'll start a new thread)

I'm curious as to how many files in a single directory have you guys stored in the past?

I mean, can you store 1 million files in a folder on a windows server? unix server?

Will file access times differ between unix and windows2003?

Lord Majestic

#:1579019

 2:46 am on Sep. 7, 2005 (utc 0)

I mean, can you store 1 million files in a folder on a windows server? unix server?

Possible, but this is a very bad idea to have so many files -- I know, I have got 130,000+ files in one directory :o

physics

#:1579020

 3:04 am on Sep. 7, 2005 (utc 0)

Storing a lot of flat files in one directory is a very bad idea. There is a serious slowdown for the directory listing (though some unix file systems are designed so that this isn't as much of a problem, but not the common ext3). If you must use tons of flat files you should split them up into different directories so that each directory doesn't have too many files in it... But really your time will be better spent getting MySQL running.

mrMister

#:1579021

 1:32 pm on Sep. 8, 2005 (utc 0)

I use flat files to cache database output for data that doesn't change very often.

Page 7: Database Management Systems Versus File Management Systems

I pull the data out of the database and store it in a flat file. It is quicker, especially if you have a lot of processing to do (if you're using joins for example) whenever you retreive the data.

Hester

#:1579022

 9:03 am on Sep. 15, 2005 (utc 0)

Hmm, this forum is actually built on flat files!

aspdaddy

#:1579023

 11:56 am on Sep. 15, 2005 (utc 0)

It has nothing to do with speed or performance. Simple flat files will always out-do a database because manipulating them is closer to the machine language.

The main reason for databases is independance from hardware, user intefaces and data integrity.

trillianjedi

#:1579024

 11:59 am on Sep. 15, 2005 (utc 0)

I use flat files to cache database output for data that doesn't change very often.

I also use that type of hybrid design and have found it gives the best performance/data integrity balance.

Most CMS's are designed to do everything. That functionality always comes with a compromise.

Absolutely any CMS can be customised to work better for you than it does straight out of the box. Reducing unnecessary queries by removing them, or caching them, is one of the simplest things you can do that makes a significant performance difference.

I have several sites which run a simple CMS based on plain text files, and I can't see a reason to switch.

Great - a good sign of slick software and no unnecessary bloat.

I think it's a horses for courses thing. If you have a largely static site in terms of the data type you're displaying (i.e. what would be the DB structure in a

Page 8: Database Management Systems Versus File Management Systems

DB based site doesn't ever need to change) then flatfile is great. and fast.

If you need to budget in the future for either complex load-balancing, or a structure of data that will need to grow and change, then a DB is probably the favourite choice - for reasons of management rather than speed.

Is MySQL overkill for many uses.....?

Yes. But given the low cost these days of an incredibly fast server, fast hard disk and plenty of RAM, does it really matter all that much?

Why is SQL so popular?

Convenience, simplicity of design, portability and easy management of content data.

TJ

physics

#:1579025

 6:19 pm on Sep. 15, 2005 (utc 0)

It has nothing to do with speed or performance. Simple flat files will always out-do a database because manipulating them is closer to the machine language.

This is first of all assuming that the code that you write is 'more efficient' than the code that the developers of the database software have written. Plus you have to take the time to write all of those functions and debug them. Also, there are other things to consider such as the fact that if you have tons of data stored on a hard disk in all different files then every time you want to access a file a file handle has to be opened at some random point on the disk, etc. With a database it's stored in a 'central' location and the handle is open... There are also all sorts of creature comforts like the MySQL slow query log and things like mytop so you can keep track of what queries might be slowing your system down. I recently wrote an application that initially used flat files, then switched to MySQL, the speedup was

Page 9: Database Management Systems Versus File Management Systems

immediately evident. Finally, yes many sites are based on flat files but they would probably be better off with an 'advanced' database system.

Hester

#:1579026

 3:39 pm on Sep. 16, 2005 (utc 0)

Only one file to backup too.

webprofessor

#:1579027

 2:10 am on Sep. 19, 2005 (utc 0)

>> Simple flat files will always out-do a database because manipulating them is closer to the machine language.

What a silly thing to say. Do you know how php handles flat files? Do you know how asp handles flat files? Do you know how C++ handles flat files ( which realy the notion of which doesn't even exist ) All do it diffferently and with different advantages and disadvantages.

Do you mean all flat files? Are you using locks and semaphores? Are you doing row processing? Are you searching the flat file? If your searching how many lines are in the flatfile? Are there even lines? and here's the doozy.. do you even know how to write assembly and what advantages it brings by being closer to the metal?

ALWAYS is a term you shouldn't use. Further more for anything data sets of signifigent size or doing anything signifigent with usually you are wrong. It takes quite a skilled programmer to use flat files to manage large amounts of data effeciently.

** anyways back to the point of the thread. **

The use of SQL allows for modular design of your application. If you think this is a one time app and you'll rarely if ever need to change it or its data, then a flat file may be the choice for you.

I always use SQL when there is the possibility of having the scale the application larger or the

Page 10: Database Management Systems Versus File Management Systems

complexity of the data may increase. The reason being is that for me it is easier to change a sql based application than rewrite all my functions for parseing files.

If your worried about performance, getting a better server may be a solution for you. Processing power is cheap these days. For me it is cheaper than the amount of time it takes me to write code.

FIN

FlatFileAdvantages

PmWiki stores pages in flat files instead of using a relational database such as MySQL. This page explains why this design decision has been made.

Pm's Explanation

Pm: I chose flat files to store PmWiki pages because I haven't seen any real advantages of using a database, and there are definitely some disadvantages. For the standard operations (view, edit, page revisions), holding the information in flat files is clearly faster than accessing them in a database, and with page caching abilities (coming soon) it'll be even faster. The only operations that really benefit are searches, but I've always believed that for fast, flexible search capabilities it's much better to use existing search programs such as ht://Dig or Google over reinventing another search engine. PmWiki's Site.Search is functional/fast enough for most purposes, and if more performance is needed it's just better to switch to a real search engine.

Indeed, as of January 2004 the Wikipedia uses a MySQL database to store its 190K+ entries, but even with the database Wikipedia has disabled its online search because of performance issues and just forwards search queries directly to Google.

And there are big disadvantages to using a database -- with a database we'd have to write a bunch of "administrative" tools/scripts to handle things such as mass page deletions in the database, backups/restores of the pages, recovering pages that have been wrongly deleted, etc. Much of that administrative programming overhead is eliminated by using a flat file system, as admins can use

Page 11: Database Management Systems Versus File Management Systems

existing tools (FTP clients, web-based file/directory managers, shell commands). They are already comfortable with the administrative tools. It's also much easier to build sophisticated and customized page management tools and scripts for specialized applications.

Finally, PmWiki is already structured such that the flat file structure can be easily replaced by a database if it ever proves necessary. However, even PmWiki sites with more than 40 000 pages function well in a flat file system without any noticeable performance problems.

PmWiki supports the ability to subdivide the wiki.d/ directory into separate subdirectories for each group, avoiding the "too large" directory problem. Check out the Cookbook:PerGroupSubDirectories for more information.

Comments:

Flat files are indeed much more easy to manage and my experience shows that there is no problem at all for PmWiki. Still I had problems convincing my boss using PmWiki since it is not using a "real" database. Ever thought of using subdirectories for each group like in Uploads? There are known issues on Solaris for directories containing more than 20.000 files. Uli

PmWiki already supports the ability to subdivide the wiki.d/ directory into separate subdirectories for each group, avoiding the "too large" directory problem. Contact me via email or pmwiki-users. --Pm

This is now specified in Cookbook:PerGroupSubDirectories. Thanks, Ben and Patrick! --Sproaticus

On a Linux based operating system, with a filesystem like ReiserFS which can handle directory with tons of files entries, performance should not be a problem and should even be better than using a database. -- Pouik

There is a lot of prejudice out there in favor of using database engines instead of flat files. Choosing which to use in a project ought to be similar to choosing what programming language to use. Some of the questions to ask are:

Page 12: Database Management Systems Versus File Management Systems

o Which choice fits the problem domain best (databases fit random queries against a very large set of records best, flat files fit Wikis best)

o What are the programmers familiar with; what do they like?

o What is available; what does the corporate culture allow; how much do they cost? -- David Spector

Personally, I like to store un-structured data in flat file. However, I do believe database has its advantage on structured data. I feel this way when I was using other wiki (Tiki, Wikipedia, phpWiki..) I always think to extend them to include flat file. So, how about a common API? -- Duncan Hsu

o PmWiki already has a common API, implemented via the PageStore class in pmwiki.php. Cookbook authors can create a class with the same interface as PageStore that saves pages in alternate locations such as a database. --Pm

I've got a question: wouldn't there be a problem with same-time multi-user access to a file? (I mean writing - losing other's changes possibly)

o That is one problem I guess. Another is the administration side of it. Of course I can dive into FTP and work with the flat files there, but I like an admin interface of restoring articles. Mainly because I have editors who are not so familiar with FTP as I am. --sjoerd

o PmWiki handles any locking necessary to make sure that multiple accesses to a file don't cause any changes to be lost. PmWiki also supports automatic merging of simultaneous edits. --Pm

I created a 8000 files wiki for fun and testing. Basic pagehandling is fine no performance issues. Search is acceptable. However creating the .linkindex file from scratch is a problem. The host I run the site on (and my test-machine) has a time out of 30 seconds. I disabled the linkindex, however no backlinks ( pagelist link={$FullName}) are too slow. --BrBrBr

Re-enable the link index and run a few backlink searches (even if they time out). PmWiki will incrementally build the link index. Once the link index is built, everything will be fast and there won't be a big cost in keeping the link index up-to-date. --Pm

Page 13: Database Management Systems Versus File Management Systems

Another BIG advantage of flat files is that they are easy to edit directly. -- Babak

o Exactly! I know many scenarios where data-loss, caused by hardware or transfer failure (storage medium crashes, power dropouts and the likes), was easy to fix by simply using an editor on the (flatfile) server's commandline and changing back what was causing errors. I've never been able to do this with similar ease for MySQL (and in such cases hate my job). -- SomeSysAdmin

Maybe the reason flat files work so well is that a file system IS a hierarchical database -- William

Is a database more secure? That extra password protection needed to access MySQL databases must mean something... Right? -- Xen

o Then why have no sites running PMwiki with flatfiles (that I know of) ever been compromised? ;-) -- Julius

o If you can get access to PmWiki's flat files, you could also get access to the php script containing the database password. So it doesn't really provide any extra security. -- Andrew

Exactly. But one should never store the (non-flatfile) database password containing php in a web-server accessible location. Instead do an include and put the php somewhere outside of the web/doc root. -- Julius

I think the biggest disadvantage of using a flatfile system is that it take the programmer too much time to design it and to maintain its stabilization, especially when more and more new feathers are added into the project and more and more requirements are put out. And this also add risk to user's data, as bugs are more likely to be brought in by program update. This also add difficulty to resolve compatibility problems. On the other hand, flat file system does work more efficiently than database in most situations. -- Adam

o I would have to disagree (with part of that). Programming something to speak to (and read from) MySQL for example can be just as painful, precisely because it is not your own code or design. That can be a huge disadvantage: You never know when an updated MySQL needs changed

Page 14: Database Management Systems Versus File Management Systems

queries, when it will do what, if it will do what you need and so on. -- Julius

I think that this could be an endless debate because the line is often thin between advantage and disadvantage, imho the safe bet will always be to give the option and let people choose given their own needs, cheers. -- h3

o I don't think the line is that thin. With a separate database you will always have a much bigger chance on crashes and downtimes. You make yourself more dependent by needing yet another service to be running and backupped separately etc. Just count the times you see things don't work and give you MySQL errors online, I have rarely, if at all, seen that occur with flatfile databases. -- Ben

Many people already have a copy of MySQL running, so that isn't a problem. The mysql problems are from sites that are too many/too slow of queries for their hardware. something as simple as retrieving a wiki page isn't going to have trouble like that.

More people don't have a copy of MySQL running. In fact, I know more people who don't run it on their servers, precisely because it is such a resource monster for its purpose: Merely some text-file storage system. -- Steven

Flat file has a very important advantage -- you can diff and merge pages with merge tools. With that you would be able to make more than one wiki sites in all your computers and merge them periodically. I think lots of people need this function. At least, I switch to dukowiki from mediawiki just because of this.-- Edward

Databases are always on top of a filesystem -- At last all of the "real" databases store their data on a filesystem. They provide an abstraction layer for purposes as e.g. authentification, transactions or only convenience on different OS and have a common query syntax (SQL). Therefore the performance issue relies mainly on following factors:

o Performance of the filesystem o Efficient caching strategies (for data, queries, ...) o Efficient internal file organization o Efficient code (client and server)-- Heiko

Page 15: Database Management Systems Versus File Management Systems

Most file systems map files to hard sectors on a disk. Databases offer a level of virtualisation:

the sectors can be on any disk or server. Result is you can use one server/disk for DB, another server for PHP and a third for web server. You can share out load and get better overall performance even in very heavy usage. Of course that may not be the goal of PmWiki, ;-). -- Peter Well you can always use NFS if you want your files on another server. But in both cases NFS or a DB, running them on another server is actually likely to increase your latency and not necessarily increase your thoughput. The advantage of a separate DB is more apparent when you need more than one client accessing it at the same time, which, of course, you can do with NFS also, the DB might provide better locking mechanisms but they are not likely to be important to pmwiki (not writer heavy enough). How do you suggest running PHP on another server than your web server? And, whatever your solution for this, wouldn't this also be available without a DB also? Martin Fick

Just to say. I prefer flatfiles in this case just becouse my home server is an MMX, but isn't SQL servers loaded in memory? memory access time is much slower than HD, not to mention the really old ones (my is 2GbATA100). Of course that not all the pages should be loaded on memory all the time, but for the most accesed ones... Also, it is easier to provide a single download file providing with all the wikidata for the user who wants to have it offline. He will just need a way to read it... And my third point is that it is better for a wiki becouse no JOIN is needed.

Page 16: Database Management Systems Versus File Management Systems

Logical vs. Physical File Organisation

The physical storage of a file is how it is stored on the storage medium.

The logical storage of a file is how it looks to the user when it is being processed.

 

File Access Methods

A serial access file has data stored in the order in which it was written. New data goes to the end of the file. To read a record from the file it is necessary to read through all of the records from the start of the file until reaching the required record.

A sequential access file has data stored in the order of a key field.

An indexed sequential file stores data in the order of a key field, but also has an index holding the key field values and the address at which they are stored. This allows both sequential and direct access.

A direct access file is one where any record can be accessed without having to access other records first. This is also known as Random Access.

NB Direct Access Files can only be stored on Direct Access Media.

 

Why are files stored on a tape cartridge always serial access?

 Advantages of Direct Access over Sequential Access

Records can be accessed in any chosen order

Records do not have to be put into any order when the file is created

Selected records can be accessed far more quickly from direct access files.

Page 17: Database Management Systems Versus File Management Systems

 

Advantages of Sequential Access over Direct Access

Sequential Files can be stored on Tape as well as disk

It is easier to write programs to handle sequential files

 

Advantages of Sequential Access over Serial Access

Updating master files with transaction files is made more easy using sequential files since they are both in the same order to begin with

What would be the best method of access to find one record in a large file - sequential or direct?

 

 

Why choose different methods of file access?

Some file organisations are better than others for particular tasks. These are some of the reasons why a particular file organisation may be chosen:

 

The number of records to be accessed - If not many records are to be accessed, direct access is more efficient. If most records are to be accessed during processing then sequential access is better. This file activity is measured by hit rate. (See later)

The size of the file - In large files sequential searches take a long time, so direct access is better. In small files time delay is less important, so sequential access is acceptable.

Page 18: Database Management Systems Versus File Management Systems

The type of application - Sequential access is usually suitable for batch applications, but on line applications usually require direct access to give a fast response time.

The type of storage media - Magnetic tape is a serial access medium so direct access cannot be used.

 

Relational Databases

 

Information is very valuable and must be organised so that it can be accessed as easily and quickly as possible. A database not only stores the data, but also organises the data and controls access to it with a program called the Database Management System (DBMS).

 

Flat files

The earliest data storage computer systems used flat files- A fiat file is like information stored in a grid or table.

Each row in the table contains a record —information about a particular person or thing.

Each column in the file contains information on one field, for example Name, Type, and so on.

 

Primary key

No two records in a file can be the same or it will lead to confusion. For example, if two people have the same name, there must be some other means of identifying which record refers to which person.

Therefore each record has to have a unique identifier - something that makes it different from all the other records - called the primary key.

 

A persons surname cannot be used as a primary key because:

Page 19: Database Management Systems Versus File Management Systems

It may not be unique — there are many people with the name Smith

it may change (for example if the person marries) and you must never alter the primary key

it maybe lengthy and so there is more chance of typing it incorrectly

the database software creates an index of key fields and the shorter the key field, the smaller the index, so the sorting and searching operations will be performed faster — a surname can be long

 

It is a good idea to create a special field to act as the primary key. Sometimes there is an obvious candidate, for example in a garage keeping a file on cars it repairs, the registration number would be an ideal primary key.

 

Why are flat files inefficient?

In flat files data tends to be stored in several places. For example, in a school information system information about teachers may be stored on the file for classes, as well as on an administration file holding employment information.

This is very inefficient because repeating data wastes disk space. It could also lead to inconsistent data if the teacher's address was stored differently in the two files.

We could store the information more efficiently by using a database with two files. The class table and the teacher table.

Entity Relationship Diagram

This diagram shows the relationship between the teacher and the class. The diagram shows that a teacher can teach more than one class. It also shows that each class can only be taught by one teacher.

We can find out the name and address of each teacher from the teacher table

 

How is a database different from a flat file?

Page 20: Database Management Systems Versus File Management Systems

There can be more than one table in a database. A flat file database has only one table. Each table in a database contains information about an entity, for example teacher, class, and so on

 

What is a relational database?

In a relational database the tables are related. This means the tables are linked together in some way. For example in the school database, we can create a relationship between the teacher number field in the teacher table and the class code field in the class table.

In the class table, class code is the primary key as it uniquely identifies the class. However it is not unique in the teacher table as one teacher can teach more than one class. By looking in the class table we can find out the teacher code of the teacher who takes that class. By looking in the teacher table we can find out more information about the teacher.

 

Flat files or databases?

Flat files have been used by computers for many years. They are usually used for one particular specific purpose. For example, a company might maintain an employee file used to produce a pay slip. The personnel department might also have a file of employees' records, which holds some different data.

When information is held on separate files in different parts of an organisation, difficulties arise. Information on one file might be up-dated but not on the other, leading to inconsistencies. Most businesses now use databases to organise their data more efficiently and to give flexibility.

The file approach is program-oriented; the needs of the program determine how it is stored. The database approach is data-oriented, the type of data determines how it is stored. This gives it a number of advantages over a file-based system.

 

Advantages of the database approach

Page 21: Database Management Systems Versus File Management Systems

Data independence. Any changes to the structure of a database, for example adding a field or a table, will not affect any of the programs that access the data. In a file-based system, a minor change in a file structure may require a lot of reprogramming of all the programs that access this file.

Data consistency. Each data item is stored only once, however many applications it is used for. There is no danger of an item, such as an employee’s change of address, being up-dated on one system and not on another. If this happened the data would not be consistent.

No data redundancy. Redundancy occurs when data is duplicated unnecessarily. In a file-based system, the same information may be held on several different files, wasting space but making up-dating more difficult.

More information available to users. In a database system, all information is stored togethe’ centrally. Authorised users have access to all this information. In a file-based system data is held in separate files in different departments, sometimes on incompatible systems.

Ease of use. The DBMS provides easy-to-use queries that enable users to get instant answers. In a file-based system a query would have to be specially written by a programmer.

Greater security and integrity of data. The DBMS will ensure that only authorised users are allowed access to the data. Different users can have different access privileges, depending (In their needs. In a file-based system using a number of files it is difficult to control access

 

 

Disadvantages of the database approach

Bigger computers and greater risk. A DBMS is a large program that may require a larger disk and a bigger, more powerful computer than a file-based system. Storing all the information in one database can be dangerous if the database system fails. All departments of the organisation will be affected, not just a single department. Complex procedures are required to ensure that lost data can be recovered.

Greater complexity. As databases have more uses, they get more complex and their design is more difficult. This requires considerable expertise and if not done well, the new system may fail to satisfy the user’s needs.

Page 22: Database Management Systems Versus File Management Systems

Possible inefficieney or poor performance. A tailor-made file-based system may be more efficient than a database.

 

NB

a table is another name for a file in a database

a relationship is a link between two tables

an entity is a subject about which data is stored

 

 

Types of relationship

A relational database consists of a number of entities or tables which are related in some way. Entities can be related in any one of three ways:

one-to-one one-to-many many-to-many

 

Examples

Each product in a supermarket has a bar code. The relationship between product name and its bar code is one-to-one.

One supermarket company has many stores. This is an example of a one-to-many relationship.

There are many different products, each on sale in many different stores. This is an example of many-to-many relationship.

 

Database Normalisation

Page 23: Database Management Systems Versus File Management Systems

Normalisation means structuring the database in the most efficient manner. In a relational database it is not possible to have many-to-many relationships as these will lead to ambiguities.

Tables should be structured in such a way that

there is no redundant data, that is there is no unnecessary duplication

data is consistent throughout the database many-to-many relationships are avoided the structure of each table is flexible enough to allow you to

enter as many or as few items as you want the structure should enable a user to make all kinds of complex

queries relating data from different tables.

 

Database administration

1. The database administrator (DBA) is the person responsible for maintaining the database. The administrator’s tasks include the following:

2. The design of the database and monitoring its performance. If problems arise appropriate changes must be made to the database structure,

3. Keeping users informed of changes in the database structure that will affect them.

4. Maintenance of the data dictionary for the database. 5. Implementing access privileges — specifying what a user can

access and/or change. 6. Protecting confidential information. 7. Allocating passwords to each user. 8. Providing training to users in how to access and use the

database. 9. Ensuring adequate back-up procedures exist to protect data.

 

The Data Dictionary

The data dictionary is information about the database, such as:

what tables are included and fields in these tables name and description of each data item

the characteristics of each data item, such as its length and data type

Page 24: Database Management Systems Versus File Management Systems

any restrictions on the value of certain fields the relationships between data items

control information such as who is allowed access to files whether users can change data or only read it

 

The Database Management System (DBMS)

The DBMS is the program that provides an interface between the database and the user in order to make access to the data as simple as possible, it has several other functions:

1. Data storage, retrieval and up-date. Users to store, retrieve and up-date information as easily as possible. These users are not computer experts and do not need to be aware of the internal structure of the database or how to set it up.

2. Creation and maintenance of the data dictionary 3. Managing the facilities for sharing the database. Many

databases need a multi-access facility. Two or more people must be able to access a record simultaneously and to up-date it without a problem.

4. Back-up and recovery. information in the database must not be lost in the event of system failure

5. Security. The DBMS must check passwords and allow appropriate privileges.

 

Database security

As we have already seen data stored in a database is very valuable. Good security to prevent loss, theft or corruption of data is vital. Relational databases such as Microsoft Access and Paradox are multi-access databases. This means that on a network, more than one user may access the same database at the same time, How can the software cope with two or more users opening the same file, making alterations and yet maintain the integrity of the database?

 Relational databases provide different methods of database security:

 

1. The simplest method is to set a password for opening the database. Once set, a password must be entered whenever the database is opened. Only users who type the correct password

Page 25: Database Management Systems Versus File Management Systems

will be allowed to open the database. The password will be encrypted so that it can’t be accessed simply by reading the database file. Once a database is open, all the features are available to the user. For a database on a stand alone computer, setting a password is normally sufficient protection.

2. A more flexible method of database security is called user-level security, which is similar to the sort of security found on networks. Users must type a username and password when they load the DBMS. The database administrator will allocate users to a group. Different groups will be allowed to see different parts of the database.

 

Databases or a manual system?

Why should a video shop use a database? Could a manual system be better than a database system? Both systems record details of members and who has hired which video. (This could easily be done in a paper-based system by using a list of all videos and the name of the hirer written next to it.)

A manual system is cheaper, unlikely to break down and requires little training. However the computerised system will probably be better for the following reasons:

Management information is automatically gathered, for example details of each hiring, financial details, how many times a customer has hired a video and how many times a video has been hired. These figures can be used in preparation of accounts or to analyse which videos are most popular.

Better service to customers. Using a bar code reader to enter the video code and the member code is very quick. Queues at the counter will be shorter.

Details of members and videos can be found and printed quickly.

The names and addresses of members can be used for advertising purposes in a direct-mail shot. The database can be queried to come up with a list of people who haven’t hired a video for six months and a letter written offering them a discount if they hire a film this week. The letter can be personalised using the mail-merge from a word-processing package.

Page 26: Database Management Systems Versus File Management Systems

Similarly automatic reminders can be sent out to members who have not returned a video by the due date.

The database can be extended to include the member’s date of birth. The computer can be used to ensure that a member is old enough to hire, say, an 18 video.