View
215
Download
0
Category
Preview:
Citation preview
8/13/2019 PlanningDesigningYourDatabase DEL 20121214
1/41
8/13/2019 PlanningDesigningYourDatabase DEL 20121214
2/41
Some keystrokes and menu items are different on a Mac from those used in Windows and Linux. The table below
gives some common substitutions for the instructions in this chapter. For a more detailed list, see the application
Help.
Windows or Linux Mac equivalent Effect
Tools > Options menu selection LibreOffice > Preferences Access setup options
Right-click Control+click Open a context menu
Ctrl (Control) (Command) Used with other keysF5 Shift ++F5 Open the Navigator
F11 +TOpen the Styles and Formatting
window
Contents
Copyright 2
Note for Mac users 2
Introduction 5
General information 5
Specific information about this chapter 5
Specific information about Base 6
Java 7Memory cached tables 7
Why some knowledge of database theory is necessary 7
Problems when shutting down computer or a crash occurs 8
Useful commands 9
Goals: planning for the database 10
Database descriptions 10
Text databases 10
Using text data sources with Base 11
Flat database 11
Using flat databases with Base 12
Relational databases 12
Using a relational database with Base 16
Planning for a flat or relational database 16
The plan: the outline 17
Design based upon the plan 21
Part 1: Purpose of our database and a design for it 21
Part 2: Type of database and a design for it 23
Database data and a structure for it 23
Fields 24
Field names 24
Field types 24
Field properties 24
Tables 25
Tables for list boxes 25
Flat databases 26
Relational databases 27
First normal form 28
Second normal form 30Third normal form 32
Boyce-Codd normal form (BCNF) 35
Relationships between tables 38
Relationships 40
View 41
Forms 42
Queries 44
Principle #1: Determine the tables and fields we need 45
Principle #2: Sorting the information in the query 45
Principle #3: Determine the search conditions 45
http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13392_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13390_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13388_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__10960_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13386_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__5522_1393493797http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13384_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13382_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13380_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13378_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13376_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13374_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__4119_171218500http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__4117_171218500http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13372_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__10956_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13370_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13368_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13366_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__10954_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__10952_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__11019_1750313817http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__11017_1750313817http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__10950_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__11011_1750313817http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__4778_1929050850http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__10996_217214055http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__12216_688277121http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13364_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__11009_1750313817http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13362_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__11173_698468123http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__4776_1929050850http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__3729_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__1599_489383896http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__11005_1750313817http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__1595_489383896http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__11003_1750313817http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__5516_1393493797http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__1587_489383896http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__2960_2038632354http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__2958_2038632354http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__3727_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__20494_2096162955http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__3107_5368398828/13/2019 PlanningDesigningYourDatabase DEL 20121214
3/41
Principle #4: Principle #4: Set the type of output. (list, calculations) 45
Principle #5: Using Aliases to shorten or clarify headings 45
Principle #6: Review the query design as to whether it meets our requirements or not 45
Principle #7: Name and save the query 45
Principle #8 Save the database file 45
Reports 45
Introduction
General information
The Base Guide describes what you need to know in order to use the Base component of LO effectively. It begins by
describing a simple flat database in chapter 1. Then, beginning with chapter 2, we consider a more complex database
type: relational databases. Some principles described from chapter 2 onwards also apply to flat databases.
Chapter 2, Planning/Designing your Database, looks at the planning required when creating a database. This is one
of the most important parts of the creation process: it is the outline to be used when the database is created.
Chapter 3, Data Input and Removal, describes the beginning of the creation process. This chapter centers upon the
parts of the database with which we input and remove data: tables and forms. Wizards used to create tables and
forms are examined to reveal their principles and how to use them. We also examine the design dialogs used to
create tables and forms. These dialogs give you much more control over the creation of tables and forms than theWizard does. These dialogs are also used to modify tables and forms.
Chapter 4, Data Output, continues the creation process. This chapter centers upon the parts of the database that
provide us with data output: the queries and reports. We describe the wizards used to create queries and reports.
The query section also contains the explanation of the query design dialog. This dialog contains two parts: Design
View and SQL View. The design view permits creation of more complex queries, and the SQL view permits an even
more complex query using the SQL language.
The report section shows how to use the report wizard and the report design view for creating reports. Use of the
report design view (Report Builder dialog) to create or edit a report is discussed as well.
Chapter 5, Exchanging Data, describes how to use Base with other components of LO: Calc, Impress, and Writer.
Base can provide data for documents created by these components, and Base can input data from documents from
other components of LO.
Chapter 6, Customization of your Database, describes the customizing of the database design. This is a more detailed
look at the design of the individual parts: tables, queries, forms, and reports.
Chapter 7, More customizations, describes even more customization including macros, validation of data, generation
of calculated data, working with multiple forms, and working with other LO documents.
Chapter 8 discusses using Base at work. This includes advancing to an external server of database files, and usage of
database files from other servers, for example: MySQL, Oracle, PostgreSQL, and MS Access.
Specific information about this chapter
How well a database works depends upon how well it is planned and designed. But the same can be said for creating
a drawing (Draw), presentation (Impress), a spreadsheet (Calc), or text document (Writer). Help from others and
one's own experiences provide information concerning how to do the planning and designing of any project.
This chapter is designed to give you a background in how to do these things if you need it. If you already
understand how to plan and design a database, you might not need to read this chapter, However, you might still
find the content a useful refresher course.For those who are fairly new to databases, I recommend you follow the instructions of this chapter to create a simple
database. Then use the next two chapters to create it. Practice using the database for a while. Then plan and design a
more complex database. This should give you even more confidence.
Planning and designing a database begins with seeking as much information about the proposed database as
possible. That means many, many questions need to be asked and answered. For more complex databases, the
answers will often lead to more questions.
This is only the beginning. Even as the database is constructed, surprises will occur. It is doubtful that any plan will
be perfect when it is first created. Changes will have to be made in the plan. As the changes are made, the database
http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__10962_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13402_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13400_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13398_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13396_1869327000http://localhost/var/www/apps/conversion/tmp/scratch_5/#__RefHeading__13394_18693270008/13/2019 PlanningDesigningYourDatabase DEL 20121214
4/41
8/13/2019 PlanningDesigningYourDatabase DEL 20121214
5/41
data only requires a small fraction of that memory when the table is cached. This leads to a second advantage: when
the database file is first opened, there is less writing of data to RAM, so it takes less time to open the file. And there is
also a third advantage: the potential size of the database. You can work with databases that whose size is hundreds of
megabytes. That is not always possible if all the databases data is written in memory.
There is one drawback to cached tables: search time. When a query is run, the desired data has to be written into
memory first. Only then will the query run. The HSQLDB User Guide contains some directions on how to design a
query for quicker results. It still will not be as quick as when the data is already in memory before the query is
started.
Why some knowledge of database theory is necessary
Base has a wizard with which you can create a new database file for your database. As long as you are only creating
a simple embedded database, this wizard is fairly straightforward. However, if you want to connect to an external
data source, some knowledge is necessary before you can accomplish that. This includes connecting to a
spreadsheet.
Base has a wizard for creating tables. It includes several suggested tables along with the suggested fields to put in the
table. Then the wizard takes you through steps which determine what the field types and properties should be. You
can accept the suggested field types in the wizard. But what if you think you might want to change a fields field type
in some way? You could make some mistakes unless you know what you are doing. Otherwise you could just be
guessing. The same thing can be said about selecting the field properties. And you will be asked to select a Primary
Key. Do you know what this is, and why you need to have one?
Base also gives you the option to create a table usingDesign View . To use this requires you to name your fields,define their field types and properties. The Primary Key also needs to be selected as well. Some knowledge of
database theory helps greatly when using this option.
Creating and working with Base databases is not hard if you know what you are doing. This is why we teach the
basics of database theory as we describe what to do while creating and working with these databases.
Note
The basics of database theory are the same for
different databases, but the application of the basics
are usually different. If you have a background in
these basics as applied in another database program,
I advise you to look for the differences between
how Base applies these basics and the another
database program. It might same you some time and
some headaches.
Problems when shutting down computer ora crash occurs
CautionWhen Base is using the HSQLDB engine, some
special actions must be taken when Base is
shutdown incorrectly. This includes a computer
crash or shutdown before Base is closed properly.
This caution should have caught your attention. I intended to do exactly that! If you follow the directions we give,
you will have a minimum of problems. If you do not follow them, you can have many more problems than you will
want.
The other components of LO have an auto recovery feature which you can configure in Tools > Options > Load/
Save > General. Base does not use this feature. Type something in Writer or enter data in Calc. If your computer
crashes after the length of time set by Autorecovery, the information you entered will be restored when LO has gone
through its recovery process.
During the recovery process LO will try to recover databases, but in most cases the changes created during your last
session cannot be recovered. The recovery process can not recover opened forms and reports.
Backup options under Tools > Options > Load/Save > General can create backup files in the backup folder;
unfortunately, you need to create backup files manually. This requires using Tools >SQL to type commands. Theyare mentioned in Useful commands.
The sage advice of "save and save often" applies especially to Base when working with an embedded database. We
explain the importance of doing this in Chapter 3 and 4 of the Base Guide: Data Input and Removal, and Data
Output respectively.
How often you save when using Base to connect to an external data source depends upon the source and what you
8/13/2019 PlanningDesigningYourDatabase DEL 20121214
6/41
are doing. When creating or modifying a table, query, or report, you should follow the advice given for data source
you are using.
Base will prompt you to save your data under specific circumstances. When you close a form after entering or
modifying data, this prompt appears unless the database has already saved this data.
Data can also be added to or modified in a table opened in the Data Sources window. This data also needs to be
saved. While there is a Save icon at the top of this window, it can be easy to forget to save. When you close the Data
Sources window, you will be prompted to save the data unless it has already been saved.
Tip
Ways to save data
Click the Save icon in the table or in the form you are
working in.
Move to another record of a simple form.
Move to another record of the main form or sub form of
a complex form.
Special care must be taken when saving any editing made to any table, query, form, or report. The same is the case
when creating these. Saving these changes to the database requires two steps. First you must save your changes in the
table, query, form, or report you are editing or creating. This writes what you have done to memory. Now these
changes need to be written to the database file itself. To do so, click the Save icon in the Standard toolbar of the
main database window. This window is labeled at the top. Its label contains the name of your database followed by
LibreOffice Base.
Any time you finish using a database, close it. If you have any data which has not been saved yet, you will be asked
if you want to save the changes. You will be asked the same question if you have not yet saved changes made to thedatabase structure. This means you may see two such dialog boxes in a row. In almost all circumstances, you will
want to answer yes in both boxes. The only circumstance I can think of in which you would not want to answer yes
is when you made a change that you do not want to keep and can not reverse.
This is why you need to make sure you save your work on a regular basis. But what do you do if your computer
crashes, or the power goes off, or you shut down Base improperly, or someone shuts down the computer
improperly? You will lose any data or changes to the database since the last time you saved them.
If any of these problems happen while the database is being closed, you might lose the whole database. This is
especially true if Base is in the process of writing to the database when the power goes off.
Useful commands
Chapter 9 of the HSQLDB user guide contains the commands HSQLDB uses and their syntax. Some of these are
different from those used in Calc. For example, enter =NOW()into a Calc cell and you get todays date (data and
time if the cell is formatted for both). If you want to use todays date in a query, you must useCURDATE()
. This isfound in theBuilt- in Functions and Stored Proceduressection of chapter 9.
The commands can be directly submitted to database engine through menu Tools>SQL...command window.
There are two useful commands which you can use when you want to force changes to write into file. Use these
commands, when you have inserted a lot of data into tables and you want to avoid losing them.
The first is CHECKPOINT; the second is SHUTDOWN.
CHECKPOINT closes the database cleanly, then reopens it.
When used with the DEFRAG option, CHECKPOINT also shrinks the data to its minimal size.
SHUTDOWN closes the database open in Base; after that you need to close the ... .odb file and reopen it, if you want
to continue your work. I mention here only one option to this command, others will be discussed in Chapter 9.
SHUTDOWN COMPACT rewrites the data and shrinks all files to the minimum size. This is a good time to create a
backup copy of your file. Never create a backup copy of an open .odb file; it will not contain all the data.
Goals: planning for the database
A database is a collection of data. It might be a text file, a flat database, or a relational database. Whatever type it is,
we must create a structure for the data before we can use any of it. This is a three step process. First a plan needs to
be made based upon the purpose of the database. Then a design is made based upon the plan. Finally, Base is used to
create the database structure as it was designed.
Begin with some general ideas of what you want to accomplish. Then break these goals down into smaller, more
specific goals. These specific goals may need to be even further divided. You continue doing this until you are
8/13/2019 PlanningDesigningYourDatabase DEL 20121214
7/41
satisfied that the goals have become specific as possible.
The first decision to make is what type of database should be used. Then, beginning with the stated, written goals,
determine the structure of that database. So we will begin by describing the three types of databases. Later we will
divide our goals into more specific goals until we have the structure we need.
Database descriptions
We will discuss three types of databases: text, flat, and relational. Each type serves its own purposes.
Text databases
Text database files are exactly that: files that contain text only. They have a defined structure so that their information
can be retrieved. These text files may be one of several different types. One common format is comma separated
value (CSV). Another common type is vCard. (The latter is used in some address books.) LDIF is another common
format for an address book. (The vCard and LDIF formats are similar in structure.) Whether a given database engine
can access the data in the file depends upon its ability to use the built-in structure in the given format.
A variety of file extensions are used for text databases depending upon the format. For example CSV uses *.csv, and
LDIF uses *.ldif. When creating a text file for data using a word processor, any file extension can be used as long as
the file has a consistent structure that can be recognized by Base.
This structure must contain field separators and text separators. If the file contains decimals, decimal separators must
be present. (Although the database wizard includes a possible thousands separator, Base does not seem to recognize
thousands separators.)Figures 1 is a sample CSV file showing its format. The first row contains the names of the fields for this data source,
and the rest of the rows contain the data for these fields. All text is enclosed with double quotes, and the fields are
separated by commas.
Figure 1: Data source, CSV file
When Base accesses the CSV file, it creates a table.
Figure 2: Table for the CSV file
Using text data sources with BaseBase can recognize and access many text data source files. Maintaining these data source files requires using a
separate program to enter, remove, or modify the text in the file or files. Base will create the queries and reports from
the text files but will not modify the file by adding, changing, or removing text.
Normally text data source files should be created and modified using the same program. This will limit the potential
for errors. Figure 1 is an example of this. This file could be modified using a word processor, but you might then
misplace the commas. This type of error might make the data unusable. However, using spreadsheet programs such
as LO Calc, you can make your modifications and then save the file using the CSV format. Then all the commas will
be placed right where they need to be.
A text data source (database) can contain more than one text file with the same file structure. This might be several
address books using the CSV format: one address book for general personal use, another for business use, and
another for a social organization. As long as they are all placed in the same folder, Base will access all the files with
the format you tell Base to use.
If you want to have more than one text database, you need to have separate folders for each one. Otherwise, you
may find yourself accessing text files from more than one database especially if the two text databases use the same
file format.
CautionWhen modifying a text data source file, Isuggest you
use the program used to create the data source files.
You can make errors that will corrupt the data you
see when accessing the file using Base.
Flat database
A flat database contains data just as a text data source does, but the flat database is structured in a different way.
Figures 1 and 2 show the differences. The data is organized into one or more tables. (This example has only one
table shown in Figure 2.) Then the data is further organized into columns with each column containing the same
8/13/2019 PlanningDesigningYourDatabase DEL 20121214
8/41
kind of data. (The first column contains only the first names of people, the second column contains only the last
names of people, the third column contains display names, the fourth column contains only nicknames of people,
the fifth column contains the primary email addresses, and the sixth column contains the secondary email addresses.
These are referred to as fields.
The data is even further organized by the rows. All the data of a row refers to a specific person, place, or thing.
Together they describe the specific person, place, or thing. (The data in the first row refers to John Smith.) These are
referred to as records.
Tables in a flat database are independent of each other. You work with one table at a time. Think of a flat database
with more than one table as being a collection of tables that together serve a general purpose. Each table provides a
part of that purpose. Each table has its own queries and reports that gather information for that one table. Each table
can have its own form.
Computer Address books are flat databases. You can divide your addresses into groups (tables) according your
needs. You can have the same address in different groups if they meet the requirements for more than one group.
One of these may contain only the person's name and business email; another may contain family information,
personal email, and social network information.
Consider a library database that includes book titles, authors names, information on each author, and publication
information. When the library contains two or more books by the same author, the authors name is a duplication, as
is the author information.
As an example of a database requiring a large number of fields, consider a family database. Some of the fields are
the family members first and last name (total of 2 fields). Additional fields are the spouses first and last name. But
then we would need the first and last names for all the spouses. One of my siblings has been married three times (sixfields). So the total number of fields has increased to 8. Then we need the first and last names of the children. Well,
one of my siblings has five children (10 fields). So the total number of fields has gone to 18. A relational database
providing the same information requires only 11 fields. And with these 11 fields, a relational database will also
identify the step children and to what parent they belonged before the marriage. Even more fields would be needed
to do this with a flat database.
Note
Base and Calc can create flat databases. Specifically, Calc
spreadsheets can be saved in dBase format. To save more
than one sheet in a spreadsheet, you have to access each
sheet and save it one sheet at a time. (If you have 3 sheets,
you have to save three times.) For more information on doing
this, see Chapter 5 of the Base Guide.
Using flat databases with BaseThere is nothing really complex about planning for a flat database. You first determine the data to be used in the
database and what characteristics the data will have. The types of data and their characteristics will determine the
number of fields and the characteristics of each field. (Each field should have one characteristic that distinguishes it
from all the other fields.) You then name each field and define the characteristics for each field.
Determine if you will need only one or more than one table. Since tables in a flat database are independent from
each other, divide the fields into groups which have no defined relationships with the other groups.
Will the database be contained in one file (as in those databases created with Base) or will it be contained in two or
more files as in dBase or CSV files? If the latter, then all the files need to be placed in the same folder.
Some database programs will create only flat databases. Most can create relational databases as well as flat ones. The
latter has the ability to relate data held in separate tables, while the former can not do this. But unless the program
defines one or more relationships between data contained in separate tables of a database, it will be a flat one.
Relational databases
Like a flat database, relational databases contain one or more tables. They were originally designed to solve the
problems of duplication of data and multiplication of fields in a flat database.
A relational database has a more complex structure than does a flat one. Below are some of the general principles
that have been found to produce good design in relational databases.
Each table of the database must have one or more fields that uniquely identify each row of the table. For a
given value in one field or given sequence of values in two or more fields, there can only be one row
containing this value or sequence of values.
The tables of a database must meet specific criteria known as first, second, third, and fourth normal form. (These are
discussed in the design phase. See First Normal Form [reference will be placed here.])
8/13/2019 PlanningDesigningYourDatabase DEL 20121214
9/41
Tip
value. A cell with a null value has no entry at all in
it. Anything entered into a cell is considered to be a
value. A cell is considered to be null until a value is
entered into it. Similarly, a cell that has had its
value deleted has gone from being not null to have
the null value.
2) The field or fields that uniquely determine a
row have a name: primary key. This term isdefined and explained in chapter 3 of the Base
Guide, "Data Input and Removal." Relationships are defined by linking one or more fields of one table with one or more fields of another table.
The link is made by virtue of the fact that the two fields have identical values. The field or fields from one of
the tables will be the primary key for that table. The other fields from the second table may or may not
include that tables primary key. They are called foreign keys. Since sometimes we may want to use the
primary keys of two tables to define our relationship, it is possible that a primary key may also be a foreign
key.
TipRelationships in a relational database are defined by
primary-foreign key pairs.
Example of a relational database and the restraints required of it
We have family data that we want to put into a relational database. It begins with one table: the Family member. It
contains the family members first and last name, the spouses first and last name, and the childs first and last names.A sample table is shown below using the information given below as well. (This table does not have primary key
yet.)
Table 1: Family member table
FM First
name
FM Last
nameS First name S Last name C First name C Last name
Sam Silver Martha Silver Bob Silver
Wendy Whitter Dave Whitter Alice Whitter
Next we complicate things because Sam has since divorced Martha and married Wilma. Sam and Wilma now have
one daughter, Sandra.(Martha has kept her married name.)
Rather than just adding two more fields for the new spouse (her first and last name), we will create a second table
containing the first and last name of the spouses as well as the childs first and last name. Then we will add the fields
needed to link these two tables. By doing so, we will be able to associate each family member with a ll of the spouses
they have had.
The new table will be name Spouse for clarity. To identify each row of the Family member table, we have added the
field Family member ID (the primary key) to the Family member table and removed the four fields referring to the
spouse and child. The Family member ID field contains distinct values so we can identify each family member by
the value of the Family member ID field.
The Spouse table contains the Spouse ID field (the primary key), the four fields removed from the Family member
table, and the Family member ID field (foreign key). This last field is used to link the Spouse table to the Family
member table. The first two rows of the Spouse table are both linked with the first row of the Family member table.
For these rows, the values for both Family member ID fields is the same: 0. At the same time, individual spouses are
identified by the Spouse ID fields. Using the Family member ID with the Spouse ID fields, we can determine who has
been married to whom and what children belong to each one of these marriages. (See Tables 2, 3,and 4)
With the design we have now, we can list as many spouses for any given family member as we need. We do not
have to add any fields. Furthermore, we could add one or more fields such as the wedding date and date of birth.
Table 2: Modified Family member table
FM First name FM Last name Family member ID
Sam Silver 0
Wendy Whitter 1
Table 3: Spouse table
Spouse ID S First name S Last name C First name C Last nameFamily
member ID
0 Martha Silver Bob Silver 0
8/13/2019 PlanningDesigningYourDatabase DEL 20121214
10/41
1 Wilma Silver Sandra Silver 0
2 Dave Whitter Alice Whitter 1
Wendy Whitter has now had a second child, David, so we need to apply the same principles to the Spouse table that
we did to the Family member table. We will create a Child table containing the childs first and last names. We also
need a field to identify each of the children listed. I have used Child ID (the primary key). The Spouse table also
needs a field, Child ID (foreign key), to link the Child table to the Spouse table.
Table 4: Modified Spouse table
Spouse ID S First name S Last name Family member ID
0 Martha Silver 0
1 Wilma Silver 0
2 Dave Whitter 1
Table 5: Child table
Child ID C First name C Last name Spouse ID
0 Bob Silver 0
1 Sandra Silver 1
2 Alice Whitter 2
3 David Whitter 2
We have three tables in our database with 11 fields. With this structure, we can list all the family members, their spouses, andtheir children regardless of how many times any of the family members have married, how many children were born in these
marriages, and how many step-children there may be.
With a flat database, it would take 10 fields to list the family members, their one spouse, and one child. For each
additional spouse a family member may have, 2 more fields must be added. For each additional child, 2 more fields
must be added.
Figure 3 shows the family member as Sam Silver, two spouses: Martha and Wilma, and one child, Bob, belongs to
Sam and Martha (a green arrow is on the left end of Martas row). Figure 4 shows that Sam has been married twice
and they have had a child, Sandra, during their marriage. (The green arrow is on the left end of Wilma's row.)
Figure 3: Showing the child from one spouse for a family member
Figure 4: Showing a child from another spouse from the same family memberThis figure shows that Wendy has been married once and two children have been the results of that marriage.
Figure 5: Showing two children from one marriage
Defined relationships in a relational database serve another purpose. They permit queries to be created using the
tables that have these relationships. So with this example, we can have a query that will use all three tables since all
three tables are related. This query would use the First name and Last name of the Family table, the First name and
Last name of the Spouse table, and the First name and Last name of the Child table. By specifying the values for the
names from the Family and Spouse tables, the query will tell us the names of the children who belong to this specific
couple. This will only work because of these two relationships: the Family member ID field pair, and the Spouse ID
field pair.
Reports are based upon one table, view, or query, but in relational databases, a query can contain fields from more
than one table. In the above example, a report can be generated containing fields from all three tables: Family
members, Spouses, and Child. This is because the report can be based upon a query that contains the three tables.
Relational databases are the most common type of databases used today.
Using a relational database with BasePlanning for a relational database is very similar to planning for a flat database. You are still determining the tables,
fields, and the tables to which the fields belong. The main difference comes from the additional restrictions that a
relational database must have. They are necessary to define the relationships which exist among the fields of the
database.
8/13/2019 PlanningDesigningYourDatabase DEL 20121214
11/41
Planning for a flat or relational database
Tip
Many of the principles taught here apply to both
types of databases.. So, rather than mentioning both
all of the time, we will discuss flat databases. Then
we will discuss the additional things that apply to a
relational one.Regardless of what type of database will be used, the planning for creating the database begins at the same point:
you have some data that you want to use to provide you with some information. So, you begin with the question:
"What do I want the database to do?" The answer to this answer is the general goal for the database.
Planning for your database requires you to go from this general goal to more specific goals that will, when
completed, accomplish the general goal. In many cases, you will need to divide the specific goals into even more
specific goals. And you should continue this process until the specific goals cannot be further divided. When you
have finishing your planning, you have an outline of what the database should do and what its structure should be.
This planning process involves asking and answering questions based upon what you already have. This is how you
divide your goals into smaller parts. This is true whether you are creating the plan by yourself or someone works
with you creating the plan.
Make sure you take enough time to ask and answer the questions thoroughly, or the plan might not be as complete
as it could have been. You might not notice this until later in the creation of the database. If this happens, you willneed to come back to your plan to make changes and also change the things you have done since creating the
original plan. It is common for people not to ask enough questions to make their plan complete. (When I have had
to make changes when creating a database it was often because I did not even think of some possibilities.)
Your plan needs to be in outline form. This permits you to list your main goal in general terms. These general terms
will enough for the top level of the outline. As you divide each of these general terms, you create a second level.
Further division gives you additional levels of the outline.
You should create your plan using a word processor (such as Writer). You can review the higher and lower levels of
the outline to see the logic you have been using. This also permits you to add lines anywhere in the outline where it
is needed at any time, even long after the planning is done if the plan needs to be updated.
The outline of the plan can become rather complex because you will need to use several levels to get to the specifics.
If you are comfortable working with an outline containing six or more levels, then use the complete outline.
However, for most people, I suggest you break the outline into smaller parts. It should make your task much easier.
The outline can be divided into three parts: the purposes for the database, the type of database you will use, and thedata to be contained in the database. The first two part are not very complex. However, describing the data to be
used can produce a very complex outline. In this part, you will be describing the properties of the data in detail.
The plan: the outline
Note
1) The outline for the database plan will contain
comments inline. Each of the comments will
precede the portion of the outline to which the
comments apply.
2) Each part will be fully developed to the lowest
level of the outline before the next part is
begun.The top level of the outline contains the most important question: the purpose of the database. The answer may
contain only one item, or it may contain more than one (in which case you need a second level). On this level, youask what you want each of these items to do. The individual answers to these questions may contain one item or a
list. If an answer contains only one item, make sure the answer is detailed. If an answer contains a list, you go to a
third level using the same basic question as on the second level.
There is a pattern contained in the above paragraph. It is a pattern that you continue until all the lines of questions
end in single item answers. I have listed only two levels here and listed three levels in the outline shown below.
Some database plans require only one level, some require two, and others require three or more.
The answer to Question 3 in Part 1 should be very general in nature since Part 3 asks you to describe the data in very
specific terms. For example, data in a financial database will have to contain amounts, accounts, descriptions for
transactions, and dates and perhaps more. A library database will contain information about the books, information
8/13/2019 PlanningDesigningYourDatabase DEL 20121214
12/41
about the people borrowing the books, and information about loans of books.
The answers to Question 3 in Part 1 should also be general in nature since Part 3 also asks you to describe the queries
and reports in specific terms including what fields will be used. The answers can be in the form of "I want the
database to tell me ..." Your answers to Question 3 will fill in the ellipses ().
Tip
Remember that your database plan needs to answer
questions that begin with the word "What." Your
database design answers question beginning withthe word "How." It is not very hard while
developing your plan to drift from asking "What" to
asking "How."
Part 1: Purpose of the database
1) What is the purpose of the database?
a) What do you want (item in the list) to do?
i) What do you want (item in the list) to do?
What do you want (item in the list) to do?
2) What general types of data will be in the database?
3) What information do you want from the database? (queries and reports)
a) What information do you want to get from queries?
b) What information do you want to get from printed reports?
Part 2: Type of database
Base works with two types of data sources: external ones that it accesses, and embedded databases that it has created.
Since text and spreadsheet data sources are somewhat different from others accessed by Base, we will consider them
separately from the others. (The Base wizard contains the list of data sources that Base can access as a drop-down list
on page 1.)
Question 1 contains mutually exclusive choices so you only have to answer the questions in the appropriate section.
Question 2 needs to be answered for people who use a network or are concerned with compute security.
1) What type of database file or files will you be using?
a) Accessed database
i) For text or spreadsheet data sources:
What program will you use to create, modify, or delete fields and tables from your
data source?
Should a folder be created for the data source file or files? (A text data source willneed to have a special folder.)
ii) Other accessed databases:
What type of database file or files will you be accessing? (For example: Access,
MySQL, PostgreSQL, or Oracle)
Should a folder be created for the data source file or files?
b) Embedded databases
i) Will you be using a flat or relational database?
ii) Preference as to where to place the database file?
2) Will you be accessing the database file or files on your computer or over a network?
a) Who should have access to the database?
b) Will you want to have user names and passwords to control people's access?
Part 3: Data to be used in the database
This part of the outline is very important. This is where you define the attributes (characteristics) of your data. Each
attribute will be a field for the database. Once you have defined the attributes, you can separate them into tables
based on common relations that exist among the fields. If you are creating a relational database, the tables you create
in your plan will be modified and additional tables added as you design it based upon the plan outline you are
creating now.
Part 3a: Defining the attributes of the data
8/13/2019 PlanningDesigningYourDatabase DEL 20121214
13/41
Tip
.
in the Table Design dialog. (These choices are
discussed in more detail in the "Creating the
table" section of Chapter 3 of the Base Guide.
1) What data will you be using?
2) What names will you give to your data, the names of the fields in the database?
3) What are the characteristics of this data?
a) Name the characteristic for each field. (Choices: tiny integer, big integer, image, binary (variable or
fixed), memo, fixed length text, number, decimal, integer, small integer, floating decimal, real
number, double accuracy, variable length text, variable length text (ignore case), Yes/No, date, time,
date/time, other)
b) What are the additional properties for each field?
i) Will you require that an entry be always made for the field?
ii) What is the length of the field? (The maximum number of characters that any entry for the
given field can have.)
iii) Do you want a default value for this field that will entered for each record in the table? If so,
what is it?
iv) Is there a specific format that you want to use for this field? (For example, you can specify
currency format when selecting number or decimal as the field type, the date format, time
format, or the date stamp format.)
Part 3b: Planning the tables
Another thing that may determine some of the tables we will need: some fields will have a limited number of distinct
values. For each such field, we will create a table with a single field containing these distinct values. These will be
used to provide drop-down lists for these fields in the form(s) containing them. (We will use list boxes for the drop-
down lists.)
For example: A field for the days of a week. This has seven distinct values. Or, consider a field for the months of the
year. This will have 12 or 13 values. (Some non-Gregorian calendars have a leap month.)
It is possible that our initial list contains some "fields" that actually are not fields at all, but alternative values of a
single field. When this happens, list this field with the others. Name the table to be created for the field, and list the
distinct values with the name of the table.
For example: our budget database. The budget categories may seem like fields, but they are only distinct values for
specific budget levels. We will develop the plan for this database as our example after we define what a plan should
be.
The next step is to look at the fields to see if there are any relationships existing between them. We will be separating
the fields into tables based upon these relationships.
Note
A household inventory database might be a flat
database with one table.This permits queries to
provide detailed information needed for insurance
purposes and the value of the inventory. It might
also be a flat database with one table for each room
of the house. This permits queries or views that
display what is contained in each room. This might
be very helpful for a moving company. (A view
could also be created for each room displaying the
same information from a single table.)
1) What fields have a limited number of distinct values?
a) Name the table for each of these fields.
b) For each field, list the distinct values it has. For each field, list the field type and properties.
2) What fields are really distinct values for a specific field?
a) Name a table for each set of distinct values.
b) Name a field for each set (if it does not already exist) and list it with the fields of the database.
3) What are the relations that exist among the fields?
4) Do these relations agree with the purposes of our database?
a) Only if the tables suggested by the relations agree with the stated purposes of our database should we
use these tables.
8/13/2019 PlanningDesigningYourDatabase DEL 20121214
14/41
8/13/2019 PlanningDesigningYourDatabase DEL 20121214
15/41
Part 1 of the plan usually provides little specific information that can be used in the design phase, but it does provide
the general structure needed. It therefore places limits upon the other two parts as we work through them.
For example: the flat database described in Chapter 1 of the Base Guide, Introducing Base. Part 1 of that plan
requires that the information be useful to provide information that ones insurance agent could use to determine
what the insurance policy should be for the contents of the house. This limits the number of tables to just one.
Note
Designing a database is an art. This means that more than
one design may accomplish the goals listed in the databaseplan.
Tip
While designing a database, you should have a
separate copy of your plan that you can view. This
can be one or more sheets of paper or a text
document to be viewed on a monitor. If you are
designing it on a computer, you could have your
plan and design for your database on your monitor
at the same time (if it is big enough).
Part 1: Purpose of our database and a design for it
Part 1 provides the structure for the design's outline. So keep this part of the design in general terms. The more
specific parts of the design will be filled in in Parts 2 and 3.
Purpose of the database
The plan has a list of things to be obtained from the database. These will add structure to it as well as place
restrictions upon it. Design the general structure based on the answers to question 1 of Part 1 of the plan. Also design
the restrictions the same way.
1) What structure is required to accomplish each item listed in the answers to question 1a (including lower
levels) of Part 1 of the plan?
2) What restrictions are required because of these answers?
General types of data
These are described in greater detail in Part 3 of the design. In the plan, they served as an outline for all the data to
be used. Part 3 of the plan used this outline to develop all of the data needed. We will not have to include them in
Part 1 of the design.
General description of the information desired from the database
The general outline of the queries and reports is done here. In Part 3, these outlines are filled in based on the tables
created from the data.
Information from queries
This is the time to list the query or queries to be created, if any. Then we need to give each one of them some basic
structure: detailed or summary query. We should also list the general restrictions that apply to each query. Usually,
these restrictions will be in WHERE clauses, but summary queries can use the GROUP BY clause. A few of them use
the HAVING clause with GROUP BY. (This might not be obvious until you reach Part 3 of the design.)
1) Name the queries to be created.
2) Detailed queries: What restrictions are required using the WHERE clause?
3) Summary quer ies :
a) What restrictions require the WHERE clause?
b) What restrictions require the GROUP BY clause? Will the HAVING clause be required as well?
Information from reports
We should look carefully at the report descriptions in the plan. If there is not a query or table upon which the report
can be created, a query or view must be created before the report can exist.
For all the reports that have tables or queries from which to get the data, we need to determine what type of report it
should be: static, or dynamic. You also have the choice of using the Data window (F4) to insert data into a Writer
document or Calc spreadsheet. For example, the report could take the form of a letter describing the status of
investments for the previous year.
Now that LibreOffice also has the Report Builder extension installed; more elaborate reports can be created with this
8/13/2019 PlanningDesigningYourDatabase DEL 20121214
16/41
than with the report wizard. These reports can take the form of a text document or spreadsheet.
1) How will each report be generated? (Writer document, Calc spreadsheet, report wizard, or report builder?
Include a brief outline of the desired report
Report Builder: Will it be a spreadsheet or text document?
1) What query, table, or view will the report use? (It can only use one.)
a) If no existing one will do, what is the basic structure of the needed query or view?
b) Name the query, table, or view if there is one.2) Answer for each report: will it be static or dynamic?
Part 2: Type of databaseand a design for it
The center of this part is the LO database document file (extension.odb). We may want the database to be embedded
within this file. Or we may want to use the file to connect to a database (an accessed database).
This raises even more questions to consider in the design. The type of database must be considered as well as the
location of the database when it is an accessed. This is especially true when a local area network is involved.
Security also plays a part in the design. Base does not encrypt its file. But it can access databases which require a
password. If on a network, access can be controlled by use of passwords and access rights. All of these become a
part of the design.
Tip
When LO is on a network, the security issues
belong to the IT people. They will tell you what youneed to know when designing your database.
We must consider the input and output of the database as well. With embedded databases, LO can create both data
input (tables and forms) and data output (queries and reports). With external databases, LO cannot always add,
modify, or delete data. It can create data output. Text databases and spreadsheets are the two examples for which LO
cannot input or modify data. Neither can LO create a view for either. So, if you will be using either of these, you
must decide what program you will use to input data into the database.
1) Will the database be embedded within the database document file, or will this file be used to access the
database?
2) If the database is a text or spreadsheet, what will you use to add, modify, or remove data?
3) For other accessed databases.
a) How do you access i t?
i) What database driver is needed?
ii) What settings are required for the database driver?b) Where will the database be located? (same computer, network server)
i) User and password required?
ii) What is the path to the database on the network?
What are the access rights to the database?
Database data and a structure for it
Part 1 of the design contains the general outline of the desired database while Part 2 of the design contains the
outline of its, its creation, and what is required to use it. Part 3 of the design describes it in very specific terms.
This includes defining how data will be added, modified, or removed (data input and removal). Tables, views, and
forms for it are defined where needed. And if a relational database will be used, the relationships between tables are
defined.
Part 3 also includes how the data will be used (data output). Queries and reports are defined including what fieldsand their tables are used, what functions are needed (if any), what restrictions are required, and how the output data
should be sorted.
We begin with Part 3 of the plan. The latter lists the specific fields and their possible tables for the database. Then we look at
how we can use what we have, and we begin to ask questions again. When planning it, we asked ourselves what we wanted.
Now we ask another question: how do we do it?
Fields
It may seem as if we have all the information we need about our fields in Part 3 of the plan section. But we must now
take the same information and look at it from a different perspective. For example, some of the fields may not really
8/13/2019 PlanningDesigningYourDatabase DEL 20121214
17/41
be fields but only field values. (This will be more obvious when we begin our design of the tables.)
We need to create one or more three-column lists to contain our fields. The first column contains the field names, the
second column the field types, and the third column the field properties. Thus each row contains a field name, its
field type, and its field properties.
In the plan, we have already divided the fields into proposed tables for it. We should now create a list of fields for
each proposed table so that we can place fields that share a relationship in the same table. It is also easier to spot
errors or missing fields within a smaller list per table than might be possible when we put all of the fields into a single
large list.
What fields refer to the same thing?
Field namesWe have already defined and named our fields when we created our plan. At this point we should review our field
names for their usefulness.
Do we want to change any of the field names?
Do we want to add or remove any fields?
Field typesAs with the field names, we need to review the field types we have given to our fields. We might need to make some
changes. If we added a field when we reviewed the field names, we also need to assign a field type to the new field.
Have all of our fields been given an appropriate field type? If not, correct the field type.
Field propertiesThese are divided into four parts: AutoValue and Auto-increment statement, or Entry required; Length or Length
and Decimal; Default value; and format example.
AutoValue and Auto-increment statement
The AutoValue and Auto-increment statement properties apply only to the primary key of a table. The AutoValue
determines whether this fields values are automatically generated by the table or not. The Auto-increment statement
determines the value of the increments. (This statement is written in SQL.)
Entry required
For all fields of a table except for the primary key, you have a choice as to whether a value must be entered in every
record of the table or not. (This is one of the questions in Part 3d of the plan.) You need to look again at each one of
the fields to determine if the value for Entry required needs to be changed.
Length
Length describes the maximum number characters contained in a field value. (Fields having a date field type do not
have this property.) Its purpose is to prevent unnecessary space requirements in the database. Yet you want to make
sure that each of your fields is long enough. For example, a field for six digit odometer readings does not need a
length more than 6 digits. But a database for a very large bank may require much longer fields. Use your own
judgment as to what is enough, and what is too much.
Associated with the Length property is the Decimal property. This one determines how many decimal places the field
value will have. This divides the length into two parts: the number of digits to the left of the decimal point, and the
number of digits to the right of it.
Default value
You may want some fields to have a specific value until you change it. You specify what value you want the field to
have. This may be text, a number, or a date. Fields with the field type, Yes/No, permits three choices for the default
value: None, Yes, and No.
Format example
Field formatting is the last field property. It is tempting to apply all of the formatting that you can here. In most
cases, this should not be done. Only use essential formatting unless you are going to view the data in its table.
Field data is usually only seen in forms, queries, reports, and views. It is in these place that you should format the
field according to appearance you want in those places.
1) Primary key:
a) Is the AutoValue set properly for the primary key of each table?
i) Is it set to Yes when the Field type is Integer [INTEGER]?
8/13/2019 PlanningDesigningYourDatabase DEL 20121214
18/41
ii) Is it set to No when the Field type is not Integer [INTEGER]?
b) If an Auto-increment statement is desired, use SQL to write it.
2) Entry required: Should a field have an entry in every record?
a) If so, select Yes.
b) If in doubt , select No.
3) Length:
a) For non-decimal fields, select a length that is appropriate for the data the field will contain.b) For fields that contain decimals, enter the appropriate number of decimal places.
4 ) Default value.
If the answer is yes, enter the value.
If the field is Yes/No (Boolean),select the value you will use.
Tables
In the planning phase, we placed the fields of the database into one or more tables based upon the relations that exist
between the fields. In the design phase, we will consider these more closely. It may require us to make some changes
to our tables: modifying some, and perhaps creating some.
Tables for list boxesIf you have identified one or more fields that have a limited number of distinct values, you have also named the
tables that will contain these values. We must design a table for each set of values. If you have not done so already,
for each field with distinct values, define its characteristics.
If when planning you listed one or more fields having a limited number of distinct values, you need to design a table
for each one of these fields. If so, you have a list of these fields, the table names associated with them, and their field
types and properties. All of these will be used to define the tables needed to create list boxes for these fields in the
forms.
List boxes are very useful for fields with a limited number of distinct values. We use them to create drop-down lists
in our forms. We also use them for consistency in our data entries for these fields.
List each field with distinct values, the table holding its values, the characteristics of the table's field, and its
distinct values.
Because of the structure we have placed upon each table used for a list box,
The table contains only one field.
This field is the primary key for the table.
You can only enter distinct values into this field.
TipIn most cases, you will be using VARCHAR as the
field type for the field. Other field types can be
used, but this will be the most common one.
Flat databasesFlat databases contains tables that are independent of each other while relational databases have tables in which one
or more fields of a table are related to one or more fields of another table. In fact, a field of a table may be related to
another field of the same table. Because of this we design the tables of a flat database differently from those of a
relational database.
For a flat database, our table has already been designed when we designed the fields with their field type properties.
The only thing left to do is to create a table that implements these fields. But since a field is likely to have several
properties, we will divide these into individual properties as shown.
Before doing this, consider in what order you want each field to appear in its table. It may well be sensible to havesome consistency between the order of fields in a table and the form that contains it.
So, create a list like the one below for each table of the database. Begin by listing the fields in the order you have
now given them. For each table of the database, enter the characteristics of its fields as you have already defined
them. You will be using these lists when you create the database.
As you fill out these lists of field characteristics, you should review the fields you have defined.
Are there any more fields that might need to be added?
What are the characteristics of these fields?
Tables 6 and 7 are really one table. Each one contains part of the characteristics for the four sample fields.
8/13/2019 PlanningDesigningYourDatabase DEL 20121214
19/41
Table 6: Sample Characteristics of the fields of a database table
Field Name Field typeAutovalue Auto-Increment
StatementEntry Required
ID [INTEGER] Yes (none) Yes
Budget 1 [VARCHAR] Yes
Amount [DECIMAL] Yes
Reconcile [BOOLEAN] Yes
Table 7: Sample Characteristics of the fields of a database table
Field Name Length Decimal places Default value Format Example
ID
Budget 1 50
Amount 10 2
Reconcile 0 No
Table 8: Sample table with values.
ID Budget 1 Amount Reconcile
0 Auto 17.21 Yes
1 Food 12.00 No
2 Income 0.03 Yes
Relational databasesThese require a different method of design. While flat database tables are well defined from the time they became
part of the plan, relational database tables must meet more stringent requirements in the design phase. Fields may be
moved from one table to another. Additional tables may be needed and, in technical terms, the tables have to be
normalized. Relationships existing between tables have to be defined using one or more fields of one table and one
or more fields of another.
Normalized tableOne that has been shown or modified to meet the requirements for first through Boyce-Codd normal form.
When physically representing these tables, we will use the column headings (field names). For example, Table 8 would look
like:
ID Budget 1 Amount Reconcile
When modifying or creating a table, we will use this to indicate what fields belong to it. Sometimes we may need to
move one or more fields from one table to another. Or we may need to create a table with a field identical to one in
another table. We must then use the same name for the field in both tables.
One reason why fields have to be moved or duplicated and new tables created is to conform the tables to the rules
required for relational databases. This is done over four levels called the first, second, third, and fourth normal form.
Note
While additional normal levels exist, relational
databases that have fourth normal form will run
very well. In many cases, adding another
normal form level does not improve how well
the database runs.
Microsoft, perhaps some others, have combined
the original fourth normal form with anadditional requirement to define what they
define as fourth normal form.
While there are precise definitions of the normal
forms, they are very technical in nature. Unless
you have a good mathematical background,
these definitions are very difficult, if not
impossible to understand.
First normal formBefore making a table first normal form, we have to look for two possible characteristics in the columns and rows
8/13/2019 PlanningDesigningYourDatabase DEL 20121214
20/41
that must be removed if they exist. Does the table contain two or more columns that are identical in nature? Does the
table contain two or more rows with the same value for one field but different values for another field? If the answer
to both questions is no, the table is first normal form.
These two characteristics are two different ways of looking at the same problem. For example, consider an address
book. It contains these fields: first name, last name, street address, city, state zip code (postal code), and phone
number. One person listed in it has three phone numbers: a personal cell phone, his office cell phone number, and
his home phone number (land line). Another person listed has three addresses: where she lives and post office box
(for her snail mail). This tables would not be first normal form. One row of data contains three entries in the phonenumber field, and another row contains three entries in the street address field.
Table 9: Address Book not first normal form
First Name Last NameStreet
AddressCity State Zip Code Phone no.
Albert James 345 First Hart IN 12345
(111)
111-1111
(222)
222-2222
(333)
333-3333
Wanda Anderson
111 Main
Apt. 14P O Box
1991
Amherst MA 15115(100)
100-1010
We need to consider the primary key for the table because this is important. Originally, all the data referred to
specific people and could be divided into rows accordingly: one row for each person. As long as there was no
duplication of names, these fields, First Name and Last Name, can be the composite primary key. If duplication
exists, an ID field can be used as the primary key. Then each individual's name is associated with a specific ID value.
We still have two rows in which one field has multiple values.
Table 10: Not first normal form
ID (PK) F N L N S A City State Z C P No.
1 Albert James 345 First Hart IN 12345
(111)
111-1111
(222)
222-222
2
(333)
333-333
3
2 Wanda Anderson
111
Main
Apt. 14
P O Box
1991
Amherst MA 15115
(100)
100-101
0
Another possibility is to associate each phone number with a specific ID value. In our case, we would have to also
associate each street address with a specific ID value. While we can determine what each of the addresses are for,
how do we know to what phone number to use to call a given phone? This possibility is making the situation too
complex. The more fields have multiple entries, the more complex this possibility becomes. Furthermore, a field is
still needed to distinguish between multiple entries of each field that has them.
Table 11: First normal form
F N L N S A City State Z C P no. ID (PK)
8/13/2019 PlanningDesigningYourDatabase DEL 20121214
21/41
8/13/2019 PlanningDesigningYourDatabase DEL 20121214
22/41
Tip
The new composite primary key will create additional
rows of data that contains repetitions with exception
of the field containing multiple values and the field
added because of these values.
Second normal formFor a table to be in second normal form, it must be in first normal form and the fields must be dependent upon the
entire primary key. Clearly, if the primary key consists of only one field, the table is in second normal form. So,look for tables that have composite primary keys. They are the only ones in first normal form that might not be in
second normal form.
The primary key determines the values for the rest of the fields in table. With composite primary keys, one of the
fields in it may determine the values of some of the other fields by itself. That is when problems can occur:
repetitions in the rows of the table. Modifying the table to make it second normal form removes the repetitions.
We continue to use the address as our example that is in first normal form. It does have a composite primary key, so
we will check the fields in it to see if the table is in second normal form.
The field, Phone no., is determined by a combination of two fields: Type of Phone, and ID. We do not need the
fields, Type of Address nor First Name nor Last Name, to determine this field.
Similarly, the fields, Street Address, City, State, and Zip Code are determined by a combination of two other fields:
ID and Type of Address. We do not need the other field in the composite primary key to determine the values of
former three fields.
Finally, the field, ID, determines First Name and Last Name by itself. So, this requires a third table.
DeterminantA field that can determine the values of another field in its table.
Tip
A primary key is a determinant since it determines
the values of all the other fields. When we have a
composite primary key, it may contain a field that is
also a determinant that determines the values of
some fields but not all of them. Having such a field
in a table prevents it from being second normal
form.
This means that we have three fields in our table that are determinants. To make the table second normal form, we have to
create three tables, one for each determinant. Each of these tables contains the determinant as its primary key. The fields that
it determines are moved from the original table to this new one. The original table is modified. It still contains the completecomposite primary key. It may contain other fields, but our example does not. When we describe the designing of the Budget
database, this example will have fields containing more than the primary key.
Table 13: Phone table
ID(PK) Type of Phone (PK) Phone no.
1 Cell (111) 111-1111
1 Bus Cell (222) 222-2222
1 Home (333) 333-3333
2 Home (100) 100-1010
Table 14: Address table
ID (PK)Type of
Address (PK)Street Address City State Zip Code
1 Home & mail 345 First Hart IN 12345
2 Home111 Main Apt
14Amherst MA 15115
2 Mail P O 1991 Amherst MA 15116-1991
Table 15: Name table
ID(PK) First Name Last Name
1 Albert James
2 Wanda Anderson
8/13/2019 PlanningDesigningYourDatabase DEL 20121214
23/41
We have three new tables, so they should be examined to determine if they are first normal form. They are because
all the fields contain single values. But what about second normal form? (two tables contain a composite primary
key.) For the Phone table, both fields of the primary key are required to determine the Phone no. field. For the
Address table, the same thing is true. So, these tables are both second normal form. The third table, Name table, is
second normal form because its primary key contains only one field.
The modified Address Book has a single field primary key. So it is also second normal form.
Perhaps some observations need to be made about the modified Address Book table. In all five rows, the First Name
and Last Name field values are determined by the ID field (the primary key). The first two columns come directly
from the first two columns of the Phone table. Similarly, the first and third columns come from the Address table.
Table 16: Modified Address Book table that is now second normal form
ID (PK& FK) Type of Phone (PK & FK) Type of Address (PK & FK)
1 Cell Home & mail
1 Bus Cell Home & mail
1 Home Home & mail
2 Home Home
2 Home Mail
To check for second normal form:
Does the table have only one field upon which all the other fields depend?
If yes, the table is second normal form.
If a table has a composite primary key, can any of the fields of this key determine the values of a table field?
If no, the table is second normal form.
Modifying a first normal form table to make it second normal form.
1) What fields in the table depend upon just part of its composite primary key?
2) Separate these fields into groups depending upon which part of the primary key they depend upon.
3) Create a table for each group of these fields. Move each group to their own table. Add to each table a copy
of the primary key components upon which of its fields depend. (Each of these primary key components
becomes the primary key for its table.)
4) Modify the original table so that it contains its primary key and only the fields that depend only upon it.
Third normal formTo be third normal form, a table must be second normal form and its fields must depend only upon the primary key.Some times a field depends upon another key that is not the primary key nor part of a composite primary key.
Check list for third normal form:
Are there one or more fields that depend upon a field that is not the primary key?
! If Yes, the table is not third normal form.
! If No, the table is third normal form.Asking this question of the Phone and Address Book tables results in a No answer, so these tables are third normal
form. However, the Address table will give a Yes answer to this question. It is not third normal form.
Because the Zip Code field depends upon the stated three fields, we will create a new table using them as its
composite primary key. We also move the Zip Code to the new table.
Table 17: Zip Code table
Street Address (PK) City (PK) State (PK) Zip Code345 First Hart IN 12345
111 Main Apt 14 Amherst MA 15115
P O 1991 Amherst MA 15116-1991
Table 18: Modified Address table to make it third normal form
ID (PK)Type of Address
(PK)
Street Address
(FK)City (FK) State (FK)
1 Home & mail 345 First Hart IN
8/13/2019 PlanningDesigningYourDatabase DEL 20121214
24/41
2 Home 111 Main Apt 14 Amherst MA
2 Mail P O 1991 Amherst MA
At this point it pays to consider what tables you now have. We began with a single Address Book table. This has
been modified twice: once to make it first normal form, and the second time to make it second normal. We have also
created the Phone and Address table. The latter has been modified to make it third normal form.
So we need to check two tables, Address Book and Phone, to make sure they are also third normal form. In both
tables, the only non-key fields are only dependent upon their respective primary key. So they are third normal form.Modifying a second normal form to make it third normal form:
1) Determine what fields depend upon one or more non-key fields.
2) List each of these non-key fields or group of not-key fields.
3) Add to this list what fields depend upon each of them.
4) Create a table for each of the determinants you found.
5) Move the fields depending upon each determinant to the new table containing it.
Summary of our example table
Table 19: Original Address Book table
ID (PK) F N L N S A City State Z C P No.
1 Albert James 345 First Hart IN 12345
(111)
111-111
1
(222)
222-222
2
(333)
333-333
3
2 Wanda Anderson
111
Main
Apt. 14
P O Box
1991
Amherst MA 15115
(100)
100-101
0
Table 20: Modified Address Book table
ID (PK& FK) Type of Phone (PK & FK) Type of Address (PK & FK)
1 Cell Home & mail
1 Bus Cell Home & mail
1 Home Home & mail
2 Home Home
2 Home Mail
Table 21: Modified Address table
ID (PK)Type of Address
(PK)
Street Address
(FK)City (FK) State (FK)
1 Home & mail 345 First Hart IN
2 Home 111 Main Apt 14 Amherst MA2 Mail P O 1991 Amherst MA
Table 22: Name table
ID(PK) First Name Last Name
1 Albert James
2 Wanda Anderson
Table 23: Phone table
8/13/2019 PlanningDesigningYourDatabase DEL 20121214
25/41
ID(PK) Type of Phone (PK) Phone no.
1 Cell (111) 111-1111
1 Bus Cell (222) 222-2222
1 Home (333) 333-3333
2 Home (100) 100-1010
Table 24: Zip Code table
Street Address (PK) City (PK) State (PK) Zip Code
345 First Hart IN 12345
111 Main Apt 14 Amherst MA 15115
P O 1991 Amherst MA 15116-1991
Table 19 is the original Address Book with all of its flaws. The five following it are the tables we created as we
worked toward making them third normal form.
We will study the six tables to see what we can discover comparing Table 19 with the five tables following it. The
values for the First and Last Name fields are listed once in the Name table. The values for the Street Address, City,
and State fields are listed once in the modified Address table. The values for the Zip Code field are listed once in the
Zip Code table. And the values of the Phone no. field are listed once in the Phone table. As a result, fewer errors are
likely to be made when entering the data.
Table 20 is the modified Address Book table that looks much different from the original table. All the information in
the latter can be gotten from the modified table and the tables we created using a query or a form which containsdata from all the tables.
Note
As an introduction into the next normal form, I will
state that these three tables above are a lso Boyce-Codd
normal form. You can check for yourself if you want
to do so.
Boyce-Codd normal form (BCNF)There is an other normal form that is similar to third normal form. It is known as the Boyce-Codd normal
form(BCNF). Most third normal form tables are also BCNF. The only time when this statement may not be true is
when the table contains composite candidate keys which overlap. When a candidate key contains one or more fields
that are also part of another candidate key, the keys overlap. This potential problem very seldom occurs.
Boyce-Codd normal form
A relation (table) is in Boyce-Codd Normal Form (BCNF) if every determinant is a candidate key.
Candidate keyOne or more fields in a table that uniquely determines the other fields.
Tip
A table may have more than one candidate key, and
any one of these can become the designated primary
key. But remember that a candidate key can consist
of one or more fields of the table.
An example of BCNF: Consider a table, Enrollment, with these fields: student number, student name, course number,
course name, date-enrolled. We make these two assumptions: no two students have the same name, and no two
courses have the same name.
The candidate keys are these groups of fields: student number, course number; student name, course number, student number,
course name, and student name, course name). Any one of these for composite candidate keys can be used as the primary key.
Notice how these candidate keys overlap.
Student number Student name Course number Course name Date-enrolled
Student number determines Student name, Student name determines Student number, Course number determines
Course name, and Course name determines Course number. But none of these fields taken by itself is a candidate
key. So the table is not Boyce-Codd normal form.
To correct this problem, we create modify the original and create two new tables: Student and Course.
Student, Course, and the modified Enrollment tables (respectively):
Student number(PK) Student name
Course number(PK) Course name
8/13/2019 PlanningDesigningYourDatabase DEL 20121214
26/41
Student number(PK&FK) Course number(PK&FK) Date-enrolled
To show that there is a problem and its correction, we add some data to the original Enrollment table and then to the
tables correcting the problem. When a student takes more than one course, the Student number and Student name
are repeated. Similarly, the Course number and Course name are repeated when more than one student takes the
course. When this table is made third normal form, we eliminate this repetition.
Table 25: Original Enrollment table
Student number Student name Course number Course name Date enrolled
1001 Sam Livingston CS101Introduction to
ComputersAugust 8, 2015
1001 Sam Livingston CS102Introduction to
Operating SystemsAugust 8, 2015
1002 Max Caprilla CS101Introduction to
ComputersJanuary 4, 2015
Table 26: Student table
Column 1 Column 2
1001 Sam Livingston
1002 Max Caprilla
Table 27: Course tableColumn 1 Column 2
CS101 Introduction to Computers
CS102 Introduction to Databases
Table 28: modified Enrollment table
Student number(PK & FK) Course number(PK & FK) Date enrolled
1001 CS101 August 8, 2015
1001 CS102 August 8, 2015
1002 CS101 January 4, 2015
We enter data in the Student table one time for each student and in the Course table for each course. In the
Enrollment table, we enter data for the three fields for each course a student takes. When data is added or modified
for these tables, it only has to be done one time. For example, Sam Livingston legally changes his name to Walley
Habbernack. To modify our data, we only change his Student name field. When we do this, everywhere Studentnumber, 1001, appears, the database correctly lists the Student name as Walley Habbernack. We do not have to
worry about changing his name for every course he has taken.
From my studies, tables with this problem have fields that are similar in nature. This leads to candidate keys that
overlap. In our example, Course name and Course number do this. Student number and Student name would seem
to do it also. However, two or more students can have the exact same name. (I once had a class with three boys
having identical names.) In this case, the field
Recommended