Data Checking

Preview:

DESCRIPTION

Data Checking. Applied Database II. File structure. A database consists of a number of related files . A file consists of a number of related records . A record consists of a number of related fields . A field is a single data item consisting of a number of characters. - PowerPoint PPT Presentation

Citation preview

Data Checking

Applied Database II

File structure

• A database consists of a number of related files.

• A file consists of a number of related records.

• A record consists of a number of related fields.

• A field is a single data item consisting of a number of characters.

The database we will use on this course is

Microsoft Access, Microsoft FoxProTable structure

StudentID Surname Forename TYear TGroup Gender

2814 AHUJA Vishal 10 E M

2817 ASIJA Vandana 10 J F

2822 BENZIE Nicholas 10 E M

2823 BIBBY Carol 10 M F

2824 BOSWOOD Peter 10 G M

2827 CHAN Annjeanette 10 S F

2828 CHAN Anthony 10 G M

2829 CHAN Benson 10 S M

NB: In Access-

File = Table, Record = Row, Field = Column, Value = Cell

Types of Files• A Master file is a permanent file of all the data

needed by a business. Some of the fields may be regularly updated from transactions

• A Transaction file is a temporary file of all the transactions (items, bought sold, added, deleted etc) which have taken place in a given period.

• An Archive file is a file of data in long-term storage (usually for legal reasons). Archive files will contain all information about the past dealings of the business and would normally be stored on a different site in case of fire etc

The Data/Information Processing Cycle

• Acquisition/Origination, The collection of raw data from the outside world so it can be put into an information system

• Input, Putting the acquired data into the information system.

• Validation, Checking the quality of the data before it is processed

• Processing (manipulation), This is a key point. At this stage DATA is converted into INFORMATION.

• Storage, Unless you want to input the data every time you process it, it's sensible to store the data

• Retrieval, The reverse of storage

• Output, All the previous steps are useless unless you can see the results

• Communication, Sending data or information to another place. • Disposal, Deleting data or archiving data.

Data Validation

• Data validation checks that the data is sensible before it is processed.

• Methods used for validation are...– range check; This checks that the data lies within a

specified range of values.

Eg the month of a person's DOB should lie between 1 and 12

– presence check; This checks that important data is actually there and has not been missed out.

Eg Customers may be required to have their telephone numbers.

Data Validation (Cont.)

– check digits-- Used for numerical data. An extra digit is added to a number which is calculated from the digits

Eg The ISBN number on a book. The last digit is a check digit.

– batch totals--This checks for missing records. Numerical fields may be added together for all records in a batch

Eg Add the 'Total Cost' field of a number of transactions together.

– hash totals--This is just a batch total done on a meaningless field.

Eg Add the Telephone Numbers together for a number of Customers.

– Reasonableness, does it seem logical? Seventy year old women are not likely to be pregnant

– Consistency, died before being born?

– Format, numeric name?

– Text check – spelling, grammar – Completeness, all digits present

– Sequence, data is ordered?

– Timeliness, is the data out of date?

Data Validation (Cont.)

Data Verification

• Verification is used to prevent errors occurring when data is copied from one medium to another. (eg paper to disk, disk to disk, memory to disk)

• Methods used for verification...– Double keying-- Used to check for transcription errors.

The data is entered twice (by two different people).

– Visual check-- Checking for errors by looking through the data. Eg Proof-reading a typed document.

– Parity-- Used to check for transmission errors over networks or between memory and disk.

Validation in Spreadsheet

• Provide a Drop-down List of Options

(1) (2) (3)

• Validation Criteria Examples » Whole Number

– Decimal

» Date

Validation in Spreadsheet (Cont.)

• Validation Criteria Examples – Time

– Text Length

Validation in Spreadsheet (Cont.)

Validation on database

• Validation cannot ensure that the data on the database is correct, only that it is possible.

• What validation does is ensure that the data on the database follows certain business rules (semantics) of the database.

• Some of these rules are both general and absolute, ie

– the primary key of each row in a table must be non-null and unique (entity integrity)

– all foreign keys (references to other tables) must be valid (ie correspond to an existing primary key on the referenced table) or (sometimes, it depends on the semantics) wholly null

• Some of these rules are general about fields

– a numeric field must contain a number

– a date/time field must contain a valid date/time

• Some are specific to the logic of a particular field

– the quantity ordered on an order line must be a positive integer

– the date of an order cannot be earlier than the date on which it is created

– a gender field must be male or female

– the county field must contain one of the values from a valid list of counties (but beware if you ever have overseas customers!)

Validation on database (Cont.)

• Some of these field rules may be 'reasonableness' rather than absolute

– an active employee with a date born indicating an age of 120 or 3 is clearly wrong

– an active employee with a date born indicating an age of 85 or 13 is probably wrong (and should generate a warning rather than preventing the input)

• Some depend on more than one field, possibly on more than one table

– An order's delivery date cannot be any earlier than the order date

– If a customer is credit class 'cash only' then the customer's order payment type must be 'cash'

Validation on database (Cont.)

Validation in MS Access

• field type-- often forgotten but it does ensure number fields contain numeric values

• input mask-- The input mask is a regular expression describing valid input

• Default value-- This makes sure that there is a value

• Validation Rule-- This allows you to specify an expression for valid values (this must be a simple expression such as '>12 and < 66' and cannot involve other fields

• Validation Text The text displayed in a message box if the validation rule is broken

• Required-- Does the user have to input a value

• Allow Zero Length-- (Strings only) Can the field contain an empty string

• Indexed-- Is there an index on this column and, if so, does it allow duplicate values. Primary keys will be Indexed, no duplicates

Validation in MS Access (Cont.)

• Field level validation– In Visual Basic, TextBoxes can respond to the

following events (amongst others):

• KeyPress (generated when the user types an ascii character into the TextBox)

• LostFocus (generated when the user transfers the focus outside of the TextBox - e.g., to another TextBox, or to a button)

• Record level validation– Immediately prior to writing data to the database,

Visual Basic generates a "Validate" event on the data control. This happens whenever the current record is changed (e.g., by using MoveNext) and when the Update method is invoked.

Validation in VB/MS Access

Private Sub txtTitle_KeyPress (KeyAscii As Integer)

Dim char As Stringchar = Chr(KeyAscii)If char >= "0" And char <= "9"

ThenKeyAscii = 0

End IfEnd Sub

Validation in VB/MS Access (Cont.)

Private Sub txtDoB_LostFocus () Dim msg As String If txtDoB.Text <> "" And Not

IsDate(txtDoB.Text) Then msg = "Please enter a valid date,

or leave the field empty."

msgBox msg txtDoB.SetFocus

End IfEnd Sub

Validation in VB/MS Access (Cont.)

Record Level ValidationPrivate Sub Address_Validate (Action As Integer, Save As Integer) Dim msg As String If Not Address.RecordSet.EOF Then ' RecordSet is not empty, so validate current record

If Address.RecordSet.Fields("Surname") = "" Then msg = "Surname is a required field.

Please enter a Surname" MsgBox msg Action = vbDataActionCancel ' Cancel whatever action generated the event Save = False ' Do not write the record

to the databaseEnd If

End IfEnd Sub

Recommended