26

Name?Yes, allowing for spelling variations Age? Yes, allowing for some variance

Embed Size (px)

DESCRIPTION

What search criteria would I use to search for James in 1881?. Name?Yes, allowing for spelling variations Age? Yes, allowing for some variance Occupation?No. Often changed over the decade Place of Birth? Yes and No. Information in 1881 censuses varies - PowerPoint PPT Presentation

Citation preview

Name? Yes, allowing for spelling variations

Age? Yes, allowing for some variance

Occupation? No. Often changed over the decade

Place of Birth? Yes and No. Information in 1881 censuses varies

Marital status? No. Often changed over the decade

Gender? Sure. It could be mis-coded, but is likely correct.

What search criteria would I use to search for James in 1881?

http://www.mysql.com/

Forget diamonds or dogs…MySQL: a true best friend

1871 Census extract:

1NF: What are our entities?

Family units?

P1 P2 P3 P4P 1 Relate to HH

P 2 Relate to HH

P 3 Relate to HH

P 4 Relate to HH

P 1 M. Stat

P 2 M. Stat

P 3 M. Stat

P 4 M. Stat

Ann Leisk

Janet Leisk

Ann Leask

Jessie A Sinclair Head Daur Daur Grandchi

ld Mar Mar

P 1 Age P 2 Age P 3 Age P 4 Age P 1 Occ P 2 Occ P 3 Occ P 4 Occ P 1 Born P 2 Born P 3 Born P 4 Born

57 30 13 2

Knitter, Seamstress Seamstress

Unst, SHI, SCT

Mid Yell, SHI, SCT

Lerwick, SHI, SCT

Lerwick, SHI, SCT

Works for family of 4, but what about the Eunson family with 8 in the family?

We would have to add new columns, so no, Family units is no good as an entity.

1NF: What are our entities?

Individuals?Name Relationship to HH Marital Status Age Gender Occupation Born

Ann Leisk HEAD MAR 57 F KNITTER, SEAMSTRESS UNST, SHETLAND, SCOTLAND

Janet Leisk DAUR MAR 30 F SEAMSTRESS MID YELL, SHETLAND, SCOTLAND

Ann Leask DAUR 13 F LERWICK, SHETLAND, SCOTLAND

Ann Leask GRAND CHILD 2 F LERWICK, SHETLAND, SCOTLAND

Thomas Eunson HEAD MAR 49 M FISHERMAN DUNROSSNESS, SHETLAND, SCOTLAND

Elizabeth Eunson WIFE MAR 51 F DUNROSSNESS, SHETLAND, SCOTLAND

John Eunson SON UNM 23 M FISHERMAN DUNROSSNESS, SHETLAND, SCOTLAND

Jane Eunson DAUR UNM 22 F KNITTER DUNROSSNESS, SHETLAND, SCOTLAND

Thomas Eunson SON UNM 17 M FISHERMAN DUNROSSNESS, SHETLAND, SCOTLAND

Adam Eunson SON 14 M BEACH BOY DUNROSSNESS, SHETLAND, SCOTLAND

Margaret Eunson DAUR 11 F SCHOLAR DUNROSSNESS, SHETLAND, SCOTLAND

Janet Eunson DAUR 6 F SCHOLAR DUNROSSNESS, SHETLAND, SCOTLAND

Yes, individuals works.So, let’s give each individual a Primary Key

1NF

ID No. Name Relationship to HH Marital Status Age Gender Occupation Born

337864 Ann Leisk HEAD MAR 57 F KNITTER, SEAMSTRESS UNST, SHETLAND, SCOTLAND

337865 Janet Leisk DAUR MAR 30 F SEAMSTRESS MID YELL, SHETLAND, SCOTLAND

337866 Ann Leask DAUR 13 F LERWICK, SHETLAND, SCOTLAND

337867 Ann Leask GRAND CHILD 2 F LERWICK, SHETLAND, SCOTLAND

337622 Thomas Eunson HEAD MAR 49 M FISHERMAN DUNROSSNESS, SHETLAND, SCOTLAND

337623 Elizabeth Eunson WIFE MAR 51 F DUNROSSNESS, SHETLAND, SCOTLAND

337624 John Eunson SON UNM 23 M FISHERMAN DUNROSSNESS, SHETLAND, SCOTLAND

337625 Jane Eunson DAUR UNM 22 F KNITTER DUNROSSNESS, SHETLAND, SCOTLAND

337626 Thomas Eunson SON UNM 17 M FISHERMAN DUNROSSNESS, SHETLAND, SCOTLAND

337627 Adam Eunson SON 14 M BEACH BOY DUNROSSNESS, SHETLAND, SCOTLAND

337628 Margaret Eunson DAUR 11 F SCHOLAR DUNROSSNESS, SHETLAND, SCOTLAND

337629 Janet Eunson DAUR 6 F SCHOLAR DUNROSSNESS, SHETLAND, SCOTLAND

Primary Key (PK)

1NF: 3 GoalsGoal 1: no duplicated rows. Each row should have a PK.

Goal 2: Each cell contains only one value. No groups or comma-separated lists

Goal 3: Any given column contains the same kind of data

ID No. Name Relationship to HH Marital Status Age Gender Occupation Born

337864 Ann Leisk HEAD MAR 57 F KNITTER, SEAMSTRESS UNST, SHETLAND, SCOTLAND

337865 Janet Leisk DAUR MAR 30 F SEAMSTRESS MID YELL, SHETLAND, SCOTLAND

337866 Ann Leask DAUR 13 F LERWICK, SHETLAND, SCOTLAND

337867 Ann Leask GRAND CHILD 2 F LERWICK, SHETLAND, SCOTLAND

337622 Thomas Eunson HEAD MAR 49 M FISHERMAN DUNROSSNESS, SHETLAND, SCOTLAND

337623 Elizabeth Eunson WIFE MAR 51 F DUNROSSNESS, SHETLAND, SCOTLAND

337624 John Eunson SON UNM 23 M FISHERMAN DUNROSSNESS, SHETLAND, SCOTLAND

337625 Jane Eunson DAUR UNM 22 F KNITTER DUNROSSNESS, SHETLAND, SCOTLAND

337626 Thomas Eunson SON UNM 17 M FISHERMAN DUNROSSNESS, SHETLAND, SCOTLAND

Goal 1: no duplicated rows. Each row should have a PK.

Goal 2: Each cell contains only one value. No groups or comma-separated lists

Goal 3: Any given column contains the same kind of data

Occ ID Occupation

1 BAKER2 BEACH BOY3 FISHERMAN4 FISHERMAN WIFE5 GENERAL SERV DOMESTIC6 GROCER7 KNITTER8 PHOTOGRAPHER9 SCHOLAR

10 SEAMSTRESS

Primary Key (PK)

1NF: New Entities, New Tables

Bpl ID Birthplace

1 DALMENY, LINLITHGOWSHIRE, SCOTLAND2 DUNROSSNESS, SHETLAND, SCOTLAND3 EAST INDIA, CEYLON4 FETLAR, SHETLAND, SCOTLAND5 LERWICK, SHETLAND, SCOTLAND6 LINLITHGOW, LINLITHGOWSHIRE, SCOTLAND7 MID YELL, SHETLAND, SCOTLAND8 NEW ZEALAND9 NORTHMAVINE, SHETLAND, SCOTLAND

10 UNST, SHETLAND, SCOTLAND

1NF: 3 Goals for the new tablesGoal 1: no duplicated rows. Each row should have a PK.

Goal 2: Each cell contains only one value. No groups or comma-separated lists

Goal 3: Any given column contains the same kind of data Occ ID Occupation

1 BAKER2 BEACH BOY3 FISHERMAN4 FISHERMAN WIFE5 GENERAL SERV DOMESTIC6 GROCER7 KNITTER8 PHOTOGRAPHER9 SCHOLAR

10 SEAMSTRESS

Bpl ID Birthplace

1 DALMENY, LINLITHGOWSHIRE, SCOTLAND2 DUNROSSNESS, SHETLAND, SCOTLAND3 EAST INDIA, CEYLON4 FETLAR, SHETLAND, SCOTLAND5 LERWICK, SHETLAND, SCOTLAND6 LINLITHGOW, LINLITHGOWSHIRE, SCOTLAND7 MID YELL, SHETLAND, SCOTLAND8 NEW ZEALAND9 NORTHMAVINE, SHETLAND, SCOTLAND

10 UNST, SHETLAND, SCOTLAND

Birthplace fails at Goal 2 and Goal 3 holding parish, county and country data in a comma-separated list

Goal 1: no duplicated rows. Each row should have a PK.

Goal 2: Each cell contains only one value. No groups or comma-separated lists

Goal 3: Any given column contains the same kind of data

Goal 1: no duplicated rows. Each row should have a PK.

Goal 2: Each cell contains only one value. No groups or comma-separated lists

Goal 3: Any given column contains the same kind of data

Parish ID Parish

1 DALMENY2 DUNROSSNESS3 FETLAR4 LERWICK5 LINLITHGOW6 MID YELL7 NORTHMAVINE8 UNST

County ID County

1 ABERDEEN2 ARGYLL3 LANARKSHIRE4 LINLITHGOWSHIRE5 MORAY6 PEEBLES7 RENFREWSHIRE8 SHETLAND

Country ID Country

1 CANADA2 CEYLON3 ENGLAND4 IRELAND5 NEW ZEALAND6 SCOTLAND7 WALES

8 USA

Primary Key (PK)

BIRTHPLACE

PARISH COUNTY COUNTRY

1 4 62 8 63 8 64 8 65 4 66 8 67 8 68 8 6

Foreign Key (FK)

Junction Table

IndividualPK ID No.

NameRelate to HHM. StatGenderAge

FK Parish IDFK County IDFK Country ID

OccupationPK Occ ID.

OccupationFK ID No.

ParishPK Parish ID

Parish

CountyPK County ID

County

CountryPK Country ID

Country

BirthplaceFK Parish IDFK County IDFK Country ID

2NF: “Sub-entities”Repeated bits of information should now be moved into their own tables.

IndividualID No. Name Relate to HH M. Stat Age Gender Parish ID County ID Country ID337622 Thomas Eunson HEAD MAR 49 Male 2 8 6337623 Elizabeth Eunson WIFE MAR 51 Female 2 8 6337624 John Eunson SON UNM 23 Male 2 8 6337625 Jane Eunson DAUG UNM 22 Female 2 8 6337626 Thomas Eunson SON UNM 17 Male 2 8 6337627 Adam Eunson SON 14 Male 2 8 6337628 Margaret Eunson DAUG 11 Female 2 8 6337629 Janet Eunson DAUG 6 Female 2 8 6337802 (Revd) James Doull HEAD MAR 39 M 2337803 Jane Doull WIFE MAR 49 F 4 8 6337804 Mary Struthers STEP-DAUR UNM 19 F 1 4 6337805 George Struthers STEP-SON 14 M 5337806 Euphemia W Daull DAUR 8 MO F 3 8 6337864 Ann Leisk HEAD MARRIED 57 F 8 6 6337865 Janet Leisk DAUGHTER MARRIED 30 F 6 8 6337866 Ann Leask DAUGHTER 13 F 4 8 6337867 Jessie A Sinclair GRAND CHILD 2 F 4 8 6

Information repeated in Relate to HH, M. Stat and Gender columns = new tables.

IndividualPK ID No.

NameFK Relate to HHFK M. StatFK Gender

AgeFK Parish IDFK County IDFK Country ID

OccupationPK Occ ID.

OccupationFK ID No.

ParishPK Parish ID

Parish

CountyPK County ID

County

CountryPK Country ID

Country

BirthplaceFK Parish IDFK County IDFK Country ID

Relate to HHPK HH ID

Relate to HH

M. StatPK M. Stat ID

M. Stat

GenderPK Gender ID

Gender

IndividualID No. Name Relate to HH M. Stat Age Gender Parish ID County ID Country ID

337622 Thomas Eunson 101 1 49 1 2 8 6337623 Elizabeth Eunson 201 1 51 2 2 8 6337624 John Eunson 301 6 23 1 2 8 6337625 Jane Eunson 301 6 22 2 2 8 6337626 Thomas Eunson 301 6 17 1 2 8 6337627 Adam Eunson 301 14 1 2 8 6337628 Margaret Eunson 301 11 2 2 8 6337629 Janet Eunson 301 6 2 2 8 6337802 (Revd) James Doull 101 1 39 1 2337803 Jane Doull 201 1 49 2 4 8 6337804 Mary Struthers 303 6 19 2 1 4 6337805 George Struthers 303 14 1 5337806 Euphemia W Daull 301 8 MO 2 3 8 6337864 Ann Leisk 101 1 57 2 8 6 6337865 Janet Leisk 301 1 30 2 6 8 6337866 Ann Leask 301 13 2 4 8 6337867 Jessie A Sinclair 901 2 2 4 8 6

3NF

1. If we delete a row, will we lose any data that other records might need?

IndividualID No. Name Relate to HH M. Stat Age Gender Parish ID County ID Country ID

337802 (Revd) James Doull 101 1 39 1 2337803 Jane Doull 201 1 49 2 4 8 6337804 Mary Struthers 303 6 19 2 1 4 6337805 George Struthers 303 14 1 5337806 Euphemia W Daull 301 8 MO 2 3 8 6

If we deleted George Struther’s row, will we lose the information that the code for New Zealand is 5?

Country ID Country

3 ENGLAND4 IRELAND5 NEW ZEALAND

No, because it is entered into the Country table – the code in the Individual table is a Foreign Key

3NF

2. If we add a row, could we accidentally make any data entry mistakes that would compromise our database’s integrity?

IndividualID No. Name Relate to HH M. Stat Age Gender Parish ID County ID Country ID

337802 (Revd) James Doull 101 1 39 1 2337803 Jane Doull 201 1 49 2 4 8 6337804 Mary Struthers 303 6 19 2 1 4 6337805 George Struthers 303 14 1 5337806 Euphemia W Daull 301 8 MO 2 3 8 6330086 Andrew Ross 101 1 21 2 1

If we accidentally entered incorrect information into a row, would that impact any other part of the database?

Gender1 Male2 Female3 Unknown

No, because even if we code Dr Ross as a girl, the gender table still holds the accurate code for male.

3NF

3. Have we stored any values that we should calculate instead?

IndividualID No. Name Relate to HH M. Stat Age Gender Parish ID County ID Country ID

337802 (Revd) James Doull 101 1 39 1 2337803 Jane Doull 201 1 49 2 4 8 6337804 Mary Struthers 303 6 19 2 1 4 6337805 George Struthers 303 14 1 5337806 Euphemia W Daull 301 8 MO 2 3 8 6

In many circumstances we would calculate age because it is dynamic.

In the case of historic data like the census the recorded age is all we have, and it could have been mis-recorded, so we do not calculate the age in this case.

One other thing

Separate names into separate columns

If, for example, you want to search for the density of a given surname across all of Scotland, you will have trouble doing so unless surnames are separated out from first names.

IndividualID No. First Name Surname Relate to HH M. Stat Age Gender Parish ID County ID Country ID

337622 THOMAS EUNSON 101 1 49 1 2 8 6337623 ELIZABETH EUNSON 201 1 51 2 2 8 6337624 JOHN EUNSON 301 6 23 1 2 8 6337625 JANE EUNSON 301 6 22 2 2 8 6337626 THOMAS EUNSON 301 6 17 1 2 8 6337627 ADAM EUNSON 301 14 1 2 8 6337628 MARGARET EUNSON 301 11 2 2 8 6337629 JANET EUNSON 301 6 2 2 8 6337802 (REVD) JAMES DOULL 101 1 39 1 2337803 JANE DOULL 201 1 49 2 4 8 6337804 MARY STRUTHERS 303 6 19 2 1 4 6337805 GEORGE STRUTHERS 303 14 1 5337806 EUPHEMIA W DAULL 301 8 MO 2 3 8 6

Wound IncidentWound

ID SoliderSerial

Number Rank Brigade Commander Regiment DOB Wounded Wound

1 J.A. Ross 752054Major1st Armoured Brigade Brig. R.C. Keller

1st King's Dragoon Guards 25-Oct-21 02-Jul-42

Bullet in left buttock

2 Daniel Tallman 721004Lieutenant1st Armoured Brigade Brig. R.C. Keller

3rd Royal Tank Regiment

04-Aug-14 15-Jul-42

Burn on right theigh

3 C. Ironfoundersson 152301Second Lieutenant

1st Armoured Brigade Brig. R.C. Keller

1st Royal Tank Regiment

16-Sep-17 16-Jul-42Dislocated shoulder

4C. W. St. John Nobbs 1841998Corporal

Cairo Cavalry Brigade Brig. G.N. Todd

7th Queen's Own Hussars

25-Apr-10 23-Jul-42

Gash on head, concussion

5 Daniel Tallman 721004Captain4th Armoured Brigade Brig. R.C. Keller

3rd Royal Tank Regiment

04-Aug-14 28-Oct-42Sprained wrist

6 C. Ironfoundersson 152301Captain4th Armoured Brigade Brig. R.C. Keller

6th Royal Tank Regiment

16-Sep-17 28-Oct-42Broken ankle

7 J.A. Ross 752054Captain1st Armoured Brigade

Brig. E.C.N. Custance

1st King's Dragoon Guards 25-Oct-21

22-Nov-42

Bullet in right buttock

Soldiers from Crikey Village, Wellington County, England who were wounded while in service in the Western Desert Campaign with the 7th Armoured Division