View
40
Download
4
Category
Tags:
Preview:
DESCRIPTION
Statistical databases in theory and practice Part III: Designing statistical databases. Bo Sundgren 2008-02-11. Conceptual data model and relational data model in normalised form. Concept modelling. Define concepts and relations between them Conceptual models and data models - PowerPoint PPT Presentation
Citation preview
Statistical databases in theory and practice
Part III: Designing statistical databases
Bo Sundgren
2008-02-11
Conceptual data model and relational data model in normalised form.
PERSON
IdentifierHouseholdIdentifier*SexAgeEducationOccupationIncomeWealthHealth
ESTABLISHMENT
EstablishmentIdentifierOrganisationIdentifier*LocationKindOfActivityNumberOfEmployeesNetProfit
ORGANISATION
OrganisationIdentifierLocationOfHQ
BELONGS TO
HOUSEHOLD
IdentifierDwellingIdentifier*SizeStructureIncome
BELONGS TO
DWELLING
IdentifierLocationSizeStandardRent
LIVES IN
MIGRATIONEVENT
IdentifierPersonIdentifier*FromDwellingId*ToDwellingId*Time
OF
FROM TO
PERSONESTABLISH-
MENT
ORGANISATION
BELONGS TO
HOUSEHOLD
BELONGS TO
DWELLINGLIVES IN
MIGRATIONEVENT
OF
FROM TO
WORKS AT
EMPLOYMENT
EMPLOYMENT
PersonIdentifier*EstablishmentIdentifier*PercentOfFullTimeSalary
Concept modelling
• Define concepts and relations between them• Conceptual models and data models• Visualise models graphically
Rent-A-Video: first object graph
VideoFilm CustomerIsRentedBy
Rents
Rent-A-Video: elaborated object graph
VideoFilm CustomerIsRentedBy
Rents
FilmTitle
Rep
rese
nts
IsRep
resented
By
FilmId
Title
Category
Price
Actor*
Story
NumberOfCopies=
NumberOfRents=
FilmId
CopyNr
Rented?
NumberOfRents
CustomerId
Name
Address
Discount
Rental
CustomerId
FilmId
CopyNr
RentalNr
RentalDate
AgreedReturnDate
Returned?
ActualReturnDate
Rent-A-Video: further aspects
VideoFilm CustomerIsRentedBy
Rents
FilmTitle
Rep
rese
nts
IsRep
resented
By
IsRentedBy
Ren
ts
IsReservedBy
Res
erve
s
Relations between two object types
• one-to-one, symbolised by “arrow-to-arrow”
• one-to-many, symbolised by “arrow-to-fork”
• many-to-one, symbolised by “fork-to-arrow”
• many-to-many, symbolised by “fork-to-fork”
Note: The relation is usually not a flow relation!
(But you should tell what kind of relation it is.)
Object graphs: another example
PERSON
PersonId
HOUSEHOLD
HouseholdId
NumberOfPersons=
Income=
Sex
CO
NS
IST
S O
F
BE
LO
NG
S T
O
Income
Age
HomeMunicipality
PostalCode
HighestEducation
IS FATHER OF
IS MOTHER OF
Concept modelling: Exercises
• B2B– The customers of companies are companies– Companies have employees (persons)
• B2C– The customers of companies are consumers (persons)– Companies have employees (persons)
• B2B+B2C– The customers of companies are companies or
consumers (persons)– Companies have employees (persons)
Hint: There are two basic object types, COMPANY and PERSON in all three examples
Different roles of concept modelling
• Clarifying a small number of related concepts• Information model for an application
– defining meaning– basis for data design
• Corporate information model– for more efficient communication between people– basis for system integration
Concept model ---> Data model
FilmCopy CustomerIsRentedBy
Rents
FilmTitle
Re
pre
se
nts
IsR
ep
res
en
ted
By
FilmId
Title
Category
Price
Actor*
Story
NumberOfCopies=
NumberOfRents=
FilmId
CopyNr
Rented?
NumberOfRents
CustomerId
Name
Address
Discount
Rental
Rental date
AgreedReturnDate
Returned?
ActualReturnDate
CustomerId
FilmId
CopyNr
RentalNr
CopyNr Rented?FilmId
FilmId CopyNr CustomerId RentalNrRentals
FilmCopies
Name AddressCustomerIdCustomers
NumberOfRents
RentalDate
AgreedReturnDate
Returned?Actual
ReturnDate
Discount
FilmId Title Category Price StoryAgreed
ReturnDateReturned?
ActualReturnDate
ActorNameFilmIdActorsInFilms ActorsRoleInFilm
FilmTitles
CopyNr Rented?FilmId
FilmId CopyNr CustomerId RentalNrRentals
FilmCopies
Name AddressCustomerIdCustomers
NumberOfRents
RentalDate
AgreedReturnDate
Returned?Actual
ReturnDate
Discount
FilmId Title Category Price StoryAgreed
ReturnDateReturned?
ActualReturnDate
ActorNameFilmIdActorsInFilms ActorsRoleInFilm
FilmTitles
Concept model ---> Star/cube model
FilmCopy CustomerIsRentedBy
Rents
FilmTitle
Re
pre
se
nts
IsR
ep
res
en
ted
By
FilmId
Title
Category
Price
Actor*
Story
NumberOfCopies=
NumberOfRents=
FilmId
CopyNr
Rented?
NumberOfRents
CustomerId
Name
Address
Discount
Rental
Rental date
AgreedReturnDate
Returned?
ActualReturnDate
CustomerId
FilmId
CopyNr
RentalNr
OBJECT IN FOCUS
FilmTitle
Customer
PriceGroup
NumberOfRentsPerCopy
FilmId
Title
Category
NumberOfRents
CustomerId
Category
Area
Rental
Rental date
Delayed?
CustomerId
FilmId
RentalNr
FILMCATEGORY
CUSTOMER DIS-COUNT CATEGORY
Number of rentals of film copiesduring the year t by customer
discount category and film category
number ofrentals
number ofrentals
number ofrentals
number ofrentals
number ofrentals
number ofrentals
number ofrentals
number ofrentals
number ofrentals
number ofrentals
number ofrentals
number ofrentals
number ofrentals
number ofrentals
number ofrentals
number ofrentals
number ofrentals
number ofrentals
number ofrentals
number ofrentals
Star model for Data Warehouse
OBJECT IN FOCUS
FilmTitle
Customer
PriceGroup
NumberOfRentsPerCopy
FilmId
Title
Category
NumberOfRents
CustomerId
Category
Area
Rental
Rental date
Delayed?
CustomerId
FilmId
RentalNr
Multidimensional model (cube model)
FILMCATEGORY
CUSTOMER DIS-COUNT CATEGORY
Number of rentals of film copiesduring the year t by customer
discount category and film category
number ofrentals
number ofrentals
number ofrentals
number ofrentals
number ofrentals
number ofrentals
number ofrentals
number ofrentals
number ofrentals
number ofrentals
number ofrentals
number ofrentals
number ofrentals
number ofrentals
number ofrentals
number ofrentals
number ofrentals
number ofrentals
number ofrentals
number ofrentals
Modelling the contents and structure of official statistics
Or: How to design ”correct” and globally consistent SDMX Data Structure Definitions
Or: Navigating in a space of statistical surveys of society
Or: Reality as a statistical construction
Bo Sundgren, Statistics SwedenICES-III, Montreal, June 18-21, 2007
What can a statistical agency do, in order to help a user - find potentially relevant statistical data? - judge the relevance of data retrieved?
• Provide overviews of available data
• Provide search tools
• Provide informative metadata
Conceptual navigation: contents exploration and searching for statistics
• A conceptual model of societyas reflected by official statistics
SOCIO-ECONOM IC PROCESSES,SUBPROCESSES, ACTIVITIES
ACTORSwith properties
- persons, households- organisations, institutions- enterprises ,establishm ents- ...
involvem entin different
roles
UTILITIES with properties
- resources: real, financial- products: com m odities , services , inform ation, ...- assets and liabilities- ...
SOCIETY
ACTORLIFE HISTORIES
(actor eigenprocesses)SUBHISTORIES,
”CASES”
UTILITYLIFE HISTORIES
(utility eigenprocesses)SUBHISTORIES,
”CASES”
BY AREA:
- agriculture- manufacturing- service industr- trade- transports- energy- construction- education- R&D- financial- information- culture&leisure- health- social services- judicial services- ...
BY SECTOR:
- public . central . regional . local- private . business . household
involvem entin different
roles
BY FUNCTION:
- production- consumption- investment- order/delivery- sales- supply- employment- environment side-effects- ...
SOCIO-ECONOMIC PROCESSES,SUBPROCESSES, ACTIVITIES
ACTORSwith properties
- persons, households- organisations, institutions- enterprises,establishments- ...
involvementin different
roles
UTILITIES with properties
- resources: real, financial- products: commodities, services, information, ...- assets and liabilities- ...
SOCIETY
ACTORLIFE HISTORIES
(actor eigenprocesses)SUBHISTORIES,
”CASES”
UTILITYLIFE HISTORIES
(utility eigenprocesses)SUBHISTORIES,
”CASES”
countable/measurable
COMPLEXOBJECTS
with properties
events, transactions,relationships, ”cases”, ...
DATA COLLECTIONAND AGGREGATION
PROCESSES
- done by respondents- done by agency
STATISTICAL DATA(micro, macro)
organised in multidimensionalstructures:
balance sheets,crosstabulations, cubes, etc
FURTHERPROCESSING
ANALYTICALPRODUCTS
BY AREA:
- agriculture- manufacturing- service industr- trade- transports- energy- construction- education- R&D- financial- information- culture&leisure- health- social services- judicial services- ...
BY SECTOR:
- public . central . regional . local- private . business . household
involvementin different
roles
BY FUNCTION:
- production- consumption- investment- order/delivery- sales- supply- employment- environment side-effects- ...
OBJECTRELATIONS
OBJECTRELATIONS
countable/measurable
BASIC OBJECTS(ACTORS)
with properties
countable/measurable
BASIC OBJECTS(UTILITIES)
with properties
Statistics Canada: Agents, Events, Things
Contents By Example (based on a simple generic model)
PERSONVARIABLE
VARIABLE
VARIABLE
VARIABLE
VARIABLE
x
m
>0
x
ORGANISATIONVARIABLE
VARIABLE
VARIABLE
VARIABLE
VARIABLE
x
p
<5
x
RESOURCEVARIABLE
VARIABLE
VARIABLE
VARIABLE
VARIABLE
g
PRODUCTVARIABLE
VARIABLE
VARIABLE
VARIABLE
VARIABLE
x
ACTIVITYVARIABLE
VARIABLE
VARIABLE
EVENTVARIABLE
VARIABLE
VARIABLE
RELATIONVARIABLE
VARIABLE
VARIABLE
x x
xx
Actors Utilities
Complexobjects
Everything ”clickable”OBJECT
VARIABLE
Lefthand click Righthand click
Select:- object- variable
Retrieve metadata:- definition- value set, classification- questionnaire- quality declaration- survey documentation
ProgramExecution.countunique(Provider.Id,
Program.Type,Program.Level,
Program.Orientation)
EducationProvider(Institution)
x Sector (Public/Private)
Teacher
x Sex
Provides
TeacherEngagement.count
x TeacherEdStatusx PartTimeStatus- PartTimeFraction.sum
EducationSystem(Utility)
- Country- Currency- CompulsoryEdBegAge- CompulsoryEdEndAge- CompulsoryEdLength- AcadYearBegMonth- AcadYearEndMonth
IsEngagedIn
EducationProgram(Utility)
- Name - Year - EntranceAge - Duration x Type x Level (ISCED97) x Grade x Orientation x PositionInDegreeStructure x FieldOfEducation
BelongsTo
Of
Pupil
x Sexx Agex CountryOfOriginx AttendedPrePrimary
PupilEnrolment.count
x PartTimeStatusx Repeaterx Completer/DropOutx CumulatedTime- PartTimeFraction.sum
IsEnrolledIn
Expenditure
x EducationalStatusx Sourcex Nature- Amount.sum
Funder(Actor)
x Sector (Public/Private/...)
Pays For
For
For
LEGEND:
one-to-many relationship
many-to-one relationship
one-to-one relationship
many-to-many relationship
x Variable: indicates that the ”Variable” variable has a classifying role
Object.count – indicates that ”Object” objects are counted
Variable.sum – indicates that the ”Variable” variable is summarised
reading direction
For
UNESCOmodelversion 1(to be revised)
Recommended