Upload
elijah-kilgore
View
217
Download
0
Tags:
Embed Size (px)
Citation preview
www.uis.unesco.org
Building SDMX Data Structure Definitions based on a generic conceptual model for contents
Experience with the joint Eurostat-Unesco-OECD education statistics questionnaire
Michael Bruneforth, UNESCO Institute for Statistics
Expert Group on SDMX
[email protected] May 10, 2007, UN, Geneva
www.uis.unesco.org
Overview
The world of international education data collections
Why building a conceptual model
Steps to build the model
The model
From the model towards a SDMX data structure definition
www.uis.unesco.org
The world of international education data collections
www.uis.unesco.org
The UNESCO-UIS / OECD / EUROSTAT (UOE) Data Collection on Education Statistics
• EXCEL based questionnaire, organized in 31 work sheets• 47 countries, 14,000+ data points• Changes: 2003
The World Education Indicators Project (WEI)• Based on UOE Instruments, extended by 10 work sheets• 16 countries , >15,000+ data points• Examples at www.uis.unesco.org/publications/wei2006
The UIS Survey• Pdf based E-Questionnaire infrastructure, plus paper form• All remaining countries, 5,000+ data points• Examples at www.uis.unesco.org -> current surveys
Instruments used in the system of international education data collections
www.uis.unesco.org
Instruments used in the system of international education data collections (II)
UOE
i ii i i
i i
UIS
Can be transformed
Can be transformed
WEI
www.uis.unesco.org
Education Questionnaires: ever changing
1998• Tables were introduced after ISCED 97 was adopted.
2000• Redesign of Finance tables.
2001 – 2005: ???
2005• Major redesign: Tables redesigned, some tables spilt or combined.
2006• In ENRL8a; ENRL8b and ENRL8c: the Caribbean countries are now included with Latin America
instead of Northern America.
2007• In table ENRL-7, three new sub-categories, “unknown residence”, “unknown prior education”,
and “unknown citizenship” have been added..
• In ENTR-2 a new row has been added to collect typical age of entry.
• In GRAD-1 and GRAD-3 a new row has been added to collect typical graduation age.
www.uis.unesco.org
Why building a conceptual model?
Meta data
• Theoretical basis for describing data
• Visualization of data
• Validation of codes
Questionnaire design
• Improving internal consistency in questionnaires
• Maintaining the coding schemes:
» Avoiding random or ad-hoc data descriptions leading to inconsistent, incomprehensible systems
(we need discipline as much as a model!)
www.uis.unesco.org
Why using a conceptual model as basis for SDMX?
A model describes a universe of questionnaires
• Consistency across questionnaires
• Consistency across tables
• Consistency across statistical units
• Facilitates adaptation of SDMX to changes to tables» Typically no/few keys need to be changed, most new data can be defined using existing keys
A model can be used to describe indicators and derived data
• SDMX exchange of results (->WorldBank, MDG)
A model can be transformed into/from data base definitions
• Use of existing meta data (efficiency)
• Avoid redundant information (less error prone)
• Basis to match national data to international SDMX definitions
www.uis.unesco.org
Building the model
Step 1: Bo Sundgren’s analysis of the UIS Questionnaire
Step 2: Analysis of the relational data base at UIS
Step 3: Correction / Expansion of Bo’s model
Step 4: Model verification1: review of UOE questionnaires
Step 5a: Model verification2: Transformation of UIS database model to conceptual model, automated creation of full code list
Step 5b: Model verification2: Analysis of the relational data base at OECD
Step 6: Creation of data structure definition based on existing meta data
Student
.count .partTimeFraction.sum x age x sex x origin (array) x previousEducation
Enrolment
- partTimeFraction x workmode x repeater x completer x entrant x (adjustement)
ProgramExecution(Class, pedagogical unit, …)
.count x adult x grade x ISCED.field x location
Institutions(education provider; non-instructional institutions; …)
x sector x InstType
EducationProgram(Utility)
- startingAge - duration - Name x ISCED.level x ISCED.orientation x ISCED.destination x ISCED.degreePos
is enrolled in
for
of
provides
www.uis.unesco.org
Example 1: Students and Repeater
ENRL3
COUNTRY
School year, data collection period: Please indicate the dates in table ENRL1a. Sources:
Methods:
LEVEL OF EDUCATION PRIMARY (ISC 1)
LOWER SECONDARY
(ISC 2)
UPPER SECONDARY
(ISC 3)PRIMARY
(ISC 1)
LOWER SECONDARY
(ISC 2)
UPPER SECONDARY
(ISC 3) UOE version:All educational programmes
All general programmes
All general programmes
All educational programmes
All general programmes
All general programmes
TOTAL PUBLIC AND PRIVATE INSTITUTIONS 1 2 3 4 5 6TOTAL FULL-TIME AND PART-TIMETotal males and females Grade groups
A1 Total: All grade groups (within ISC-Level) (A2toA12) X
A2 Grade 1 (within ISC-Level) (A14+A26)
A3 Grade 2 (within ISC-Level) (A15+A27)
A4 Grade 3 (within ISC-Level) (A16+A28)
A5 Grade 4 (within ISC-Level) (A17+A29)
A6 Grade 5 (within ISC-Level) (A18+A30)
A7 Grade 6 (within ISC-Level) (A19+A31)
A8 Grade 7 (within ISC-Level) (A20+A32)
A9 Grade 8 (within ISC-Level) (A21+A33)
A10 Grade 9 (within ISC-Level) (A22+A34)
A11 Grade 10 (within ISC-Level) (A23+A35)
A12 Grade unknown (within ISC-Level) (A24+A36)
Males Grade groups
A13 Total: All grade groups (within ISC-Level) (A14toA24)
A14 Grade 1 (within ISC-Level)
A15 Grade 2 (within ISC-Level) Y
A16 Grade 3 (within ISC-Level)
NUMBER OF STUDENTS AND REPEATERS (ISC123) IN GENERAL PROGRAMMES BY LEVEL OF EDUCATION, SEX AND GRADE
Number of repeatersNumber of students Block Check Global Check & Save
Row Instructions
RowNotes ColumnNotes CellNotes
Missing Value Codes:
Student
.count .partTimeFraction.sum x age x sex x origin (array) x previousEducation
Enrolment
- partTimeFraction x workmode x repeater x completer x entrant x (adjustement)
ProgramExecution(Class, pedagogical unit, …)
.count x adult x grade x ISCED.field x location
Institutions(education provider; non-instructional institutions; …)
x sector x InstType
is enrolled in
for
of
provides
EducationProgram(Utility)
- startingAge - duration - Name
x ISC.level: 2 x ISC.orientation: General x ISCED.destination x ISCED.degreePos
Count of lower secondary general students
Student
.count .partTimeFraction.sum x age
x sex: Male x origin (array) x previousEducation
Enrolment
- partTimeFraction x workmode
x repeater: Yes x completer x entrant x (adjustement)
ProgramExecution(Class, pedagogical unit, …)
.count x adult
x grade: 2 x ISCED.field x location
Institutions(education provider; non-instructional institutions; …)
x sector x InstType
EducationProgram(Utility)
- startingAge - duration - Name
x ISC.level: 2 x ISC.orientation: General x ISCED.destination x ISCED.degreePos
is enrolled in
for
of
provides
Count of male lower secondary general repeater at grade 2
www.uis.unesco.org
Example 2: Students and Classes
Class1 AVERAGE CLASS SIZE BY LEVEL OF EDUCATIONCountry AND BY TYPE OF INSTITUTIONS
School year start (mm/yyyy): Sources:
School year end (mm/yyyy): Methods:
PRIMARY Education (ISC 1)
LOWER SECONDARY
SCHOOLS (ISC 2) UOE version:
TYPE OF INSTITUTIONS All regular
programmesAll general
programmes
1 2TOTAL: Public and private institutions
A1 Average class size
A2 Number of students
A3 Number of classes
Public institutions
A4 Average class sizeA5 Number of students X
A6 Number of classes Y
Count of lower secondary general classes and students, class size
Student
.count .partTimeFraction.sum x age x sex x origin (array) x previousEducation
Enrolment
- partTimeFraction x workmode x repeater x completer x entrant x (adjustement)
ProgramExecution(Class, pedagogical unit, …)
.count x adult x grade x ISCED.field x location
Institutions(education provider; non-instructional institutions; …)
x sector x InstType
is enrolled in
for
of
provides
EducationProgram(Utility)
- startingAge - duration - Name
x ISC.level: 2 x ISC.orientation: General x ISCED.destination x ISCED.degreePos
www.uis.unesco.org
Example 2: Students and Classes
ANNUAL INTAKE BY LEVEL OF EDUCATION
AND PROGRAMME DESTINATION
TOTAL PUBLIC AND PRIVATE INSTITUTIONS UPPER POST-
SECONDARY
TOTAL FULL-TIME AND PART-TIME SECONDARY NON-TERTIARY
Total males and females ISCED 3 ISCED 4 ISCED 5A ISCED 5B ISCED 6
Enrolment 1 2 3 4 5
A1Total number of students enrolled (ENRL1, row A1) (A2+A3+A4)
Of which:
A2 New entrants (B1)
A3 Re-entrants
A4 Continuing students
New entrants
B1 New entrants (from A2) (B2+B3)
Of which:
B2With previous education at the other tertiary level
B3Without any previous education at the tertiary level
TERTIARY
Count of new entrants to tertiary 5B with previous tertiary education
Student
.count .partTimeFraction.sum x age x sex x origin (array)
x previousEducation: 5A
Enrolment
- partTimeFraction x workmode x repeater x completer
x entrant: New entrant x (adjustement)
ProgramExecution(Class, pedagogical unit, …)
.count x adult x grade x ISCED.field x location
Institutions(education provider; non-instructional institutions; …)
x sector x InstType
is enrolled in
for
of
provides
EducationProgram(Utility)
- startingAge - duration - Name
x ISC.level: 5 x ISC.orientation: General
x ISCED.destination: B x ISCED.degreePos
Student
.count .partTimeFraction.sum x age x sex x origin (array) x previousEducation
Enrolment
- partTimeFraction x workmode x repeater x completer x entrant x (adjustement)
EducationStaff(Teacher, …)
.partTimeFraction.sum x age x sex x training
Engagement
.count - partTimeFraction x workmode x engagementType
Institutions(education provider; non-instructional institutions; …)
x sector x InstType
EducationProgram(Utility)
- startingAge - duration - Name x ISCED.level x ISCED.orientation x ISCED.destination x ISCED.degreePos
EducationSystem(Utility)
- compulsoryEducationStart - compulsoryEducationEnd - academicYearBeginn - academicYearEnd - financialYearBeginn - country - currency
Funder(Governements, private entities, …)
x sector
Expenditure
.amount.sum x nature
is enrolled in
belongs to
for
isEngagedIn
for
of
spends on
belongs to
transfers to/spends on
receives transfer
receives transfer
ProgramExecution(Class, pedagogical unit, …)
.count x adult x grade x location
provides
Institutions(education provider; non-instructional institutions; …)
x sector x InstType
Funder(Governements, private entities, …)
x sector
Expenditure
.amount.sum x nature
Householdsspends on
transfers to/spends on
receives transfer
receives transfer
receives transfer
transfers to/spends on
www.uis.unesco.org
Principles for the generation the detailed model for individual data points
Use existing meta data
Avoid multiple capturing of questionnaire information
Ensure consistency with existing systems
www.uis.unesco.org
Generate the detailed model for individual data points
Cell ID: 1052108025 (ENTR1-B2:4, Version 2005 to 2099)
Object: Student: Count
-age: Total -sex: Female -origin: All -previousEducation: 5A
enrolled in -> Object: Enrolment
-workmode: Total -repeater: NO -completer: NO -entrant: First time entrant
for -> Object: ProgramExecution
-adult: Total -grade: Total -ISCED.field: Total -location: Total
of -> Object: EducationProgram
-ISCED.level: 5 -ISCED.orientation: Total -ISCED.destination: B -ISCED.degreePos: First
provided by -> Object: Institution -sector: Total -InstType: Instructional
www.uis.unesco.org
The basis: the UIS meta data (relational database description)
www.uis.unesco.org
Example: UIS meta data (relational database codes, XML version)
www.uis.unesco.org
What is needed beyond the model to get a complete data structure definition?
The data structure definition has to cope with data points collected twice.
• Total number of primary students is collected in ENRL1a, ENRL1, ENRL3, ENRL4, CLASS1
The data structure definition has to cope with adjustements to data concerning coverage of data.
• The count of student is collected with coverage adjusted to expenditure data.
www.uis.unesco.org
Questions, comments?
Education content: Michael Bruneforth ([email protected])
IT: Brian Buffett ([email protected])
Thanks