54
ESCWA SDMX Workshop Session: SDMX and Data

ESCWA SDMX Workshop Session: SDMX and Data. Session Objectives At the end of this session you will: –Know the SDMX model of a data structure definition

Embed Size (px)

Citation preview

ESCWA SDMX Workshop

Session: SDMX and Data

Session Objectives

• At the end of this session you will:– Know the SDMX model of a data structure

definition– Understand the techniques to identify the

structure of data– Identify the concepts in a simple data set – Be able to develop simple data structure

definitions using SDMX tools

Data Set

Data Set: Structure

Data Set Structure

• Computers need to know the structure of data in terms of:– Concepts– Code Lists– Dimensionality– Additional metadata

First: Identify the Concepts

• A concept is a unit of knowledge created by a unique combination of characteristics (SDMX Information Model)

Unit Multiplier

Unit

Topic

Time/Frequency

CountryStock/Flow

Data Set Structure: Concepts

Data Set Structure: Code Lists

Code Lists

TOPIC

A Brady Bonds

B Bank Loans

C Debt Securities

AR Argentina

MX Mexico

ZA South Africa

COUNTRY STOCK/FLOW

1 Stock

2 Flow

CONCEPTS

Topic

Country

Flow

Concepts

16457

Q,ZA,B,1,1999-06-30=16457

Data Makes Sense

Data Set Structure: Defining Multi-dimensional Structures

• Comprises– Concepts that identify the observation value– Concepts that add additional metadata about the

observation value– Concept that is the observation value– Any of these may be

• coded• text• date/time• number• etc.

Dimensions

Attributes

Measure

Representation

Data Set Structure: Concept Usage

Unit Multiplier

Unit

Topic

Time/Frequency

CountryStock/Flow

Observation

(Dimension)(Dimension)

(Dimension)

(Attribute)

(Dimension)

(Dimension)

(Attribute)

(Measure)

has code list

Code List

Code List

AttributesAttributes

concepts that add metadata

has format

concepts that identify groups of keys

concepts that identify the observation

Data Structure Definition

Data Structure Definition

Key Key Group Key Group Key

Dimensions Dimensions

Concept Concept

MeasuresMeasures

CONCEPTS

Topic

Country

Flow

takes semantic

from

has formattakes

semantic

from

takes semantic

from

has format

concepts that are observed phenomenon

TOPIC

A Brady Bonds

B Bank Loans

C Debt Securities

Representation

Coded Coded Non-

coded Non-

coded

16457

Q,ZA,B,1,1999-06-30=16457

Data Makes SenseFrequency,Country,Topic,Stock/Flow,Time=Observation

Quarterly, South Africa, Bank Loans, Stocks, 2nd quarter 1999

Identifying Concepts

• Identifying Concepts - Sources– Existing data set tables

• From website• From applications

– Data Collection Instruments• Questionnaires• Excel spreadsheets

– Regulations, Handbooks, User Guides• Labour Statistics Convention, 1985 (No. 160), Recommendation,

1985 (No. 170)• Council Regulation No: 311/76/EEC of 09/021976; OJ: L039 of

14/02/1976; Compilation of statistics on foreign workers

– Database Tables– Existing Data Structure Definitions

• From other organisations

Identify Concepts – from website

Source: FAO proof of concept project

Measurement = 1,000 Kg

Concepts

Reference Region

Commodity

Frequency and Time

Observation Value

Measure Type

Unit and Unit Multiplier

Measurement = 1,000 Kg

Exercise: Identify Concept Role

Concept Role: Reminder

• Dimensions– Are the concepts that identify the observation value

• Attributes– Are the concepts that add additional metadata about

the observation value

• Measure– Is the concept that is the observation value

Concepts

Reference Region

Commodity

Frequency and Time

Observation Value

Measure Type

Unit and Unit Multiplier

Measurement = 1,000 Kg

Exercise:Concept Role

Reference Region

Commodity

Frequency and Time

Observation Value

Measure Type

Unit and Unit Multiplier

Measurement = 1,000 Kg

(Dimension)(Dimensions)

(Measure)

(Dimension)

(Dimension)

(Attributes)

Data Set and Structure

Dimension Concept

FREQ

REF_AREA_REG

COMMODITY

MEASURE_TYPE

TIME

Measure Concept

OBS_VALUE

Attribute Concept

OBS_STATUS

OBS_CONF

UNIT

UNIT_MULTIPLIER

Identify/Define Code Lists

• Purpose of a Code List– Constrains the value domain of concepts when used

in a structure like a data structure definition– Defines a shortened language independent

representation of the values– Gives semantic meaning to the values, possibly in

multiple languages

• Agreeing on harmonised code lists is the most difficult aspect of defining a data structure definition

Code Lists Required

Source: FAO proof of concept project

Reference Region

Commodity

Frequency Measure Type

Unit and Unit Multiplier

Measurement = 1,000 Kg

Code Lists

Code Lists

Code Lists (CL_)

For Time Series the SDMX Cross Domain Concepts recommend all observations have a status code (Concept = OBS_STATUS) and a confidentiality code (Concept = OBS_CONF)

Data Structure Definition

Data Structure Definition

Data Structure Definition

Key Key Group Key Group Key

Dimensions Dimensions

Concept Concept

AttributesAttributes MeasuresMeasures

takes semantic

from

has format

takes semantic

from

takes semantic

from

has format

has format

concepts that add metadata

concepts that identify the observation

concepts that are observed phenomenon

concepts that identify groups of keys

Representation

Coded Coded Non-

coded Non-

coded

Code List

Code List

has code list

Data Structure Definition - Reminder

CL_FREQCL_AREA_CTYCL_COMMODITYCL_MEASURE_ELEMENT

Data Structure Definition - Agriculture

CL_OBS_STATUSCL_OBS_CONFCL_UNITCL_UNIT_MULT

Data Structure Definition

Data Structure Definition

Key Key Group Key Group Key

Dimensions Dimensions

Concept Concept

AttributesAttributes MeasuresMeasures

AGRICULTURE_COMMODITY

OBS_STATUSOBS_CONFUNITUNIT_MULT

FREQREF_AREA_REGCOMMODITYMEASURE_TYPETIME

OBS_VALUE

Representation

Coded Coded Non-

coded Non-

coded

Code List

Code List

© Metadata Technology

SDMX and Data Formats

Exercise: Identify Concepts

Identifying Concepts

• Identifying Concepts - Sources– Existing data set tables

• From website• From applications

– Data Collection Instruments• Questionnaires• Excel spreadsheets

– Regulations, Handbooks, User Guides• Labour Statistics Convention, 1985 (No. 160), Recommendation,

1985 (No. 170)• Council Regulation No: 311/76/EEC of 09/021976; OJ: L039 of

14/02/1976; Compilation of statistics on foreign workers

– Database Tables– Existing Data Structure Definitions

• From other organisations

Identifying Concepts

• Identifying Concepts - Sources– Existing data set tables

• From website• From applications

– Data Collection Instruments• Questionnaires• Excel spreadsheets

– Regulations, Handbooks, User Guides• Labour Statistics Convention, 1985 (No. 160), Recommendation,

1985 (No. 170)• Council Regulation No: 311/76/EEC of 09/021976; OJ: L039 of

14/02/1976; Compilation of statistics on foreign workers

– Database Tables– Existing Data Structure Definitions

• From other organisations

Exercise: Identify Concepts – from collection instrument

Source: UNESCO Institute for Statistics

Data Entry - Table 2.1

Source: UNESCO Institute for Statistics

Data Entry - Table 2.2

Source: UNESCO Institute for Statistics

Identifying Concepts

• Identifying Concepts - Sources– Existing data set tables

• From website• From applications

– Data Collection Instruments• Questionnaires• Excel spreadsheets

– Regulations, Handbooks, User Guides• Labour Statistics Convention, 1985 (No. 160), Recommendation,

1985 (No. 170)• Council Regulation No: 311/76/EEC of 09/021976; OJ: L039 of

14/02/1976; Compilation of statistics on foreign workers

– Database Tables– Existing Data Structure Definitions

• From other organisations

Exercise: Identify Dimension Concepts – from website

Source: International Labor Organisation

Identify Concepts: Table 2A

Source: International Labor Organisation

Identify Concepts: Table 2B

Source: International Labor Organisation

Identify Concepts: Table 2C

Source: International Labor Organisation

Identify Concepts: Table 2D

Source: International Labor Organisation

Identify Concepts: Table 2E

Source: International Labor Organisation

Dimension Concept

Identify Concepts: Table 2A

Reference Area

Sex Time Period Frequency

Measure Type

Identify Concepts: Table 2B

Economic Activity

Measure Type

Identify Concepts: Table 2C

OCCUPATION

Measure Type

Identify Concepts: Table 2D

Status in Employment

Measure Type

Identify Concepts: Table 2EMeasure Type

Exercise: Identify Concepts – from collection instrument

Source: UNESCO Institute for Statistics

Time

Reference Area

Dimension Concepts - Tables 2.1/2.2

Source: UNESCO Institute for Statistics

Education Level

Sex

Institution Type

Measure Type

Work Mode

Programme Orientation

© Metadata Technology

Labor Statistics: Data Structure Definition(Incomplete)

Dimension Concept Representation

Frequency (FREQ) CL_FREQ

Reference Area (REF_AREA) CL_REF_AREA

Education level (EDUC_LEVEL) CL_EDUCATLVTYP

Sex (SEX) CL_SEX

Programme Orientation (PROG_ORIENTATION)

CL_PROG_ORIENTATION

Institution Type (INSTITUTION_TYPE) CL_INSTITUTION_TYPE

Work Mode (WORK_MODE) CL_WORK_MODE

Measure Type (MEASURE_TYPE) CL_MEASURE_TYPE

Time (TIME) Date/Time

Measure Concept Representation

Observation Value (OBS_VAL) Numeric

Education Statistics : Data Structure Definition (Incomplete)

Attribute Concept Assignment Status

Attachment Representation

Observation Status (OBS_STATUS)

M(andatory) Observation CL_OBS_STATUS

Observation Confidentiality (OBS_CONF)

C(onditional) Observation CL_OBS_CONF

Unit (UNIT) M Series CL_UNIT

Unit Multiplier (UNIT_MULTIPLIER)

M Series CL_UNIT_MULT

Education Statistics : Data Structure Definition (Incomplete)

Identify Concepts from User Guide