Variance estimation on SILC based indicators - GESIS · Variance estimation on SILC based...

3rd EU-LFS/EU-SILC European User Conference 1

Variance estimation on SILC based indicators

Emilio Di Meglio – Eurostat emilio.di-meglio@ec.europa.eu

Guillaume Osier – STATEC guillaume.osier@statec.etat.lu

EU SILC is a sample survey

This means that when looking at indicators we should also

take a look at accuracy measures

EU SILC is a complex survey

Naive methods are not directly applicable

We present the first results of variance estimation using

linearization techniques

Our main message today

Why variance estimation?

Requested by regulation

Quality report

Compliance

Requested by users

Policy relevance of indicators

Requested by researchers

Current quality precision requirements

According to Reg.1982/2003, the X and L (initial sample) data

are to be based on a nationally representative probability

sample of the population residing in private households…

Representative probability samples shall be achieved both for

households and for individual persons in the target population.

The sampling frame and methods of sample selection should

ensure that every individual and household in the target

population is assigned a known and non-zero probability of

selection.

Reg. 1177/2003 defines the minimum effective sample

sizes to be achieved.

Minimum effective sample size

Households Persons aged 16+

Country Cross-sectional Longitudinal Cross-sectional Longitudinal

BE 4 750 3 500 8 750 6 500

BG 4 500 3 500 10 000 7 500

DK 4 250 3 250 7 250 5 550

DE 8 250 6 000 14 500 10 500

DEFF (Kish, 1965)

Definition: "The ratio of the variance under the given sample

design, to the variance under a simple random sample of the

same size"

Importance: a tool to measure the efficiency of your complex

sample design

Calculation based on at-risk-of poverty rate

What influences variance?

Variability of the phenomenon

Sample size

Indicator value

Sampling design

Imputation

Calibration

Weighting

Main challenges for EU SILC

Difficulty to find the « best » possible method for

variance estimation

– Different designs (flexibility)

– Missing information

– Debate on methods ongoing

Differentiate the needs: accuracy estimates for policy

usage and accuracy estimates for researchers.

Sampling design by country (2010)

Simple random sampling Malta

Stratified simple random sampling Luxembourg, Austria*

Stratified simple random sampling

from former participants of micro

census

Germany

Sampling

dwellings/

addresses

Stratified multi-stage sampling

Czech Republic, Spain, France, Hungary,

Latvia, The Netherlands, Poland, Portugal,

Romania, United Kingdom

Stratified simple random sampling Cyprus, Slovakia Sampling

households Stratified multi-stage sampling Belgium, Bulgaria, Greece, Ireland, Italy

Simple random or systematic

sampling Denmark, Iceland, Sweden, Norway

Stratified simple random or

systematic sampling Estonia, Lithuania

Stratified two-phase sampling Finland

Sampling

individuals

Stratified two-stage sampling Slovenia

Sample design variables

DB050: primary strata

DB060: primary sampling units

DB062: secondary sampling units

DB070: order of selection of primary sampling units

DB030: household ID

Our objective

Resampling taking into account all the possible elements

coming from 32 countries would be extremely computationally

and resource intensive

Variance estimation methods balancing between scientific

accuracy and administrative considerations (time, cost,

simplicity) are the only viable solution

Aim: to quickly provide to users and policy makers standard

errors for the SILC-based indicators, particularly the AROPE,

its components and its main breakdowns.

The proposed approach

We have considered different methods: bootstrap,

Jacknife, linearisation

We carried out comparative experiments on a limited

number of countries and results are similar

We chose to work with linearisation (ultimate cluster

approach proposed by Net-SILC2) that can provide

acceptable results given the constraints we face

The approach was discussed at the Workshop accuracy

(Net-SILC2) and validated by the SILC WG

The method (synthesis)

Linearization is a technique based on the use of linear approximation to reduce non-linear statistics to a linear form, justified by asymptotic properties of the estimator (Särndal et al, 1992 ; Deville, 1999 ; Wolter, 2006 ; Osier, 2009)

The "ultimate cluster" approach (Särndal et al, 1992) is a simplification consisting in calculating the variance taking into account only variation among Primary Sampling Unit (PSU) totals

This method requires first stage sampling fractions to be small which is nearly always the case.

This method allows a great flexibility and simplifies the calculations of variances.

It can also be generalized to calculate variance of the differences of one year to another (Berger, 2004 , 2010 ).

Applicable with the main statistical packages (SAS, R, STATA)

The method and the results

The described methodology has been applied in Eurostat for running an estimation of variance and confidence intervals

For AROPE and its subcomponents for 3 age groups (0-17, 18-64, 65+) and gender breakdowns

For estimating variance of net changes.

We have used the SAS procedures SURVEYMEANS and SURVEYFREQ that allow to specify the survey design

According to the characteristics and availability of data for different countries we have used different variables to specify strata and cluster information.

The method and the results

We have used SAS PROC Surveyfreq (linearization) adapting strata and cluster parameters according to the following groups:

GROUP 1: BE, BG, CZ, IE, EL, ES, FR, IT, LV, HU, NL, PL, PT, RO, SI, UK, HR Strata=DB050 Cluster=DB060

GROUP 2 DE, EE, CY, LT, LU, AT, SK, FI, CH Strata=DB050 cluster=DB030

GROUP 3 DK IS MT NO SE Cluster=DB030

Results on AROPE

For 6 countries 95% Confidence Interval for AROPE equal or smaller that ±1.0% (CZ, IT, SI, DE, FI, SE)

For 11 countries 95% Confidence Interval for AROPE between ± 1% and ±1.5% (ES, HU, PL, UK, EE, AT, SK, CH, DK, IS, NO)

For 8 countries 95% Confidence Interval for AROPE between ±1.5% and ±2% (BE, BG, EL, LV, NL, PT, CY, MT)

For 4 countries 95% Confidence Interval for AROPE larger than ±2% (IE, RO, LT, HR)

Complete results in EU-SILC quality report

Results, example

Member

Indicator

Standards

Error (%)

CI 95%

Lower bound

CI 95%

Upper bound

EU27 16.4 0.14 16.08 16.64

BE 14.6 0.74 13.13 16.06

BG 20.7 0.85 19.03 22.35

CZ 9.0 0.44 8.14 9.86

IE 16.1 0.98 14.13 17.98

Measurement of net changes

To measure the significance of the evolution of social indicators

Example:

When the At-risk-of-poverty or social exclusion rate for Cyprus

goes from 22.9% in 2010 to 23.5% in 2011, are we able to say

that this change is significant?

Exercise already done for: AROPE, AROPE(0-

17),ARP,ARP(65+), SMD, VLWI, IWP, UMNC

Problem statement

t= x/y

Absolute change: ∆= t2-t1

AIM: Estimation of variance of change ∆

Major problem: temporal correlations between indicators

Var (∆)= var(t1)+var(t2)-2corr(t1,t2)sqr(var(t1)var(t2))

NET-SILC2: multivariate linear regression approach

(Berger and Priam, Statistics Canada Symposium, 2010 code SAS developed by G. Osier)

Algorithm used

1. Preparation of the data

2. Aggregation at PSU level (ultimate cluster PSU approach)

3. SE estimation for the X estimator at T0

4. 1-3 for the X estimator at T1

5. SE estimation for changes in X estimators using the multivariate

regression approach:

Responses variables: 4 totals

Regressors:

1. Stratification dummy variables

2. Rotation variable at T0 (dummy variable which specifies which PSUs are

observed at T0)

3. Rotation variable at T1

Output

Country

(2010)

(2011)

Difference

2011 - 2010

(% points)

Standard

(% points)

Margin of

(% points)

= 1.96*SE

Significance

of change

BE 20.8 21 0.1 0.076 0.1 N

BG 41.6 49.1 7.5 0.726 1.4 Y

CY 22.9 23.5 0.5 0.605 1.2 N

DK 18.3 18.9 0.5 0.448 0.9 N

Conclusion and future plans

The methodology is of relatively simple application

It can be considered as a good compromise between scientific

soundness and feasibility under current constraints.

SILC based indicators in the current implementation can be

considered as having an overall acceptable accuracy;

The next steps consist in still improving these calculations by asking

Member States to provide the necessary information where missing.

Dissemination of further information to users under investigation.

Variance estimation on SILC based indicators - GESIS · Variance estimation on SILC based...

Documents

SILC, YOUR PASSPORT TO FRANCE

SILC 4315 sept2016 jh

Nevada SILC Policies and Procedures Manualadsd.nv.gov/uploadedFiles/adsdnvgov/content/Boards/… · Web viewNevada SILC Policies and Procedures Manual Rev. 03/02/18 Nevada SILC Policies

Nevada SILC Policies and Procedures Manualadsd.nv.gov/uploadedFiles/adsdnvgov/content/Boards/SILC... · Web view1.SILC Fact Sheet 2.SILC booklet brochure 3.Membership Checklist of

Data Handling in EU-SILC - GESIS: Home...Data Handling in EU -SILC 11 3 Households in the EU-SILC The EU-SILC household data file contains extensive information about characteristics

Variance Estimation in EU-SILC Survey Mārtiņš Liberts Central Statistical Bureau of Latvia

Catálogo Silc (suspensão)

SILC - Arizona State University | ASU

Quality of EU-SILC data - Uni Trier · Quality of EU-SILC data Version: 2011 ... 2.1.2 Equivalised household size ... EU-SILC sample designs for 2007

Silc Fulll Report

EU-SILC 2019: Salient Indicators · At 20.1 per cent, the at-risk-of-poverty or social exclusion rate (AROPE) increased by 1.1 percentage points when compared to EU-SILC 2018. AROPE

SILC Working Draft

The SILC Project (SiLC: S ilicon tracking for the I nternational L inear C ollider)

The SILC Project (SiLC:Silicontracking for the InternationalLinearCollider) The SILC Project (SiLC: Silicon tracking for the International Linear Collider)

Welcome to SILC. Outline Introduction to SILC Typical Day What to expect in each class? What do we expect from students, parents, volunteers? SILC Calendar

Measuring change with the Belgian Survey on …...Measuring change with the Belgian Survey on Income and Living Conditions (SILC): taking account of the sampling variance Report Financed

SILC – Children’s report

5 raisons de préférer SILC

SILC-NET at ILRU Presents…

1 SiLC Silicon Sensors: Status of different activities Thomas Bergauer for the SiLC Collaboration April 25th, 2007 Outline: SiLC “Sensor Baseline” DESY