Sylvie Huet

Preview:

DESCRIPTION

Sylvie Huet. Modelling from data: an experience in modelling rural demography. Laboratoire d’Ingénierie pour les Systèmes Complexes. From data to models Cergy-Pontoise, 27-28 june 2013. Context: demography in rural municipalities. Evolution du rural in Europe? - PowerPoint PPT Presentation

Citation preview

www.irstea.fr

Pour mieux affirmer ses missions, le Cemagref devient Irstea

Modelling from data: an experience in modelling rural demography

From data to models Cergy-Pontoise, 27-28 june 2013

Sylvie Huet

Laboratoire

d’Ingénierie pour les Systèmes

Complexes

2

Context: demography in rural municipalities2

Evolution du rural in Europe?

Coupling demography and residential mobility of people in order to study their evolution at a very local scale: the municipality level

In-migration

Demography

Residential

mobility

Out-migration

3

Context: demography in rural municipalities3

Coupling microsimulation and agent-based modelling

No integrated theories so extracted and using data to build a globally coherent theory through the dynamic modelling approach at the individual level and at the municipality level

Decision-support in demography generally uses microsimulation modeling

(O'Donoghue, C. (2001), Li and O’Donoghue, 2012).

Space and residential mobility

A first instance on the Cantal population (French region) (Huet et al 2012a, 2012b).

4

Problem

1. An interesting motivation2. A well identified overall modelling choice3. A marvellous applied research question

But data! As a constraint, as theories, as results…

“Pas de chichis, pas de blabla, que des résultats”

4

5

5

1. Finding and censing data 2. Choosing data for dynamic modelling

Summary: everything through the prism of data

6

6

tenacity… and then

1. Can’t built a specific survey: too large problem

2. Can’t use a reweighted sample of individual: not enough and too much difficult to access

Finding data

7

Finding and make the census of data7

At first, we had nothing…

and finally we have too much!

Enquêtes générations

1988, 1998, …

Histoires familiales

Histoire de vie 2003

Distribution des salaires (INSEE)

Enquête Emploi

Labour Force Survey

Household Panel

Corinne Land Cover

Recensements agricoles 1988,

1998, 2005

Inventaire Communal 1998, 1998

Base permanente des

Equipements

Recensements 1990, 1999,

2006, …

Tables de mobilité

1999

Enquête logements

Revenus des ménages ISSP sens

du travail

SIRENE entreprises

Finances Communales

DGF

Réseau chambre d’hôtes

Taxes de séjour

SITADEL (logements)

Confusion

8

Changing confusion in results8

DATA MODELLINKING

Criteria to choose

9

Summary

9

1. Finding and censing data 2. Choosing data for dynamic modelling

10

Criteria to choose among all the data?10

1. Quantity of work

2. People and ideas

3. Building the various dynamics (and their couplings)

4. Calibrating and validating the model 

11

1. Criteria: time and cognitive costs!11

The ones we don’t really talked about linked to the quantity of work

• Cost in terms of investigation of the data sources• Easiness to use statistical tool and representativity• Possible reuse of generic objects and dynamics in other

countries

12

What a costly approach! 12

List of questionsList of variables (not necessarily the direct answers to questions)List of modalities for a variablesRepresentativity at various scales, for various population…Understanding hiden/above model, theories

Require to study for every possible source:

Laborious, difficult, not valorised,…

not publishable, not a research problem, too

long to explain…

A lot of people always use the same survey as we use the same tools or the same methods

13

2. Criteria: working with people and ideas13

In interdisciplinary work, the ones you don’t think a priori:

• Understandable for involved people (and comparable with other models)

• Working with research partners

A compromise to decide aboutOr who you are going

not to understand 

14

Criteria: working with researchers and ideas

14

• The existing/choosing data are not collected under their theories’ hypothesis: misunderstood, disagreement

• Some, especially modellers, don’t use data usually • Some, especially modellers, have difficulties to

understand what individual based modelling means

Why not to use the wages?

15

3. Criteria: building the various dynamics15

To build the various dynamics (and their couplings)

• Possible interconnectivity of various sources

Example: using conjointly the LFS and the Census, giving both the “same” activity sectors and socio-professional category allowing to define the employment offer at the municipality level (Census) and the way an individual choose an employment and change it (LFS)

16

Criteria: building the various dynamics16

• Problem of the statistical representation (example of low density areas representing a small part of the population: 39% zones rurales ou périurbaines)

Census: rare datasources at low level and rare theories and/or knowledge

Example in Cantal: number of farmers in Cantal; no problem to access to a lodging but problem to access services)

European Household Panel or National Census?

17

Criteria: building the various dynamics starting from wrong data

17

With the wrong data, in sense of irrelevant, not convenient, chosen for their capacity to « reveal » a relevant dynamics

The number of in and out migrants has this property since it links every processes related to mobility, starting from the decision to move 

18Choosing a decision to move: “checking model”

17025

17075

18

nbSizes

iips cd1

17025

17075

15exp1 anbSizes

iips cd

Old people move too much for a decision only based on the size of the current housing

Family reasons are the most cited reasons for the decision to move (impact on needed size)

19

Assessing the chosen decision to move

LITTERATURE (statistical analysis from data)(Debrand and Taffin 2006) notice that moving decreases with age

But also the move to a large housing is much more common than the move to a smaller one

19

And finally we can also reproduce the critical values, and more simply, deciding to move with a lower probability when the need is to decrease the residence size

20

Choosing dynamics to ensure consistency (in case again of wrong data)

20

Counterintuitive choices to ensure the consistency between endogenous submodels, being parameterised from calibration, and exogenous submodels, parameterised from data.

Example: residential mobility modelling, people are susceptible to migrate out the region if and only if they have found a new residence place inside the region!

=> only because we only know about the probability to quit the region versus moving inside the region (ie problem of the unknown decision to move)

 

21

4. Criteria: calibration and validation

21

To calibrate (finding out the parameters of the dynamics chosen through the checking-model procedure) and validate the model:

• Temporal continuity of the definitions and availability, comprising also the initial state (ex. : 1990, 1999, 2006, dwelling size…)

• Relevance of the spatial scale at which the data are available

• Critical indicators about the temporal evolution, especially related to “initially” unknown dynamics

Example for Cantal…

22

The Cantal: data for calibration

19992000-2006

22

A DECREASING POPULATION BUT AN INCREASING MIGRATORY BALANCE (switching during the period) AND A DECREASING NATURAL BALANCE

23

The Cantal: data for calibration

decreasing municipalities: redincreasing municipalities: blue

2000-2006

23

WITH A LARGE HETEROGENEITY OF THE TENDENCIES AT LOCAL LEVEL

24

The Cantal: data for calibration

1999

2000-2006

24

WITH A STRONG SPATIAL CONSTRAINT

A LOT OF MOVES DESPITE A WEAK MIGRATORY BALANCE

133459

116461

17025

17075

9814

11905

1990-1999

2000-2006

25

An almost impossible calibration despite the data and because of the data

25

Aim at respecting the tendency (not only the absolute difference to various measures of the time). What about a small overall distances if the tendency is not the same? A combination of every tendencies is almost impossible to obtain… Require a quasi continuous loop of rebuilding the model

Small distance but bad tendency

26

A never ending validation26

Too many data in a way… how choosing to restrict the validation process? I don’t know at this stage.

Similarly to the calibration problem, you can’t be satisfied since you have a lot of data, almost all the data you have not retain for building the initialisation or the dynamics

27

Synthesis at this point of my study of what data brings into the dynamic modelling at low level of large systems

27

Finally very difficult to use as a predictive tool even if microsimulation (built from data) are usually built for this reason and considered as reliable since it propose a consistent theory extracted from data

Much more useful (probably even classical theoretical approach or discrete choice models) to learn about composing dynamics since they consider a lot of coupling dynamics (instead hypothesizing they are neglectable) : checking dynamics procedure

Data challenges the interdisciplinary work (instead of simplifying)!

28

What a richness and a nightmare!28

Recommended