24
Introduction Latent class model Estimation Implementation Reflection Estimating a latent class model Jasper Knockaert DCA Workshop 23 June 2017 Jasper LCM Estimation

Estimating a latent class model - Transport and Mobility ...transp-or.epfl.ch/dcaworkshop/2017/Knockaert.pdf · Introduction Latent class model Estimation Implementation Reflection

  • Upload
    vuduong

  • View
    215

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Estimating a latent class model - Transport and Mobility ...transp-or.epfl.ch/dcaworkshop/2017/Knockaert.pdf · Introduction Latent class model Estimation Implementation Reflection

IntroductionLatent class model

EstimationImplementation

Reflection

Estimating a latent class model

Jasper Knockaert

DCA Workshop23 June 2017

Jasper LCM Estimation

Page 2: Estimating a latent class model - Transport and Mobility ...transp-or.epfl.ch/dcaworkshop/2017/Knockaert.pdf · Introduction Latent class model Estimation Implementation Reflection

IntroductionLatent class model

EstimationImplementation

Reflection

1 Introduction

2 Latent class model

3 Estimation

4 Implementation

5 Reflection

Jasper LCM Estimation

Page 3: Estimating a latent class model - Transport and Mobility ...transp-or.epfl.ch/dcaworkshop/2017/Knockaert.pdf · Introduction Latent class model Estimation Implementation Reflection

IntroductionLatent class model

EstimationImplementation

Reflection

SpecificationLocal optima

Classical multinomial logit choice model:

Uin = � · Xin + ✏in

Mixed logit model: � distributed

Uin = �n · Xin + ✏in

Latent class model: special case of distributionA vector �c for each class cClass membership is latentClass membership model defined by probability mass functionP5(c |Xn; �,⌃µ)Choice probability for alternative y :

X

c

hP5(c |X ⇤

n ; �,⌃µ) · P4(yn|U1n, . . . ,Ujn;�c ,⌃✏)i

Main advantage of latent class: closed-form expression forchoice likelihood

Jasper LCM Estimation

Page 4: Estimating a latent class model - Transport and Mobility ...transp-or.epfl.ch/dcaworkshop/2017/Knockaert.pdf · Introduction Latent class model Estimation Implementation Reflection

IntroductionLatent class model

EstimationImplementation

Reflection

SpecificationLocal optima

Main issue with latent class model is abundance of local optima:var description 1-class 2-class 3-class 4-class

# estimations 11 132 1048 3038s # converged estimations 11 132 1034 2858p # unique optima 1 16 91 614H stopping criterion 0.00049 0.00033 0.000012 0.0096for each latent class 10 utility function coefficients and 1 class constant (with the constant for one classarbitrarily fixed to 0)

22174 revealed preference observations by 544 respondents (work with Stefanie Peer)

Jasper LCM Estimation

Page 5: Estimating a latent class model - Transport and Mobility ...transp-or.epfl.ch/dcaworkshop/2017/Knockaert.pdf · Introduction Latent class model Estimation Implementation Reflection

IntroductionLatent class model

EstimationImplementation

Reflection

IntroductionAlgorithmHeuristicsIllustration

Estimation with local optima:Heuristic optimisation:

Calculate LL in more or less randomly sampled starting points("informed" grid search)Identify most promising starting pointRun classical optimisation (i.e. biogeme)Hole and Yoo (2017) The use of heuristic optimizationalgorithms to facilitate maximum simulated likelihoodestimation of random parameter logit models, Appl. Statist.

population-based heuristic optimization algorithms: DE andPSO algorithms

My approach: use random initialisation and run optimisationfrom each starting point.

Jasper LCM Estimation

Page 6: Estimating a latent class model - Transport and Mobility ...transp-or.epfl.ch/dcaworkshop/2017/Knockaert.pdf · Introduction Latent class model Estimation Implementation Reflection

IntroductionLatent class model

EstimationImplementation

Reflection

IntroductionAlgorithmHeuristicsIllustration

Estimation with local optima:Heuristic optimisation:

Calculate LL in more or less randomly sampled starting points("informed" grid search)Identify most promising starting pointRun classical optimisation (i.e. biogeme)Hole and Yoo (2017) The use of heuristic optimizationalgorithms to facilitate maximum simulated likelihoodestimation of random parameter logit models, Appl. Statist.

population-based heuristic optimization algorithms: DE andPSO algorithms

My approach: use random initialisation and run optimisationfrom each starting point.

Jasper LCM Estimation

Page 7: Estimating a latent class model - Transport and Mobility ...transp-or.epfl.ch/dcaworkshop/2017/Knockaert.pdf · Introduction Latent class model Estimation Implementation Reflection

IntroductionLatent class model

EstimationImplementation

Reflection

IntroductionAlgorithmHeuristicsIllustration

random initialise modelmodel

specification

estimate model

compare estimationto protomodels

stoppingcriterion

met?

stop

no

yes

Generic methodology forrandom initialisationChallenge is in clustering theestimation results aroundoptimaHeuristic as stoppingcriterium

Jasper LCM Estimation

Page 8: Estimating a latent class model - Transport and Mobility ...transp-or.epfl.ch/dcaworkshop/2017/Knockaert.pdf · Introduction Latent class model Estimation Implementation Reflection

IntroductionLatent class model

EstimationImplementation

Reflection

IntroductionAlgorithmHeuristicsIllustration

To cluster we define a similarity indicator:

S(�, �) =Y

k=1...K

2

6641 � erf|�k � �k |r2⇣t2�k

+ t2�k

3

775

Stopping criterium:

H =

p

p + 1

�s

Jasper LCM Estimation

Page 9: Estimating a latent class model - Transport and Mobility ...transp-or.epfl.ch/dcaworkshop/2017/Knockaert.pdf · Introduction Latent class model Estimation Implementation Reflection

IntroductionLatent class model

EstimationImplementation

Reflection

IntroductionAlgorithmHeuristicsIllustration

Illustration of optima found 2-class version of the RP model inprevious slide (132 properly converged estimations)

optimum LL #estimations1 �52599.378 132 �52601.516 253 �52628.548 214 �52677.798 15 �52909.474 196 �52910.33 127 �52915.372 148 �52932.691 19 �52934.631 310 �52935.529 111 �52936.441 1212 �52936.482 113 �52938.413 214 �52973.077 115 �53038.479 416 �53042.439 2

Jasper LCM Estimation

Page 10: Estimating a latent class model - Transport and Mobility ...transp-or.epfl.ch/dcaworkshop/2017/Knockaert.pdf · Introduction Latent class model Estimation Implementation Reflection

IntroductionLatent class model

EstimationImplementation

Reflection

Random initialisation: bash script substitutes random numbersin model template; results are saved as genuine pythonbiogememodel file so reproducible

Estimation: pythonbiogemeComparison: a python script patches the estimation htmloutput, and compares it to "protomodels" in sqlite database;after comparison the sqlite database is updated; support forbehaviourally identical permutations of models (e.g. unorderedlatent classes)Evaluation of stopping criterium: python script readsparameters from sqlite databaseParallelisation is possible using a network share for the files(sqlite database supports concurrency); I implemented PBSsupport for managing parallel estimations on a clusterSupporting tools, e.g. to rebuild sqlite database but also tovisualise clustering (with gnuplot)

Jasper LCM Estimation

Page 11: Estimating a latent class model - Transport and Mobility ...transp-or.epfl.ch/dcaworkshop/2017/Knockaert.pdf · Introduction Latent class model Estimation Implementation Reflection

IntroductionLatent class model

EstimationImplementation

Reflection

Random initialisation: bash script substitutes random numbersin model template; results are saved as genuine pythonbiogememodel file so reproducibleEstimation: pythonbiogeme

Comparison: a python script patches the estimation htmloutput, and compares it to "protomodels" in sqlite database;after comparison the sqlite database is updated; support forbehaviourally identical permutations of models (e.g. unorderedlatent classes)Evaluation of stopping criterium: python script readsparameters from sqlite databaseParallelisation is possible using a network share for the files(sqlite database supports concurrency); I implemented PBSsupport for managing parallel estimations on a clusterSupporting tools, e.g. to rebuild sqlite database but also tovisualise clustering (with gnuplot)

Jasper LCM Estimation

Page 12: Estimating a latent class model - Transport and Mobility ...transp-or.epfl.ch/dcaworkshop/2017/Knockaert.pdf · Introduction Latent class model Estimation Implementation Reflection

IntroductionLatent class model

EstimationImplementation

Reflection

Random initialisation: bash script substitutes random numbersin model template; results are saved as genuine pythonbiogememodel file so reproducibleEstimation: pythonbiogemeComparison: a python script patches the estimation htmloutput, and compares it to "protomodels" in sqlite database;after comparison the sqlite database is updated; support forbehaviourally identical permutations of models (e.g. unorderedlatent classes)

Evaluation of stopping criterium: python script readsparameters from sqlite databaseParallelisation is possible using a network share for the files(sqlite database supports concurrency); I implemented PBSsupport for managing parallel estimations on a clusterSupporting tools, e.g. to rebuild sqlite database but also tovisualise clustering (with gnuplot)

Jasper LCM Estimation

Page 13: Estimating a latent class model - Transport and Mobility ...transp-or.epfl.ch/dcaworkshop/2017/Knockaert.pdf · Introduction Latent class model Estimation Implementation Reflection

IntroductionLatent class model

EstimationImplementation

Reflection

Random initialisation: bash script substitutes random numbersin model template; results are saved as genuine pythonbiogememodel file so reproducibleEstimation: pythonbiogemeComparison: a python script patches the estimation htmloutput, and compares it to "protomodels" in sqlite database;after comparison the sqlite database is updated; support forbehaviourally identical permutations of models (e.g. unorderedlatent classes)Evaluation of stopping criterium: python script readsparameters from sqlite database

Parallelisation is possible using a network share for the files(sqlite database supports concurrency); I implemented PBSsupport for managing parallel estimations on a clusterSupporting tools, e.g. to rebuild sqlite database but also tovisualise clustering (with gnuplot)

Jasper LCM Estimation

Page 14: Estimating a latent class model - Transport and Mobility ...transp-or.epfl.ch/dcaworkshop/2017/Knockaert.pdf · Introduction Latent class model Estimation Implementation Reflection

IntroductionLatent class model

EstimationImplementation

Reflection

Random initialisation: bash script substitutes random numbersin model template; results are saved as genuine pythonbiogememodel file so reproducibleEstimation: pythonbiogemeComparison: a python script patches the estimation htmloutput, and compares it to "protomodels" in sqlite database;after comparison the sqlite database is updated; support forbehaviourally identical permutations of models (e.g. unorderedlatent classes)Evaluation of stopping criterium: python script readsparameters from sqlite databaseParallelisation is possible using a network share for the files(sqlite database supports concurrency); I implemented PBSsupport for managing parallel estimations on a cluster

Supporting tools, e.g. to rebuild sqlite database but also tovisualise clustering (with gnuplot)

Jasper LCM Estimation

Page 15: Estimating a latent class model - Transport and Mobility ...transp-or.epfl.ch/dcaworkshop/2017/Knockaert.pdf · Introduction Latent class model Estimation Implementation Reflection

IntroductionLatent class model

EstimationImplementation

Reflection

Random initialisation: bash script substitutes random numbersin model template; results are saved as genuine pythonbiogememodel file so reproducibleEstimation: pythonbiogemeComparison: a python script patches the estimation htmloutput, and compares it to "protomodels" in sqlite database;after comparison the sqlite database is updated; support forbehaviourally identical permutations of models (e.g. unorderedlatent classes)Evaluation of stopping criterium: python script readsparameters from sqlite databaseParallelisation is possible using a network share for the files(sqlite database supports concurrency); I implemented PBSsupport for managing parallel estimations on a clusterSupporting tools, e.g. to rebuild sqlite database but also tovisualise clustering (with gnuplot)

Jasper LCM Estimation

Page 16: Estimating a latent class model - Transport and Mobility ...transp-or.epfl.ch/dcaworkshop/2017/Knockaert.pdf · Introduction Latent class model Estimation Implementation Reflection

IntroductionLatent class model

EstimationImplementation

Reflection

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.01 0.1 1 10 100 1000 10000

Sim

ilarit

y in

dica

tor

Difference in loglikelihood

LConly_2_join_wide (std_check: True)

Jasper LCM Estimation

Page 17: Estimating a latent class model - Transport and Mobility ...transp-or.epfl.ch/dcaworkshop/2017/Knockaert.pdf · Introduction Latent class model Estimation Implementation Reflection

IntroductionLatent class model

EstimationImplementation

Reflection

Does it actually matter?

optimum LL #estimations value of traveltimehigh low

1 �52599.378 13 47.2 10.52 �52601.516 25 42.9 10.83 �52628.548 21 40.3 11.14 �52677.798 1 40.6 16.35 �52909.474 19 30.6 10.5

optimum LL #estimations value of morning SDEhigh low

1 �52599.378 13 43.4 3.02 �52601.516 25 37.8 3.23 �52628.548 21 32.7 3.44 �52677.798 1 46.7 2.85 �52909.474 19 11.9 5.0

Jasper LCM Estimation

Page 18: Estimating a latent class model - Transport and Mobility ...transp-or.epfl.ch/dcaworkshop/2017/Knockaert.pdf · Introduction Latent class model Estimation Implementation Reflection

IntroductionLatent class model

EstimationImplementation

Reflection

Does it actually matter?

optimum LL #estimations value of traveltimehigh low

1 �52599.378 13 47.2 10.52 �52601.516 25 42.9 10.83 �52628.548 21 40.3 11.14 �52677.798 1 40.6 16.35 �52909.474 19 30.6 10.5

optimum LL #estimations value of morning SDEhigh low

1 �52599.378 13 43.4 3.02 �52601.516 25 37.8 3.23 �52628.548 21 32.7 3.44 �52677.798 1 46.7 2.85 �52909.474 19 11.9 5.0

Jasper LCM Estimation

Page 19: Estimating a latent class model - Transport and Mobility ...transp-or.epfl.ch/dcaworkshop/2017/Knockaert.pdf · Introduction Latent class model Estimation Implementation Reflection

IntroductionLatent class model

EstimationImplementation

Reflection

Does it actually matter?

optimum LL #estimations value of traveltimehigh low

1 �52599.378 13 47.2 10.52 �52601.516 25 42.9 10.83 �52628.548 21 40.3 11.14 �52677.798 1 40.6 16.35 �52909.474 19 30.6 10.5

optimum LL #estimations value of morning SDEhigh low

1 �52599.378 13 43.4 3.02 �52601.516 25 37.8 3.23 �52628.548 21 32.7 3.44 �52677.798 1 46.7 2.85 �52909.474 19 11.9 5.0

Jasper LCM Estimation

Page 20: Estimating a latent class model - Transport and Mobility ...transp-or.epfl.ch/dcaworkshop/2017/Knockaert.pdf · Introduction Latent class model Estimation Implementation Reflection

IntroductionLatent class model

EstimationImplementation

Reflection

Some observations (and caveats)The presence of local optima is wildly variable; some latentclass models only have one optimum even with three of fourclasses

Sampling matters (interval, distribution)The cutoff points (for the algorithm heuristics) is also variable;(e.g. H = 0.01 seems a fine value but with very non-linear"Utility" functions a much lower value is needed)Why suboptima?

Analytical feature of LC? Conditions?Numerical feature? (I use high-precision version ofpythonbiogeme)Misspecification?

Report estimation statistics!

Jasper LCM Estimation

Page 21: Estimating a latent class model - Transport and Mobility ...transp-or.epfl.ch/dcaworkshop/2017/Knockaert.pdf · Introduction Latent class model Estimation Implementation Reflection

IntroductionLatent class model

EstimationImplementation

Reflection

Some observations (and caveats)The presence of local optima is wildly variable; some latentclass models only have one optimum even with three of fourclassesSampling matters (interval, distribution)

The cutoff points (for the algorithm heuristics) is also variable;(e.g. H = 0.01 seems a fine value but with very non-linear"Utility" functions a much lower value is needed)Why suboptima?

Analytical feature of LC? Conditions?Numerical feature? (I use high-precision version ofpythonbiogeme)Misspecification?

Report estimation statistics!

Jasper LCM Estimation

Page 22: Estimating a latent class model - Transport and Mobility ...transp-or.epfl.ch/dcaworkshop/2017/Knockaert.pdf · Introduction Latent class model Estimation Implementation Reflection

IntroductionLatent class model

EstimationImplementation

Reflection

Some observations (and caveats)The presence of local optima is wildly variable; some latentclass models only have one optimum even with three of fourclassesSampling matters (interval, distribution)The cutoff points (for the algorithm heuristics) is also variable;(e.g. H = 0.01 seems a fine value but with very non-linear"Utility" functions a much lower value is needed)

Why suboptima?Analytical feature of LC? Conditions?Numerical feature? (I use high-precision version ofpythonbiogeme)Misspecification?

Report estimation statistics!

Jasper LCM Estimation

Page 23: Estimating a latent class model - Transport and Mobility ...transp-or.epfl.ch/dcaworkshop/2017/Knockaert.pdf · Introduction Latent class model Estimation Implementation Reflection

IntroductionLatent class model

EstimationImplementation

Reflection

Some observations (and caveats)The presence of local optima is wildly variable; some latentclass models only have one optimum even with three of fourclassesSampling matters (interval, distribution)The cutoff points (for the algorithm heuristics) is also variable;(e.g. H = 0.01 seems a fine value but with very non-linear"Utility" functions a much lower value is needed)Why suboptima?

Analytical feature of LC? Conditions?Numerical feature? (I use high-precision version ofpythonbiogeme)Misspecification?

Report estimation statistics!

Jasper LCM Estimation

Page 24: Estimating a latent class model - Transport and Mobility ...transp-or.epfl.ch/dcaworkshop/2017/Knockaert.pdf · Introduction Latent class model Estimation Implementation Reflection

IntroductionLatent class model

EstimationImplementation

Reflection

Some observations (and caveats)The presence of local optima is wildly variable; some latentclass models only have one optimum even with three of fourclassesSampling matters (interval, distribution)The cutoff points (for the algorithm heuristics) is also variable;(e.g. H = 0.01 seems a fine value but with very non-linear"Utility" functions a much lower value is needed)Why suboptima?

Analytical feature of LC? Conditions?Numerical feature? (I use high-precision version ofpythonbiogeme)Misspecification?

Report estimation statistics!

Jasper LCM Estimation