16 Partha Lahiri

8/6/2019 16 Partha Lahiri

1/31

SMALL DOMAIN PROPORTION ESTIMATION: ANADAPTIVE BAYESIAN APPROACH

Partha Lahiri

JPSM, University of Maryland, College Park, USA

[Based on joint work with Ms. Benmei Lui]


2/31

Examples

Estimation of batting averages of major leaguebaseball players (Efron andMorris, 1975)

Small Area Income and Poverty Estimation (SAIPE)Survey of drug use in Nebraska

2


3/31

Borrowing Strength:

Relevant Source of InformationCensus dataAdministrative informationRelated surveys

Method of Combining InformationChoices of good small area modelsUse of a good statistical methodology

3


4/31

A Basic Area Level Model

iP true proportion for area i

ip : direct design-based estimate for area i

ix : a vector of known auxiliary variables

Model:For i=1,,m,

Level 1 : = g(i i i i

T 2

i i i

p ) ~ ind. N( , )

Level 2 : = g(P ) ~ ind. N(x , )

4


5/31

The sampling variances are assumed to be knowni

Carter and Rolph (1974): ,i i i

,i

g(p ) = arcsine( p ) =4n

1

Efron and Morris (1975): i i ig(p ) = n arcsine(2p -1), = 1

Fay and Herriot (1979):i i

g(p ) = log(p ),

i

estimated by GVF

SAIPE:

ifor state level estimation of proportion

of poor school-age children and i for county

level poverty counts of school-age children

ig(p ) = p

g(Y ) = log(Y )i

5


6/31

Comments:

The model is simple and does not require theknowledge of detailed design Information (e.g., PSU

identifiers), which may not be available in a public-use

file

The resulting empirical best predictor (EBP) of isdesign-consistent

i

EBP method is extendable to specified non-normaldistributions for the sampling and random effects.

6


7/31

For unspecified non-normality of the sampling andrandom effects, one can use EBLUP [Lahiri and Rao,1995] or certain adaptive [Li and Lahiri, 2007; Fabrizi

and Trivisano, 2007] or linear EB [Ghosh and Lahiri,

1987]

Known sampling variances : The GVF typemethods are generally used. The method usually doesnot consider small area effects and the uncertainty in

estimating the sampling variances are not included in

the EBP.

i

In some situation, standard estimates [REML, ML,ANOVA, etc.] of the model variance 2 can be zero.

7


8/31

When is zero, EBLUP reduces to the regressionsynthetic estimate. One way to avoid the problem is to

use the ADM or AML estimates [Morris, 1987; Li and

Lahiri, 2007]

2

The rationale behind the transformation rests onthe Taylor series argument and is used primarily to

stabilize the variance. A direct modeling of the directestimates is possible, but this is likely to lead to non-

linear non-normal mixed model.

g(.)

A simple back transformation is often used to obtainthe estimate of . The optimum property of the BP is

lost by such a back transformation.i

P

8


9/31

Measures of uncertainty and confidence interval

problem are quite challenging and the theory rests on

asymptotics

Hierarchical Bayes implementation of the basic arealevel model provides an exact inference at the expense

of specification of priors for the hyperparameters.

9


10/31

Estimation of Small Area Proportions: Two BasicArea ModelsRef: Liu, Lahiri and Kalton (2007)

Model 1 :

i ii i i i

i

T 2i i

P ( 1 - P )Level 1 : p | P ~ ind N(P , DEFF )

n

Level 2 : g(P ) ~ ind N(x , )

Model 2 :

i ii i i i

i

T 2

i i

P ( 1 - P )Level 1 : p | P ~ ind Beta(P , DEFF )n

Level 2 : g(P ) ~ ind N(x , )

10


11/31

Comments

Both EBP [Jiang and Lahiri 2002; Chatterjee andLahiri 2008] and the Bayesian implementation

[Liu,Lahiri and Kalton 2007] of the above models are

possible

Level 1 modeling could be problematic in the presenceof sizable number of zeroes for small area.

2

W P (1- P )/nh ih ih ih ih

DEFF =i P (1 -P )/ni i i

;

ih ih iW = N /N ;N = Ni h ih

i h ihn = n

11


12/31

is the population proportion for stratum in area .ihP h i

The design effect is a function of , which areunknown.

iDEFF

ihP

If ,ih

.ih i

P P i

2DEFF deff = n W /ni i h ih

DEFF estimation typically requires a syntheticassumption and the variability due to estimation of the

Deff is not accounted for (Other refs: papers by Rao

and You;Folsom and Singh)

12


13/31

A Unit Level Model:

Level 1:: ; (ind

ik i iy | ~ Bernoulli( )

Level 2:i

,i i

logit( ) = x ' + v

where , .iid

2

iv ~ N(0, ) i = 1, ...,m

Ref:

MacGibbon and Tomberlin (1989)Malec, Sedransk, Moriarity and LeClere (1997)

Malec, Sedransk and Tompkins (1993)

Malec, Davis and Cao (1999)

Ghosh et al. (1998)

13


14/31

An Illustration Of Platikurtic RandomEffects:

Study Population: 2002 natality public-use data file that

contains records on all births (4,024,378) occurring within

the United States in 2002.

i

i1

i2

For i = 1,...,51

P : true low birth weight ratex : proportion of mothers of age < 15 yr

x : proportion of case where the newborn is the first child in the family

14


15/31

Fit

i 0 1 i1 2 i2 iLevel 2 : logit(P ) = + x + x + v

Obtain the residual.

Analyze the residuals for possible departure fromnormality.

pThe -value from the Kolmogorov-Smirnov test is0.0436.

15


16/31

The exponential power (EP) distributionRef: Box and Tiao (1973)

0 1/1EP

ccf (x | ,, ) = exp - | (x - ) |

, - < x < +

where ] + R, R , (0,1 0

c = (3 )/( ) , 1 0

c = c /2 ( ) .

: location

: scale

The excess of kurtosis is:

2

( )(5 )

= - 3 (3 ).

,

Normal: 0.5 = Platikurtik: 0.5 < Leptokurtic: 0.5 >

16


17/31

NCHS Data Analyses Cont.

i

v ~ EP(0, , )

Assume that the components of ( , ) are independentand i) ) ~ Unif(0, 1 and ii) ~ U , K is a largepositive number

nif(0, K)

The posterior mean of is 0.2.The one-sided 95% credible interval is (0, 0.473),which does not include the normal case ( = 0.5).

Among several alternatives the EP model with = 0.2fits the data best in terms of the smallest DIC (Ref on

DIC: Spiegelhalter et. al., 2002)

17


18/31

-2 -1 0 1 2

-0.2

-0.1

0.0

0.1

0.2

Figure 1. Normal Q-Q plot of the random effect vi

Theoretical Quantiles

SampleQuantiles

-2 -1 0 1 2

-0.1

5

-0.1

0

-0.05

0.0

0

0.0

5

0.1

0

Figure 2. Normal Q-Q plot of a random generated platikurtic data

Theoretical Quantiles

SampleQuantiles

18


19/31

Figure 3. Posterior density of kurtosis psi

psi

Density

0.0 0.2 0.4 0.6 0.8 1.0

0

1

2

3

19


20/31

Our Unit Level Model for Estimating Small AreaProportions

For i = 1,...,m

Level 1: ;ind

ik i iy | ~ Bernoulli( )

Level 2:i i i

logit( ) = x ' + v

Two approaches for modeling the area specific randomeffects :

iv

Option 1: Assume that the kurtosis of is known; e.g.,iid

- Bernoulli-Logit-Normal Model

iv

2

iv ~ N(0, )

20


21/31

Option 2: Assume that the distribution of is a memberof a class of distributions that covers a wide range of

kurtosis values and let the data determine the unknown

kurtosis; e.g., assume

iv

i

v ~ EP(0, , ) and estimate just

as we would estimate the other hyperparameters of the

hierarchical model - Bernoulli-Logit-EP model

21


22/31

Comments:In an unpublished work in the early 90s, Lahiri and

Rao considered a robust extension of the Batesse-

Fuller-Harter (1988) using EP.

Fabrizi and Trivisano (2007): robust extensions to theFay-Herriot model for continuous data using EP.

Li and Lahiri (2007) considered a super-populationmodel was chosen adaptively from the well-known

Box-Cox class of transformation.

22


23/31

Simulated Data Analysis

Two aims:

When is non-normal, how effective is the Bernoulli-Logit-EP model relative to the Bernoulli-Logit-Normal?

iv

When is indeed normal, what is the effect ofoverparametrization for the Bernoulli-Logit-EPmodel?

iv

23


24/31

.im = 100,n = 5 .'

ix = = 0

We consider two cases: = 0.2 (platikurtic) and(normal).

0.5

For each of the two cases, one sample was generatedfrom the models: ,

i ilogit( ) = + v

i

v ~ EP(0, = 0.1, ) i = 1,...,m and ,

, .

ij iy ~ Bernoulli( )

ij = 1, ...,n i = 1,...,m

24


25/31

Priors for the hyperparameters: i) f() 1; ii) ~ Unif(0,K), and iii) ~ Unif(0,1).We computed HB estimates for the two models using

WinBUGS. For each WinBUGS run, three

independent chains were used. For each chain, burn-

ins of 1,000 samples were produced, with 4,000

samples after burn-in. The resultant 12,000 MCMCsamples after burn-in were then used to compute the

posterior means and percentiles for each HB model

based on each sample dataset. The potential scale

reduction factor was used as the primary measure

for convergence (see Gelman and Rubin, 1992).

R

25


26/31

Let denote an HB estimator of , and let denote

the

HB

iP

iP HB

i,qP

qth

percentile of the posterior distribution of . To

evaluate the two HB models, the following evaluation

statistics for each HB estimator are calculated:

iP

Average squared deviation (ASD), m HB 2

i ii=1

1ASD = (P - P )

m

Average absolute deviation (AAD), m HB

i ii=1

1AAD = | P - P |

m

Average squared relative deviation (ASRD),

m HB 2

i i i1

i=1ASRD = ((P - P )/P )

m

26


27/31

Average absolute relative deviation (AARD),

im HB

i ii=1

1AARD = | P - P |/P

m

Average length of the 95% credible interval (ALCI),

m HB HB

i,.975 i,.025i=11ALCI = (P - P )m

27


28/31

Table 1: Ratios of ASD, AAD, ASRD, AARD, and ALCIfor the two models (Normal/EP) using the simulated data

DGP ASD AAD ARSD AARD ALCI

EP(0, 0.1, 0.2) 1.258 1.106 1.259 1.106 1.100N(0, 0.1) 1.064 1.038 1.058 1.033 1.017

28


29/31

Real Data Analysis

Data Source: 2002 Natality public-use dataWe drew 6 sets of samples of size n=4,526 using simple

random sampling within states from the finite

population.

The state level sample sizes ranged from 7 (for smallstates such as Vermont) to 690 (for California).

in

Using each sampled data, we computed the HBestimates for each model using the two auxiliary

variables

29


30/31

The prior assumptions for the hyperparameters arethe same as we used earlier

To evaluate the two HB models, the five evaluationstatistics computed for each HB estimator.

The numbers in the table consistently show thatBernoulli-Logit-EP model works better than the

Bernoulli-Logit-Normal model in terms of the five

evaluation statistics.

30


31/31

Table 3: ASD, AAD, ASRD, AARD, ARD and ALCI for the HB estimators

using real data

Sample Model ASD AAD ASRD AARD ALCI

1 EP 0.00021 0.01168 0.00283 0.15695 0.05410

1 Normal 0.00021 0.01176 0.00285 0.15809 0.059202 EP 0.00007 0.00653 0.00102 0.08779 0.06544

2 Normal 0.00010 0.00746 0.00133 0.09948 0.06952

3 EP 0.00012 0.00812 0.00139 0.10284 0.04663

3 Normal 0.00013 0.00853 0.00154 0.10776 0.05214

4 EP 0.00070 0.01846 0.00988 0.25118 0.12736

4 Normal 0.00087 0.02061 0.01241 0.28247 0.13299

5 EP 0.00043 0.01699 0.00569 0.22286 0.10696

5 Normal 0.00063 0.02057 0.00810 0.26668 0.12456

6 EP 0.00086 0.02238 0.01121 0.28965 0.128206 Normal 0.00147 0.02994 0.01876 0.38553 0.14330

31

Documents

16 Partha Lahiri