Upload
michael-ray
View
236
Download
0
Embed Size (px)
Citation preview
8/6/2019 16 Partha Lahiri
1/31
SMALL DOMAIN PROPORTION ESTIMATION: ANADAPTIVE BAYESIAN APPROACH
Partha Lahiri
JPSM, University of Maryland, College Park, USA
[Based on joint work with Ms. Benmei Lui]
8/6/2019 16 Partha Lahiri
2/31
Examples
Estimation of batting averages of major leaguebaseball players (Efron andMorris, 1975)
Small Area Income and Poverty Estimation (SAIPE)Survey of drug use in Nebraska
2
8/6/2019 16 Partha Lahiri
3/31
Borrowing Strength:
Relevant Source of InformationCensus dataAdministrative informationRelated surveys
Method of Combining InformationChoices of good small area modelsUse of a good statistical methodology
3
8/6/2019 16 Partha Lahiri
4/31
A Basic Area Level Model
iP true proportion for area i
ip : direct design-based estimate for area i
ix : a vector of known auxiliary variables
Model:For i=1,,m,
Level 1 : = g(i i i i
T 2
i i i
p ) ~ ind. N( , )
Level 2 : = g(P ) ~ ind. N(x , )
4
8/6/2019 16 Partha Lahiri
5/31
The sampling variances are assumed to be knowni
Carter and Rolph (1974): ,i i i
,i
g(p ) = arcsine( p ) =4n
1
Efron and Morris (1975): i i ig(p ) = n arcsine(2p -1), = 1
Fay and Herriot (1979):i i
g(p ) = log(p ),
i
estimated by GVF
SAIPE:
ifor state level estimation of proportion
of poor school-age children and i for county
level poverty counts of school-age children
ig(p ) = p
g(Y ) = log(Y )i
5
8/6/2019 16 Partha Lahiri
6/31
Comments:
The model is simple and does not require theknowledge of detailed design Information (e.g., PSU
identifiers), which may not be available in a public-use
file
The resulting empirical best predictor (EBP) of isdesign-consistent
i
EBP method is extendable to specified non-normaldistributions for the sampling and random effects.
6
8/6/2019 16 Partha Lahiri
7/31
For unspecified non-normality of the sampling andrandom effects, one can use EBLUP [Lahiri and Rao,1995] or certain adaptive [Li and Lahiri, 2007; Fabrizi
and Trivisano, 2007] or linear EB [Ghosh and Lahiri,
1987]
Known sampling variances : The GVF typemethods are generally used. The method usually doesnot consider small area effects and the uncertainty in
estimating the sampling variances are not included in
the EBP.
i
In some situation, standard estimates [REML, ML,ANOVA, etc.] of the model variance 2 can be zero.
7
8/6/2019 16 Partha Lahiri
8/31
When is zero, EBLUP reduces to the regressionsynthetic estimate. One way to avoid the problem is to
use the ADM or AML estimates [Morris, 1987; Li and
Lahiri, 2007]
2
The rationale behind the transformation rests onthe Taylor series argument and is used primarily to
stabilize the variance. A direct modeling of the directestimates is possible, but this is likely to lead to non-
linear non-normal mixed model.
g(.)
A simple back transformation is often used to obtainthe estimate of . The optimum property of the BP is
lost by such a back transformation.i
P
8
8/6/2019 16 Partha Lahiri
9/31
Measures of uncertainty and confidence interval
problem are quite challenging and the theory rests on
asymptotics
Hierarchical Bayes implementation of the basic arealevel model provides an exact inference at the expense
of specification of priors for the hyperparameters.
9
8/6/2019 16 Partha Lahiri
10/31
Estimation of Small Area Proportions: Two BasicArea ModelsRef: Liu, Lahiri and Kalton (2007)
Model 1 :
i ii i i i
i
T 2i i
P ( 1 - P )Level 1 : p | P ~ ind N(P , DEFF )
n
Level 2 : g(P ) ~ ind N(x , )
Model 2 :
i ii i i i
i
T 2
i i
P ( 1 - P )Level 1 : p | P ~ ind Beta(P , DEFF )n
Level 2 : g(P ) ~ ind N(x , )
10
8/6/2019 16 Partha Lahiri
11/31
Comments
Both EBP [Jiang and Lahiri 2002; Chatterjee andLahiri 2008] and the Bayesian implementation
[Liu,Lahiri and Kalton 2007] of the above models are
possible
Level 1 modeling could be problematic in the presenceof sizable number of zeroes for small area.
2
W P (1- P )/nh ih ih ih ih
DEFF =i P (1 -P )/ni i i
;
ih ih iW = N /N ;N = Ni h ih
i h ihn = n
11
8/6/2019 16 Partha Lahiri
12/31
is the population proportion for stratum in area .ihP h i
The design effect is a function of , which areunknown.
iDEFF
ihP
If ,ih
.ih i
P P i
2DEFF deff = n W /ni i h ih
DEFF estimation typically requires a syntheticassumption and the variability due to estimation of the
Deff is not accounted for (Other refs: papers by Rao
and You;Folsom and Singh)
12
8/6/2019 16 Partha Lahiri
13/31
A Unit Level Model:
Level 1:: ; (ind
ik i iy | ~ Bernoulli( )
Level 2:i
,i i
logit( ) = x ' + v
where , .iid
2
iv ~ N(0, ) i = 1, ...,m
Ref:
MacGibbon and Tomberlin (1989)Malec, Sedransk, Moriarity and LeClere (1997)
Malec, Sedransk and Tompkins (1993)
Malec, Davis and Cao (1999)
Ghosh et al. (1998)
13
8/6/2019 16 Partha Lahiri
14/31
An Illustration Of Platikurtic RandomEffects:
Study Population: 2002 natality public-use data file that
contains records on all births (4,024,378) occurring within
the United States in 2002.
i
i1
i2
For i = 1,...,51
P : true low birth weight ratex : proportion of mothers of age < 15 yr
x : proportion of case where the newborn is the first child in the family
14
8/6/2019 16 Partha Lahiri
15/31
Fit
i 0 1 i1 2 i2 iLevel 2 : logit(P ) = + x + x + v
Obtain the residual.
Analyze the residuals for possible departure fromnormality.
pThe -value from the Kolmogorov-Smirnov test is0.0436.
15
8/6/2019 16 Partha Lahiri
16/31
The exponential power (EP) distributionRef: Box and Tiao (1973)
0 1/1EP
ccf (x | ,, ) = exp - | (x - ) |
, - < x < +
where ] + R, R , (0,1 0
c = (3 )/( ) , 1 0
c = c /2 ( ) .
: location
: scale
The excess of kurtosis is:
2
( )(5 )
= - 3 (3 ).
,
Normal: 0.5 = Platikurtik: 0.5 < Leptokurtic: 0.5 >
16
8/6/2019 16 Partha Lahiri
17/31
NCHS Data Analyses Cont.
i
v ~ EP(0, , )
Assume that the components of ( , ) are independentand i) ) ~ Unif(0, 1 and ii) ~ U , K is a largepositive number
nif(0, K)
The posterior mean of is 0.2.The one-sided 95% credible interval is (0, 0.473),which does not include the normal case ( = 0.5).
Among several alternatives the EP model with = 0.2fits the data best in terms of the smallest DIC (Ref on
DIC: Spiegelhalter et. al., 2002)
17
8/6/2019 16 Partha Lahiri
18/31
-2 -1 0 1 2
-0.2
-0.1
0.0
0.1
0.2
Figure 1. Normal Q-Q plot of the random effect vi
Theoretical Quantiles
SampleQuantiles
-2 -1 0 1 2
-0.1
5
-0.1
0
-0.05
0.0
0
0.0
5
0.1
0
Figure 2. Normal Q-Q plot of a random generated platikurtic data
Theoretical Quantiles
SampleQuantiles
18
8/6/2019 16 Partha Lahiri
19/31
Figure 3. Posterior density of kurtosis psi
psi
Density
0.0 0.2 0.4 0.6 0.8 1.0
0
1
2
3
19
8/6/2019 16 Partha Lahiri
20/31
Our Unit Level Model for Estimating Small AreaProportions
For i = 1,...,m
Level 1: ;ind
ik i iy | ~ Bernoulli( )
Level 2:i i i
logit( ) = x ' + v
Two approaches for modeling the area specific randomeffects :
iv
Option 1: Assume that the kurtosis of is known; e.g.,iid
- Bernoulli-Logit-Normal Model
iv
2
iv ~ N(0, )
20
8/6/2019 16 Partha Lahiri
21/31
Option 2: Assume that the distribution of is a memberof a class of distributions that covers a wide range of
kurtosis values and let the data determine the unknown
kurtosis; e.g., assume
iv
i
v ~ EP(0, , ) and estimate just
as we would estimate the other hyperparameters of the
hierarchical model - Bernoulli-Logit-EP model
21
8/6/2019 16 Partha Lahiri
22/31
Comments:In an unpublished work in the early 90s, Lahiri and
Rao considered a robust extension of the Batesse-
Fuller-Harter (1988) using EP.
Fabrizi and Trivisano (2007): robust extensions to theFay-Herriot model for continuous data using EP.
Li and Lahiri (2007) considered a super-populationmodel was chosen adaptively from the well-known
Box-Cox class of transformation.
22
8/6/2019 16 Partha Lahiri
23/31
Simulated Data Analysis
Two aims:
When is non-normal, how effective is the Bernoulli-Logit-EP model relative to the Bernoulli-Logit-Normal?
iv
When is indeed normal, what is the effect ofoverparametrization for the Bernoulli-Logit-EPmodel?
iv
23
8/6/2019 16 Partha Lahiri
24/31
.im = 100,n = 5 .'
ix = = 0
We consider two cases: = 0.2 (platikurtic) and(normal).
0.5
For each of the two cases, one sample was generatedfrom the models: ,
i ilogit( ) = + v
i
v ~ EP(0, = 0.1, ) i = 1,...,m and ,
, .
ij iy ~ Bernoulli( )
ij = 1, ...,n i = 1,...,m
24
8/6/2019 16 Partha Lahiri
25/31
Priors for the hyperparameters: i) f() 1; ii) ~ Unif(0,K), and iii) ~ Unif(0,1).We computed HB estimates for the two models using
WinBUGS. For each WinBUGS run, three
independent chains were used. For each chain, burn-
ins of 1,000 samples were produced, with 4,000
samples after burn-in. The resultant 12,000 MCMCsamples after burn-in were then used to compute the
posterior means and percentiles for each HB model
based on each sample dataset. The potential scale
reduction factor was used as the primary measure
for convergence (see Gelman and Rubin, 1992).
R
25
8/6/2019 16 Partha Lahiri
26/31
Let denote an HB estimator of , and let denote
the
HB
iP
iP HB
i,qP
qth
percentile of the posterior distribution of . To
evaluate the two HB models, the following evaluation
statistics for each HB estimator are calculated:
iP
Average squared deviation (ASD), m HB 2
i ii=1
1ASD = (P - P )
m
Average absolute deviation (AAD), m HB
i ii=1
1AAD = | P - P |
m
Average squared relative deviation (ASRD),
m HB 2
i i i1
i=1ASRD = ((P - P )/P )
m
26
8/6/2019 16 Partha Lahiri
27/31
Average absolute relative deviation (AARD),
im HB
i ii=1
1AARD = | P - P |/P
m
Average length of the 95% credible interval (ALCI),
m HB HB
i,.975 i,.025i=11ALCI = (P - P )m
27
8/6/2019 16 Partha Lahiri
28/31
Table 1: Ratios of ASD, AAD, ASRD, AARD, and ALCIfor the two models (Normal/EP) using the simulated data
DGP ASD AAD ARSD AARD ALCI
EP(0, 0.1, 0.2) 1.258 1.106 1.259 1.106 1.100N(0, 0.1) 1.064 1.038 1.058 1.033 1.017
28
8/6/2019 16 Partha Lahiri
29/31
Real Data Analysis
Data Source: 2002 Natality public-use dataWe drew 6 sets of samples of size n=4,526 using simple
random sampling within states from the finite
population.
The state level sample sizes ranged from 7 (for smallstates such as Vermont) to 690 (for California).
in
Using each sampled data, we computed the HBestimates for each model using the two auxiliary
variables
29
8/6/2019 16 Partha Lahiri
30/31
The prior assumptions for the hyperparameters arethe same as we used earlier
To evaluate the two HB models, the five evaluationstatistics computed for each HB estimator.
The numbers in the table consistently show thatBernoulli-Logit-EP model works better than the
Bernoulli-Logit-Normal model in terms of the five
evaluation statistics.
30
8/6/2019 16 Partha Lahiri
31/31
Table 3: ASD, AAD, ASRD, AARD, ARD and ALCI for the HB estimators
using real data
Sample Model ASD AAD ASRD AARD ALCI
1 EP 0.00021 0.01168 0.00283 0.15695 0.05410
1 Normal 0.00021 0.01176 0.00285 0.15809 0.059202 EP 0.00007 0.00653 0.00102 0.08779 0.06544
2 Normal 0.00010 0.00746 0.00133 0.09948 0.06952
3 EP 0.00012 0.00812 0.00139 0.10284 0.04663
3 Normal 0.00013 0.00853 0.00154 0.10776 0.05214
4 EP 0.00070 0.01846 0.00988 0.25118 0.12736
4 Normal 0.00087 0.02061 0.01241 0.28247 0.13299
5 EP 0.00043 0.01699 0.00569 0.22286 0.10696
5 Normal 0.00063 0.02057 0.00810 0.26668 0.12456
6 EP 0.00086 0.02238 0.01121 0.28965 0.128206 Normal 0.00147 0.02994 0.01876 0.38553 0.14330
31