Upload
ravindra-singh
View
213
Download
1
Embed Size (px)
Citation preview
Auuttul. J . Skrtiat., 17 (l), 1976, 1 2 2 1
A NOTE ON OPTIMUM STRATIFICATION I N SAMPLING WITH VARYING PROBABILITIES
RAVINDRA SINGH Punjab Agricultural University, Ludhiana, India
Summary Singh and Sukhatme [4] have considered the problem of optimum
stratification on an auxiliary variable x when the units from the Merent strata are selected with probability proportional to the value of the auxiliary variable and the sample sizes for the different strata are determined by using Neyman allocation method. The present paper considers the same problem for the proportional and equal allocation methods. The rules for finding approximately optimum strata boundaries for these two allocation methods have been given. An investigation into the relative efficiency of these allocation methods with respect to the Neyman allocation has also been made. The performance of equal allocation is found to be better than that of proportional allocation and practically equivalent to the Neyman allocation.
1. Introduction Let the population under study be divided into L strata and a
stratified sample of size n be drawn from this population, the sample drawn from the h-th stratum being of size nh so that X nh=n. It is assumed that the units from the different strata are d r a m with probabilities proportional to the values of the auxiliary variable x and with replacement (ppswr). It is well known that if the study variable y is highly positively correlated with the variable z and the regression line of y on x passes (or nearly passes) through the origin then selecting the units with probability proportional to x provides a more efficient estimate of population mean ?’ in comparison to the simple mean estimate 8. It is eaay to see that an unbiased stratified estimate of the mean 7 is given by
L
h - 1
L
A - 1 ?j,t=N-’ x ijh,
where yh is the usual unbiased estimate of the h-th stratum total P, and is based on nh units drawn from that stratum. If for the A-th stratum yhi is the value of the study variable y for the ith unit, xA, is the corresponding value of the auxiliary variable x and x h denotes the stratum total for x, the variance of the estimate grt is given by
Menueclipt received April 30, 1974 ; revised September 4, 1974.
A NOTE ON OPTIMUM STaATIFICA‘TION 13
Whatever be the method of allocation the variance is clearly a function of strata boundaries. The question as to whether the relationship between y and x can also be profitably utilized to further increase the accuracy of the estimate of population mean P by utilizing the techniques like stratification on the same variable x was considered by Singh and Sukhatme [4]. For this purpose they considered Neyman method of sample allocation and also obtained the method of determining the approximately optimum strata boundaries (AOSB) for this allocation method. In practice the use of Neyman allocation requires the advance knowledge of the strata variances C$ (h=1,2, . . . ,L) ( d e h e d in (1.3)) which is in general not available. It is, therefore, important to examine the possibdity of having some other equally efficient allocation method which also does not require any information about the strata parameters. One such procedure is the equal allocation method. In this paper we examine the relative performance of this allocation in comparison to the Neyman allocation. For the sake of widening the scope of present investigation we have also considered the case of proportional alloca- tion which is very popular with survey statisticians.
2; Minimal Equations and their Approximate Solutions Under the superpopulation model (2.1) of [ a ] it can be easily
seen that the expected value of the variance in (1.2) with proportional and equal allocations of the sample to Merent strata is given by
2 L
h - 1 (2.1) Vp(VJt)=n-’ C W,(phxp,e-phc)
and L
h - 1 (2.2) v E ( q J t ) = h - ’ c Wi(pkpt ,e -PL)
where O(s) =(c2(x) +cp(x))/s and phe etc. denote the expected values of the correspondmg functions of x in the h-th stratum and W , is the proportion of the population falling in that stratum.
Proceeding on the lines of [43 it is easy to see that the minimal equations giving optimum strata boundaries corresponding to the minimum of variance in (2.1) are for i = h + l and h=1,2, . . ., L-1,
(2.3) (e(x,)pnx+x,~,e-phx~.*e)-[C2(x*) -(c(x*) - ~ t , c ) ~ l
=(e(x,)pix +xhpie-pizpie) - [c2(x*) -(c(%,+)
and their approximate solutions are given by the following c u m q p r ( x ) rule :
cum. qpa(x) rule : I f the function p , ( x ) = ( (c ’ ( s ) )2 -O’ (x ) ) f ( x ) is bounded and possesses first two derivatives for all x in (a , b ) then for a given value of L taking equal intervals on the cum. q p , ( x ) yields approximately optimum strata boundaries.
Similarly in the case of equal allocation the minimal equations are obtained as
(2.4) W h [e(sh)phx +“hplO --2c(sh)phcl =wi[e(xh)pix +xhpie--ac(xh)~icl
14 EAVMDRA SINGH
where i = h + l and h=1,2, . . ., L--1. The approximate solutions to these equations are obtained by the cum. fd? rule given below.
cum. fdcp rule: If the function f (x)z/cp(z) is bounded and possesses f is t two derivatives for dl x in (a, b ) then for a given vdue of L taking equd intervals on the cum. f (z)dq(s) yields approx- imately optimum strata boundaries.
3. The Relative Efficiency In this section we shall find approximate expressions for the
relative efficiencies of proportional and equal allocations in comparison to the Neyman allocation. It will be assumed that when following any particular allocation method the strata boundaries are obtained by following the corresponding stratification rule. Thus for example the relative efficiency for equal allocation will be given by
where the variance in the numerator corresponds to the Neyman allocation and is based on strata obtained by using the stratification rules given in [a] and the variance in the denominator corresponds to equal allocation and is based on strata obtained by using the cum. f dq~ rule given in this paper. For finding the expressions for the relative efficiency we shall make use of the following two lemmas.
Lemma 3.1. If (Xh-1,xh) are the boundaries of h-th stratum and .E-h =xh - x ~ - ~ , then
3
- e w u w t ] (1 +o(G))
Lemma 3.2. If ( X ~ - ~ , X ~ ) are the boundaries of h-th stratum and Kh =xh - x ~ - ~ , then
where p , ( t ) =((c'(t))2-O'(t)) f (t) . Now for the strata boundaries obtained by cum. q p , ( x ) rule we have
A NOTE ON OPTIMUM STBbTIFICATION 15
b s" V(Pl(W~. = ja V(P,(t))df/L. oh- 1
(3.4)
Therefore, if the terms of order O(m4), m=sup (Kh) , are neglected we
have approximately (a,b)
b
(3.5) nvP(&A+ ( t )p( t )d t+( l2Lt ) -L( 1: v(P4(t))dt)3.
Also from the relation (4.3) of [ 4 ] we have
Thus we get from (3.5) and (3.6) the relative efficiency of pro- portional allocation as
If the frequency distribution of x is available in the form of M classes and the functional forms for cp(t), p 2 ( t ) etc. are known then one can get the approximate value of R.E. for any L by finding the values of A , B , C and D from the frequency table. For this purpose we can
approximate A by C fid(cp(zi)) where zi is the mid value of the i-th
class interval, and similar expressions can be used for B, C and D. In case of equal allocation the approximately optimum strata
boundaries (AOSB) are obtained by taking equal intervals on the cum. f (s) l/p(z) ; therefore,
M
i = l
If the effect of the differences between AOSB for the Neyman and
equal allocations on the value of $'(p,(t))dt are neglected, the
value of this term for the AOSB for equal allocation can be
approximately taken as v ( p , ( l ) ) d t / L . Then we get from Lemma
3.2 and relation (3.8) after dropping the terms involving higher powers of strata widths
1."
16 RAVINDRA SINGH
(3.9)
and the vairiance for equal allocation becomes
From the relations (3.6) and (3.10) it is seen that the variances for the two allocations are approximately same. The two allocations are, therefore, approximately equally efficient for large values of L and for the situation considered in this paper. This observation is also supported by the numerical investigation made in the next section.
4. Numerical Investigation For numerical investigation in this paper we consider the same
four densities for 2 which were considered by Singh and Sukhatme [4]. The truncation of exponential and right normal densities is also same as in [4]. In this case also the AOSB are obtained by using the frequency distribution versions of different stratiiication rules. For calculating the expected value of z-l (needed when g=O) in different strata we have used numerical integration methods for exponential and normal distributions. Each stratum width in these cases waa divided into 100 equal intervals. In [4] since nVN(gst) , the variance for Neyman allocation, used expected values of x-1 numerically calculated by dividing the strata widths into equal intervals which numbered different from 100, these variances were recalculated on the basis of 100 equal width divisions of the strata widths.
In the following four tables are given the strata boundaries, variances and the relative efficiencies of the proportional and equal allocations for the three values of g (i.e. 0 , l and 2 ) and for four density functions. The relative efficiency of the equal allocation is found to be generally more than that for the proportional allocation. The minimum value for the relative efficiency of equal allocation is found to be 99.93 per cent and hence under the situations considered one can safely use equal allocation of the sample to different strata with cum.fl/cp rule for finding the corresponding AOSB in place of going in for the Neyman allocation. The minimum value for the relative efficiency of proportional allocation is 82-36 per cent. The proportional allocation, therefore, does not qualify for recommendation in place of Neyman or the equal allocation. The relative efficiency for the proportional allocation is also found to decrease with the increase in the value of g. In some cases the relative efficiencies are found to be more than 100 per cent. This observation has also been made earlier by Singh and Parkash [5] and Singh [6]. This happens because the variances used in calculating the relative efficiencies are based on approximate solutions of the minimal equations and not on the exact solutions. Also the relative efficiency for these values of c(z) and cp(z) does not depend on the correlation value.
In the case of proportional allocation no boundaries are gven in the tables €or g = l because in this case any set of boundaries is optimum.
TA
BL
E 1
AO
SB
and
P
erce
ntag
e R
e&&
E
fic
iew
(R
.E.)
: R
cc
taw
lar
Dis
tri6
udi
on
1.443
1.283, 1.616
1.208, 1.443, 1.706
1,164, 1.346, 1-646, 1.763
1.136,1-283,1-443, 1.G16.1.801
0.26247
100.00
0.25110
99.99
0.2
60
62
100.00
0.2
60
40
100.00
0.26027
100.00
0.25
006
0.26
005
0.2
60
05
0.26006
0.26006
99-3
0 99
-17
99.12
99
-09
99 * 0
8
1.600
1.260. 1.600, 1-
760
1.200, 1
.40
0, 1.000, 1
.80
0
1~10
7,1~
333.
1~60
0.
1.607, 1.833
1.333, 1.067
0.24710
97-3
0
0,24Q44
96.64
0.24044
96.67
0.24972
90.64
0.24900
90-8
1
-
L
Pro
por
tion
al A
lloc
atio
n
Equ
al A
lloca
tion
AO
SB
R
.E.
0.26247
0.26 108
0*2
60
61
0*26039
0-2
60
28
1-600
1.333. 1.667
1.260, 1-600. 1.760
1.200, 1.400, 1.600, 1-800
1,167. 1.333, 1.600, 1.667, 1.833
0.26267
0.261 16
0.26066
0.26042
0.26020
99-9
6 99.97
90. D
B 99- 9
9 100.00
99
.98
0
9.9
9
99 * 9
8 9
9.8
9
99 * Q
D
0
0.24830
0.24707
0.24784
0.24777
0.24174
1.642
1.373, 1.701
1 * 286, 1 *
642, 1 *
778
1,231, 1,442, 1,838, 1.824
1.194. 1.373, 1.642, 1.701. 1.864
0.24834
0.24790
0.24788
0.24780
0.2477e
1
0.24100
0.24106
0*24100
0.24106
0.24106
1.681
1-414, 1.732
1.323. 1.681, 1.803
1.266, 1.483, 1-073, 1.844
1.224, 1.414, 1.681, 1.732, 1-871
10
0*0
0
100-00
100*00
10
0*0
0
lOO*
OQ
0.24100
0.24 106
0.2
41
00
0- 24
108
0.24106
2
Tm
m
2 AOSB a
nd
Per
cen
tage
Rel
ativ
e E
fick
ncy
(R
.E.)
: RGU T
riu
np
lar
Dis
trib
uti
on
AO
SB
1 * 368
1,226, 1.613
1.164, 1.368.
1-8
02
1.129.1.276, 1.44a.1.660
1.106.1.226, 1.368, 1.613.1.702
U -
0
"VP
G,,
)
0.26206
0.26096
0.26064
0.26036
0.26026
L
1 2
nv&
,)
0.26234
0.26094
0- 2 6066
0.26033
0.25024
2 0.24869
3 0.24836
4 0.24824
6
0.24820
6
0.24818
2 0-24242
3 0.24242
4
0.24242
6
0.24242
6 0
.24
24
2
--
~
1 - 406
1.263, 1 .a62
1.194. 1.406, 1.647
1-16
4 1.319 1-408, 1.702
1.128, 1*203.1*406.1*602.1.740
0.2
60
00
0*
2600
0 0.26000
0.26000
0.26000
0,24786
0- 24901
0.24942
0.24963
0.24974
RE.
100.11
100~
00
100.01
99.99
100*
00
99-4
4 99.34
99.30
99-2
7 9
9.2
8
97.80
97-3
6 97.10
97.11
97-0
7
Eq
ual
Ah
cati
on
AO
SB
1,293
1,184, 1.423
1.134, 1.293, 1
*60
0
1.108, 1.226, 1.368, 1.663
1.087. 1.184. 1.293, 1.423. 1.592
1,320
1.204, 1.462
1,160, 1.320, 1.630
1.119. 1-249, 1
.39
6, 1.682
1.098, 1.204. 1.320. 1.462, 1.820
.
1.347
1 * 226, 1.482
1.1
68
. 1.347, 1
.66
8
1.134. 1.274, 1.426, 1-609
I.112, 1
.22
6, 1.347, 1-482, 1
.86
6
0.26212
0.26100
0.2
6Q
69
0.26039
0.26028
R.E
.
100.09
90
.98
9
9-8
9
99.98
99.98
0.24863
0.24835
0.24826
0.24820
0-2
48
18
0-24
242
0.24242
0.24242
0-24242
0.24242
99-8
8 100*00
100 *
00
100~
00
100~00
100.
00
100.00
10
0~
00
10
0~00
1
00
~0
0
TULE
3
AO
SB
and
Per
cen
iap
R&
iw
Efl
cim
cy (
R.B
.) : h
kp
olu
rda
1 D
ietr
ibu
iion
L
2 3 4 6 6 2 3 4 6 6
Q P
rop
orti
onal
All
ocat
ion
'LV
"Y,r)
AO
SB
nV
.J$s
t)
R.E
.
0.26
906
1.9
19
0.
2608
4 9
9.3
2
0.26
269
1.47
0, 2
.68
1
0-2
64
31
9
9-3
2
0.26
076
1.2
94
,1*9
19
,3.0
07
0
-26
24
6
99
-32
0
.24
89
1
1-2
18
,1*6
34
,2-2
69
,3.3
46
0
.26
06
0
99
.33
0.
2482
2 1
.18
2,1
*47
0,
1-9
19
,2*6
61
, 3
.80
9
0.2
49
90
9
9.3
3
0.23
476
0.2
44
10
96
.17
0.23
308
0.2
44
10
96
.48
0.23
243
0.2
44
10
9
6.2
2
0.23
216
0.2
44
1 0
9
6.1
1
0.23
196
0.2
44
10
96
.02
0
~~~~
0,2
36
80
0.
2432
9 0
-24
61
2
0,2
47
48
0
.24
82
3
1
~
86
-68
8
4.0
3
83.0
7 8
2-8
1
82.3
6
I A
OSB
1 * 6
92
1.41
0, 2
.07
2
1.28
9,
1.69
2.
2.37
4 1.
224.
1.
607.
1.
914.
2
.6
1.62
6, 2
.36
1
1.38
2.
1.87
9.
2.6
74
1.
297,
1.
669.
2.
142,
2
.9
I 1.242,
- 1
.62
6,
1.87
9,
2.3
2
2 3 4 6 6
0.20
444
0.20
444
0.20
444
0.20
444
0.20
444
' 2
.66
2
1 Zt;
2ii, 3.8
13
1.63
2, 2
.178
, 3.000, 4
.14
1
1.43
8,
1.04
7,
2.66
2,
3.3
36
, 4.
382
~~
~
2.11
6 1.
691,
2
.66
9
1.60
4, 2
.116
, 3
.01
4
1.40
1,
1.86
1. 2
.420
. 3.2
1.33
2.
1.69
1,
2.11
6.
2.6
II
I
I I
I
P -
0
0-24
686
O-!Z4480
0.24440
0- 24420
0.24410
0.22477
0-22477
0-22477
0-22477
0.22477
1
-
2 -
2.146
1.731,2*636
1.642,2.146,2.936
1*431,1*891,2.426,3.148
1.368, 1.731,2.146,2.e36,3.308
TA
BU
4
AO
SB
and Pcrcer&uge
R&
ivtr
Eflciency
(R.B
.):
Nm
md
Dis
ttib
uth
1.786
1.618, 2.099
1.390, 1-786, 2.286
1.314, 1.823, 1.964, 2.424
1.283, 1
.61
8, 1-78e,2-099.2.623
Pro
por
tion
al A
lloo
atio
n
AO
SB
0.24660
0.24463
0.24414
0-24396
0.24386
0.26802
1.883
0.26376
1.623.2.304
0-26219
1-371, 1-883,2.690
0-26142
1.296,1.664,2*114, 2.794
0.26099
1.2
40
,1.6
23
,1.8
83
,2.3
04
,2.9
66
---,
~ ~
~-
i.474. 1
.80
4, 2.423
1.386, 1.720, 2.092, 2-669
1~
32
7,1
~6
16
,1~
00
4,2
~2
26
,2~
66
6
2 3 4 6
6 2 3 4 6 6 -
.
0.22478
0.22478
0,22478
nv
ptr
,,)
0.26802
0.263'18
0.26220
0.26143
0.26100
0.26002
0.26002
0.26002
0*26002
0.26002
0.24149
0.24683
0.24764
0.24838
0.24886;
R.E
.
10
0~
00
100*00
99.99
99.9
9 99,99
98.34
97.91
97.75
97.67
97 * 63
93-0
8 91
-43
90.80
90.60
00.32
Equal
All
ocat
ion
AO
SB
1.G79
1.433, 1.971
1.321, 1.679. 2.169
1.264, 1.626, 1.848, 2.286
1.211, 1.433, 1.679, 1.971, 2.394
nV
g(S
I,t)
0.26782
0.26391
0.26240
0.26
166
0.26 122
1.904
0.22477
1-8
16
. 2
.22
6
I 0.2
24
77
R.E
.
101.44
100.81
100.64
100.39
100.37
100.14
100-
11
100*10
100*01
100.10
100-0
0
100~00
100~
00
99.99
99.99
A NOTE ON OPTIMUM STRATIFICdTION 21
Acknowledgement The author is grateful t o the referee for certain valuable
suggestions. References
[l] Singh, Ravindra, and Sukhatme, B. V. (1969). “Optimum stratification.” Ann. Inst. Statist. Math., 21, 516-628.
[2] Singh. Ravindra (1971). “An expression for the variance of the eatimate of mean in stratihd simple random sampling.” Mimeograph, Department of Mathematics and Statistics, Punjab Agricultural University, Ludhinnn (Punjab).
[3] Singh, Ravindra, and Sukhatme, B. V. (1972). “ A note on optimum atretiha- tion.” J a r . I d . Soc. Ag. Stcrtiet., 24, 91-98.
[4]Singh, Ravindra, and Sukhatme, B. V. (1972). “Optimum stratification in sampling with varying probabilities.” Ann. Imt. Slatiat. Math., 24,486-494.
[5] Singh, Ravinctfs, and Dev Parkash. “Optimum stratification for equal allocation. Ann. Inst. Slatist. Math. (To appear.)
[6] Singh, Ravindra. “ A note on opthum stratification for equal docation with ratio and regression methods of wtimation.” J a r . I d . SOC. Ag. SkJiat. (Submitted for publication.)