Double Sampling Ratio-product Estimator of a Finite Population Mean in Sample Surveys

This article was downloaded by: [University of Newcastle (Australia)]On: 27 August 2014, At: 22:21Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registeredoffice: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Journal of Applied StatisticsPublication details, including instructions for authors andsubscription information:http://www.tandfonline.com/loi/cjas20

Double Sampling Ratio-productEstimator of a Finite Population Meanin Sample SurveysHousila P. Singh a & Mariano Ruiz Espejo ba School of Studies in Statistics, Vikram University , Indiab Departamento de Matemáticas Fundamentales , UNED , Madrid,SpainPublished online: 14 Feb 2007.

To cite this article: Housila P. Singh & Mariano Ruiz Espejo (2007) Double Sampling Ratio-productEstimator of a Finite Population Mean in Sample Surveys, Journal of Applied Statistics, 34:1, 71-85,DOI: 10.1080/02664760600994562

To link to this article: http://dx.doi.org/10.1080/02664760600994562

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the“Content”) contained in the publications on our platform. However, Taylor & Francis,our agents, and our licensors make no representations or warranties whatsoever as tothe accuracy, completeness, or suitability for any purpose of the Content. Any opinionsand views expressed in this publication are the opinions and views of the authors,and are not the views of or endorsed by Taylor & Francis. The accuracy of the Contentshould not be relied upon and should be independently verified with primary sourcesof information. Taylor and Francis shall not be liable for any losses, actions, claims,proceedings, demands, costs, expenses, damages, and other liabilities whatsoeveror howsoever caused arising directly or indirectly in connection with, in relation to orarising out of the use of the Content.

This article may be used for research, teaching, and private study purposes. Anysubstantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

http://www.tandfonline.com/loi/cjas20

http://www.tandfonline.com/action/showCitFormats?doi=10.1080/02664760600994562

http://dx.doi.org/10.1080/02664760600994562

http://www.tandfonline.com/page/terms-and-conditions

http://www.tandfonline.com/page/terms-and-conditions

Double Sampling Ratio-product

Estimator of a Finite Population Mean

in Sample Surveys

HOUSILA P. SINGH� & MARIANO RUIZ ESPEJO��

�School of Studies in Statistics, Vikram University, India, ��Departamento de Matematicas

Fundamentales, UNED, Madrid, Spain

ABSTRACT It is well known that two-phase (or double) sampling is of significant use in practicewhen the population parameter(s) (say, population mean X) of the auxiliary variate x is notknown. Keeping this in view, we have suggested a class of ratio-product estimators in two-phasesampling with its properties. The asymptotically optimum estimators (AOEs) in the class areidentified in two different cases with their variances. Conditions for the proposed estimator to bemore efficient than the two-phase sampling ratio, product and mean per unit estimator areinvestigated. Comparison with single phase sampling is also discussed. An empirical study iscarried out to demonstrate the efficiency of the suggested estimator over conventional estimators.

KEY WORDS: Auxiliary variate, double sampling ratio and product estimators, finite populationmean, study variate

Introduction

One of the major developments in sample surveys over the last five decades is the use of

auxiliary variable x, correlated with the study variable y, in order to obtain the estimates of

the population total or mean of the study variable. Various estimation procedures in

sample surveys need advance knowledge of some auxiliary variable xi which is then

used to increase the precision of estimates. For example, the classical ratio and product

estimators require the advance knowledge of population mean �X of the auxiliary variable x.

When the population mean �X is unknown, it is sometimes estimated from a preliminary

large sample on which only the auxiliary characteristic x is observed. The value of �X in

the estimator is then replaced by its estimate. A smaller second-phase sample of the

variate of interest (study variate) y is then taken. This technique, known as double

sampling or two-phase sampling, is especially appropriate if the xi values are easily acces-

sible and much cheaper to collect than the yi values, see Sitter (1997) and Hidiroglou &

Sarndal (1998). Neyman (1938) was the first to give the concept of double sampling in

connection with collecting information on the strata sizes in a stratified sampling.

Journal of Applied Statistics

Vol. 34, No. 1, 71–85, January 2007

Correspondence Address: Housila P. Singh, School of Studies in Statistics, Vikram University, Ujjain - 456010,

M. P., India. Email: [email protected]

0266-4763 Print=1360-0532 Online=07=010071–15 # 2007 Taylor & FrancisDOI: 10.1080=02664760600994562

Dow

nloa

ded

by [

Uni

vers

ity o

f N

ewca

stle

(A

ustr

alia

)] a

t 22:

21 2

7 A

ugus

t 201

4

The use of double sampling is necessary if the x-value is obtained by performing a

non-destructive experiment whereas to obtain a y-value of a unit, a destructive experiment

has to be performed, see Unnikrishan & Kunte (1995). However there are several situations

in practice, where double sampling can be employed effectively. Snedecor & King (1942)

have mentioned the application of a double sampling procedure by Goodman for determi-

nation of corn yield. Goodman found that it was easier and much cheaper to count the

number of ears of corn in a given unit area than to harvest the yield and obtain the dry

weight of kernels. The high cost of making dry weight determination led to the use of

double sampling in which ears would be counted and measured in many fields but harvested

in only a portion of these. Thus, taking advantage of the correlation between study variate y

(the dry weight of kernels) and the auxiliary variate x (length � diameter of the ear) and using

the regression technique, the dry weight of kernels per ear can be estimated (see Singh, 1968).

Sukhatme (1962) has advocated that the double sampling is usually used when the

number of units required to give the desired precision on different items is widely differ-

ent. This procedure is also used when it is proposed to utilize the information gathered in

the first phase as auxiliary information in order to improve the precision of the information

to be gathered in the second phase. For example, in a survey to estimate the production of

lime crop based on orchards as sampling units, a comparatively larger sample is taken to

obtain the acreage under the crop while the yield rate is obtained from only a sub-sample

of the orchards selected for determining acreage estimators for estimating population

mean �Y of the study variable y in two-phase sampling.

For more applications of double sampling method, the reader is referred to Sukhatme &

Koshal (1959), Yates (1960), Singh & Singh (1965), Singh (1968), Seth et al. (1968),

Chand (1975), Mukerjee et al. (1987), Dorfman (1994), Rao & Sitter (1995), York et al.

(1995), Prasad et al. (1996), Pitt et al. (1996), Breslow & Holubkov (1997), Singh &

Ruiz Espejo (2000) and Barnett et al. (2001).

We, further, note that the ratio method of estimation (or the product method of esti-

mation, see Robson (1957) and Murthy (1964)) yields a more efficient estimator than

the simple unbiased estimator provided the correlation coefficient between study variate

y and auxiliary variate x has high positive value (or high negative value). Further, the

ratio estimator is most effective and is as efficient as the regression estimator, when the

relationship between the study variate, y, and the auxiliary variate, x, is linear through

the origin and the variate of y is proportional to x. However, in many practical situations,

the line does not pass through the vicinity of the origin. Keeping this deed in view, we have

made an effort to improve these estimators in double sampling. Throughout, samples have

been drawn by the method of simple random sampling without replacement (SRSWOR).

For estimating the population mean �Y of the study variate y, recently Singh & Ruiz

Espejo (2003) considered an estimator of the ratio-product type given by

�yRP ¼ �y k�X

�xþ (1� k)

�x

�X

� �

where �y and �x are the sample means of y and x respectively based on a sample of size n out

of the population of N units, �X is the known population mean of x, and k is the set equal to

k ¼ (1þ C )/2 with C ¼ rCy/Cx obtained from judgement, past data or pilot sample

survey, r is the correlation coeffecient between y and x, Cy and Cx are the coefficients

of variation of y and x respectively. For the details of the above estimator the reader is

referred to Singh & Ruiz Espejo (2003).

In this paper we have studied the properties of the above estimator �yRP in the case of

double sampling. Numerical illustrations are given in the support of the present study.

72 H. P. Singh & M. Ruiz Espejo

Dow

nloa

ded

by [

Uni

vers

ity o

f N

ewca

stle

(A

ustr

alia

)] a

t 22:

21 2

7 A

ugus

t 201

4

The Double Sampling Estimator

When the population mean �X of x is not known, a first-phase sample of size n1 is drawn

from the population on which only the x-characteristic is measured in order to furnish a

good estimate of �X. Then a second-phase sample of size n is drawn on which both the vari-

ates y and x are measured. Let �x1 ¼ (1=n1)Pn1

i¼1 xi denote the sample mean of x based on

the first-phase sample of the size n1. Then the two-phase sampling (or double sampling)

estimator is given by

�y(d)RP ¼ �y k

�x1

�xþ (1� k)

�x

�x1

� �(1)

where k is determined so as to minimize the variance of �y(d)RP.

Defining

e0 ¼ (�y� �Y)= �Y , e1 ¼ (�x� �X)= �X and e01 ¼ (�x1 � �X)= �X

we have

�y(d)RP ¼

�Y(1þ e0){k(1þ e01)(1þ e1)�1 þ (1� k)(1þ e1)(1þ e01)�1}: (2)

We now assume that je1j , 1 and je01j , 1, so that we may expand (1þ e1)21 and

(1þ e01)21 as a series in powers of e1 and e10. Expanding, multiplying out and retaining

terms of es to the second degree, we obtain

�y(d)RP ¼

�Y{k(1þ e0 � e1 þ e01 � e0e1 þ e21 þ e0e01 � e1e01 þ . . . )

þ (1� k)(1þ e0 þ e1 � e01 � e0e01 þ e0e1 þ e021 � e1e01 þ . . . )}

or

�y(d)RP �

�Y ffi �Y{e0 þ e1 � e01 � e0e01 þ e0e1 � e1e01 þ e021

þ k(e21 � e021 � 2e0e1 þ 2e0e01 � 2e1 þ 2e01)}: (3)

Taking expectation in equation (2) and noting that

E(e0) ¼ E(e1) ¼ E(e01) ¼ 0

and that the expectations of the second degree terms of order n21, we obtain

E{�y(d)RP} ¼ �Y þ o(n�1)

Thus the bias of the estimator �y(d)RP, is of the order n21 and hence its contribution to the mean

square error will be of the order of n22.

To find the bias and variance of �y(d)RP, let

C2y ¼ S2

y= �Y2, C2

x ¼ S2x= �X

2and r ¼ Syx=(SySx)

Double Sampling Ratio-product Estimator 73

Dow

nloa

ded

by [

Uni

vers

ity o

f N

ewca

stle

(A

ustr

alia

)] a

t 22:

21 2

7 A

ugus

t 201

4

where

S2y ¼

1

N � 1

XN

i¼1

(yi � �Y)2, S2x ¼

1

N � 1

XN

i¼1

(xi � �X)2

and

Syx ¼1

N � 1

XN

i¼1

(yi � �Y)(xi � �X):

The following two cases will be considered separately.

Case I. When the second phase sample of size n is a subsample of the first phase of size n1.

Case II. When the second phase sample of size n is drawn independently of the first phase

sample of size n1.

The case where the second sample is drawn independently of the first was considered by

Bose (1943).

Case I

Bias, Variance and Optimum k

In Case I, we have

E(e0) ¼ E(e1) ¼ E(e01) ¼ 0

E e20

� �¼

1� f

nC2

y

E e21

� �¼

1� f

nC2

x

E e021� �¼

1� f1

n1

C2x

E e0e1ð Þ ¼1� f

nCC2

x

E e0e01� �

¼1� f1

n1

CC2x

E e1e01� �

¼1� f1

n1

C2x

9>>>>>>>>>>>>>>>>>>>>>>>>>=>>>>>>>>>>>>>>>>>>>>>>>>>;

(4)

where f ¼ n/N, f1 ¼ n1/N, and C ¼ rCy/Cx.

Substituting equation (4) and noting that E(e0) ¼ E(e1) ¼ E(e01) ¼ 0 in (3) we get the

bias of �y(d)RP to the first degree of approximation as

B{�y(d)RP} ¼

1� f �

n�YC2

x {C þ k(1� 2C)} (5)

where f � ¼ n/n1. Thus, B{�y(d)RP} in equation (5) is ‘zero’ if

k ¼C

2C � 1:


Dow

nloa

ded

by [

Uni

vers

ity o

f N

ewca

stle

(A

ustr

alia

)] a

t 22:

21 2

7 A

ugus

t 201

4

Thus the estimator �y(d)RP with k ¼ C/(2C 2 1) is almost unbiased. From equation (5), it also

follows that the bias in �y(d)RP is negligible if the sample size n is sufficiently large.

The variance of �y(d)RP, up to terms of order n21, is

V{�y(d)RP} ffi �Y

2E½e2

0 þ (1� 2k){(1� 2k)(e1 � e01)2 þ 2(e0e1 � e0e01)}�: (6)

Taking the expectation of both sides in equation (6) and using results in equation (4), we

obtain the variance of �y(d)RP to terms of order n21, as

V{�y(d)RP}I ¼

�Y2 1� f

nC2

y þ1� f �

n(1� 2k þ C)C2

x (1� 2k þ 2CÞ

� �(7)

which is minimized when

k ¼1þ C

2¼ k0 (say) (8)

Substituting equation (8) in equation (1) we get the ‘asymptotically optimum estimator’

(AOE) as

�y(d0)RP ¼

�y

2(1þ C)

�x1

�xþ (1� C)

�x

�x1

� �Putting equations (8) in equations (5) and (7) we get the bias and variance of �y(d0)

RP res-

pectively as

B{�y(d0)RP }I ¼

1� f �

2n�YC2

x {1þ C(1� 2C)}

and

V{�y(d0)RP }I ¼ S2

y

1� f

n(1� r2)þ

1� f1

n1

r2

� �which is the same as the variance of the linear regression estimator �ydlr ¼ �yþ b(�x1 � �x) in

two phase sampling, where b is the sample regression coefficient of y on x.

Comparison with Ratio Estimator in Double Sampling

For k ¼ 1, the estimator �y(d)RP in equation (1) reduces to the usual double sampling ratio

estimator

�y(d)R ¼ �y

�x1

�x

The variance of �y(d)R can be obtained by putting k ¼ 1 in equation (7) as

V{�y(d)R } ¼ �Y

2 1� f

nC2

y þ1� f �

nC2

x (1� 2C)

� �: (9)


Dow

nloa

ded

by [

Uni

vers

ity o

f N

ewca

stle

(A

ustr

alia

)] a

t 22:

21 2

7 A

ugus

t 201

4

From equations (7) and (9) we have

V{�y(d)RP}� V{�y(d)

R } ¼ �41� f �

n�Y

2C2

x {k(1� k þ C)� C}

which is negative if

either 1 , k , C

or C , k , 1

�or equivalently,

min(C, 1) , k , max(C, 1)

or equivalently,jk � k0j , j1� k0j

where k0 ¼ (1þ C )/2.

Comparison with Product Estimator in Double Sampling

For k ¼ 0, the estimator �y(d)RP in (1) boils down to the usual double sampling product esti-

mator �y(d)P for �Y as

�y(d)P ¼ �y

�x

�x1

The variance of �y(d)P can be obtained by putting k ¼ 0 in equation (7) as

V{�y(d)P } ¼ �Y

2 1� f

nC2

y þ1� f �

nC2

x (1þ 2C)

� �: (10)



P } ¼ �4 �Y2 1� f �

nC2

x k(1� k þ C);


either 0 , k , 1þ C

or 1þ C , k , 0

�or equivalently,

min (0, 1þ C) , k , max(0, 1þ C)

or equivalently,

jk � k0j , jk0j:


Dow

nloa

ded

by [

Uni

vers

ity o

f N

ewca

stle

(A

ustr

alia

)] a

t 22:

21 2

7 A

ugus

t 201

4

Comparison with Mean Per Unit Estimator

The variance of sample mean �y under SRSWOR sampling scheme is given by

V(�y) ¼1� f

n�Y

2C2

y (11)


V{�y(d)RP}� V(�y) ¼ �Y

2 1� f �

nC2

x (1� 2k)(1� 2k þ 2C)


either 1=2 , k , 1=2þ C

or 1=2þ C , k , 1=2

�

or equivalently,

min1

2,

1

2þ C

� �, k , max

1

2,

1

2þ C

� �

or equivalently,

jk � k0j , k0 �1

2

�� :Case II

In Case II, we have

E(e0) ¼ E(e1) ¼ E(e01) ¼ 0

E e20

� �¼

1� f

nC2

y

E e21

� �¼

1� f

nC2

x

E e021� �¼

1� f1

n1

C2x

E(e0e1) ¼1� f

nCC2

x

E(e0e01) ¼ E(e1e01) ¼ 0

9>>>>>>>>>>>>>>>>>>=>>>>>>>>>>>>>>>>>>;

(12)

Taking expectation of equation (3) and using the results in equation (12) we get the bias of

�y(d)RP up to terms of order n21, as

B{�y(d)RP}II ¼

�YC2x

1� f

n{C þ k(1� 2C)}þ

1� f1

n1

(1� k)

(13)


Dow

nloa

ded

by [

Uni

vers

ity o

f N

ewca

stle

(A

ustr

alia

)] a

t 22:

21 2

7 A

ugus

t 201

4

which will vanish if

k ¼

1� f1

n1

þ1� f

nC

1� f1

n1

þ1� f

nð2C � 1)

Taking expectation of equation (6) and using the results in equation (12) we get the

variance of �y(d)RP up to terms of order n21, as

V{�y(d)RP} ¼ �Y

2 1� f

nC2

y þ (1� 2k) ð1� 2kÞ1� f

nþ

1� f1

n1

� ��þ 2

1� f

nC

�C2

x

(14)

which is minimum when

k ¼1þ uC

2¼ k�0 (say) (15)

where

u ¼

1� f

n1� f

nþ

1þ f1

n1

:

Substitution of equation (15) into equation (1) yields the ‘AOE’ as

�y(d�

0)

RP ¼�y

2(1þ uC)

�x1

�xþ (1� uC)

�x

�x1

� �Putting equation (15) in equations (13) and (14) we get the bias and variance of �y

(d�0)

RP

respectively as

B{�y(d�

0)

RP }II ¼�YC2

x

2

1� f

n{1þ uC(1� 2C)}þ

1� f1

n1

(1� uC)

and

V �y(d�

0)

RP

n oII¼

1� f

nS2

y(1� ur2)

Ignoring the finite population correction in equation (14), the variance of �y(d)RP is given by

V{�y(d)RP}II ¼

�Y2 1

nC2

y þ (1� 2k) (1� 2k)1

nþ

1

n1

� �þ

2

nC

� �C2

x

(16)


Dow

nloa

ded

by [

Uni

vers

ity o

f N

ewca

stle

(A

ustr

alia

)] a

t 22:

21 2

7 A

ugus

t 201

4

which is minimized for

k ¼1þ u�C

2¼ k��0 (say) (17)

where u� ¼ n1/(nþ n1).

Substitution of equation (17) in equation (1) yield the ‘AOE’ as

�y(d��

0)

RP ¼�y

2(1þ u�C)

�x1

�xþ (1� u�C)

�x

�x1

� �(18)

Putting equation (17) in equation (16) we get the variance of �y(d��

0)

RP as

V{�y(d��

0)

RP }II ¼S2

y

n(1� u�r2): (19)

Comparison with Ratio Estimator in Double Sampling

Putting k ¼ 1, the estimator �y(d)RP at equation (1) reduces to the ratio estimator �y(d)

R .

Thus putting k ¼ 1 in equation (16) we get the variance of �y(d)R to the first degree of

approximation as

V{�y(d)R } ¼ �Y

2 1

n{C2

y þ C2x (1� 2C)}þ

1

n1

C2x

and so


R } ¼ �Y2C2

x

1

nþ

1

n1

� �{(1� 2k)2 � 1}þ

4C

n(1� k)

is negative if

either u�C , k , 1

or 1 , k , u�C

�

or equivalently,

min(u�C, 1) , k , max(u�C, 1)

or equivalently,

jk � k��0 j , j1� k��0 j:


Dow

nloa

ded

by [

Uni

vers

ity o

f N

ewca

stle

(A

ustr

alia

)] a

t 22:

21 2

7 A

ugus

t 201

4

Comparison with Product Estimator in Double Sampling

Setting k ¼ 0 in equation (18) we get the variance of �y(d)P to the first degree of approxi-

mation as

V{�y(d)P } ¼ �Y

2 1

n{C2

y þ C2x (1þ 2C)}þ

1

n1

C2x

(20)



P } ¼ �4 �Y2C2

x

1

nþ

1

n1

� �k(1� k)þ

kC

n

� �which is negative if

either 0 , k , 1þ u�C

or 1þ u�C , k , 0

�or equivalently,

min(0, 1þ u�C) , k , max(0, 1þ u�C)

or equivalently,jk � k��0 j , jk

��0 j:

Comparison with Mean Per Unit Estimator �y

Ignoring fpc, the variance of �y under SRSWOR is given by

V(�y) ¼1

n�Y

2C2

y (21)


V{�y(d)RP}� V(�y) ¼ �Y

2C2

x (1� 2k)2 1

nþ

1

n1

� �þ 2(1� 2k)

C

n

� �which is negative if

either 1=2þ u�C , k , 1=2or 1=2 , k , 1=2þ u�C

�or equivalently,

min1

2,

1

2þ u�C

� �, k , max

1

2,

1

2þ u�C

� �or equivalently,

jk � k��0 j , k�0 �1

2

��


Dow

nloa

ded

by [

Uni

vers

ity o

f N

ewca

stle

(A

ustr

alia

)] a

t 22:

21 2

7 A

ugus

t 201

4

Comparison with Linear Regression Estimator

With the fpc ignored, the variance of the linear regression estimator in double sampling,

�ydlr ¼ �yþ b(�x1 � �x), is given by Cochran (1963),

V(�ydlr) ¼S2

y

n1�

n1 � n

n1

r2

� �¼ V{�y(d0)

RP }I (22)


V(�ydlr)� V{�y(d��

0)

RP }II ¼S2

yr2n

n1(nþ n1). 0

This shows that the variance of the ‘AOE’ �y(d��

0)

RP is always less than that of �ydlr. Thus the

AOE �y(d��

0)

RP in Case II is uniformly more efficient than the AOE �y(d0)RP in Case I.

Remark 4.1

Following Singh & Ruiz Espejo (2003), one can define the estimators based on estimated

optimum values under Cases I and II respectively as

�y(d0)RP ¼

�y

2(1þbC)

�x1

�xþ (1�bC)

�x

�x1

� �and

�y(d�

0)

RP ¼�y

2(1þ ubC)

�x1

�xþ (1� ubC)

�x

x1

� �where bC ¼ (syx=s

2x)(�x1=�y) ¼ b=bR, syx ¼

1

n� 1

Xn

i¼1

(yi � �y)(xi � �x)

s2x ¼

1

n� 1

Xn

i¼1

(xi � �x)2 and bR ¼ �y=�x1

It can be easily shown to the first degree of approximation that

V{�y(d0)RP } ¼ V{�y(d0)

RP } (under Case I)

andV{�y(d0)

RP } ¼ V{�y(d�

0)

RP } (under Case II)

Comparison with Single Phase Sampling

In this section the comparisons between double and single-phase sampling have been

made for fixed cost. We shall consider the Cases I and II separately.

Case I. In this case let us consider the following cost function

c ¼ c1nþ c2n1 (23)

where c equals total cost of the survey and c1 and c2 are the costs per unit of collecting

information on y and x respectively.


Dow

nloa

ded

by [

Uni

vers

ity o

f N

ewca

stle

(A

ustr

alia

)] a

t 22:

21 2

7 A

ugus

t 201

4

In this case, ignoring fpc we write the variance expression of AOE �y(d0)RP as

V ¼V1

nþ

V2

n1

where V1 ¼ Sy2(1 2 r2) and V2 ¼ Sy

2r2.

The optimum values of n and n1 for fixed cost c, which minimize the variance in

equation (23) are given by

nopt ¼cffiffiffiffiffiffiffiffiffiffiffiffiV1=c1

p� ffiffiffiffiffiffiffiffiffiV1c1

pþ

ffiffiffiffiffiffiffiffiffiV2c2

p � ¼ cffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi(1� r2)=c1

p� ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffic1(1� r2)

pþ r

ffiffiffiffiffic2p �

nIopt ¼cffiffiffiffiffiffiffiffiffiffiffiffiV2=c2

p� ffiffiffiffiffiffiffiffiffiV1c1

pþ

ffiffiffiffiffiffiffiffiffiV2c2

p � ¼ crffiffiffiffiffiffiffiffiffi1=c2

p� ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffic1(1� r2)

pþ r

ffiffiffiffiffic2p �

9>>>>=>>>>;The variance of �y(d0)

RP corresponding to optimal double sampling estimator is

Vopt{�y(d0)RP } ¼ (1=c)

� ffiffiffiffiffiffiffiffiffic1V1

pþ

ffiffiffiffiffiffiffiffiffic2V2

p �2

¼ (S2y=c)

� ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffic1(1� r2)

pþ r

ffiffiffiffiffic2p 2

)(24)

Case II. In Case II, we assume that x is measured on n� ¼ nþ n1 units and y on n units.

Following Srivastava (1970) we shall consider a simple cost function

c ¼ c1nþ c�2n� (25)

where c1 and c�2 denote costs per unit of observing y and x values respectively.

The variance of �y(d�

0)

RP at equation (19) can now be written as

V� ¼V1

nþ

V�2n�

(26)

To obtain the optimum allocation of sample between phases for a fixed cost c, we

minimize equation (26) with the condition (25). It is easily found that this minimum is

attained for

n

n�¼

V1c�2V2c1

� �1=2

¼c�2(1� r2)

c1r2

� �1=2

Thus the minimum variance corresponding to these optimum values of n and n1 are

given by

Vopt{�y(d��

0)

RP } ¼ (S2y=c)

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi(1� r2)c1

pþ r

ffiffiffiffiffic�2

pn o2

(27)

Had all the resources been diverted towards the study of character y only, then we would

have optimum sample size as below

n�� ¼ c=c1


Dow

nloa

ded

by [

Uni

vers

ity o

f N

ewca

stle

(A

ustr

alia

)] a

t 22:

21 2

7 A

ugus

t 201

4

Thus, the variance of sample mean �y for a given fixed cost c in case of large population is

given by

Vopt(�y) ¼c1

cS2

y (28)

Case I. From equations (24) and (28), the proposed double sampling strategy would be

profitable as long as

Vopt{�y(d0)RP } , Vopt(�y)

or equivalently,

c2

c1

,(1�

ffiffiffiffiffiffiffiffiffiffiffiffiffi1� r2

p)2

r2

Table 2. Description of the populations (b)

Population r Cy Cx C

Optimum value

k0 ¼ (1þ C)/2

Optimum value

k�0 ¼ (1þ uC)/2

1. 0.930 2.00500 1.59775 1.1671 1.08355 0.95598

2. 0.840 2.00500 1.44699 1.1639 1.08195 0.95473

3. 0.7418 0.23833 0.09198 1.9221 1.46105 1.2474

4. 0.5677 0.23833 0.11265 1.2011 1.10055 0.96709

5. 0.720 1.35277 0.24495 3.9763 2.48815 2.06514

6. 0.520 1.35277 0.46904 1.4998 1.24990 1.09035

7. 20.7177 0.48031 0.17776 21.9392 20.46960 20.28041

8. 20.4996 0.48031 0.74933 20.3202 0.3399 0.37114

9. 20.4074 0.20174 0.15033 20.5467 0.22664 0.27782

10. 20.0591 0.20174 0.27678 20.0431 0.47845 0.48237

Table 1. Description of the populations (a)

Population Source Study variate y Auxiliary variate x N n n1

1. Sukhatme & Chand

(1977)

Bushels of apples

harvested in 1964

Apple trees bearing

age in 1964

120 20 50

2. Sukhatme & Chand

(1977)

Bushels of apples

harvested in 1964

Bushels of apples

harvested in 1959

120 20 50

3. Srivastava (1971) Yield per plant Height of the plant 50 8 20

4. Srivastava (1971) Yield per plant Base diameter 50 8 20

5. Tripathi (1980) Persons in services Educated persons 225 40 100

6. Tripathi (1980) Persons in services Persons employed 225 40 100

7. Steel & Torrie (1960) Log of leaf burn in secs Nitrogen percentage 30 8 18

8. Steel & Torrie (1960) Log of leaf burn in secs Clorine percentage 30 8 18

9. Dobson (1990) Total calories

percentage from

carbohydrate

Body weight 20 5 12

10. Dobson (1990) Total calories

percentage from

carbohydrate

Age 20 5 12


Dow

nloa

ded

by [

Uni

vers

ity o

f N

ewca

stle

(A

ustr

alia

)] a

t 22:

21 2

7 A

ugus

t 201

4

Case II. From equations (27) and (28) it is obtained that the double sampling estimator

�y(d��

0)

RP yields less variance than that of sample mean �y for the same fixed cost, if

r2 .4c1c�2

(c1 þ c�2)2

Empirical Study

To observe the relative performance of different estimators of �Y we consider ten natural

population data sets. The description of the populations are given in Tables 1 and 2.

We have computed the percentage relative efficiency of different estimators with

respect to �y and this is shown in Table 3.

Table 3 exhibits that there is considerable gain in efficiency by using suggested estima-

tors �y(d0)RP (or �y(d0)

RP ) and �y(d�

0)

RP (or �y(d�

0)

RP ) over conventional estimators �y, �y(d)R and �y(d)

P except for

the data set of population 10, where the estimators �y, �y(d0)RP (or �y(d0)

RP ) and �y(d�

0)

RP (or �y(d�

0)

RP ) are

almost equally efficient. It is due to poor correlation between y and x. It is further observed

that the estimator �y(d�

0)

RP (or �y(d�

0)

RP ) is more efficient than �y(d0)RP (or �y(d0)

RP ) for all data sets. Thus,

it is preferred to use the proposed estimators �y(d0)RP (or �y(d0)

RP ) and �y(d�

0)

RP (or �y(d�

0)

RP ).

References

Barnett, V., Haworth, J. & Smith, T. M. F. (2001) A two-phase sampling scheme with applications to auditing or

sed quis custodiet ipsos custodes. Journal of the Royal Statistical Society Series A, 164, pp. 407–422.

Bose, C. (1943) Note on the sampling error in the method of double sampling, Sankhya, 6, pp. 329–330.

Breslow, N. E. & Holubkov, R. (1997) Maximum likelihood estimation of logistic regression parameters under

two-phase outcome-dependent sampling, Journal of the Royal Statistical Society (Series B), 59,

pp. 447–461.

Chand, L. (1975) Some ratio-type estimators based on two or more auxiliary variables. PhD Thesis, Ames, Iowa

State University.

Cochran, W. G. (1963) Sampling Techniques, 2nd edn (New York: Wiley).

Dobson, A. J. (1990) An Introduction to Generalized Linear Models, 1st edn (New York: Chapman & Hall).

Dorfman, A. H. (1994) A note on variance estimation for the regression estimator in double sampling, Journal of

the American Statistical Association, 89, pp. 137–140.

Table 3. Percent relative efficiencies

Estimator �yR(d) �yR

(d) �yP(d) �yP

(d) �y(d0)RP or �y(d0)

RP �y(d�0)

RP or �y(d�

0)

RP

Population �y Case I Case II Case I Case II Case I Case II

1. 100.00 256.42 303.14 � � 265.10 309.04

2. 100.00 199.18 220.41 � � 203.26 223.09

3. 100.00 143.39 161.57 � � 164.76 174.82

4. 100.00 128.83 133.23 � � 129.91 133.46

5. 100.00 119.95 128.06 � � 160.84 168.95

6. 100.00 121.27 126.25 � � 124.58 127.05

7. 100.00 � � 142.59 156.51 163.99 170.83

8. 100.00 � � � � 123.31 125.14

9. 100.00 � � 104.21 � 114.83 115.47

10. 100.00 � � � � 100.27 100.29

�Data not applicable and percent relative efficiency less than 100%.


Dow

nloa

ded

by [

Uni

vers

ity o

f N

ewca

stle

(A

ustr

alia

)] a

t 22:

21 2

7 A

ugus

t 201

4

Hidiroglou, M. A. & Sarndal, C. E. (1998) Use of auxiliary information for two-phase sampling, Survey

Methodology, 24, pp. 11–20.

Mukerjee, R., Rao, T. J. & Vijayan, K. (1987) Regression type estimator using multiple auxiliary information,

Australian Journal of Statistics, 29, pp. 244–254.

Murthy, M. N. (1964) Product method of estimation, Sankhya (Series A), 26, pp. 294–307.

Neyman, J. (1938) Contribution to the theory of sampling human populations, Journal of the American Statistical

Association, 33, pp. 101–116.

Pitt, D. G., Glover, G. R. & Jones, R. H. (1996) Two-phase sampling of woody and herbaceous plant communities

using large-scale aerial photographs, Canadian Journal of Forestry Research, 26, pp. 509–524.

Prasad, B., Singh, R. S. & Singh, H. P. (1996) Some chain ratio-type estimators for ratio of two population means

using two auxiliary characters in two phase sampling, Metron, 54, pp. 95–113.

Rao, J. N. K. & Sitter, R. R. (1995) Variance estimation under two phase sampling with application to imputation

for missing data, Biometrika, 82, pp. 453–460.

Robson, D. S. (1957) Applications of multivariate polykays to the theory of unbiased ratio-type estimation,

Journal of the American Statistical Association, 52, pp. 511–522.

Seth, G. R., Sukhatme, B. V. & Manwani, A. H. (1968) Sample surveys on fruit crops, in: Contributions in Stat-

istics and Agricultural Sciences, pp. 181–190 (New Delhi: Indian Society of Agricultural Statistics).

Singh, D. (1968) Double sampling and its application in agriculture, in: Contributions in Statistics and

Agricultural Sciences, pp. 213–226 (New Delhi: Indian Society of Agricultural Statistics).

Singh, D. & Singh, B. D. (1965) Double sampling for stratification on successive occasions, Journal of the

American Statistical Association, 60, pp. 784–792.

Singh, H. P. & Ruiz Espejo, M. (2000) An improved class of chain regression estimators in two phase sampling,

Statistics & Decisions, 18, pp. 205–218.

Singh, H. P. & Ruiz Espejo, M. (2003) On linear regression and ratio-product estimation of a finite population

mean, The Statistician, 52, pp. 59–67.

Sitter, R. R. (1997) Variance estimation for the regression estimator in two-phase sampling, Journal of the


Snedecor, G. W. & King, A. J. (1942) Recent developments in sampling for agricultural statistics, Journal of the


Srivastava, S. K. (1970) A two-phase sampling estimator in sample surveys, Australian Journal of Statistics, 2,

pp. 23–27.

Srivastava, S. K. (1971) Generalized estimator for the mean of a finite population using multiauxiliary

information, Journal of the American Statistical Association, 66, pp. 404–407.

Steel, R. G. D. & Torrie, J. H. (1960) Principles and Procedures of Statistics (New York: McGraw-Hill).

Sukhatme, B. V. (1962) Some ratio-type estimators in two-phase sampling, Journal of the American Statistical

Association, 57, pp. 628–632.

Sukhatme, B. V. & Chand, L. (1977) Multivariate ratio type estimators, Proceedings of the Social Statistics

Section of the American Statistical Association, pp. 927–931.

Sukhatme, B. V. & Koshal, R. S. (1959) A contribution to double sampling, Journal of the Indian Society of

Agricultural Statistics, 11, pp. 128–144.

Tripathi, T. P. (1980) A general class of estimators of population ratio, Sankhya (Series C), 42, pp. 63–75.

Unnikrishan, N. K. & Kunte, S. (1995) Optimality of an analogue of Basu’s estimator under a double sampling

design, Sankhya (Series B), 57, pp. 103–111.

Yates, F. (1960) Sampling Methods for Censuses and Surveys, 2nd edn (London: Charles Griffin).

York, I., Madigan, D., Heuch, I. & Lie, R. T. (1995) Birth defects registered by double sampling: a Bayesian

approach incorporating covariates and model uncertainty, Applied Statistics, 44, pp. 227–242.


Dow

nloa

ded

by [

Uni

vers

ity o

f N

ewca

stle

(A

ustr

alia

)] a

t 22:

21 2

7 A

ugus

t 201

4

Dow

nloa

ded

by [

Uni

vers

ity o

f N

ewca

stle

(A

ustr

alia

)] a

t 22:

21 2

7 A

ugus

t 201

4

Documents

Double Sampling Ratio-product Estimator of a Finite Population Mean in Sample Surveys