ON SOME SAMPLING STRATEGIES AND THEIR OPTIMALITYshodhganga.inflibnet.ac.in/bitstream/10603/41970/2/archana shukla... · on some sampling strategies and their optimality thesis submitted

ON SOME SAMPLING STRATEGIES AND

THEIR OPTIMALITY

THESIS SUBMITTED

TO THE

UNIVERSITY OF LUCKNOW

FOR THE

DEGREE OF DOCTOR OF PHILOSOPHY

STATISTICS

BY

ARCHANA SHUKLA

M.Sc.

Under the Supervision of

PROF. SHEELA MISRA

DEPARTMENT OF STATISTICS

LUCKNOW UNIVERSITY

LUCKNOW

SEPTEMBER, 2013

Certificate

This is certify that the thesis entitled “On Some Sampling

Strategies and their Optimality” submitted for the degree of doctor of

Philosophy in Statistics to the Department of Statistics , University of

Lucknow ,Lucknow (India) is a record of genuine and original research

work carried out by Ms. Archana Shukla, under my guidance and

supervision. The content of the thesis, in full or parts have not been

submitted to any other institute or University for the award of any other

degree or diploma.

Date:

Prof. Sheela Misra

Department of Statistics

University of Lucknow

Lucknow -226007

INDIA

Declaration

I Archana Shukla, hereby declare that the work submitted in this

thesis entitled “On Some Sampling Strategies and their Optimality”

for the award of Doctor of Philosophy in Statistics to the Department of

Statistics, University of Lucknow, Lucknow (India) is my own genuine

and original research work.

It Contains no material previously published or written by another person

nor material which has been accepted for the award of any other degree

or diploma of the University or other Institute of higher learning except

where due acknowledgement has been made in text.

Date: Ms. Archana Shukla

Department of Statistics

University of Lucknow

Lucknow -226007

INDIA

ACNOWLEDGEMENT

This research work has been a great and unique experience of immense

value for me and my greatest privilege to work under the guidance of my

supervisor Prof. Sheela Misra to whom I owe my heartiest respect and

regard. I am deeply indebted to the altruistic influence created by her

invaluable selfless guidance and inspiration without her perseverance my

endeavor wouldn’t have reached to its culmination. Her genuine concern,

affectionate attitude always kept me dynamic to accomplish this work

promptly. The precious time that she spent in organizing and critically going

through the manuscript in developing this thesis in present form is gratefully

acknowledged.

Word cannot express my heartfelt thanks to Prof. R .K .Singh,

because this work would not have been possible without the inspirational

guidance and continuous support from him. My special thanks to my uncle

Mr. S.S. Misra and my friend Dr. Ashish Kumar Shukla for their co-

operation encouragement and motivation throughout the period of my work.

It is very difficult for a researcher to complete his work without co-

operation of surrounding people. I am greatly thankful to Prof. S.K. Pandey

(Head) and all the teachers, Department of Statistics, University of

Lucknow, Lucknow for their blessings, co-operation and good wishes. Co-

operation and help by the Staff of Statistics Department of the Lucknow

University Lucknow is highly acknowledged.

Word cannot express my heartfelt thanks to my beloved mother Smt.

Prabha Shukla and father Sri R .K. Shukla for their kind co-operation,

support, enthusiastic love and encouragement which enabled me to

accomplish the present endeavor. My sincere thanks are due to all my family

members, my elder brother ,Bhabhi ji , Didi and Jija ji, my younger brother

and their kids for their, moral support, unconditional love and affection.

Whatever I have accomplished so far was possible with the blessings,

cooperation and immense help of my well wisher Dr. A.K. Mishra I express

my heartiest and sincere thanks to them.

My special thanks to all other well wishers colleagues and relatives

whose names could not be given here due to paucity of space. Last but not

least I am thankful to all the persons and places contributing directly or

indirectly in this journey of knowledge and wisdom.

I would be always indebted to the Almighty God for showing me the

way constantly especially whenever I was in desperate need.

Date: (Archana Shukla)

Index

Pg. No.

BONAFIED CERTIFICATE

DECLARATION

ACKNOWLEDGEMENT

CHAPTER No -1……………....................................................................... 1-15

INTRODUCTION AND REVIEW OF LITERATURE

CHAPTER No-2……………................................................................... 16-23

AN IMPROVEMENT IN THE MEAN PER UNIT ESTIMATOR OF

POPULATION MEAN UTILIZING KNOWN COEFFICIENT OF

VARIATION

CHAPTER No-3……………..................................................................... 24-32

ON ESTIMATION OF POPULATION MEAN USING REGRESSION

APPROACH WITH KNOWN COEFFICIENT OF VARIATION

CHAPTER No-4....................................................................................... 33-42

ESTIMATION OF POPULATION MEAN USING KNOWN

COEFFICIENT OF VARIATION

CHAPTER No-5......................................................................................... 44-53

AN IMPROVEMENT IN LINEAR REGRESSION ESTIMATOR

OF FINITE POPULATION MEAN USING KNOWN

COEFFICIENT OF VARIATION

CHAPTER No-6.........................................................................................

54-61

AN IMPROVED REGRESSION TYPE ESTIMATOR OF

POPULATION MEAN USING AUXILIARY INFORMATION.

CHAPTER No-7......................................................................................... 62-72

AN IMPROVED SEPARATE REGRESSION-TYPE

ESTIMATOR OF POPULATION MEAN

CHAPTER No-8.........................................................................................

73-87

A GENERALIZED CLASS OF SEPARATE REGRESSION-

TYPE ESTIMATORS FOR THE ESTIMATION OF FINITE

POPULATION MEAN

CHAPTER No-9......................................................................................... 88-96

ON ESTIMATION OF VARIANCE OF MEAN FOR THE

REGRESSION ESTIMATOR UNDER STRATIFIED RANDOM

SAMPLING

BIBLIOGRAPHY..................................................................................... 97-110

Published Papers

[1] “On Estimation of Population Mean using Regression Approach With

Known Coefficient of Variation”, Journal of Combinatorics Information

and System Sciences, Volume 37 (2012) , No. 1-2, pg 61-37.

[2] “An Improved Regression Type Estimator Of Population Mean Using

Auxiliary Information”, International Journal of Statistics and Analysis,

Volume 2, Number 4 (2012), pp. 483-488.

[3] “An improvement in the mean per unit estimator Of population mean utilizing

known coefficient Of variation”, Proceedings of the II National Conference

on Statistical Inference, Sampling Techniques and Related Areas,

Department of Statistics and Operations Research, Aligarh Muslim

University, Aligarh – 202002, February 11 – 12, 2012.

[4] “On the Regression Estimation of Population Mean using known

Coefficient of Variation”. Journal of Indian Society of Agricultural

Statistics to be appear in December 2013.

[5] “An Improvement In Linear Regression Estimator of Finite

Population Mean using known Coefficient of Variation”. International

Journal of Agricultural and Statistical Sciences. (accepted)

Chapter-I

Introduction and Review of

Literature

ChapterChapterChapterChapter----I I I I Introduction and Review of LiteratureIntroduction and Review of LiteratureIntroduction and Review of LiteratureIntroduction and Review of Literature

Ph.D Thesis/Ph.D Thesis/Ph.D Thesis/Ph.D Thesis/ StatisticsStatisticsStatisticsStatistics /2013//2013//2013//2013/Archana ShuklaArchana ShuklaArchana ShuklaArchana Shukla 1

1.1 Introduction

Sampling is the technique of selection of part of an aggregate population

to represent the whole population and sample obtained thereby is expected to

be a true representative of the whole population. It is most frequently used in

surveys. The purpose of a sample survey is to obtain information about the

population. By population we understand the group of units defined according

to the aims and objects of the survey. Thus the population may consist of all the

fields under a specified crop as in area and yield surveys, all the agricultural

holdings larger than a specified size as in agricultural survey, or all the

households having four or more children as in socio-economic surveys. Of

course, the population may also refer to human beings of the whole population

of a country or a particular sector of the country. The information that we seek

about the population is usually the total number of units such as the number of

farms in a state growing corn, aggregate values of the various characteristic per

unit such as the average size of a household and proportions of units possessing

specified attributes such as the proportion of households having income above

a certain level.

A Sampling method is a scientific and objective procedure of selecting

units from a population and provides a sample. It also provides procedures for

the estimation of results that would be obtained if a comparable survey was

taken on all the units in the population.

If we use random sampling numbers for drawing random samples, we

need not construct a miniature population. Also, the numbering of the sampling

units can be done in any convenient manner.

Secondly, randomizations of the numbers being done once for all, the

tedious process of randomization of the miniature population each time before

the next drawing is made is not necessary. Any part of the series can be used

for a random sample of numbers and the problem is simply to interpret these



numbers in terms of individuals of the population.

The simplest and most commonly used type of probability sampling is

simple random sampling. In this kind of sampling, each member of the

population has the same probability of being included in the sample. Simple

random sampling may be with or without replacements.

When the population is heterogeneous one uses stratified random

sampling. In stratified random sampling, before drawing the random sample,

one divides the population into several strata or sub-populations which are

relatively homogenous within themselves and the means of which are as widely

different as possible. Then draw a sample from each strata according to

different allocation plans (equal, proportional, neyman and optimum).

Stratified random sampling is preferable to simple random sampling on

a number of counts. (a) In many situations stratified sampling will be

administratively more convenient. In taking a sample of villages from the

whole of West Bengal, we may take the districts as strata. This will facilitate of

field work, since the exiting administrative set-up at the district level may be

used for this purpose. (b) Again, stratified sampling will be more representative

in the sense that here we can ensure that some individuals from each of the sub-

populations (strata) will be included in the sample. (c) Stratified sampling,

moreover, has the merit of supplying not only an estimate for the population as

a whole, but also separate estimates (with estimates of their standard errors) for

the individual strata. (d) Since a portion of the variability identifiable as

between-strata variance is eliminated in stratified random sampling. If the

between-strata variance is large, the within-strata variance, which provides the

estimate for error, will be small as compared with the variance for the whole

population. That is why we try to make each particular stratum as homogenous

as possible, while making the strata as different from each other as possible.



1.1.1 Advantages of the Sampling Methods

Our knowledge, our attitude, and our actions are based to a very

large extent on the observation obtained through samples. This is equally true

in everyday life and in scientific research. A person’s opinion of an institution/

person or an issue or a tourist who spends a small time in a foreign country and

then proceeds to write a book about the people of the country, sampling

information helps in making/reforming their program and policies. Though the

results thus obtained based on a much smaller sample differ from the census

results which are obtained by studying the whole of the population , sampling

strategies provide us a method for finding and estimating the error in estimation

of the parameters of the population in an efficient way. Optimal Sampling

strategies not only provide an efficient way of collection of data and estimation

of population parameters but also reduce the non sampling errors which are

more in census.

Moreover most of the time in science and humanities alike we lack the

time and resources to study more than a part of the phenomena that might

advance our knowledge making it almost compulsory to resort to sampling

strategies .

1.1.2 Uses of Sample Survey

Sampling can be used in a variety of ways. However, it is mostly used in

all kinds of surveys all over the world. Depending upon the objectives of the

survey and the purposes for which the data may be used, sample surveys can be

broadly classified into three categories: descriptive, analytical or both

descriptive and analytical.

In descriptive surveys, the object is usually to obtain some descriptive

measures with respect to the characteristics of the entire population under

study. Such surveys are very common and are required for national planning

and socio-economic development, to collect data on agriculture production and



utilization of land and water resources, industrial production, and

unemployment and size of labour force, wholesale and retail prices, income

and expenditure per household, numbers of literate persons and school-going

children and so on. On the other hand, the object in analytical surveys is to

obtain descriptive information for different subgroups of the population in

order to test hypothesis concerning possible relationships between the

subgroups. For-Example, in labour force surveys one would be interested not

only in knowing the average number of hours worked per day and the wages

paid but also whether men work longer hours than women and whether they

receive higher wages than women for the same type of work. Sampling

methods are also used in population census. In fact, except for certain basic

information required in respect of every individual, data on various items such

as occupation, parentage, marriage fertility, income, migration, housing, is

collected on a sampling basis. Sampling methods are used to provide counter

checks and speed up tabulation and publication of result.

Sampling methods are used extensively in business and industry to

increase operational efficiency. They play an important role in problems

encountered in market research such as estimating the size of readership of

news-magazines and newspapers or finding the reactions of consumers to new

products recently introduced in the market. They are also used to ascertain the

opinions or attitudes of the public to certain issues in which they are interested.

Surveys carried out for such purposes are often termed ‘Opinion Poll’ surveys.

Sampling is also used widely in purely experimental investigations as in

the determination of the blood tests, quality of milk or the response of

fertilizers to different crops or the chemical composition of soils.

1.2 Estimation in sampling theory

In statistics, survey sampling describes the process of selecting a sample

of elements from a target population in order to conduct a survey. There are



many problems in life which force to think and find out their solutions.

A survey may refer to many different types or techniques of observation, but in

the context of survey sampling it most often involves a questionnaire used to

measure the characteristics and/or attitudes of people. Different ways of

contacting members of a sample once they have been selected is the subject

of survey data collection.

The purpose of sampling is to reduce the cost and/or the amount of work

that it would take to survey the entire target population. Survey samples can be

broadly divided into two types: probability samples and non-probability

samples. Only surveys based on a probability samples can be used to create

mathematically sound statistical inferences about a larger target population.

Inferences from probability-based surveys may still suffer from many types of

bias.

Surveys that are not based on probability sampling have no way of

measuring their bias or sampling error. Surveys based on non-probability

samples are not externally valid. They can only be said to be representative of

the people that have actually completed the survey.

Put another way, if a probability-based survey of the United States

household population finds that 59% of its respondents support a piece of

legislation there is mathematical reason to believe that the proportion of all the

persons living in households in the United States who support this piece of

legislation is close to 59% (within the margin of error). If a non-probability

survey conducted in the United States finds that 59% percent of its respondents

support a piece of legislation that is the only conclusion that can be drawn, no

statement about the target population can be made.

In the literature there are several estimators for estimating population

parameters and various criteria for judging their performances. Among these

estimators an important class of estimators is taken i.e. to restrict upon the



estimators in the class of unbiased estimators which has the minimum

variances. If we go for biased estimators, we search for the estimator having

minimum mean square error.

1.3 Use of Auxiliary Variable

Auxiliary information can be used at designing stage, sampling stage or

estimation stage leading to a considerable improvement in the precision of

estimators of the population parameter under study. At the selection stage for

example, selecting a sample with probability proportion to some auxiliary

character (usually taken as the size of the unit), or at the stage of stratifying the

population and selecting the samples of appropriate sizes from each of the

strata so constructed or at the estimation stage.

In most of the developments concerning the use of auxiliary information

in the estimation of parameters in survey sampling, it is typically assumed that

all the observations on selected units in the sample are available. This may not

hold true in many practical situations encountered in sample surveys and some

observations may be missing for various reasons such as unwillingness of some

selected units to supply the desired information, accidental loss of information

caused by unknown factors, failure on the part of investigator to gather correct

information.

Auxiliary information is in the use in sample surveys since the

development of the theory and application of modern sample surveys.

Information on auxiliary variable which is highly correlated with the variable

under study is readily available in many sample surveys and can be used to

improve the sampling design. In modern sampling theory, the works of Bowley

(1926)[3]

and Neyman (1934, 38) are the foundation stones, dealing with

stratified random sampling and putting forward a theoretical criticism of non-

random (purposive) sampling. Their works may perhaps be referred to as the

initial works in history of sample surveys, utilizing the auxiliary information.



Watson (1937)[104]

and Cochran (1961, 77)[9-10]

were the persons who initially

works making use of auxiliary information in devising estimation procedures

leading to improvement in precision of estimation. Hansen and Hurwitz

(1953)[29]

were the first to suggest the use of auxiliary information in selecting

the units with varying probabilities.

There are three popular ways of utilizing auxiliary information at the

estimation stage, viz., ratio, and product and regression methods. When the

auxiliary characteristic is negatively correlated with the characteristic under

study, Robson (1957)[60]

and later on Murthy (1964)[43]

proposed the product

estimator and also developed an unbiased product estimator. Srivastava, and

Bhatnagar (1981)[91]

analyzed their properties while Singh (1967)[89]

considered

almost unbiased product estimators. Srivastava and Bhatnagar (1981)[91]

proposed an estimator which has no constraint of prior knowledge regarding

population mean of auxiliary characteristic

The auxiliary variable x may be the main variable y itself under study

on some previous occasion in case of repetitive surveys. The information on

auxiliary character may be known in advance from past data, pilot survey or

from experience or it may be collected while the survey is going on without

increasing the cost of the survey or a part of resources may be devoted for

collecting such information. In the literature there are several estimators for

estimating population parameters and various criteria for judging their

performances. Among these estimators an important class of estimators is taken

i.e. to restrict upon the estimators in the class of unbiased estimators which has

the minimum variances. If we go for biased estimators, we search for the

estimator having minimum mean square error. In large scale sample surveys,

we often collect data on more than one auxiliary character and some of these

may be correlated with y .

Various strategies have been proposed for estimating a finite population

mean under a super population model that links the variable of interest to one



or more auxiliary variables. Brewer (1963)[4]

, Royall (1970,1971,1976)[62-64]

,

and others have adapted linear model prediction theory to the finite population

situation and have derived the best linear unbiased (BLU) predictor. Cassel,

Sarndal, and Wretman (1976, 1977) [6-7]

and Sarndal (1982)[85]

have proposed a

generalized regression (GREG) predictor that is Asymptotically Design

Unbiased (ADU). Brewer (1979)[5]

suggested a predictor that blends aspects of

the BLU and GREG predictors and retains the ADU property, but is restricted

to the case of a single auxiliary variable. Certain ADU predictors involving

several auxiliary variables are suggested by Isaki and Fuller (1982)[32]

.

There are several instances where the mean is proportional to standard

deviation and consequently the coefficient of variation is known although the

mean and standard deviation may not be known. Some such situations may be

seen in Snedecor (1946)[84]

, Hald (1952)[27]

, Davies and Goldsmith (1976)[12]

and Gleser and Healy (1976)[19]

. The well known Weber’s law of

Psychophysics (see Guilford (1975)[20]

, chapter 2) provides instances where

coefficient of variation is known and one such example is given in Singh

(1998)[80]

also.

Sometimes, simple a priori information in the form of coefficient of

variation is available to the experimenters in the fields of biology, agriculture,

psychometrics etc. Long association of the experimenters with the experimental

material, the experimenters may have at their disposal quite accurate

information concerning the coefficient of variation. This information

concerning coefficient of variation is frequently used to plan experiments,

estimate sample size, average, total, etc (see Searles (1964)[72]

also). Further

supporting explanation regarding stable and consistent information about

coefficient of variation may be seen in Cochran (1977[9]

, 3rd edition) on page

77 and page 79 of chapter 4. A good description about knowledge of

coefficient of variation is given in Sukhatme et. al (1984)[99]

also on page 42.



1.4 Review and Development of the relevant literature

Statistics is a science of making decisions in the face of uncertainty and

sampling provides a technique to guess the reality in an efficient way to help us

in decision making. In any decision process in applied sciences and many other

fields of practical importance, we need sample observations to make inferences

about the unknown parameters. In decision making based on sample

observations, it is assumed that the sample observations are free from

observational / measurement or any other such as response errors which may

not hold true in actual practice. Samples containing these kinds of errors

invalidate our inference procedures giving erroneous conclusions about the

parameters of our interest in our study. For example, in regression theory

measurement errors invalidate the application of least squares estimators of

coefficient vector of regression coefficients in a general linear regression model

to the extent that the least squares estimator does not remain even consistent.

To make least square estimator consistent, some procedures were developed

but still some more efficient procedures than the earlier ones are to be

developed. In the presence of measurement errors, a very little work is done in

applied sciences also. Some papers by Misra et.al. (2004)[47]

, Maneesha and

Singh (2001)[49]

and Shalabh (1999)[77]

may be seen in this regard. Though

some work has been done in this direction still a lot of work remains to be

done.

For the estimation of parameters like mean, total, ratio, proportion,

variance etc, estimation procedures like ratio, product, difference and

regression are available in papers of some authors and books by Cochran

(1977)[9]

, Sukhatme et al (1984)[99]

and Misra et al (2003)[43]

. Some authors

developed classes of estimators depending upon optimum values and studied

their properties. A little work is done in developing classes of estimators

depending on estimated optimum values, hence in this regard using auxiliary

information some efficient classes or sub-classes of estimators enhancing their

practical utility may be further developed for the estimation of parameters of



interest. Papers by Srivastava and Jhajj (1981)[92]

, Singh and Singh (1984)[78]

may be seen regarding optimum values and estimated optimum values.

There are several authors who have suggested estimators using some

known population parameters of auxiliary variable(s) and enhanced the

efficiency of exiting estimators. These includes Das and Tripathi (1981)[16]

,

Srivastava and Jhajj (1980, 1983)[93, 95]

, Wu (1985)[105]

, Prasad and Singh

(1990, 1992)[52-53]

. Upadhyaya and Singh (1999)[102]

have suggested the class of

estimators in simple random sampling. Kadilar and Cingi (2003)[36]

and

Shabbir and Gupta (2005)[81]

extended these estimators for the stratified

random sampling. Singh et al (2012)[75]

have suggested the general family of

ratio-type estimators in systematic sampling. Tailor et al (2012)[101]

have

recently proposed dual to ratio-cum-product estimator using known parameters

of auxiliary variables. Kadilar and Cingi (2005)[37]

and Shabbir and Gupta

(2006)[82]

have suggested new ratio estimators in stratified sampling to improve

the efficiency of the estimators. Koyuncu and Kadilar (2008)[40]

have proposed

families of estimators for estimating population mean in stratified random

sampling by considering the estimators proposed in Searls (1964) and

Khoshnevisan et al (2007)[39]

. Singh and Vishwakarma (2008)[79]

have

suggested a family of estimators using transformation in the stratified random

sampling. Koyuncu and Kadilar (2009)[41]

, Chaudhary et.al (2012)[11]

have

suggested a family of factor-type estimators of population mean in stratified

random sampling under non-response using an auxiliary variable.

1.5 Problems Discussed in the Present Research Work

In sampling theory, a strategy deals with the determination of a

combination of selection and estimation procedures so as to infer about

population with least amount of error and consequently minimize the loss

which might be associated with such error. There is great variety of techniques

for using auxiliary information in order to obtain improved sampling designs

and efficient estimators for some most common population parameters.



Moreover coefficient of variation is relatively a stable quantity and information

about population coefficient of variation may be easily available for auxiliary

and study variable both many times, which can be utilized for improving upon

the existing estimators without increasing the cost involved. In the present

research work we have proposed and studied improved estimation procedures

under the Simple Random sampling without replacement and Stratified

Random Sampling by using auxiliary information. We have also obtained

optimum estimators/optimum classes of estimators and found that they retained

their properties even after being estimated through sample values.

In this research work we have also proposed the improved estimators

when the information on the coefficient of variation of study variable ( )y and

auxiliary variable ( )x is available; this increases the efficiency of the estimator.

The increase in the efficiency of existing/earlier estimators is shown

theoretically as well as through numerical illustrations. Regression types of

estimators have been explored in great detail as this improves the estimator’s

efficiency significant. This has been shown through derivation upto first order

of approximation and the numerical examples where such estimators can be

used have been discussed and shown to achieve enhanced efficiency. In one of

the chapters the problem of estimation of population variance has been

discussed and the estimator for the same is proposed in the presence of

information about auxiliary variable and its properties are studied along with

the numerical illustration.

The Present research work divided in nine chapters contains the

introduction, review of relevant literature and different proposed estimators for

the estimation of finite population mean and variance under Simple Random

Sampling without replacement and Stratified Random Sampling. The

Chapterwise details of the present thesis are as follows:

In Chapter I “Introduction and Review of Literature” consists of the

introduction about Sampling procedure, need and its uses, and advantages. It



also contains review of literature and use of auxiliary variable, summary of

problems discussed in different chapters of the present thesis.

In Chapter II “An Improvement in the Mean Per Unit Estimator of

Population Mean Utilizing Known Coefficient of Variation” the estimation

of the finite population mean, an improved estimator of population mean is

proposed assuming coefficients of variation of study variable y known under

simple random sampling without replacement. Proposed estimator increases the

efficiency of existing mean per unit estimator considered by Sukhatme P. V.

(1984) in the sense of having lesser MSE. The optimum class of estimators is

also obtained. Further for greater practical utility proposed optimum estimator

based on estimated optimum value of the characterizing scalar has also been

obtained and is shown to retain the same efficiency as the former class. A

numerical illustration is also given to support the theoretical conclusions.

Chapter III “On Estimation of Population Mean Using Regression

Approach with Known Coefficient of Variation” deals with the estimation of

the finite population mean using regression approach with known coefficients

of variation. The proposed estimator are obtained and compared with the mean

square error of usual regression estimator given in Sukatme P. V. (1984). The

proposed estimator comes out to be more efficient. The class of estimators

based on estimated optimum value of the characterizing scalar retains the

efficiency of the proposed estimator. A numerical example is also given to

support the theoretical conclusions.

Chapter IV “Estimation of Population Mean Using Known

Coefficient of Variation” contains the regression estimation procedure of the

finite population mean, with known coefficient of variation of study variable y ,

is being estimated under Simple Random Sampling without replacement. The

bias and mean square error of the proposed estimator are obtained and

compared with the regression estimator of the population mean given in

Sukatme P. V. (1984). The proposed estimator comes out to be more efficient



in the sense of having lesser MSE. The optimum class of estimators is also

obtained and further proposed optimum estimator based on estimated optimum

value of the characterizing scalar having greater practical utility is shown to

retain the same efficiency as the former. Empirical study is also given to

support the derivation.

Chapter V “An Improvement In Linear Regression Estimator

of Finite Population Mean Using Known Coefficient of Variation”

contains the regression estimation procedure of the finite population mean,

with known coefficient of variation of study variable y , is being estimated

using predictive modeling approach under Simple Random Sampling without

replacement. The bias and mean square error of the proposed estimator are

obtained and compared with the regression estimator of the population mean

given in Sukatme P. V. (1984). The proposed estimator comes out to be more

efficient in the sense of having lesser MSE. The optimum class of estimators is

also obtained and further proposed optimum estimator based on estimated

optimum value of the characterizing scalar having greater practical utility is

shown to retain the same efficiency as the former. Relevant numerical

examples are also given to support the theoretical part.

Chapter VI “An Improved Regression Type Estimator of Population

Mean Using Auxiliary Information”, deals with utilization of regression

approach for the estimation of finite population mean, using Auxiliary

Information in the form of variance, under simple random sampling without

replacement. The bias and mean square error of the proposed regression type

estimator are obtained and compared with the usual regression estimator of the

population mean. The proposed regression type estimator comes out to be more

efficient in the sense of having lesser mean square error. The optimum class of

estimators is also obtained and further proposed optimum estimator based on

estimated optimum value of the characterizing scalar having greater practical

utility is shown to retain the same efficiency as the former. Numerical

examples are also given to support the theoretical conclusions.



Chapter VII “An Improved Separate Regression-Type Estimator of

Population Mean”, in the present chapter utilizing auxiliary information in

regression estimation procedure of finite population mean, is being estimated

under Stratified Random Sampling. The bias and mean square error of the

proposed Separate regression-type estimator are obtained and compared with

the regression estimator of the population mean under simple random Sampling

and Separate ratio-type estimator under stratified Random sampling given by

Sukatme P. V. (1984).The proposed Separate regression-type estimator comes

out to be more efficient in the sense of having lesser MSE. The optimum class

of estimators is also obtained and further proposed optimum estimator based on

estimated optimum value of the characterizing scalar having greater practical

utility is having same efficiency as the former. The theoretical conclusions are

also supported by numerical examples.

In Chapter VIII “A Generalized Class of Separate Regression-Type

Estimators for The Estimation of Finite Population Mean”, we proposed a

generalized class of Separate regression-type estimation of finite population

mean using auxiliary information under Stratified Random sampling. The

expression of bias and mean square error of the proposed estimator up to first

order approximation are derived and compared with the regression estimator of

the population mean considered by Sukhatme P. V. (1984).The proposed

estimator comes out to be more efficient. The optimum class of estimators is

also obtained and further proposed optimum estimator based on estimated

optimum value of the characterizing scalar having greater practical utility is

shown to retain the same efficiency as the former. A Numerical example is also

given to support the theoretical findings.

In Chapter IX “On Estimation of Variance of Mean for the

Regression Estimator under Stratified Random Sampling” deals with the

estimation of variance of separate regression type estimator of the population

mean in stratified random sampling, its bias and mean square error are obtained

and further an optimum class of estimators is obtained having minimum mean



square error. Enhancing the practical utility of the optimum estimator, a class

of estimators depending upon estimated optimum value based on sample

observations is also found. Further comparative study has been done with some

earlier estimators.

Chapter - II

“An Improvement in the Mean per

Unit Estimator of Population Mean

Utilizing Known Coefficient Of

Variation”

ChapterChapterChapterChapter----II II II II An Improvement In The Mean Per Unit Estimator Of Population An Improvement In The Mean Per Unit Estimator Of Population An Improvement In The Mean Per Unit Estimator Of Population An Improvement In The Mean Per Unit Estimator Of Population ............


AN IMPROVEMENT IN THE MEAN PER UNIT ESTIMATORAN IMPROVEMENT IN THE MEAN PER UNIT ESTIMATORAN IMPROVEMENT IN THE MEAN PER UNIT ESTIMATORAN IMPROVEMENT IN THE MEAN PER UNIT ESTIMATOR

OF POPULATION MEAN UTILIZING OF POPULATION MEAN UTILIZING OF POPULATION MEAN UTILIZING OF POPULATION MEAN UTILIZING KNOWN COEFFICIENTKNOWN COEFFICIENTKNOWN COEFFICIENTKNOWN COEFFICIENT

OF VARIATIONOF VARIATIONOF VARIATIONOF VARIATION

SUMMARY

In this chapter, an improved estimator over the mean per unit

estimator of population mean with known coefficient of variation is

proposed, its bias and mean square error are found and its comparative

study with the usual mean per unit estimator with numerical illustration

has been made.

Outline of this chapter is given as follows:

2.1 Introduction

2.2 Bias and Mean Square Error of Proposed Estimator

2.3 Estimator with Estimated Optimum Characterizing Scalar

2.4 Concluding Remarks

2.5 An Illustration



2.1 Introduction

Let the variable of interest be y taking the value Yi for the ith

(i=1,2,……….N) unit of the population of size N.

Further let

∑=

=N

i

iY

NY

1

1 , ( )∑

=

−=N

i

r

irYY

N 1

1µ

YY

Cy

y

σµ==

2 , ( )∑=

−==N

i

iyYY

N 1

22

2

1σµ

2

2

42

µ

µβ = and 2/3

2

31

µ

µγ =

For y1,y2,…………yn being the sample observations on y in a simple

random sample of size n without replacement, let

∑=

=n

i

iy

ny

1

1 and ( )∑

=

−−

=n

i

iyyy

ns

1

22

1

1

Using known coefficient of variation Cy for the estimation of population

mean Y , the proposed estimator is

k

y

y

k

s

Cyyy

=

2

22

(2.1.1)

where k is the characterizing scalar to be chosen suitably.



2.2 Bias and Mean Square Error of ky

For simplicity, it is assumed that the population size N is large enough

as compared to the sample size n so that finite population correction

(f.p.c.) term may be ignored.

Let

)1( 0eYy += , )1( 1

22es

yy+=σ

so that

( ) ( ) 010 == eEeE

( )n

eE122

1

−=

β

, ( )

2

2

2

0

Yn

eEy

σ=

, , ( )

2

3

10

yYn

eeE

σ

µ=

From (2.1.1), we have

( )

( )

( )

k

y

y

k

e

Y

eY

eYy

+

+

+=1

2

2

2

2

0

2

01

1

1σ

σ

( )( ) ( ) kk

eeeY−

+++= 1

2

00 111

or

( ) ( ) ( ) ( )( )

........2

1121212

2

1110

2

00 ++

+−+−+++=−Yekk

eYkeeYkkeYkkeYkYyk

(2.2.1)



Taking expectation on both sides, we have bias up to terms of order O

(1/n) to be

( ) ( )YyEyBiaskk

−= ( )( )( )[ ]11

2

1221

2−++

−

+= βσγ

µkk

n

Y

Yn

kk

y (2.2.2)

Again, squaring both sides of (2.2.1) and taking expectation, we have

mean square error of ky up to terms of order O(1/n) to be

( ) ( )2

YyEyMSEkk

−= ( )2100 2 eYkeYkeYE −+=

( ){ } { }[ ]yyyy

y

CCkCCkn

Y

n1

2

12

22

2

22414 γγβσ

−+−−++= (2.2.3)

The optimum value of k minimizing the mean square error of k

y in

(2.2.3) is given by

( )

( ){ }yy

yy

o

CC

CCk

12

2

1

2

414

2

γβ

γ

−−+

−−= (2.2.4)

and the minimum mean square error of k

y is given by

( )( )

( ){ }yy

yyy

k

CCn

CCY

nyMSE

o

12

2

2

1

222

414

2

γβ

γσ

−−+

−−= (2.2.5)

2.3 Estimator with Estimated Optimum ∧

k

For situation when values of 2β and 1γ or their good guessed values are not

available ,the alternative is to replace these 2β and 1γ involved in the

optimum o

k by their estimates ∧

2β and ∧

1γ based on sample values and get the

estimated optimum value of k denoted by ∧

k as



−

−+

−

−=∧∧

∧

∧

yy

yy

CC

CC

k

12

2

1

2

414

2

γβ

γ

(2.3.1)

where

2

2

42

∧

∧∧

=

µ

µβ with ( )∑

=

∧

−=n

i

iyy

n 1

4

4

1µ , ( )∑

=

∧

−−

==n

i

iyyy

ns

1

22

21

1µ

and 2/3

2

31

∧

∧

∧

=

µ

µγ with ( )∑

=

∧

−=n

i

iyy

n 1

3

3

1µ and

3

2/3

2 ys=

∧

µ

Thus, replacing k by estimated optimum ∧

k in the estimator k

y in (2.1.1), we

get for wider practical utility of the estimator based on the estimated optimum

∧

k given by

∧

=

k

y

y

k

s

Cyyy

e 2

22

(2.3.2)

To find the bias and mean square error of e

ky , let

( )233 1 e+=

∧

µµ,

( )344 1 e+=∧

µµ

alongwith )1( 0eYy += and )1( 1

22es

yy+= σ so that

( )

( )

( )

( )

( )

( )

+

+−

−

+

++

+

+−

−=∧

y

yy

y

y

y

y

C

e

e

e

eC

C

e

eC

k

2/3

1

3

23

2

1

4

342

2/3

1

3

232

1

141

1

14

1

12

σ

µ

σ

µ

σ

µ



( )

−−++−−−−++−+

−−++−−

−=

yy

yy

CeeeeeeeeeeC

CeeeeeC

.....2

3

8

15

2

314123214

.....2

3

8

15

2

312

212

2

111313

2

112

2

212

2

111

2

γβ

γ

[ ]( )[ ] ( )

*2

.......2

3

8

15

2

3

1414

2

1

2

212

2

111

12

2

1

2

−

−+−−

+−−+

−−=

yy

y

yy

yy

CC

Ceeeee

CC

CC

γ

γ

γβ

γ

( )

( ){ }

1

12

2

212

2

111313

2

112

414

......2

3

8

15

2

3......232

1

−

−−+

−+−−+−−++−

+

yy

y

CC

Ceeeeeeeeee

γβ

γβ

[ ]( )[ ]

( )

( ){ }+

−−+

−+−−+−−++−

−−−+

−−=

yy

y

yy

yy

CC

Ceeeeeeeeee

CC

CC

12

2

212

2

111313

2

112

12

2

1

2

414

......2

3

8

15

2

3......232

1414

2

γβ

γβ

γβ

γ

[ ]( )[ ] ( )

−

−+−−

−−+

−

yy

y

yy

yy

CC

Ceeeee

CC

CC

1

2

212

2

111

12

2

1

2

2

.......2

3

8

15

2

3

414

2

γ

γ

γβ

γ

(2.3.3)

Substituting ( )01 eYy += , ( )1

22 1 esyy

+= σ and ∧

k from (2.3.3) in (2.3.2) , we

have

( )

( )( )[ ]

++−+−

−−+

−−=− ........

22

414

2 2

110

2

010

12

2

1

2

0

eeeeee

CC

CCYeYYy

yy

yy

ke γβ

γ

(2.3.4)

Taking expectation of (2.3.4) and ignoring terms of e’s greater than two, we

can easily check that the bias of e

ky is of order O(1/n); hence , the Bias(

ek

y ) is

negligible for sufficiently large value of n , that is, the estimator e

ky is

approximately unbiased estimator of the population mean Y .Further, squaring

and taking expectation up to terms of order O(1/n)

( )( )

( )[ ]( )

2

10

12

2

1

2

0 2414

2

−

−−+

−−= ee

CCn

CCYeYEyMSE

yy

yy

ke γβ

γ

( )( )[ ]

yy

yyy

CCn

CCY

n 12

2

2

1

222

414

2

γβ

γσ

−−+

−−= (2.3.5)



which is same as mean square error for the optimum o

k , that is, estimator e

ky

based on estimated optimum ∧

k attains the same mean square error as that

of the estimator o

ky based on optimum

ok .


a) For the optimum value o

k of k, it is clear in (2.2.5) that the

estimator o

ky attains the minimum mean square error

( )( )

( ){ }yy

yyy

k

CCn

CCY

nyMSE

o

12

2

2

1

222

414

2

γβ

γσ

−−+

−−=

(2.4.1)

b) The estimators o

ky with optimum value

ok and the estimator

ek

y based

on estimated optimum ∧

k have same mean square error given by

( ) ( )oe

kkyMSEyMSE =

( )( )[ ]

yy

yyy

CCn

CCY

n 12

2

2

1

222

414

2

γβ

γσ

−−+

−−=

( )( )

( )[ ]yy

yy

CCn

CCYyMSE

12

2

2

1

22

414

2

γβ

γ

−−+

−−= (2.4.2)

which shows that estimators e

ky or

ok

y based on estimated optimum or

optimum value are more efficient than the mean per unit estimator y in the

sense of having lesser mean square error.

c) For normal parent population (that is, for 01 =γ and 32 =β ) , the

optimum value o

k from (2.2.4), reduces to

12 2

2

+=

y

y

o

C

Ck

for which ( )o

kyMSE becomes



( )( )12

22

422

+−=

y

yy

k

Cn

CY

nyMSE

o

σ

(2.4.3)

showing that the proposed estimator k

y is more efficient than y in

normal parent population also.

2.5 An Illustration

Considering the data given in Cochran (1977, page 34) dealing

with the weekly expenditure of family on food (y) group, computation of

required values have been done and we have the following

33=n , 49.27=Y , 613033.992=

yσ , 36306.0=

yC , 4651.11 =γ , 7146.52 =β

Using the required values, we have

( ) 018576.3=yMSE (2.5.1)

( ) ( ) 4892803.2==oe

kkyMSEyMSE

(2.5.2)

From the above, the percent relative efficiency (PRE) of the

proposed estimator over the usual mean per unit estimator is 121%.

Chapter – III

“On Estimation of Population Mean

Using Regression Approach with Known

Coefficient of Variation”

ChapterChapterChapterChapter----IIIIIIIIIIII On Estimation On Estimation On Estimation On Estimation of Population Mean Using Regression Approach…of Population Mean Using Regression Approach…of Population Mean Using Regression Approach…of Population Mean Using Regression Approach…


ON ESTIMATION OF POPULATION MEAN USING ON ESTIMATION OF POPULATION MEAN USING ON ESTIMATION OF POPULATION MEAN USING ON ESTIMATION OF POPULATION MEAN USING

REGRESSIONREGRESSIONREGRESSIONREGRESSION APPROACH WITH KNOWN COEFFICIENT OF APPROACH WITH KNOWN COEFFICIENT OF APPROACH WITH KNOWN COEFFICIENT OF APPROACH WITH KNOWN COEFFICIENT OF

VARIATIONVARIATIONVARIATIONVARIATION

SUMMARY

In this chapter regression estimation approach has been used for the estimation

of the population mean using known coefficient of variation. The bias and

mean square error of the proposed estimator are found. A comparative study

with the usual regression estimator of the population mean has been made to

show the enhanced efficiency of the proposed estimator along with a numerical

illustration.


3.1 Introduction


3.3 Estimator with Estimated Optimum Characterizing Scalar


3.5 An Illustration



3.1 Introduction

Let the variable of interest be Y and the auxiliary variable be X

which takes the values Yi and Xi for the ith

(i=1,2,……….N) unit of the

population of size N.

Further let

∑=

=N

i

iY

NY

1

1 , ∑

=

=N

i

iX

NX

1

1, ( ) ( )s

i

rN

i

irsYYXX

N−−= ∑

=1

1µ

YC

y

y

σ= ,

XC

x

x

σ= , 2

02

042

µ

µβ = , 2/3

02

031

µ

µγ =

( )2

1

2 1∑

=

−=N

i

iyYY

Nσ , ( )

2

1

2 1∑

=

−=N

i

ixXX

Nσ

( )( )∑=

−−=N

i

iixyYYXX

N 1

1σ ,

yx

xy

σσ

σρ = , 2

x

xy

Bσ

σ=

Also, let

∑=

=n

i

iy

ny

1

1 , ∑

=

=n

i

ix

nx

1

1

( )2

1

2

1

1∑

=

−−

=n

i

iyyy

ns , ( )

2

1

2

1

1∑

=

−−

=n

i

ixyy

ns

( )( )∑=

−−−

=n

i

iixyyyxx

ns

11

1 ,

2

x

xy

s

sb =

where 2

xs , 2

ys and

xys are unbiased or consistent estimators of 2

xσ , 2

yσ and

xyσ where y1,y2,………,yn are the observation on y and x1,x2,……..xn are

the observation of auxiliary variable x for a simple random sample of

size n.

For estimating the population mean using regression method of

estimation, the proposed estimator is



( )[ ]

−+−+= y

s

CyhxXbyy

y

y

lrh 2

23

(3.1.1)

where h is the characterizing scalar chosen suitably.

3.2 Bias and Mean Square Error of lrhy


as compared to the sample size n so that finite population correction

term may be ignored.

Let

( )01 eYy += , ( )11 eXx += , ( )2

22 1 esyy

+= σ

( )3

22 1 esxx

+= σ , ( )41 esxyxy

+= σ so that

( ) ( ) ( ) ( ) ( ) 043210 ===== eEeEeEeEeE and

( )n

C

Yn

eEyy

2

2

2

2

0 ==σ

, ( )n

C

Xn

eExx

2

2

22

1 ==σ

, ( )n

eE122

2

−=

β,

( )n

CC

YXneeE

yxxyρσ

==10 , ( )Yn

eeE

y

2

0320

σ

µ= , ( )

Xn

eeE

y

2

1221

σ

µ= ,

( )Xn

eeE

x

2

3031

σ

µ= , ( )

Xn

eeE

xyσ

µ2141 =

Also, we have

( )

( )( )( ) 1

34

3

2

4

211

1

1 −++=

+

+== eeB

e

e

s

sb

x

xy

x

xy

σ

σ

( )( )....11 2

334 −+−+= eeeB

( )....1 43

2

343 +−++−= eeeeeB



From (3.1.1), writing lrh

y in ei’s (i=0,1,2,3,4), we see that

( ) ( ) ( ){ }[ ]++−+−++−++= 143

2

3430 1....11 eXXeeeeeBeYylrh

( )( )

( )

+−+

+ 0

2

22

23

0

3

11

1 eY

eY

eYh

y

y

σ

σ

( )( ) ( ) ( ) ( ){ }0

1

2

3

0143

2

3430 111....1 eeeYheXeeeeeBeYY +−+++−+−++−++=−

[ ] { }...233..... 2020

2

0

2

2314110 +−+−++++−−=− eeeeeeYheeXBeeXBeXBeYYylrh

(3.2.1)

Taking expectation on both sides, we have bias up to term of order O

(1/n) as follows,

( ) ( )YyEyBiaslrhlrh

−=

( ) ( ) ( ) ( )( ) ( ) ( )

( ) ( )

−

+−+++−−=

20

20

2

0

2

2

3141102

33

eEeE

eeEeEeEYheeEXBeeEXBeEXBeEY

( ){ }yy

y

x

y

CCn

Yh

nn1

2

21

11

21331 γβ

σργ

µσ

µσρ −+−++=

(3.2.2)

Further squaring both sides of (3.2.1) and taking expectation, we have

mean square error of the estimator, up to term of order O(1/n) as follows,

( ) { } ( )[ ]22010 2 eeYheXBeYyMSElrh

−+−=

( ) ( ) ( ) ( ) ( ) ( ){ }+−++−+= 20

2

2

2

0

22

10

2

1

222

0

2

442 eeEeEeEYheeEYXBeEXBeEY

( ) ( ) ( ) ( ){ }212010

2

0 222 eeEXBeeEYeeEXBeEYYh +−−

( )

+

−−

+

−

+−+−+=

yx

y

yy

yy

yyy

YC

CC

n

Yh

CCn

Yh

nnnσσ

µρρ

γ

γ

βσρ

σρ

σ1222

1

2

2

1

2

2

2222

2

2

2

22

44

12



( ) ( )[ ]

+

−−

+−+−+−=

yx

y

yy

yy

y

YC

CC

n

YhCC

n

Yh

nσσ

µρρ

γ

γβσ

ρ1222

1

2

2

1

2

2

222

2

2

22

4411

(3.2.3)

The optimum value of h minimizing the mean square error of lrh

y in

(3.2.3) is given by-

( ){ }

yy

yx

yyy

CCY

YCCC

h

1

2

2

1222

1

2

441

22

γβ

σσ

µρργ

−+−

+−−

−= (3.2.4)

and the minimum mean square error is given by

( ) ( )( ){ }

yy

yx

yyy

y

lrh

CCn

Y

CCCY

nyMSE

1

2

2

2

1222

1

22

2

2

441

22

1γβ

σσ

µρργ

σρ

−+−

+−−

−−=

(3.2.5)

3.3 Estimator based on estimated optimum ∧

c

If the exact or good guess of 2β , 1γ , ρ and 12µ are not available, we

can replace these quantities by their consistent sample estimates ∧

2β , ∧

1γ ,

∧

ρ , ∧

12µ respectively and yY =

∧

in (3.2.4) and get the estimated optimum

value of h denoted by ∧

c as

−+

−

+−−

−=∧∧

∧

∧∧∧

∧

yy

yx

yyy

CC

ssyCCC

c

1

2

2

122

2

1

2

441

22

γβ

µρργ

−+

−

+−−

−=∧∧

∧

∧∧

∧

y

y

y

y

yx

yy

y

y

Cs

Cs

ssyCC

sC

3

032

4

04

122

2

3

032

441

22

µµ

µρρ

µ

(3.3.1)



where

4

042

ys

∧

∧

=µ

β with ( )4

1

04

1∑

=

∧

−=n

i

iyy

nµ

3

031

ys

∧

∧

=µ

γ with ( )3

1

03

1∑

=

∧

−=n

i

iyy

nµ

yx

xy

ss

s=

∧

ρ , ( )( )2

1

12

1∑

=

∧

−−=n

i

iiyyxx

nµ

Thus, incorporating ∧

c in place of h in (3.1.1), we get the estimator based on

the estimated optimum ∧

c as

( )[ ]

−+−+=

∧

ys

CyhxXbyy

y

y

lrc 2

23

(3.3.2)

Let,

( )50303 1 e+=∧

µµ , ( )60404 1 e+=∧

µµ , ( )71212 1 e+=∧

µµ

Also, we have

( )

( )

( )

( )( )

( ) ( )

( )( )( )

( )

( )

( )

( )

+

+−+−

+

+

+++

++

+++

+−

+

+−

−

=∧

y

y

y

y

yx

xy

yx

xy

yy

y

y

C

e

eC

e

e

eeeY

ee

ee

eCC

e

eC

c

2/3

2

3

5032

2

2

4

604

023

22

7124

23

22

2

4

2

2

2/3

2

3

5032

1

1441

1

1

111

11

11

12

1

12

σ

µ

σ

µ

σσ

µσ

σσ

σ

σ

µ



( )

( ) ( )

++−−++−−−

+++−−−

++−+−−

++−−

−

=

yy

yx

yyy

CeeeCe

eeeeeY

eeeCeeCC

...2

314141...21

...1

...)21(2...2

312

5210

2

22

7432012

243

22

521

2

γβ

σσ

µρ

ργ

(3.3.3)

Writing lrc

y in (3.3.2) in terms of ei’s (i=0,1,2,….,7) and after some

simplification, we have

( )( ){ }

( )...2441

22

20

1

2

2

1222

1

2

10 +−−−−

++−−

−−=− eeCC

Y

CCC

eXBeYYy

yy

yx

yyy

lrc

γβ

σσ

µρργ

(3.3.4)

Squaring both sides of (3.3.4), ignoring terms of ei’s greater than two and

taking expectation, we have mean square error of lrc

y to the first degree of

approximation that is, up to terms of order O(1/n) to be

( ) ( )( ){ }

( )

2

20

1

2

2

1222

1

2

10 2441

22

−−−−

+−−

−−= eeCC

Y

CCC

eXBeYEyMSE

yy

yx

yyy

lrc

γβ

σσ

µρργ

( ) ( )( ) ( ){ }

( ) ( ) ( )[ ]−−+−−−

+

−−

+

−+

= 20

2

0

2

22

1

2

2

2

1222

1

2

10

2

1

222

0

2

44441

2

2

eeEeEeE

CC

YC

CC

eeEXYB

eEXBeEY

yy

yx

y

yy

γβ

σσ

µρρ

γ

( ){ }( ) ( ) ( )

( )

+−−

−−−

+−−

10

2021

2

0

1

2

2

1222

1

2

2

2

441

222

eeEXB

eeEYeeEXBeEY

CC

YCCC

yy

yx

yyy

γβ

σσ

µρργ



( )( ){ }

yy

yx

yyy

y

CCn

YCCCY

n 1

2

2

2

1222

1

22

2

2

441

22

1γβ

σσ

µρργ

ρσ

−−−

+−−

−−=

(3.3.5)

which shows that the estimator lrc

y in (3.3.2) based on estimated optimum

∧

c attains the same minimum mean square error of lrh

y in (3.2.6) depending

on optimum value of h in (3.2.4).


a). From (3.2.5), for the optimum value of h , the estimator lrh

y attains

the minimum mean square error given by

( ) ( )( ){ }

yy

yx

yyy

y

lrh

CCn

YCCCY

nyMSE

1

2

2

2

1222

1

22

2

2

441

22

1γβ

σσ

µρργ

ρσ

−−−

+−−

−−=

(3.4.1)

b). From (3.3.5), the estimator lrc

y depending upon estimated optimum ∧

c

has the mean square error

( ) ( )( ){ }

yy

yx

yyy

y

lrc

CCn

YCCCY

nyMSE

1

2

2

2

1222

1

22

2

2

441

22

1γβ

σσ

µρργ

ρσ

−−−

+−−

−−=

(3.4.2)

c). From (3.4.2), we see that the estimator lrc

y depending on estimated

optimum value is always more efficient than the usual linear regression

estimator ( )xXbyylr

−+= in the sense of having lesser mean square

error.



3.5 An Illustration

Considering the data given in Cochran (1977, page 181)

dealing with paralytic polio cases ‘Placebo’ (y) group, computation of


58.2=Y , 92.4=X , 8894.92=

yσ , 644.242

=x

σ

3145181.42 =β ,

015235.4703 =µ , 218892.1=y

C , 7125.0=ρ , 5117629.11 =γ ,

96088.42104 =µ , 0176.5512 =µ , 34=n


( ) 14127714.0=lr

yMSE (3.5.1)

( ) ( ) 11478414.0==lrhlrc

yMSEyMSE (3.5.2)

From (3.5.1) and (3.5.2), the percent relative efficiency (PRE) of the

proposed estimator over the usual regression mean per unit estimator is

123%.

Chapter – IV

“Estimation of Population Mean using

known Coefficient of Variation”

.

ChapterChapterChapterChapter----IVIVIVIV Estimation of Population Mean using known Coefficient of VariationEstimation of Population Mean using known Coefficient of VariationEstimation of Population Mean using known Coefficient of VariationEstimation of Population Mean using known Coefficient of Variation

Ph.D Ph.D Ph.D Ph.D Thesis/Thesis/Thesis/Thesis/ StatisticsStatisticsStatisticsStatistics /2013//2013//2013//2013/Archana ShuklaArchana ShuklaArchana ShuklaArchana Shukla 33

ESTIMATION OF POPULATION MEAN USING KNOWN ESTIMATION OF POPULATION MEAN USING KNOWN ESTIMATION OF POPULATION MEAN USING KNOWN ESTIMATION OF POPULATION MEAN USING KNOWN

COEFFICIENT OF VARIATIONCOEFFICIENT OF VARIATIONCOEFFICIENT OF VARIATIONCOEFFICIENT OF VARIATION

SUMMARY

In this chapter, a regression estimation procedure based on known

coefficient of variation is proposed for the estimation of the population

mean. The bias and mean square error of the proposed estimator are

found. A comparative study with the usual regression estimator of the

population mean has been made.


4.1 Introduction

4.2Bias and Mean Square Error of Proposed Estimator

4.3Estimator with Estimated Optimum Characterizing Scalar


4.5 An Illustration



4.1 Introduction

A regression type estimator using known coefficient of variation is

considered and its properties are studied. There are several instances in

physical, biological and agricultural sciences where the mean is proportional to

standard deviation and consequently the coefficient of variation is known

although the mean and standard deviation may not be known. Some such

situations may be seen in Snedecor (1946), Hald (1952), Davies and

Goldsmith (1976) and Gleser and Healy (1976). The well known Weber’s law

of Psychophysics (see Guilford (1975), chapter 2) provides instances where

coefficient of variation is known and one such example is given in Singh

(1998) also.

Sometimes, simple a priori information in the form of coefficient of

variation is available to the experimenters in the fields of biology, agriculture,

psychometrics etc. Long association of the experimenters with the experimental

material, the experimenters may have at their disposal quite accurate

information concerning the coefficient of variation. This information

concerning coefficient of variation is frequently used to plan experiments,

estimate sample size, average, total, etc (see Searles (1964) also). Further

supporting explanation regarding stable and consistent information about

coefficient of variation may be seen in Cochran (1977, 3rd

edition) on page 77

and page 79 of chapter 4. A good description about knowledge of coefficient of

variation is given in Sukhatme et. al (1984) also on page 42.

Let the variable of interest be y and the auxiliary variable be x

taking the values Yi and Xi respectively for the ith

(i=1,2,……….N) unit

of the population of size N.

Further, let



∑=

=N

i

iY

NY

1

1, ∑

=

=N

i

iX

NX

1

1, ( ) ( )s

i

rN

i

irsYYXX

N−−= ∑

=1

1µ

YC

y

y

σ= ,

X

Cx

x

σ= , 2

02

042

µ

µβ = , 2/3

02

031

µ

µγ =

( )2

1

2 1∑

=

−=N

i

iyYY

Nσ , ( )

2

1

2 1∑

=

−=N

i

ixXX

Nσ

( )( )∑=

−−=N

i

iixyYYXX

N 1

1σ ,

yx

xy

σσ

σρ = , 2

x

xy

Bσ

σ=

Also, let

∑=

=n

i

iy

ny

1

1 , ∑

=

=n

i

ix

nx

1

1

( )2

1

2

1

1∑

=

−−

=n

i

iyyy

ns , ( )

2

1

2

1

1∑

=

−−

=n

i

ixyy

ns

( )( )∑=

−−−

=n

i

iixyyyxx

ns

11

1 ,

2

x

xy

s

sb =

where y1,y2,………,yn are the observations on y and x1,x2,……..xn are

the observations on auxiliary variable x for a simple random sample of

size n.

For estimating the population mean using regression method of

estimation, the proposed estimator is-

( )[ ]

−+−+=

2

2

2

yC

sxXbyy

y

y

lrω

ω (4.1.1)

where ω is the characterizing scalar chosen suitably.



4.2 Bias and Mean Square Error of ωlry

For simplicity, it is assumed that the population size N is large

enough as compared to the sample size n so that finite population

correction term may be ignored.

Let

( )01 eYy += , ( )11 eXx += , ( )2

22 1 esyy

+= σ

( )3

22 1 esxx

+= σ , ( )41 esxyxy

+= σ so that

( ) ( ) ( ) ( ) ( ) 043210 ===== eEeEeEeEeE and

( )n

C

Yn

eEyy

2

2

2

2

0 ==σ

, ( )n

C

Xn

eExx

2

2

22

1 ==σ

, ( )n

eE122

2

−=

β,

( )n

CC

YXn

eeEyxxy

ρσ==10 , ( )

YneeE

y

2

0320

σ

µ= , ( )

XneeE

y

2

1221

σ

µ= ,

( )Xn

eeE

x

2

3031

σ

µ= , ( )

XneeE

xyσ

µ2141 =

Also, we have

( )

( )( )( ) 1

34

3

2

4

211

1

1 −++=

+

+== eeB

e

e

s

sb

x

xy

x

xy

σ

σ

( )( )....11 2

334 −+−+= eeeB

( )....1 43

2

343 +−++−= eeeeeB

From (4.1.1), writing ωlry in ei’s (i=0,1,2,3,4), we see that

( ) ( ) ( ){ }[ ]++−+−++−++= 143

2

3430 1....11 eXXeeeeeBeYylrω

( )

( )

+−+ 2

0

2

2

2

2

2

11

eYYe

y

y

σ

σω

( )( ) ( ) ( ){ }2

02

2

143

2

3430 11....1 eeYeXeeeeeBeYY +−++−+−++−++= ω

or



[ ] { }0

2

02

2

314110 2..... eeeYeeXBeeXBeXBeYYylr

−−+++−−=− ωω (4.2.1)

Taking expectation on both sides, we have bias up to terms of order

O(1/n) to be

( ) ( )YyEyBiaslrlr

−=ωω

( ) ( ) ( ) ( ) ( ) ( ) ( ){ }0

2

02

2

314110 2 eEeEeEYeeEXBeeEXBeEXBeEY −−++−−= ω

nnn

yy

x

y

2

1

11

21 σω

σργ

µσ

µσρ −+= (4.2.2)

Squaring both sides of (4.2.1) and taking expectation, we have mean

square error of ωlry up to terms of order O(1/n) to be

( ) { } ( )[ ]202

2

10 2eeYeXBeYyMSElr

−+−= ωω

( ) ( ) ( ) ( ) ( ) ( ){ }+−++−+= 20

2

0

2

2

42

10

2

1

222

0

2

442 eeEeEeEYeeEYXBeEXBeEY ω

( ) ( ) ( ) ( ){ }1021

2

020

2

222 eeEXBeeEXBeEYeeEYY +−−ω

( )

+

−−

+

−

+−+−+=

2212

2

13

1

2

2

4222

2

2

2

22

44

12

y

yx

yy

yy

yyy

C

Y

CC

n

Y

CCn

Y

nnn ρσσ

µρ

γω

γ

βωσρ

σρ

σ

( ) ( )[ ]

+

−−

+−+−+−=2212

2

13

1

2

2

422

2

2

22

4411y

yx

yy

yy

y

C

Y

CC

n

YCC

n

Y

n ρσσ

µρ

γω

γβωσ

ρ (4.2.3)

The optimum value of ω minimizing the mean square error of ωlry in

(4.2.3) is given by-



( ){ }yy

y

yx

yy

CCY

C

Y

CC

1

2

2

22122

1

441

22

γβ

ρσσ

µργ

ω−+−

+−−

−= (4.2.4)


( ) ( )( ){ }

yy

y

yx

yy

y

lr

CCn

C

Y

CCY

nyMSE

1

2

2

2

22122

1

2

2

2

441

22

1γβ

ρσσ

µργ

σρ

ω−+−

+−−

−−=

(4.2.5)

4.3 Estimator based on estimated optimum

∧

c

If the exact or good guess of 2β , 1γ , ρ and 12µ are not available,

we can replace these quantities by their consistent sample estimates ∧

2β ,

∧

1γ , ∧

ρ , ∧

12µ respectively and yY =

∧

in (4.2.4) and get the estimated

optimum value of ω denoted by ∧

c as

−+

−

+−−

−=∧∧

∧

∧

∧∧

∧

yy

y

yx

yy

CCy

C

ssy

CC

c

1

2

2

2

2

122

1

441

22

γβ

ρµ

ργ

−+

−

+−−

−=∧∧

∧∧

∧

∧

y

y

y

y

y

yx

yy

y

Cs

Cs

y

C

ssy

CCs

3

032

4

04

2

2

122

3

03

441

22

µµ

ρµ

ρµ

(4.3.1)

where



4

042

ys

∧

∧

=µ

β with ( )4

1

04

1∑

=

∧

−=n

i

iyy

nµ

3

031

ys

∧

∧

=µ

γ with ( )3

1

03

1∑

=

∧

−=n

i

iyy

nµ

yx

xy

ss

s=

∧

ρ , ( )( )2

1

12

1∑

=

∧

−−=n

i

iiyyxx

nµ

Thus, incorporating ∧

c in place of ω in (4.1.1), we get the estimator based

on the estimated optimum ∧

c as

( )[ ]

−+−+=

∧ 2

2

2

yC

scxXbyy

y

y

lrc

(4.3.2)

Let

( )50303 1 e+=∧

µµ , ( )60404 1 e+=∧

µµ , ( )71212 1 e+=∧

µµ

Also, we have

( )

( )

( ) ( )

( )( )( )

( )

( )( )

( )( )

( )

( )

( )

+

+−+−

+

++

++

+

++++

++−−

+

+

−

=∧

y

y

y

y

yx

xy

y

yx

xy

yy

y

C

e

eC

e

eeY

ee

eC

eeeY

eeCC

e

e

c

2/3

2

3

5032

2

2

4

6040

23

22

2

4

2

2

023

22

71242

2/3

2

3

503

1

1441

1

11

11

12

111

112

1

1

σ

µ

σ

µ

σσ

σ

σσ

µσ

σ

µ

( )

( ) ( ) ( )

+++−−+++−−+−

+−+−

++++−−−−−

++−

−

=

yy

y

yx

yy

CeeeeCeeeY

eeeC

eeeee

Y

CeeC

...2

314141...21

...)21(2

...12...2

31

05210

2

0022

243

22

74320122

521

γβ

ρ

σσ

µργ

(4.3.3)



Writing lrc

y in (4.3.2) in terms of ei’s (i=0,1,2,….,7) and after some

simplification, we have

( )( ){ }

( )...2441

22

02

1

2

2

22122

1

10 +−−+−

+−−

−−=− eeCC

C

Y

CC

eXBeYYy

yy

y

yx

yy

lrc

γβ

ρσσ

µργ

(4.3.4)


taking expectation, we have mean square error of lrc


approximation that is, up to terms of order O(1/n) to be

( ) ( )( ){ }

( )

2

02

1

2

2

22122

1

10 2441

22

−−+−

+−−

−−= eeCC

C

Y

CC

eXBeYEyMSE

yy

y

yx

yy

lrc

γβ

ρσσ

µργ

( ) ( )( ) ( ){ }

( ) ( ) ( )[ ]−−+−+−

+

−−

+

−+

= 20

2

0

2

22

1

2

2

2

2212

2

1

10

2

1

222

0

2

44441

2

2

eeEeEeE

CC

CY

CC

eeEXYB

eEXBeEY

yy

y

yx

yy

γβ

ρσσ

µρ

γ

( ){ }

( ) ( )( ) ( )

+

−−

−+−

+−−

1021

2

020

1

2

2

22122

1

2

2

441

222

eeEXBeeEXB

eEYeeEY

CC

C

Y

CC

yy

y

yx

yy

γβ

ρσσ

µργ

( )( ){ }

yy

y

yx

yy

y

CCn

C

Y

CCY

n 1

2

2

2

22122

1

2

2

2

441

22

1γβ

ρσσ

µργ

ρσ

−+−

+−−

−−=

(4.3.5)

which shows that the estimator lrc


∧

c attains the same minimum mean square error of ωlry in (4.2.6) depending

on optimum value of ω in (4.2.4).




a). From (4.2.5), for the optimum value of ω , the estimator ωlry attains


( ) ( )( ){ }

yy

y

yx

yy

y

lr

CCn

C

Y

CCY

nyMSE

1

2

2

2

22122

1

2

2

2

441

22

1γβ

ρσσ

µργ

ρσ

ω−+−

+−−

−−=

(4.4.2)

b). From (4.3.5), the estimator lrc


c


( ) ( )( ){ }

yy

y

yx

yy

y

lrc

CCn

C

Y

CCY

nyMSE

1

2

2

2

22122

1

2

2

2

441

22

1γβ

ρσσ

µργ

ρσ

−+−

+−−

−−=

(4.4.3)

c). From (4.4.3), we see that the estimator lrc

y depending on estimated

optimum value is always more efficient than the usual linear regression

estimator ( )xXbyylr

−+= in the sense of having lesser mean square

error.

d). The use of proposed estimator is limited for the situations when coefficient

of variation is known. However , in case of unknown coefficient of variation its

estimated value may be used after studying the performance of the estimator

(robustness) against different values of CV, if the guess is in error say 5%,

10%, 15%, 20%, 25%, 50%. Further work is being done in this direction.

4.5 An Illustration

We observe that the conditions discussed in the introduction for

known coefficient of variation are satisfied for the data given in Walpole

R.E. , Myers R.H., Myers S.L. and Ye K. (2005, page 473) dealing with

measure of aerobic fitness is the oxygen consumption in volume per unit body



weight per unit time. Thirty-one individuals were used in an experiment in

order to be able to model oxygen consumption (y) against time to run one and

half miles (x). Computation of required values have been done and we

have the following

37581.47=Y , 58613.10=X , 46392.272=

yσ , 86282.12

=x

σ

34559.32 =β ,

71969.5903 =µ , 11062.0=y

C , 86219.0−=ρ , 35772.212 −=µ ,

31=n


( ) 88593.0=yMSE

( ) 22735.0=lr

yMSE

( ) ( ) 192428.0==ωlrlrc

yMSEyMSE

From above, the percent relative efficiency (PRE) of the proposed

estimator lrc

y over the mean per unit estimator y and usual linear

regression estimator lr

y are 460% and 118% respectively, showing that

the enhanced efficiency of the proposed estimator .

The Percent Relative Efficiency (PRE) of the proposed estimator over the

Estimators y lr

y

Percent Relative Efficiency 460% 118%

Chapter – V

“An Improvement in Linear Regression

Estimator of Finite Population Mean

Using Known Coefficient of Variation “

ChapterChapterChapterChapter----V V V V An Improvement in Linear Regression Estimator of Finite An Improvement in Linear Regression Estimator of Finite An Improvement in Linear Regression Estimator of Finite An Improvement in Linear Regression Estimator of Finite …………


AN IMPROVEMENT IN LINEAR REGRESSION ESTIMATOR AN IMPROVEMENT IN LINEAR REGRESSION ESTIMATOR AN IMPROVEMENT IN LINEAR REGRESSION ESTIMATOR AN IMPROVEMENT IN LINEAR REGRESSION ESTIMATOR

OFOFOFOF FINITE POPULATIONFINITE POPULATIONFINITE POPULATIONFINITE POPULATION MEAN USING KNOWN MEAN USING KNOWN MEAN USING KNOWN MEAN USING KNOWN

COEFFICIENTCOEFFICIENTCOEFFICIENTCOEFFICIENT OFOFOFOF VARIATION VARIATION VARIATION VARIATION

SUMMARY

In this chapter, for estimation of population mean, an improved regression type

estimator with known coefficient of variation of study variable is proposed, its

bias and mean square error are found and comparative study with the

usual linear regression estimator is made theoretically as well as

numerically also, based on an empirical illustration.


5.1 Introduction


5.3 Estimator with Estimated Optimum characterizing scalar


5.5 An Illustration



5.1 Introduction

Let the variable of interest be y and the auxiliary variable be x

taking the values Yi and Xi respectively for the ith

(i=1,2,……….N) unit

of the population of size N.

Further, let

∑=

=N

i

iY

NY

1

1, ∑

=

=N

i

iX

NX

1

1, ( ) ( )s

i

rN

i

irsYYXX

N−−= ∑

=1

1µ

YY

Cy

y

σµ==

02

, XX

Cx

x

σµ==

20

, 2

02

042

µ

µβ = , 2/3

02

031

µ

µγ =

( )2

1

2 1∑

=

−=N

i

iyYY

Nσ , ( )

2

1

2 1∑

=

−=N

i

ixXX

Nσ

( )( )∑=

−−=N

i

iixyYYXX

N 1

1σ ,

yx

xy

σσ

σρ = ,

x

y

x

xy

Bσ

σρ

σ

σ==

2

For y1,y2,………..yn being the sample observations on study variable y and

x1,x2,……….xn being the sample observations on auxiliary variable x in a

simple random sample of size n without replacement, let

∑=

=n

i

iy

ny

1

1 , ∑

=

=n

i

ix

nx

1

1

( )2

1

2

1

1∑

=

−−

=n

i

iyyy

ns , ( )

2

1

2

1

1∑

=

−−

=n

i

ixyy

ns

( )( )∑=

−−−

=n

i

iixyyyxx

ns

11

1 ,

2

x

xy

s

sb =

The proposed improved regression type estimator for estimating the

population mean using known coefficient of variation Cy , is

( ){ }

−+−+=

2

22

1y

y

r

C

syxXbyy α

α

(5.1.1)



where α is the characterizing scalar to be chosen suitably.

5.2 Bias and Mean Square Error of αry

For simplicity, it is assumed that the population size N is large

enough as compared to the sample size n so that finite population

correction term may be ignored.

Let

( )01 eYy += , ( )11 eXx += , ( )2

22 1 esyy

+= σ

( )3

22 1 esxx

+= σ , ( )41 esxyxy

+= σ so that

( ) ( ) ( ) ( ) ( ) 043210 ===== eEeEeEeEeE and

( )n

C

Yn

eEyy

2

2

2

2

0 ==σ

, ( )n

C

Xn

eExx

2

2

22

1 ==σ

, ( )n

eE122

2

−=

β,

( )n

CC

YXn

eeEyxxy

ρσ==10 , ( )

Yn

eeE

y

2

0320

σ

µ= , ( )

Xn

eeE

y

2

1221

σ

µ= ,

( )Xn

eeE

x

2

3031

σ

µ= , ( )

Xn

eeE

xyσ

µ2141 =

Also, we have

( )

( )( )( ) 1

34

3

2

4

211

1

1 −++=

+

+== eeB

e

e

s

sb

x

xy

x

xy

σ

σ

( )( )....11 2

334 −+−+= eeeB

( )....1 43

2

343 +−++−= eeeeeB

From (5.1.1), writing αry in ei’s (i=0,1,2,3,4), we see that

( ) ( ) ( ){ }[ ]*1....11 143

2

3430 eXXeeeeeBeYyr

+−+−++−++=α

( )( )

+

−++2

2

2

2

2

0

2 111

y

yYe

eYσ

σα



( ){ } ( ){ }20

2

0

22

31413110 21.... eeeYeeeeeeeXBeYY −+++−−+−++= α

( ) ( )++−+−−+−−++=− .....23 2

31314112020

2

0

3

0 eeeeeeeXBeeeeeYeYYyr

αα

( )....2 2

011021

2

−−− eeeeeeYXBα (5.2.1)

Taking expectation on both sides, we have bias up to terms of order O

(1/n) to be

( ) ( )YyEyBiasrr

−=αα

( ) ( ) ( ) ( ) ( ){ }( ) ( )

( )+

−

+−−+−−++=

2

3131

411

2020

2

0

3

0 23eeeeE

eeEeE

XBeeEeEeEeEYeEY α

( ) ( ){ }1021

2

2 eeEeeEYXB −α

( )

−+−+

−=2212

2

032211 23

1y

yxy

y

xy

yY

C

Y

CY

nBx

nσρ

σ

µρ

µσ

α

σ

µγρσ (5.2.2)

where ( )2

301

x

xσ

µγ = .

Again, squaring both sides of (2.1) and taking expectation, we have

mean square error of αry up to terms of order O(1/n) to be

( ) ( )2

YyEyMSErr

−= αα

( ){ }2

20

3

10 2 eeYeXBeYE −+−= α

+

−−+

−

++−+=

2110

20

2

03

20

2

2

2

062

10

2

1

222

0

2

2

22

4

42

eeXBeeXB

eeYeYY

eee

eYeeYXBeXBeYE αα

( )( )

+

−−

+

−−+

+−=

yx

y

yy

y

yy

Y

C

CC

n

Y

C

C

n

Y

nσσ

ρµρ

γα

γ

βαρ

σ1222

1

2

4

1

2

2622

2

2

22

4

141

(5.2.3)



The optimum value of α minimizing the mean square error of αry in

(5.2.3) is given by

( ){ }yy

yx

yyy

CCY

Y

CCC

1

2

2

2

1222

1

2

441

22

γβ

σσ

µρργ

α−+−

+−−

−=

(5.2.4)

and the minimum mean square error of αry is

( ) ( )( ){ }

yy

yx

yyy

y

r

CCn

Y

CCCY

nyMSE

1

2

2

2

1222

1

22

2

2

441

22

1γβ

σσ

µρργ

σρ

α−+−

+−−

−−= (5.2.5)

5.3Estimator Based On Estimated Optimum∧

α

For situation when values of 2β , 1γ and 12µ or their good guessed values

are not available, the alternative is to replace 2β , 1γ , 12µ and Y involved in

the optimum α by their estimates 2

∧

β , 1

∧

γ , 12

∧

µ and y based on sample values

and get the estimated optimum value of α denoted by ∧

α to be

−+

−

+−−

−=∧∧

∧∧

∧∧∧∧

∧

yy

yx

yyy

CCy

y

CCC

1

2

2

2

1222

1

2

441

22

γβ

σσ

µρργ

α (5.3.1)

Where, from Cramer (1946)

2

02

042

∧

∧

∧

=

µ

µβ with

( )( )( )∑

=

∧

−+−−

=n

i

iyy

nnn

n

1

4

2

2

04331

µ

( )∑=

∧

−−

==n

i

iyyy

ns

1

22

021

1µ



2/3

02

031

∧

∧

∧

=

µ

µγ with

( )( )( )∑

=

∧

−−−

=n

i

iyy

nn

n

1

3

0321

µ and 3

2/3

02 ys=

∧

µ

xx

xy

∧∧

∧∧

=

σσ

σρ with

xyxy s=∧

σ , x

x s=∧

σ and yy s=

∧

σ

and ( )( )∑=

∧

−−=n

i

iiyyxx

n 1

2

12

1µ

Thus, replacing α by estimated optimum ∧

α in the estimator αry in (5.1.1) ,

we get for wider practical utility of the estimator based on the estimated

optimum ∧

α given by

( ){ }

−+−+=

∧

2

22

1y

y

r

C

syxXbyy

e

αα

(5.3.2)

To find the bias(e

ry

α) and mean square error of

er

yα

, let

( )50303 1 e+=∧

µµ , ( )60404 1 e+=∧

µµ , ( )71212 1 e+=∧

µµ so that

( )

( )

( )

( )( )

( ) ( )

( )( )( )

( )( )

( )

( )

( )

+

+−+−

+

++

+++

++

+++

+−

+

+−

−

=∧

y

y

y

y

yx

xy

yx

xy

yy

y

y

C

e

eC

e

eeY

eeeY

ee

ee

eCC

e

eC

2/3

2

3

5032

2

2

4

6042

0

2

023

22

7124

23

22

2

4

2

2

2/3

2

3

5032

1

1441

1

11

111

11

11

12

1

12

σ

µ

σ

µ

σσ

µσ

σσ

σ

σ

µ

α



( )

( ) ( ) ( )

+++−

−+++++−++++−

+++−−−

+++−+−−

−++−−

−

=

y

y

yx

yyy

Ceee

eeCeeeeee

Y

eeeee

Y

eeeeCeeeCC

...22

314

21421...221

...1

...)21(2...8

15

2

312

0521

0

2

0

2

0

2

0

2

006222

7432012

2

2243

222

2521

2

γ

β

σσ

µρ

ργ

(5.3.3)

Now, putting this value of ∧

α in equation (5.3.2) we have,

( )( ){ }

( )...2441

2

2

...2

020

1

2

2

1222

1

2

413110 eeeCC

Y

C

CC

eeeeeXBeYYy

yy

yx

y

yy

re

+−−+−

+

−−

−+−+−+=−γβ

σσ

µρρ

γ

α

(5.3.4)

Taking expectation of (5.3.4) and ignoring terms of e’s greater than two, we

can easily check that the bias of e

ry α

is of order O(1/n); hence , the bias (e

ry α

)

is negligible for sufficiently large value of n, that is, the estimator e

ry α

is

approximately unbiased estimator of the population mean Y . Further squaring

both sides of (5.3.4) and taking expectation upto terms of order O(1/n)

( ) ( )( ){ }

yy

yx

yyy

y

r

CCn

Y

CCCY

nyMSE

e

1

2

2

2

1222

1

22

2

2

441

22

1γβ

σσ

µρργ

σρ

α−+−

+−−

−−= (5.3.5)

which is same as mean square error for the optimum α , that is, estimator e

ry α


α attains the same minimum mean square error

as that of the estimator αry based on optimum α .




a) For the optimum value α , it is clear in (5.2.5), that the estimator αry

attains the minimum mean square error

( ) ( )( ){ }

yy

yx

yyy

y

r

CCn

Y

CCCY

nyMSE

1

2

2

2

1222

1

22

2

2

441

22

1γβ

σσ

µρργ

σρ

α−+−

+−−

−−= (5.4.1)

b) The estimator αry with optimum value α and the estimator

er

y α


α have same mean square error given by

( ) ( )αα rr

yMSEyMSEe

=

( )( ){ }

yy

yx

yyy

y

CCn

Y

CCCY

n 1

2

2

2

1222

1

22

2

2

441

22

1γβ

σσ

µρργ

σρ

−+−

+−−

−−=

( )( ){ }

yy

yx

yyy

lr

CCn

Y

CCCY

yMSE

1

2

2

2

1222

1

22

441

22

γβ

σσ

µρργ

−+−

+−−

−= (5.4.2)

which shows that estimators e

ry α

or αry based on estimated optimum

or optimum value are more efficient than the linear regression

estimator lr

y in the sense of having lesser mean square error.

c) For normal parent population (i.e. 01 =γ , 32 =β and 012 =µ ) , the

optimum value α from (5.2.4) and estimated optimum value ∧

α from

(5.3.1) respectively, reduce to



( )

( )12

1

22

22

+

−−=

y

y

CY

C ρα and

( )12

1

22

2

2

+

−

−=

∧

∧

y

y

Cy

C ρ

α

for which MSE( αry ) and MSE(

er

y α) become

( ) ( ) ( ){ }

{ }12

121

2

22422

2

+

−−−==

y

yy

rr

Cn

CY

nyMSEyMSE

e

ρσρ

αα

showing that the proposed estimator e

ry α

is more efficient than lr

y in

normal parent population also.

5.5 An Illustration

Considering the data given in Walpole R.E., Myers R.H., Myers S.L.

and Ye K. (2005, page 473) dealing with measure of aerobic fitness is the

oxygen consumption in volume per unit body weight per unit time. Thirty-one

individuals were used in an experiment in order to be able to model oxygen

consumption (y) against time to run one and half miles (x). Computation of


37581.47=Y , 58613.10=X , 46392.272=

yσ , 86282.12

=x

σ

34559.32 =β ,

71969.5903 =µ , 11062.0=y

C

71969.591 =γ , 46629.252304 =µ

86219.0−=ρ

35772.212 −=µ , 31=n


( ) 22735.0=lr

yMSE

( ) ( ) 19033.0== αα rryMSEyMSE

e




estimator over the usual linear regression estimator is 119%.

Chapter – VI

“An Improved Regression Type Estimator

of Population Mean Using Auxiliary

Information”

ChapterChapterChapterChapter----VI VI VI VI An Improved Regression Type Estimator of Population An Improved Regression Type Estimator of Population An Improved Regression Type Estimator of Population An Improved Regression Type Estimator of Population …………


AN IMPROVED REGRESSION TYPE ESTIMATOR OF AN IMPROVED REGRESSION TYPE ESTIMATOR OF AN IMPROVED REGRESSION TYPE ESTIMATOR OF AN IMPROVED REGRESSION TYPE ESTIMATOR OF

POPULATION MEAN USING AUXILIARY INFORMATIONPOPULATION MEAN USING AUXILIARY INFORMATIONPOPULATION MEAN USING AUXILIARY INFORMATIONPOPULATION MEAN USING AUXILIARY INFORMATION

SUMMARY

In this chapter, using auxiliary information, improved regression type

estimator over the usual linear regression is proposed; its bias and mean square

error of the proposed estimator are found. Comparative study with the usual

linear regression estimator is made theoretically as well as numerically.


6.1 Introduction




6.5 An Illustration



6.1 Introduction

Let y be the study variable and x be the auxiliary variable taking values

Yi and Xi respectively for the ith

(i=1,2,………N) unit of the population of size

N.

Further let,

∑=

=N

i

iY

NY

1

1 , ∑

=

=N

i

iX

NX

1

1

( ) ( )s

i

N

i

r

irsYYXX

N−−= ∑

=1

1µ

( )∑= −

=−−

=N

i

yiy

N

NYY

NS

1

222

11

1σ

( ) 2

1

22

1

1x

N

i

ix

N

NXX

NS σ

−=−= ∑

=

( )( )∑=

−−−

=N

i

iixyYYXX

NS

11

1

yx

xy

SS

S=ρ ,

x

y

x

xy

S

S

S

SB ρ==

2,

2

20

40

2µ

µβ =

x

For y1,y2,…….,yn being the sample observations on study variable y and

x1,x2,………,xn being the sample observations on auxiliary variable x in a

simple random sample of size n without replacement, let

∑=

=n

i

iy

ny

1

1 , ∑

=

=n

i

ix

nx

1

1

( )∑=

−−

=n

i

iyyy

ns

1

22

1

1 , ( )∑

=

−−

=n

i

ixxx

ns

1

22

1

1

( )( )∑=

−−−

=n

i

iixyyyxx

ns

11

1 and

2

x

xy

s

sb =

Using auxiliary information ( )2,x

SX on auxiliary variable x, the proposed

regression type estimator for estimating the population mean Y is,



( ) ( )22

xxlSsxXbyy −+−+= θ

θ (6.1.1)

where θ is the characterizing scalar to be determined suitably.

6.2 Bias and Mean Square Error of θly


as compared to the sample size n so that finite correction term may be ignored.

Now, let

( )01 eYy += , ( )11 eXx += , ( )2

22 1 eSsxx

+= , ( )31 eSsxyxy

+=

so that ( ) ( ) ( ) ( ) 03210 ==== eEeEeEeE and

( )2

2

2

0

Yn

SeE

y

= , ( )2

2

2

1

Xn

SeE

x= , ( ) ( )11

2

2

2 −=x

neE β , ( )

YXn

SeeE

xy

=10 , ( )YnS

eeE

x

2

2120

µ= ,

( )XnS

eeE

x

2

30

21

µ= , ( )

XnS

eeE

xy

2131

µ=

Also, we have

( )

( )( )( ) ( )......111

1

132

2

232

1

23

2

2

3

2+−++−=++=

+

+==

−eeeeeBeeB

eS

eS

s

sb

x

xy

x

xy

(6.2.1)

Using (2.1) in (1.1) and writing θl

y in ei’s, we have

( ) ( ) ( ){ } ( ){ }2

2

2

132

2

2320 11........11xxlk

SeSeXXeeeeeBeYy −+++−+−++−++= θ

( ) 2

22

21312110 ........ eSeeeeeeeXBeYYx

θ++−−+−++=

( ) 2

22

21312110 ........ eSeeeeeeeXBeYYyxlk

θ++−−+−+=− (6.2.2)



Taking expectation on both sides of (6.2.2), we have bias up to terms of order

O(1/n) to be

( ) ( )YyEyBiasll

−=θθ

( ) ( ) ( ) ( )[ ] ( )2

2

312110 eESeeEeeEeEXBeEYx

θ+−+−+=

−=

11

21

20

30

µ

µ

µ

µ

n

B (6.2.3)

which is exactly equal to the bias of the usual linear regression estimator

( )xXbyyl

−+= .

Again, squaring both sides of (6.2.2) and taking expectation, we have mean

square error of θl

y up to terms of order O(1/n) to be

( ) [ ]22

2

10 eSeXBeYEyMSExlk

θ+−=

( ) ( ) ( )[ ] ( ) ( ) ( )[ ]2120

22

2

42

10

2

1

222

0

2

22 eeEXBeeEYSeESeeBEYXeEXBeEYxx

−++−+= θθ

( )

−+−+−+=2

30

2

212

2

42

2

22

2

2

2

2

22

2121

xx

xxx

yx

x

yx

x

yy

SX

XB

SY

YSS

YX

SS

S

SYX

X

SX

S

S

Y

SY

n

µµθβθρρρ

( ) { } ( )30212

422

22

11 µµθ

βθ

ρ Bnn

S

n

S

x

xy

−+−+−= (6.2.4)

The optimum value of θ minimizing the mean square error of θl

y in (6.2.4) is

given by

( )

( )1

1

2

3021

4−

−−=

xx

o

B

S β

µµθ (6.2.5)

and the minimum mean square error of θl

y is



( ) ( ) ( )( )1

11

2

2

3021

4

2

2

−

−−−=

xx

y

l

B

nSn

SyMSE

o β

µµρ

θ (6.2.6)

6.3Estimator Based on Estimated Optimum∧

c

For situations when values of 21µ , 30µ , B and x2β or their good guessed

values are not available, the alternative is to replace 21µ , 30µ , B and x2β

involved in the optimum o

θ by their estimates ∧

21µ , ∧

30µ , ∧

B and ∧

x2β based on

sample values and get the estimated optimum value of o

θ to be

−

−

−=∧

∧∧

∧

1

1

2

3021

4

x

x

b

sc

β

µµ

(6.3.1)

where,

∧

∧

∧

=2

20

402

µ

µβ

x with ( )∑

=

∧

−=n

i

iXx

n 1

4

40

1µ , ( )∑

=

∧

−==n

i

ixXx

ns

1

22

20

1µ

( )∑=

∧

−=n

i

iXx

n 1

3

30

1µ , ( ) ( )∑

=

∧

−−−

=n

i

iiYyXx

n 1

2

211

1µ and bB =

∧

Thus, replacing θ by estimated optimum ∧

c in the estimator θl

y in (6.1.1), we

get the estimator lc

y based on the estimated optimum ∧

c given by

( ) ( )22

xxlcSscxXbyy −+−+=

∧

(6.3.2)

For wider practical utility. Let,

( )42121 1 e+=

∧

µµ,

( )53030 1 e+=∧

µµ ,

( )64040 1 e+=∧

µµ



( )

( ) ( ) ( )[ ]( )

( )

−

+

+

++−++−−+

+−=

∧

11

1

1...........11

1

1

2

2

2

20

640

53032

2

232421

2

2

4

e

e

eeeeeeBe

eS

c

x

µ

µ

µµ

( )( )

( )

−

−−+−

−

++−++−−+−

−=

1

211

.......1

2

2

2

262

2

4

3021

535253230421

3021

x

x

xx

eeeS

B

eeeeeeeBeB

β

ββ

µµ

µµµµ

( )( )

( )

+−

++−++−−+

−

−−−

−

−−= .......

.......

1

21

1 3021

535253230421

2

2

2

262

2

4

3021

µµ

µµ

β

β

β

µµ

B

eeeeeeeBeeee

S

B

x

x

xx

(6.3.3)

Now, putting the value of ∧

c in equation (3.2), we have,

( )−+−−+−+=− ........2

21312110 eeeeeeeXBeYYylc

( )

( )

( )

+−

−++−−+

−

−−

−

−.......

.......

1

2

1 3021

5232

2

2304221

2

2

26222

2

2

3021

µµ

µµ

β

β

β

µµ

B

eeeeeBeeeeee

S

B

x

x

xx

(6.3.4)


taking expectation, we have mean square error of lc


approximation, that is up to terms of order O(1/n) to be

( ) ( )

( )

2

2

2

2

3021

101

−

−−−= e

S

BeXBeYEyMSE

xx

lc

β

µµ

( ) ( ) ( )( )

( )( )

( )

( )( ) ( ){ }2120

2

3021

2

22

2

2

3021

10

2

1

222

0

2

12

12

eeEXBeeEYB

eEB

eeEYXBeEXBeEY

x

x

−−

−

−−

−+−+=

β

µµ

β

µµ



( ) ( )

( )1

11

2

2

3021

4

2

2

−

−−−=

xx

y B

nSn

S

β

µµρ

(6.3.5)

which shows that the estimator lc


∧

c attains the same minimum mean square error of θl

y in (6.2.6) depending

on optimum value o

θ in (6.2.5).


a). From (6.2.6), for the optimum value of o

θ , the estimator θl

y attains


( ) ( ) ( )

( )1

11

2

2

3021

4

2

2

−

−−−=

xx

y

l

B

nSn

SyMSE

o β

µµρ

θ

(6.4.1)

b). From (6.3.5), the estimator lc


c


( ) ( ) ( )

( )1

11

2

2

3021

4

2

2

−

−−−=

xx

y

lc

B

nSn

SyMSE

β

µµρ

( 6.4.2)

c). From (6.4.1) or (6.4.2), we see that the estimator lc

y depending on

estimated optimum value is always more efficient than the usual linear

regression estimator ( )xXbyyl

−+= for non symmetrical population in the

sense of having lesser mean square error whereas for symmetrical

population or distribution, both lc

y and l

y are equally efficient.

6.5 An Illustration

Considering the data given in Cochran [1] (1977, page-152) dealing with

the number of inhabitants of 49 cities drawn from the population of 196 large



cities in two different years 1920 (x) and 1930 (y), computation of required

values have been done and we have the following

49=n , 8325.151582=

yS , 9817.0=ρ , 303.299866621 =µ , 933.241163230 =µ ,

1577.1=B , 208.72 =x

β

Using the required values we have

( ) 364.309=yMSE

( ) 1936.11=l

yMSE

( ) 0123.10=lc

yMSE


estimator lc

y and usual linear regression estimator l

y over the mean per

unit estimator y are 2764% and 3090% respectively, showing that the

enhanced efficiency of the proposed estimator .


Estimators y l

y lc

y

PRE 100% 2764% 3090%

Chapter-VII

“An Improved Separate Regression-Type

Estimator of Population Mean”

ChapterChapterChapterChapter----VII VII VII VII An Improved Separate RegressionAn Improved Separate RegressionAn Improved Separate RegressionAn Improved Separate Regression----Type EstimatorType EstimatorType EstimatorType Estimator…………


AN IMPROVED SEPARATE REGRESSIONAN IMPROVED SEPARATE REGRESSIONAN IMPROVED SEPARATE REGRESSIONAN IMPROVED SEPARATE REGRESSION----TYPE ESTIMATOR TYPE ESTIMATOR TYPE ESTIMATOR TYPE ESTIMATOR

OF POPULATION MEANOF POPULATION MEANOF POPULATION MEANOF POPULATION MEAN

SUMMARY

In this chapter, for the estimation of Finite Population mean, an improved

separate regression-type estimator under stratified random sampling is

proposed, its bias and the mean square error are found. Enhancing the practical

utility of the optimum estimator, an estimator depending upon estimated

optimum value based on sample observations is also found. Further

comparative study has been done with some earlier estimators theoretically as

well as numerically.


7.1 Introduction




7.5 An Illustration



7.1 Introduction

Stratified random sampling is used to improve the precision of estimator

when population is heterogeneous. Stratification is the process of dividing

members of the population into homogeneous subgroups before doing actual

sampling. The strata are made mutually exclusive: such that every element in

the population is assigned to only one stratum and is also exhaustive. No

population element is excluded. This improves the representativeness of the

sample by reducing sampling error. It produces a weighted mean that has less

variability than the arithmetic mean of a simple random sample of the

population. And then a sample is drawn from each stratum by simple random

sampling without replacement according to definite allocation plan. The use of

auxiliary variable x when it is correlated with the study variable y further

increases the precision of the estimator.

We assume that the population consists of N units, which can be partitioned

into L strata of sizes N1, N2….NL such that ∑=

=L

h

hNN

1

. Let ( )hihi

XY , ;

(i=1,2…...Nh) denote the values of the variates (y, x) respectively for the ith

unit

in hth

stratum and hY and hX denote strata means. The strata weights are

( )LhN

NW

h

h.......2,1, == .

Further let,

∑=

=L

h

hhYWY

1

(population mean of the study variable y)

∑=

=L

h

hh

XWX

1

(population mean of the auxiliary variable x)

( )∑=

−−

=h

N

i

hhi

h

hyYY

NS

1

22

1

1



( )∑=

−−

=h

N

i

hhi

h

hxXX

NS

1

22

1

1

( )( )∑=

−−−

=h

N

i

hhi

hhi

h

hxyYYXX

NS

11

1

( ) ( )s

hhi

rN

i

hhi

h

rsYYXX

N

h

−−= ∑=1

1µ

hyhx

hxy

h

SS

S=ρ , ( )

( )

( )2

20

40

2

hx

hx

hx

µ

µβ = ,

hx

hy

h

hx

hxy

h

S

S

S

SB ρ==

2

A simple random sample of size n is drawn without replacement under

proportional allocation from each of the L strata i.e. ( )L

nnnn ....., 21= , h

n

denoting the number of units in the sample is drawn from the hth

stratum , such

that hh

NN

nn =

and nn

L

h

h=∑

=1

. Let, the means of the study variable y and

auxiliary variable x of the h

n sample units drawn from the hth

stratum whose

size h

N is assumed to be known are ∑=

=h

n

i

hi

h

hy

ny

1

1 and ∑

=

=h

n

i

hi

h

h xn

x

1

1

respectively. Also let

( )∑=

−−

=h

n

i

hhi

h

hyyy

ns

1

22

1

1

( )∑=

−−

=h

n

i

hhi

h

hxxx

ns

1

22

1

1

( )( )∑=

−−−

=h

n

i

hhih

hi

h

hxyyyxx

ns

11

1

2

hx

hxy

h

s

sb =



The proposed separate regression-type estimator under stratified random

sampling and using auxiliary variate x for the estimation of population mean of

the study variate y is

φφ hl

L

h

hlsyWy ∑

=

=1 (7.1.1)

where ( ){ } ( )22

hxhxhh

hhhlSsxXbyy −+−+= φφ (7.1.2)

where φ is a characterizing scalar to be choosen suitably.

7.2 Bias and Mean Square Error of φlsy


as compared to the sample size so that finite population correction (f.p.c.) term

may be ignored.

Let

( )01 eYy h

h+= , ( )11 eXx hh += , ( )2

22 1 eSshxhx

+= , ( )31 eSshxyhxy

+=

so that ( ) ( ) ( ) ( ) 03210 ==== eEeEeEeE and

( )2

2

2

0

hh

hy

Yn

SeE = , ( )

2

22

1

hh

hx

Xn

SeE = , ( ) ( )( )1

12

2

2 −=hx

neE β , ( )

hhh

hxy

YXn

SeeE =10 ,

( ) ( )

hhxh

hx

YSn

eeE2

21

20

µ= , ( ) ( )

hhxh

hx

XSn

eeE2

30

21

µ= , ( ) ( )

hhxyh

hx

XSn

eeE21

31

µ=

Also we have,

( )

( )( )....1

1

132

2

232

2

2

3

2+−++−=

+

+== eeeeeB

eS

eS

s

sb

h

hx

hxy

hx

hxy

h



From (7.1.2), writing φhly in terms of ei’s, we have

( ) ( ) ( ){ }[ ] ( ){ }2

2

2

1320 11...11hxhx

hhh

hhl

SeSeXXeeBeYy −+++−++−++= φφ

( ) ( )[ ] ( )11...1 2

2

312110 −++++−−+= eSeeeeeXBeYhx

hh

h φ

( ) ( )[ ] 2

2

312110 ...1 eSeeeeeXBeYhx

hh

h φ+++−−+= (7.2.1)

Using (7.2.1) in (7.1.2) and we get

( ){ }2

2

312110

1

... eSeeeeeXBeYYWyhx

hh

hh

L

h

hlsφ

φ+++−−+=∑

=

( )[ ]2

2

312110

11

... eSeeeeeXBeYWYWhx

hh

h

L

h

h

L

h

hh

φ+++−−+= ∑∑==

( )[ ]2

2

312110

1

... eSeeeeeXBeYWYhx

hh

h

L

h

hφ+++−−+= ∑

=

( ) ( )[ ]2

2

312110

1

... eSeeeeeXBeYWYyhx

hh

h

L

h

hlsφ

φ+++−−=− ∑

= (7.2.2)

Taking expectation on both sides, we have the bias up to term of order O(1/n)

to be

( ) ( )YyEyBiaslsls

−= φφ

( ) ( ) ( ) ( ){ } ( )[ ]2

2

312110

1

eESeeEeeEeEXBeEYWhx

hh

h

L

h

hφ++−−=∑

=

( )

( )

( )

( )

−=∑

= hx

hx

hx

hx

h

h

L

h

h

n

BW

11

21

20

30

1 µ

µ

µ

µ

(7.2.3)

Again squaring both sides of (7.2.2) and taking expectation, we have mean

square error of φlsy up to terms of order O(1/n) to be



( ) ( )2

YyEyMSElsls

−=φφ

( ){ }2

2

2

10

1

+−= ∑

=

eSeXBeYWEhx

hh

h

L

h

hφ

( ) ( ){ }102

22

2

422

10

1

2 2 eXBeYeSeSeXBeYEW hh

hhxhx

hh

h

L

h

h−++−=∑

=

φφ

( ) ( ) ( )

( ) ( ){ }

−+

+−+=∑

=2120

22

2

42

10

2

1

222

0

2

1

2

2

2

eeEXBeeEYSeS

eeEYXBeEXBeEYW

hh

hhxhx

hhh

hh

hL

h

h

φφ

( ) ( )( ){ } ( )

( )∑∑∑

===

−

−−+−

=L

h hxh

hx

h

h

L

h

hx

h

hxh

hy

L

h h

h

hls

Bn

W

n

SWS

nWyMSE

1 30

212

1

2

4222

1

22 21

1

µ

µφβφ

ρφ

Under proportional allocation MSE becomes

( ) ( ) ( ){ } ( )

( )∑∑∑

===

−

−−+−=L

h hxh

hx

h

L

h

hxhxhhy

L

h

hhls

BW

nSW

nSW

nyMSE

1 30

21

1

2

422

1

2 121

11

1

µ

µφβφρ

φ

(7.2.4)

The optimum value of φ minimizing the mean square error of φlsy in (7.2.4)

is given by

( ) ( )( )

( )( )12

4

3021

−

−−=

hxhx

hxhhx

o

S

B

β

µµφ (7.2.5)

and the minimum mean square error of φlsy is given by

( ) ( ) ( ) ( )( )

( )( )∑∑== −

−−−=

L

h hx

hxhhx

hx

h

hy

L

h

hhols

B

S

W

nSW

nyMSE

1 2

2

3021

4

2

1

2

1

11

1

β

µµρφ

(7.2.6)



7.3Estimator based on estimated optimum value of φ

For situations when values of ( )hx21µ , ( )hx30µ , h

B and ( )hx2β or their good

guessed values are not available, the alternative is to replace ( )hx21µ , ( )hx30µ , h

B

and ( )hx2β involved in the optimum φ by their estimates ( )hx21

∧

µ , ( )hx30

∧

µ , hB

∧

and

( )hx2

∧

β based on sample values and get the estimated optimum value of k to be

( ) ( )

( )

−

−

−=∧

∧∧

∧

1

1

2

3021

4

hx

hxhhx

hx

b

sc

β

µµ

(7.3.1)

where,

( )( )

( )hx

hx

hx

20

2

40

2∧

∧

∧

=

µ

µβ with ( ) ( )∑

=

∧

−=n

i

hhihxxx

n 1

4

40

1µ , ( )∑

=

∧

−==n

i

hhihxxx

ns

1

22

20

1µ

( ) ( )∑=

∧

−=n

i

hhihxxx

n 1

3

30

1µ , ( ) ( ) ( )∑

=

∧

−−=n

i

hhihhihxyyxx

n 1

2

21

1µ and

hh bB =

∧

Thus, replacing φ by estimated optimum ∧

c in the estimator φlsy in (7.1.1), we

get the estimator lsc

y based on the estimated optimum ∧

c given by

hlc

L

h

hlscyWy ∑

=

=1 (7.3.2)

where ( ){ } ( )22

hxhxhhhhhlc

SscxXbyy −+−+=∧

(7.3.3)

For wider practical utility. Let,



( ) ( )( )42121 1 ehxhx

+=∧

µµ, ( ) ( )( )53030 1 e

hxhx+=

∧

µµ ,

( ) ( )( )64040 1 ehxhx

+=∧

µµ

( ) ( )( ) ( ) ( ) ( )

( ) ( )

( )( ) ( )

( )

−

−−+−

−

++−++−−+−

−=

1

211

.......1

2

2

2

262

2

4

3021

535253230421

3021

hx

hx

hxhx

hxhhx

hxhhx

hxhhx

eeeS

B

eeeeeeeBeB

β

ββ

µµ

µµµµ

( ) ( )( )

( )( )

( )

( )

( ) ( )( )

( ) ( )

+−

++−++−−

+−

−−−

−

−−=

..............

1

21

1

3021

535253230421

2

2

2

262

2

4

3021

hxhhx

hxhhx

hx

hx

hxhx

hxhhx

B

eeeeeeeBe

eee

S

B

µµ

µµ

β

β

β

µµ

(7.3.4)

Now, putting the value of ∧

c in equation (7.3.3), we have,

( ){ }−+−−+−+=− ∑=

........2

21312110

1

eeeeeeeXBeYWYy hh

h

L

h

hlsc

( ) ( )( )

( )( )

( )

( )

( ) ( )( )

( ) ( )

+−

+++−−

+−

−−

−

−∑

=

..............

1

2

1

3021

5232

2

2304221

2

2

2622

2

2

2

3021

1

hxhhx

hxhhx

hx

hx

hxhx

hxhhx

L

h

h

B

eeeeeBee

eeee

S

BW

µµ

µµ

β

β

β

µµ

(7.3.5)

( )( )( ) ( ) ( )( )[ ]

( )( )

( )( )

−

+

+

++−++−−+

+−=

∧

11

1

1...........11

1

1

2

2

2

20

640

53032

2

232421

2

2

4

e

e

eeeeeeBe

eS

c

hx

hx

hxhhx

hx

µ

µ

µµ




taking expectation, we have mean square error of lsc


approximation, that is up to terms of order O(1/n) to be

( ) ( ) ( )( )

( )( )

2

2

2

2

3021

10

1 1

−

−−−= ∑

=

eS

BeXBeYWEyMSE

hxhx

hxhhx

hh

h

L

h

hlsc

β

µµ

( ) ( ) ( ) ( ) ( )( )

( )( )( )

( ) ( )( )

( )( )( ) ( ){ }

−−

−

−−

−+−+

=∑=

2120

2

2

3021

2

22

2

4

2

3021

10

2

1

222

0

2

1

2

12

12

eeEXBeeEYS

B

eE

S

BeeEYXBeEXBeEY

W

hh

h

hxhx

hxhhx

hxhx

hxhhx

hhhh

h

L

h

h

β

µµ

β

µµ

( ) ( ) ( ) ( )( )

( )( )∑∑== −

−−

−=

L

h hx

hxhhx

hxh

h

hy

L

h h

h

hlsc

B

Sn

WS

nWyMSE

1 2

2

3021

4

22

1

22

1

1

β

µµρ

Under proportional allocation MSE is

( ) ( ) ( ) ( )( )

( )( )∑∑== −

−−−=

L

h hx

hxhhx

hx

h

hy

L

h

hhlsc

B

S

W

nSW

nyMSE

1 2

2

3021

4

2

1

2

1

11

1

β

µµρ

(7.3.6)

which shows that the estimator lsc


∧

c attains the same minimum mean square error of φlsy in (7.2.6) depending

on optimum value φ in (7.2.5).


a). From (7.2.6), for the optimum value of φ , the estimator φlsy attains


( ) ( ) ( ) ( )( )

( )( )∑∑== −

−−−=

L

h hx

hxhhx

hx

h

hy

L

h

hhols

B

S

W

nSW

nyMSE

1 2

2

3021

4

2

1

2

1

11

1

β

µµρφ

(7.4.1)



b). From (7.3.5), the estimator lsc


c


( ) ( ) ( ) ( )( )

( )( )∑∑== −

−−−=

L

h hx

hxhhx

hx

h

hy

L

h

hhlsc

B

S

W

nSW

nyMSE

1 2

2

3021

4

2

1

2

1

11

1

β

µµρ

(7.4.2)

c). From (7.4.1) or (7.4.2), we see that the estimator lsc

y depending on

estimated optimum value is always more efficient than the usual separate

regression-type estimator hl

L

h

hlsyWy ∑

=

=1

where ( )hhhhhl

xXbyy −+= for non

symmetrical population in the sense of having lesser mean square error

whereas for symmetrical population or distribution, both lsc

y and ls

y are

equally efficient.

7.5 An Illustration

Considering the data in Singh and Chaudhary (1989, page no. 162) were

collected in a pilot survey for estimating the extent of cultivation and

production of fresh fruits in three districts of Uttar Pradesh in the year 1976-

1977. Each district is considered as one strata, h

N denotes the total no. of

villages in each strata, h

X total area (in hect.) under orchard, h

y total no trees

in sample, h

x area under orchard in sample, h

n is the no. of villages in sample.

Computation of required values have been done in table 7.5.1 and we have the

following

( ) 9888.56=yMSE

( ) 3523.08=rs

yMSE

( ) 614.08=lsc

yMSE




estimator lsc

y and usual separate ratio-type estimator rs

y over the mean

per unit estimator y are 281% and 1612% respectively, showing that

the enhanced efficiency of the proposed estimator .


Estimators y rs

y lsc

y

PRE 100% 281% 1612%

Table 7.5.1

Stratum

No. h

N hn h

W 2

hyS

2

hxS h

ρ )(21 hxµ )(30 hx

µ h

B

1. 985 6 0.234467984

74775.46667

15.97122667

0.921519105

-756.389970

-11.6575513

63.05430933

2. 2196 8 0.522732683

259113.6964

132.6601143

0.973771508

93851.02815

2206.830751

43.03601642

3. 1020 11 0.242799333

65885.6

38.43842182

0.802446269

8799.26554

254.9886637

33.222204

Chapter-VIII

“A Generalized Class of Separate

Regression-Type Estimators for the

Estimation of Finite Population Mean”

ChapterChapterChapterChapter----VIII VIII VIII VIII A Generalized Class of Separate RegressionA Generalized Class of Separate RegressionA Generalized Class of Separate RegressionA Generalized Class of Separate Regression----Type Estimators Type Estimators Type Estimators Type Estimators …………


A GENERALIZED CLASS OF SEPARATE REGRESSIONA GENERALIZED CLASS OF SEPARATE REGRESSIONA GENERALIZED CLASS OF SEPARATE REGRESSIONA GENERALIZED CLASS OF SEPARATE REGRESSION----TYPE TYPE TYPE TYPE

ESTIMATORS FOR THE ESTIMATION OF FINITE ESTIMATORS FOR THE ESTIMATION OF FINITE ESTIMATORS FOR THE ESTIMATION OF FINITE ESTIMATORS FOR THE ESTIMATION OF FINITE

POPULATION MEANPOPULATION MEANPOPULATION MEANPOPULATION MEAN

SUMMARY

For the estimation of Finite Population mean, a generalized class of

separate regression-type estimators under stratified random sampling is

proposed, its bias and the mean square error are found, and further an optimum

class of estimators is also obtained having minimum mean square error.

Enhancing the practical utility of the optimum estimator, a class of estimators

depending upon estimated optimum value based on sample observations is also

found. Further comparative study has been done with some earlier estimators.


8.1 Introduction

8.2 Estimated Optimum class of estimators.

8.3 8.3 Concluding Remarks

8.4 8.4 An Illustration



8.1 Introduction

Stratified random sampling is used to improve the precision of estimator

when population is heterogeneous. Stratification is the process of dividing

members of the population into homogeneous subgroups before doing actual

sampling. The strata are made mutually exclusive: such that every element in

the population is assigned to only one stratum and is also exhaustive. No

population element is excluded. This improves the representativeness of the

sample by reducing sampling error. It produces a weighted mean that has less

variability than the arithmetic mean of a simple random sample of the

population. And then a sample is drawn from each stratum by simple random

sampling without replacement according to definite allocation plan. The use of

auxiliary variable x when it is correlated with the study variable y further

increases the precision of the estimator.

We assume that the population consists of N units, which can be

partitioned into L strata of sizes N1, N2….NL such that∑=

=L

h

hNN

1

. Let ( )hihi

XY , ;

(i=1,2…...Nh) denote the values of the variates (y, x) respectively for the ith

unit

in hth

stratum and hY and hX denote strata means. The strata weights are

( )LhN

NW

h

h.......2,1, == .

Further let,

∑=

=L

h

hhYWY

1

(population mean of the study variable y)

∑=

=L

h

hh

XWX

1

(population mean of the auxiliary variable x)

( )∑=

−−

=h

N

i

hhi

h

hyYY

NS

1

22

1

1



( )∑=

−−

=h

N

i

hhi

h

hxXX

NS

1

22

1

1

( )( )∑=

−−−

=h

N

i

hhi

hhi

h

hxyYYXX

NS

11

1

( ) ( )s

hhi

rN

i

hhi

h

rsYYXX

N

h

−−= ∑=1

1µ

hyhx

hxy

h

SS

S=ρ , ( )

( )

( )2

20

40

2

hx

hx

hx

µ

µβ = ,

hx

hy

h

hx

hxy

h

S

S

S

SB ρ==

2

A simple random sample of size n is drawn without replacement under

proportional allocation from each of the L strata i.e. ( )L

nnnn ....., 21= , h

n

denoting the number of units in the sample is drawn from the hth

stratum , such

that hh

NN

nn =

and nn

L

h

h=∑

=1

. Let, the means of the study variable y and

auxiliary variable x of the h

n sample units drawn from the hth

stratum whose

size h

N is assumed to be known are ∑=

=h

n

i

hi

h

hy

ny

1

1 and ∑

=

=h

n

i

hi

h

h xn

x

1

1

respectively. Also let

( )∑=

−−

=h

n

i

hhi

h

hyyy

ns

1

22

1

1

( )∑=

−−

=h

n

i

hhi

h

hxxx

ns

1

22

1

1

( )( )∑=

−−−

=h

n

i

hhihhi

h

hxyyyxx

ns

11

1

2

hx

hxy

h

s

sb =



The proposed generalized class of separate regression-type estimator under

stratified random sampling and using auxiliary variate x for the estimation of

population mean of the study variate y is

hg

L

h

hsgyWy ∑

=

=1 (8.1.1)

where ( ){ } ( ){ }ugxXbyy hhhhhg

−+= (8.1.2)

where 2

2

hx

hx

S

su = and ( )ug is a bounded function of u , having first three

derivatives with respect to u to be bounded and continuous such that validity

conditions of Taylor’s series expansion are satisfied and ( ) 11 =g .

Theorem 8.1.1: Bias of the proposed estimator

sgy is given as follows:

( ) ( ) ( ){ } ( )( )( ) ( )∑ ∑ ∑

= = =

−+

−+−=

L

h

L

h

L

h

hx

h

hxy

h

hx

hx

hx

hxh

sgg

n

Y

S

B

Sng

nS

ByBias

1 1 1

22

21

2

301''1

!2

11'1 β

µµ

Proof: Using equation (8.1.2) and expanding ( )ug about the point 1=u in the

third order Taylor’s series expansion,

( ){ } ( ) ( ) ( )( )

( )( ) ( )

−

+−

+−+−+=*

32

'''!3

11''

!2

11'11 ug

ug

ugugxXbyy hh

hhhg (8.1.3)

where, ( )11*−+= uu θ , 10 <<θ and θ may depend on u . ( )1'g , ( )1''g and

( )*''' ug denote the first, second and third partial derivatives of ( )ug at the

point 1=u , 1 and *u , respectively.

Further let,

( )01 eYy hh

+= , ( )11 eXx hh += , ( )2

221 eSs

hxhx+= , ( )31 eSs

hxyhxy+=



so that ( ) ( ) ( ) ( ) 03210 ==== eEeEeEeE

Also we have,

( )

( )( )....1

1

132

2

232

2

2

3

2+−++−=

+

+== eeeeeB

eS

eS

s

sb

h

hx

hxy

hx

hxy

h (8.1.4)

using equation (8.1.4) in (8.1.3) and writing hg

y in terms of ei’s, we have

( ) ( ) ( ){ }[ ]( ) ( )

( ) ( )

+

++

+−++−++=*

3

2

2

2

2

1320

'''!3

1''!2

1'1

*1...11ug

eg

e

geg

eXXeeBeYy hhh

hhg

( ) ( ){ } ( ) ( )

+++++−−+= ...1''

!21'1*...1

2

22312110 g

egeeeeeeXBeY h

hh

( ) ( ) ( ) ( ) ( ) ( )

( ) ( ) ( ) ( )1''!2

....1'....

1''!2

11'1...1

2

231211231211

2

2

020312110

ge

eeeeeXBgeeeeeeXB

ge

eYgeeYeeeeeXBeY

hh

hh

hhhh

h

+−+−++−+−

+++++++−−+=

( ) ( ) ( ) ( )( )

( )

( ) ( ) ...1'....

1''!2

1'...1

21

2

20

2

2

202312110

++−

++

+++++−−+=

geeXB

geee

YgeeeYeeeeeXBeY

hh

hhhh

h

(8.1.5)

Using (8.1.5) in (8.1.1), we have

( ) ( ) ( ) ( )

( )( ) ( ) ( )

++−++

+++++−−+

=∑= ...1'....1''

!2

1'...1

21

2

20

2

2

202312110

1 geeXBgeee

Y

geeeYeeeeeXBeY

Wy

hh

h

hhh

hL

h

hsg

(8.1.6)

Considering the terms up to

nO

1 and taking expectation on both sides, we get

( ) ( )YyEyBiassgsg

−=



( ) ( ) ( ) ( ){ } ( ) ( ){ } ( )

( )( ) ( ) ( )

−

++++−−

=∑= 1'1''

!2

1'

21

2

2

202312110

1 geeEXBgeE

Y

geeEeEYeeEeeEeEXBeEY

W

hh

h

hhh

hL

h

h

(8.1.7)

Now using the following expressions for a simple random sample of size h

n

drawn under proportion allocation i.e. hh

NN

nn = from each stratum of size

hN .

But here we have assumed that the size of the hth

stratum h

N is very large as

compared to the sample size h

n of the stratum, so ignore the finite population

correction term h

h

N

nf = .

( )2

2

2

0

hh

hy

Yn

SeE = , ( )

2

2

2

1

hh

hx

Xn

SeE = , ( ) ( )( )1

12

2

2 −=hx

hn

eE β , ( )hh

h

hxy

YXn

SeeE =10

( ) ( )

hhxh

hx

YSn

eeE2

21

20

µ= , ( ) ( )

hhxh

hx

XSn

eeE2

30

21

µ= , ( ) ( )

hhxyh

hx

XSn

eeE21

31

µ=

Now from equation (8.1.7), we have

( ) ( ) ( ){ } ( )( )( ) ( )∑ ∑ ∑

= = =

−+

−+−=

L

h

L

h

L

h

hx

h

hxy

h

hx

hx

hx

hxh

sgg

n

Y

S

B

Sng

nS

ByBias

1 1 1

22

21

2

301''1

!2

11'1 β

µµ

(8.1.8)

Theorem 8.1.2: Mean square error (MSE) of estimator sg

y , to the first of

approximation is given by

( ) ( )( ){ } ( ){ }

( ) ( ){ } ( )1'2

1'11

1

21302

1

2

2

2

2

1

2

gBnS

YW

gn

YWS

nWyMSE

L

h

hxhxh

hx

hh

L

h

hx

hh

hy

L

h

h

hsg

∑

∑∑

=

==

−

−−+−

=

µµ

βρ



Proof: Now mean square error of estimator sg

y to the first order of

approximation is given by

( ) ( )2

YyEyMSEsgsg

−=

( ){ }2

210

1

1'

+−= ∑

=

geYeXBeYWE hhh

h

L

h

h (using equation (8.1.6) )

Substituting the values of expectations involved, given in the proof of theorem

8.1.1, we get

( ) ( )( ){ } ( ){ }

( ) ( ){ } ( )1'2

1'11

1

21302

1

2

2

2

2

1

2

gBnS

YW

gn

YWS

nWyMSE

L

h

hxhxh

hx

hh

L

h

hx

hh

hy

L

h

h

hsg

∑

∑∑

=

==

−

−−+−

=

µµ

βρ

(8.1.9)

Theorem 8.1.3: Optimum class of estimators having minimum mean square

error given by,

( ) ( ) ( ) ( ){ }

( ){ }1

11

1

2

4

2

2130

1

2

1

2

min −

−−−= ∑∑

== hxhx

hxhxh

L

h

hhy

L

h

hhsg

S

BW

nSW

nyMSE

β

µµρ

Satisfies the condition, ( ) ( ) ( ){ }

( ){ }11'

2

2

2130

−

−=

hxhxh

hxhxh

SY

Bg

β

µµ

Proof: To obtain optimum class of estimators minimizing ( )sg

yMSE , we

proceed as follows:

From equation (8.1.9), we have

( ) ( )( ){ } ( ){ }

( ) ( ){ } ( )1'2

1'11

1

21302

1

2

2

2

2

1

2

gBnS

YW

gn

YWS

nWyMSE

L

h

hxhxh

hx

hh

L

h

hx

hh

hy

L

h

h

hsg

∑

∑∑

=

==

−

−−+−

=

µµ

βρ



by the principle of Maxima-Minima, partially differentiating ( )sg

yMSE with

respect to ( )1'g , the optimum of ( )1'g for which ( )sg

yMSE is minimum is

obtained as

( ) ( ) ( ){ }

( ){ }α

β

µµ=

−

−=

11'

2

2

2130

hxhxh

hxhxh

SY

Bg say) (8.1.10)

And for this value of ( ) α=1'g , the minimum mean square error of sg

y is

( ) ( ) ( ) ( ){ }

( ){ }1

1

2

4

2

2130

1

22

1

22

min −

−−

−= ∑∑

== hxhxh

hxhxh

L

h

hhy

L

h h

h

hsg

Sn

BWS

nWyMSE

β

µµρ

Under proportion allocation

( ) ( ) ( ) ( ){ }

( ){ }1

11

1

2

4

2

2130

1

2

1

2

min −

−−−= ∑∑

== hxhx

hxhxh

L

h

hhy

L

h

hhsg

S

BW

nSW

nyMSE

β

µµρ (8.1.11)

Theorem 1.4: sg

y is more efficient than the conventional estimator ls

y in the

sense of having lesser mean square error under optimum condition,

( ) ( ) ( ){ }

( ){ }11'

2

2

2130

−

−=

hxhxh

hxhxh

SY

Bg

β

µµ

Proof: we know that

( ) ( ) 22

1

11

hyh

L

h

hlsSW

nyMSE ρ−= ∑

=

Using equation (8.1.11), we see that

( ) ( ) ( ) ( ){ }

( ){ }1

1

2

4

2

2130

1min −

−−= ∑

= hxhx

hxhxh

L

h

hlssg

S

BW

nyMSEyMSE

β

µµ



Which is always greater or equal to zero, showing that the proposed estimator

sgy has lesser mean square error than

lsy under optimum condition given by

(8.1.10). Therefore, sg


y in

the sense of having lesser mean square error under optimum condition.

8.2 Estimated Optimum Class of Estimators

The optimum value of α in (8.1.10) or its guessed value may be rarely

known in practice, hence it is replaced by its estimate from sample values.

Thus, replacing ( )hx21µ , ( )hx30µ , ( )hx40µ , 2

hxS and hY by their following estimators

( )( )

( )hx

hx

hx

20

2

40

2∧

∧

∧

=

µ

µβ with ( )∑

=

∧

−=h

n

i

hhi

h

xxn 1

4

40

1µ , ( )∑

=

∧

−==h

n

i

hhi

h

hxxx

ns

1

22

20

1µ

( )∑=

∧

−=h

n

i

hhi

h

xxn 1

3

30

1µ , ( ) ( )∑

=

∧

−−−

=h

n

i

hhihhi

h

yyxxn 1

2

211

1µ and

hh bB =

∧

We get the estimated optimum value ∧

α to be

( ) ( )

( )

−

−

=∧

∧∧

∧

12

2

2130

hxhxh

hxhxh

sy

b

β

µµ

α

(8.2.1)

The mean square error in case of estimated optimum ∧

α is obtained as follows:

From (8.1.10), we need a function ( )ug involves in sg

y such that

( ) 11 =g , ( ) α=ug



Which means ( ).g should involve not only u but α also and thus we need a

function ( )α,*ug such that

( ) 1,1*=αg

,

( )

α

α

=

∂

∂

,1

*

u

g

( )

0

,1

*

=

∂

∂

αα

g

As the function ( )α,*ug so found involves unknownα , we replace α by its

estimate ∧

α from (8.2.1) and get the function

∧

α,**ug such that

( ) 1,1*=αg ,

( )

α

α

=

∂

∂

,1

*

u

g

( )

0

,1

*

=

∂

∂∧

αα

g (8.2.2)

Using such a function

∧

α,**ug satisfying (8.2.2), we may take

hg

L

h

hsg yWy

**

1

**

∑=

= (8.2.3)

where ( ){ }

−+=

∧

α,****

ugxXbyy hhhh

hg (8.2.4)

as modified estimated optimum class of estimators of Population mean Y , now

expanding

∧

α,**ug in hgy

**

about the point ( )α,1=P in Taylor’s series, we

have,

( ){ } ( )( ) ( )

+

∂

∂

−+

∂

∂−+

−+=

∧

∧∧

...1,1

,1

**

,1

******

αα α

αααg

u

gugxXbyy hh

hhhg



( ) ( ) ( ){ }[ ]( )

( )

+

∂

∂

−

+

∂

∂+

+−++−++=

∧

∧

∧

...

,1

*1...11

,1

**

,1

**

2

**

1320

α

α

α

αα

α

g

u

geg

eXXeeBeY hhh

h

( ) ( ) ( )( )

( )( )

.......

...1

,1

**

21

,1

**

202312110

+

∂

∂+−

+

∂

∂++++−−+=

α

α

u

geeXB

u

geeeYeeeeeXBeY

hh

hhh

h

(8.2.5)

Using (8.2.5) in (8.2.3), we have

( ) ( )( )

( )( )

+

∂

∂+−

+

∂

∂++++−−

=− ∑=

.......

...

,1

**

21

,1

**

202312110

1

**

α

α

u

geeXB

u

geeeYeeeeeXBeY

WYy

hh

hhh

h

L

h

hsg

(8.2.6)

Squaring both sides of (8.2.6), taking terms up to

nO

1, and taking

expectation, the ( )**

sgyMSE will be

( ) ( ) ( ) ( ){ }

( ){ }1

1

2

2

2

2130

1

22

1

22**

−

−−

−= ∑∑

== hxhxh

hxhxh

L

h

hhy

L

h h

h

hsg

Sn

BWS

nWyMSE

β

µµρ

( ) ( ) ( ) ( ){ }

( ){ }1

11

1

2

4

2

2130

1

2

1

2**

−

−−−= ∑∑

== hxhx

hxhxh

L

h

hhy

L

h

hhsg

S

BW

nSW

nyMSE

β

µµρ

(8.2.7)

Which equal to the ( )sg

yMSE given in (8.1.11), if



( )

0

,1

**

=

∂

∂∧

αα

g (8.2.8)

Thus, considering the function

∧

α,**ug such that

( ) 1,1**

=αg , ( )

α

α

=

∂

∂

,1

**

u

g and

( )

0

,1

**

=

∂

∂∧

αα

g (8.2.9)

We get the estimator hgy**

depending on estimated optimum values as

( ){ }

−+=

∧

α,**

**

ugxXbyy hhhh

hg

(8.2.10)

Which attains the same minimum MSE as given in equation (8.1.11).


(1) It may be easily seen that following estimators are special cases of the

proposed class of estimators sg

y

I. ( ){ }

−+= ∑

=

2

2

2

1

k

hx

hx

hhhh

L

h

hsg

s

SxXbyWy where ( ) k

uug2−

= , 2

2

hx

hx

S

su =

II. ( ){ }

−

−−+=∑

=

2

2

2

1

11

k

hx

hx

hhhh

L

h

hsg

S

sxXbyWy θ , where

( )

2

2

2

11

−

−=

k

hx

hx

S

sug θ ;

2

2

hx

hx

S

su =

III. ( ){ }

−+=∑=

2

2

2

2

1 hx

hx

hyhh

hh

L

h

hsg

s

SsxXbyWy , where ( )

2

2

2−

=

hx

hx

S

sug ;

2

2

hx

hx

S

su =



IV. ( ){ }

−+=∑=

2

2

2

2

1 hx

hx

hyhh

hh

L

h

hsg

S

ssxXbyWy , where ( ) 2

uug = ; 2

2

hx

hx

S

su =

V. ( ){ }

−+=∑

=

2

2

2

2

1

k

hx

hx

hyhhhh

L

h

hsg

S

ssxXbyWy , where ( ) k

uug2

= ; 2

2

hx

hx

S

su =

(2) Bias of the proposed estimator sg

y is given as follows:

( ) ( ) ( ){ } ( )( )( ) ( )∑ ∑ ∑

= = =

−+

−+−=

L

h

L

h

L

h

hx

h

hh

hxy

h

hxh

hxh

hxh

hxhh

sgg

n

YW

S

B

Sn

Wg

Sn

BWyBias

1 1 1

22

21

2

301''1

!2

11'1 β

µµ

(3) Mean Square error for the proposed generalized class of separate-

regression type estimator under optimum condition is

( ) ( ) ( ) ( ){ }

( ){ }1

11

1

2

4

2

2130

1

2

1

2**

−

−−−= ∑∑

== hxhx

hxhxh

L

h

hhy

L

h

hhsg

S

BW

nSW

nyMSE

β

µµρ

(4) It has been shown that a generalized class of sg

y depending upon

estimated optimum value ( )1∧

g , retains the same minimum mean square error

given by (8.1.9). Also

∧

α,**ug solely depends upon sample information and

therefore may be preferred to other estimators for more practical utility.

(5) From equation (8.1.9), we have

( ) ( )( ){ } ( ){ }

( ) ( ){ } ( )1'2

1'11

1

21302

1

2

2

2

2

1

2

gBnS

YW

gn

YWS

nWyMSE

L

h

hxhxh

hx

hh

L

h

hx

hh

hy

L

h

h

hsg

∑

∑∑

=

==

−

−−+−

=

µµ

βρ

(8.3.1)

and ( ) ( ) 2

1

21

hy

L

h

h

hlsS

nWyMSE ∑

=

−=

ρ (8.3.2)

The optimum value of ( )1'g for which MSE of sg

y is minimum is



( ) ( ) ( ){ }

( ){ }α

β

µµ=

−

−=

11'

2

2

2130

hxhxh

hxhxh

SY

Bg

It is clear from (8.3.2) that ( ) ( )lssg

yMSEyMSE < if,

( )

01'

21 <

−

g

α (8.3.3)

Now, if ( ) 01' >g , the efficiency condition (8.3.3) for sg

y to be better than ls

y

in the sense of having less mean square error reduces to

( )

2

1'g>α (8.3.4)

Further, if ( ) 01' >g , the efficiency condition (8.3.3) reduces to

( )

2

1'g>α (8.3.5)

This is the situations where we have prior information about the upper or lower

bounds or range of α on the basis of the past data, pilot study or experience,

we can find better estimators than ls

y from a class of estimators represented by

sgy by choosing the function ( )ug suitably. If, we know that ( )00 >> αλ , we

may choose the function ( )ug in sg

y such that ( ) 021' α=g . Satisfying the

efficiency condition condition (8.3.4) and if we know that ( )00 << αλ we may

choose the function ( )ug insg

y such that ( ) 021' α=g satisfying the efficiency

condition (8.3.5) to find more efficient estimators with less or mean square

error than ls

y in both the case.

(6) sg


y in the sense of

having lesser mean square error under optimum condition



( ) ( ) ( ){ }

( ){ }11'

2

2

2130

−

−=

hxhxh

hxhxh

SY

Bg

β

µµ

8.4 An Illustration

Considering the data in Singh and Chaudhary (1989, page no. 162) were

collected in a pilot survey for estimating the extent of cultivation and

production of fresh fruits in three districts of Uttar Pradesh in the year 1976-

1977. Each district is considered as one strata, h

N denotes the total no. of

villages in each strata, h

X total area (in hect.) under orchard, h

y total no trees

in sample, h

x area under orchard in sample, h

n is he . of villages in sample.

Computation of required values have been done in table 8.4.1 and we have the

following

( ) 9888.56=yMSE

( ) 3523.08=rs

yMSE

( ) 614.08**

=sg

yMSE


estimator lsc

y and usual separate ratio-type estimator rs

y over the mean

per unit estimator y are 281% and 1612% respectively, showing that the

enhanced efficiency of the proposed estimator .


Estimators y rs

y **

sgy

PRE 100% 281% 1612%

Table 8.4.2

Stratum

No. h

N hn h

W 2

hyS

2

hxS h

ρ )(21 hxµ )(30 hx

µ h

B

1. 985 6 0.234467984

74775.46667

15.97122667

0.921519105

-756.389970

-11.6575513

63.05430933

2. 2196 8 0.522732683

259113.6964

132.6601143

0.973771508

93851.02815

2206.830751

43.03601642

3. 1020 11 0.242799333

65885.6

38.43842182

0.802446269

8799.26554

254.9886637

33.222204

Chapter-IX

“On Estimation of Variance of Mean for

the Regression Estimator in Stratified

Random Sampling”

ChapterChapterChapterChapter----IX IX IX IX On Estimation of Variance of Mean for the Regression On Estimation of Variance of Mean for the Regression On Estimation of Variance of Mean for the Regression On Estimation of Variance of Mean for the Regression …………


OOOON ESTIMATION OF VARIANCE OF MEAN FOR THE N ESTIMATION OF VARIANCE OF MEAN FOR THE N ESTIMATION OF VARIANCE OF MEAN FOR THE N ESTIMATION OF VARIANCE OF MEAN FOR THE

REGRESSION ESTIMATOR REGRESSION ESTIMATOR REGRESSION ESTIMATOR REGRESSION ESTIMATOR UNDER STRATIFIEDUNDER STRATIFIEDUNDER STRATIFIEDUNDER STRATIFIED RANDOM RANDOM RANDOM RANDOM

SAMPLINGSAMPLINGSAMPLINGSAMPLING

SUMMARY

This chapter deals with the estimation of variance of separate regression type

estimator of the population mean in stratified random sampling, its bias and

mean square error are obtained and further an optimum class of estimators is

obtained having minimum mean square error. Enhancing the practical utility of

the optimum estimator, a class of estimators depending upon estimated

optimum value based on sample observations is also found. Further

comparative study has been done with some earlier estimators.


9.1 Introduction

9.2 Proposed Estimator

9.3 Estimator Based on Estimated Optimum class of estimators




9.1 Introduction

Let U be a finite population of size N . The study variable and the auxiliary

variable are denoted by y and x respectively and the population is partitioned

into L non-overlapping strata according to some characteristic. The size of the

thh stratum is

hN ( )Lh ,.....,2,1= such that NN

L

h

h=∑

=1

. A stratified sample of size

n is drawn from this population and let h

n be sample size from thh stratum such

that nn

L

h

h=∑

=1

. The observations on y and x corresponding to thi unit of th

h

stratum ( )Lh ,.....,2,1= are hi

y and hi

x respectively. Let h

y and hx be sample

means and hY and hX be population means of y and x respectively in thh

stratum. Suppose ∑=

=L

h

hhstyWy

1

and ∑=

=L

h

hhst xWx

1

are stratified sample means

and ∑=

=L

h

hhYWY

1

and ∑=

=L

h

hhYWX

1

are population means of y and x

respectively, where NNWhh

/= is known stratum weight. Let

( )2

1

2

1

1∑

=

−−

=h

n

i

hhi

h

yhyy

ns and ( )

2

1

2

1

1∑

=

−−

=h

n

i

hhi

h

xhxx

ns be sample variances and

( )2

1

2

1

1∑

=

−−

=h

N

i

hhi

h

yhYy

NS and ( )

2

1

2

1

1∑

=

−−

=h

N

i

hhi

h

xhXx

NS be population variances of

y and x respectively in thh stratum. Finally, let

( )( )∑=

−−−

=h

n

i

hhihhi

h

yxhxxyy

ns

11

1 and ( )( )∑

=

−−−

=h

n

i

hhi

hhi

h

yxhXxYy

NS

11

1 be sample

and population covariances respectively in thh stratum. We assume that all

parameters corresponding to auxiliary variable x are known and we ignore the

finite population correction term

−=

h

h

h

N

nf 1 for simplification.



A separate regression-type estimator of population mean Y is

( ){ }hhhh

L

h

hsxXbyWy −+=∑

=1

, where h

b sample regression coefficient is. Variance

of s

y is given by

( )( )

∑=

−=

L

h h

hyh

hs

n

SWyV

1

22

21 ρ

(9.1.1)

where xhyh

yxh

h

SS

S=ρ is population correlation coefficient between y and x in th

h

stratum.

An estimator of ( )s

yV is given by Gupta and Shabbir (2010) is as follows

( )

∑=

−=

L

h h

hyh

hs

n

rsWv

1

22

21

(9.1.2)

where xhyh

yxh

h

ss

sr = is sample correlation coefficient between y and x in th

h

stratum.

The mean square error of s

v is

( )( )[ ]

∑=

++−=

L

h h

hhhhh

yhhs

n

CBSWvMSE

13

24

4044 21 ρρλ (9.1.3)

where ( ) ( ) ( )1/41/41 13

2

2204 −−−+−=hhhhhh

B ρλρλλ and

( ) ( )1/21 3122 −−−=hhhh

C ρλλ

Different authors presented the estimators utilizing auxiliary information and

enhanced the efficiency of exiting estimators. These includes Das and Tripathi

(1981), Srivastava and Jhajj (1980, 1983), Wu (1985), Prasad and Singh (1990,

1992).



9.2 Proposed Estimator

Our proposed estimator of ( )s

yV using auxiliary information on ( )2,xh

h SX . The

proposed estimator is given by

( )

( )

−+−

=∑=

22

22

1

21

xhxh

h

hyh

L

h

haSsk

n

rsWv (9.2.1)

where k is a characterizing scalar chosen suitably.

We define the following terms:

( )0

22 1 eSsyhyh

+= , ( )1

22 1 eSsxhxh

+= , ( )21 eSsyxhyxh

+=

so that

( ) ( ) ( )210 eEeEeE ==

and also up to first order of approximation, we have the following expectations

that can be derived easily on the lines of Sukhatme et al. (1997):

( ) ( )11

40

2

0 −=h

hn

eE λ , ( ) ( )11

04

2

1 −=h

hn

eE λ ,

( )

−= 1

12

222

2

h

h

hn

eEρ

λ

( ) ( )11

2210 −=h

hn

eeE λ ,

( )

−= 1

1 3120

h

h

hn

eeEρ

λ , ( )

−= 1

1 1321

h

h

hn

eeEρ

λ

where

2/

02

2/

20

q

h

p

h

pqh

pqh

µµ

µλ =



and ( ) ( )q

hhi

N

i

p

hhi

h

pqhXxYy

N

h

−−−

= ∑=11

1µ

now writing a

v in terms of ei’s, we have

( )( )

( )( )11

1

11 1

2

1

2

1

2

2

2

2

0

2

1

2

−++

+

+−+= ∑∑

==

eSWkeS

eSeS

n

Wv

xh

L

h

h

xh

yxh

yh

L

h h

h

a

( ) ( )[ ] 1

2

1

2

21

2

2

2

121

2

0

1

22

...2211 eSWkeeeeeeen

SW

xh

L

h

hh

L

h h

yhh ∑∑==

++−+++−−+= ρ

( ) ( )( ) 1

2

1

2

2

21

2

2

2

121

2

0

1

2

22

1

...2211 eSWk

eeeeeee

n

SW

xh

L

h

h

h

h

L

h

h

h

yhh ∑∑==

+

−

+−+++−++−=

ρ

ρρ

( ) ( ){ } 1

2

1

2

21

2

2

2

121

2

0

2

1

2 ...22 eSWkeeeeeeen

SWvEv

xh

L

h

hh

h

yh

L

h

haa ∑∑==

++−+++−+=− ρ

(9.2.2)

Taking expectation on both sides, we have bias up to terms of order

O(1/n) to be

( ) ( )[ ]aaa

vEvEvBias −=

( )( ) ( )

( ) ( ) ( )( )1

2

1

2

21

2

2

2

1

212

0

2

1

2

2

2eESWk

eeEeEeE

eEeEeE

n

SW

xh

L

h

hh

h

yh

L

h

h ∑∑==

+

−+

++−+= ρ

( )

−−

−+−=∑

=

1211 13

2

2204

2

2

2

1

2

h

h

h

h

hh

h

yh

L

h

h

n

SW

ρ

λ

ρ

λλρ (9.2.3)

Squaring both sides of (9.2.2) and taking expectation, we have mean

square error of a

v up to terms of order O(1/n) to be

( ) ( )[ ]2

aaavEvEvMSE −=

( ){ }2

1

2

1

2

21

2

0

2

1

2 2

++−+= ∑∑

==

eSWkeeen

SWE

xh

L

h

hh

h

yh

L

h

hρ



( ) ( ) ( ) ( ){ } ( ) ( ){ }[ ]+−+−++=∑=

1020

2

21

2

1

2

2

42

02

4

1

4 2244 eeEeeEeeEeEeEeEn

SW

hh

h

yh

L

h

hρρ

( ) ( ) ( ) ( ){ }[ ]2

121

2

10

1

22

42

1

1

442 22 eEeeEeeEn

SSWkeESWk

h

L

h h

xhyh

h

L

h

xhh−++ ∑∑

==

ρ

( )[ ] ( )+−+++−= ∑∑==

121 04

4

1

4224

403

4

1

4

h

h

xh

L

h

hhhhhh

h

yh

L

h

h

n

SWkCB

n

SW λρρλ

( ) ( )

−−

−+−∑

=

11212 04132

22

12

22

4

h

h

h

hh

L

h h

xhyh

h

n

SSWk λ

ρ

λρλ

(9.2.4)

The optimum value of k minimizing the mean square error of a

v in

(9.2.4) is given by

( ) ( )

( )1

1121

04

2

22

13

04

22

−

−−

−−−

=hxhh

h

h

h

hhyh

o

Sn

S

kλ

λρ

λλρ

( )104

2

2

−=

hxhh

hyh

Sn

DS

λ (9.2.5)

where ( ) ( )

−−

−−−= 1121 22

13

04

2

h

h

h

hhhD λ

ρ

λλρ


(9.2.6)

9.3 Estimator Based on Estimated Optimum ∧

k

For situations where the values of h22λ ,

h13λ ,h04λ , and

hρ or their good

guessed values are not available , the alternative is to replace them by their

( ) ( )[ ]( )1

2104

2

3

4

1

424

403

4

1

4

−−++−= ∑∑

== h

h

h

yh

L

h

hhhhhh

h

yh

L

h

hoa

D

n

SWCB

n

SWvMSE

λρρλ



estimates h22

∧

λ , h13

∧

λ , h04

∧

λ and h

r based on sample values and get the

estimated optimum value of o

k denoted by ∧

k as

−

−−

−−

−

=∧

∧

∧

∧

∧

1

1121

042

2213

0422

hxhh

h

h

h

hhyh

sn

rrs

k

λ

λλ

λ

−

−−

−−

−

=

∧

∧

∧∧

∧

∧∧

∧

∧

∧

1

1121

2

02

042

2

02

2

20

22

2/3

02

2/1

20

13

2

02

0422

h

h

xhh

hh

h

yxh

xhyh

hh

h

h

h

hyh

sn

s

ssrs

µ

µ

µµ

µ

µµ

µ

µ

µ

(9.3.1)

where 2

02

2

20

2222

hh

hh

∧∧

∧

∧

=

µµ

µλ with ( ) ( )2

1

2

221

1hhi

n

i

hhi

h

hxxyy

n

h

−−−

= ∑=

∧

µ

( )∑=

∧

−−

==h

n

i

hhi

h

yhhyy

ns

1

22

201

1µ and ( )∑

=

∧

−−

==h

n

i

hhi

h

xhhxx

ns

1

22

021

1µ

2/3

02

2/1

20

1313

hh

hh

∧∧

∧

∧

=

µµ

µλ ,

2

02

0404

h

hh

∧

∧

∧

=

µ

µλ and

xhyh

yxh

hh

ss

sr ==ρ

Thus, replacing o

k by estimated optimum ∧

k in the estimator ae

v in (9.2.1) ,

we get for wider practical utility of the estimator based on the estimated

optimum ∧

k given by

( )( )

−+−

=∧

=

∑ 22

22

1

21

xhxh

h

hyh

L

h

haeSsk

n

rsWv

(9.3.2)



To find the mean square error of ae

v , let

( )32222 1 ehh

+=∧

µµ , ( )41313 1 ehh

+=∧

µµ , ( )50404 1 ehh

+=∧

µµ

we have

( )

( )

( )( )

( )

( )

( )

( ) ( )

( )

( ) ( )

( )

( ) ( )

( )( )

( )

−+

++

−

++

+

−

−

++

+

++

+

−

−

+

+

++

+

+

=∧

11

11

111

1

111

1

11

12

11

1

11

1

1

2

0

2

02

5041

2

102020

322

2/1

1

2/1

0

2

2/3

1

2/3

02

2/1

0

2/1

20

413

2

0

2

02

504

10

2

22

0

2

e

eeSn

ee

e

ee

e

ee

e

e

e

ee

e

eS

k

h

h

xhh

hh

h

h

hh

h

h

h

h

yh

µ

µ

µµ

µ

ρµµ

µ

µ

µ

ρ

( )( )

( ) ( ){ }

( ) ( ){ }( )

( )

( ) ( )( )

( )

( ) ( )( )

( )

−

−++−++−+++−

−−

−++−++−−++−

+−

−++−++−−++−

−++−−++−−−

−++−−

=

1

.........

1

......2...2

1

.........

......1

...

04

1510410221022

04

1510421132113

04

15104015104

2

015104

2

04

15104

2

2

h

h

hh

h

h

hhhh

h

h

hh

hh

h

h

hh

xhh

yh

eeeeeee

eeeeeee

eeeeeee

eeeeeee

DD

Sn

S

λ

λλλ

λ

λλρλρ

λ

λλρ

λρλ

λ

(9.3 .3)

Substituting ∧

k from (9.3.3) in (9.3.2) and squaring both sides, ignoring terms

of ei’s greater than two and taking expectation, we have mean square error of

aev to the first degree of approximation, that is up to terms of order O(1/n) to

be

( ) ( ){ }2

1

2

1

2

21

2

0

2

1

2 2

++−+= ∑∑

=

∧

=

eDn

SWkeee

n

SWEvMSE

h

h

yh

L

h

hh

h

yh

L

h

hacρ

( )[ ]( )1

2104

2

3

4

1

424

403

4

1

4

−−++−= ∑∑

== h

h

h

yh

L

h

hhhhhh

h

yh

L

h

h

D

n

SWCB

n

SW

λρρλ (9.3.4)



which shows that the estimator ae

v in (9.3.2) based on estimated optimum

∧

k attains the same minimum mean square error of a

v in (9.2.6) depending

on optimum value o

k in (9.2.5).


a). From (9.2.6), for the optimum value of o

k , the estimator a

v attains


( ) ( )[ ]( )1

2104

2

3

4

1

424

403

4

1

4

−−++−= ∑∑

== h

h

h

yh

L

h

hhhhhh

h

yh

L

h

hoa

D

n

SWCB

n

SWvMSE

λρρλ

(9.4.1)

b). From (9.3.4), the estimator ae

v depending upon estimated optimum ∧

k


( ) ( )[ ]( )1

2104

2

3

4

1

424

403

4

1

4

−−++−= ∑∑

== h

h

h

yh

L

h

hhhhhh

h

yh

L

h

hae

D

n

SWCB

n

SWvMSE

λρρλ

(9.4.2)

c). From (9.4.1) or (9.4.2), we see that the estimator ae

v depending on

estimated optimum value is always more efficient than the variance of

usual separate regression-type estimator ( ){ }hhhh

L

h

hsxXbyWy −+=∑

=1

for non

symmetrical population in the sense of having lesser mean square error.

Bibliography

BibliographyBibliographyBibliographyBibliography


BIBLIOGRAPHYBIBLIOGRAPHYBIBLIOGRAPHYBIBLIOGRAPHY

[1] BAHL, S. AND TUTEJA, R.K. (1991). “Ratio and product type

exponential estimator”, Information and optimization sciences, 12 (1), p-

159-163.

[2] BANDYOPADHYAY S. (1980). “Improved ratio and product

estimators”, Sankhya, 42, p-45-56.

[3] BOWLEY, A.L. (1926). “Measurement of the precision attained in

sampling”, Bulletin of the International Statistical Institute, 22.

[4] BREWER, K.R.W. (1963), "Ratio Estimation and Finite Populations:

Some Results Deducible from the Assumption of an Underlying

Stochastic Process," Australian Journal of Statistics, 5, p-93-105.

[5] BREWER, K.R.W. (1979). “A class of robust sampling designs for

large scale surveys”, J. Amer. Statist. Assoc., 74, p-911-915.

[6] CASSEL, C., SARNDAL, C. AND WRETMAN, J.H. (1977).

“Foundations of Inference in Survey Sampling”, Wiley, New York.

[7] CASSEL, C.M., SARNDAL, C.E., and WRETMAN, J.H. (1976),

"Some Results on Generalized Difference Estimation and Generalized

Regression Estimation for Finite Populations", Biometrika, 63, p-615-

620.

[8] CASSEL, C.M., SARNDAL, C.E., And WRETMAN, J.H. (1983).

“Some uses of statistical models in connection with the non response



problem. In Incomplete Data in Sample Surveys”, vol. 3, pp. 143-160.

New York: Academic Press.

[9] COCHRAN, W. G. (1977). “Sampling techniques”, Third edition.

Wiley. New York New York, USA.

[10] COCHRAN, W.G. (1961). "Comparison of Methods for Determining

Stratum Boundaries," Bulletin of the International Statistical Institute,

38, p-345-358.

[11] CHAUDHARY, M.K., SINGH, V.K., SHUKLA, R.K. (2012).

“Combined-Type Family Of Estimators Of Population Mean In

Stratified Random Sampling Under Non-Response”, Journal of

Reliability and Statistical Studies, Vol. 5, Issue 2 (2012): p-133-142

[12] DAVIES, O.L., GOLDSMITH, P.L. (1976). “Statistical methods in

research and production”, London: Longman Group Ltd.

[13] DEMING, W. E., STEPHAN, F. F. (1941). “On the interpretation of

censuses as samples”, J.Amer. Statist. Assoc. 36, p-45-49.

[14] DEMING, W. E. (1953). “On the distinction between enumerative and

analytic surveys”, J. Amer Statist. Assoc. 48, p-244-255.

[15] DES RAJ (1972). “The Design of Sampling Surveys”, McGraw-Hill,

New York.

[16] DAS, A.K., TRIPATHI, T.P. (1981). “A class of sampling strategies for

population means using information on mean and variance of an

auxiliary character”, Technical report No. 23/81, Stat and Math.

Division, ISI, Calcutta.



[17] DIANA, G. (1993). “A class of estimators of the population mean in

stratified random sampling”, Statistica, 53, 1, p-59-66.

[18] FELLER, W. (1966). “An Introduction of Probability Theory and its

Applications”, Volume II, New York: John Wiley.

[19] GLESER, L.J., HEALY, J.D. (1976). “Estimating the mean of a

normal distribution with known coefficient of variation”, Journal of

the American Statistical Association, 71, p-977-981.

[20] GUILFORD, J.P. (1975). “Psychometric methods”, TATA McGraw

Hill Publishing Company Ltd., BomBay.

[21] GHOSH, J.K. (1988). “Statistical Information and Likelihood”, A

Collection of Critical Essays by Dr. D. Basu, Lecture Notes in Statistics,

45, Springer-Verlag, New York.

[22] GODAMBE V.P. (1982). “Estimation in survey sampling: robustness

and optimality”, Amer. Statist. Assoc., 77, p-393-406.

[23] GODAMBE, V.P. (1955). "A Unified Theory of Sampling From Finite

Populations", Journal of the Royal Statistical Society, Ser. B, 17, p-269-

278.

[24] GODAMBE, V.P. and THOMPSON, M.E. (1976). "Philosophy of

Survey Sampling Practice," in Foundations of Probability Theory,

Statistical Inference, and Statistical Theories of Science, eds. W. Harper

and C.A. Hooker, Boston: D. Reidel.



[25] GUPTA, S. and SHABBIR, J. (2010). “Variance estimation for the

regression estimator of the mean in stratified random sampling”, Journal

of Indian Society of Agricultural Statistics, 64(2), p-255-260.

[26] GUPTA, R. K., MISRA, S. (2006). “Estimation of Population Variance

Using Ratio Type Estimator”, Indian Journal of Mathematics And

Mathematical Sciences Vol.2,No.2, p-169-176.

[27] HALD, A. (1952). “Statistical theory with engineering applications”,

John Wiley and Sons, Inc, New York.

[28] HENDERSON, C. R. (1949). “Estimates of changes in herd

environment”, J. Dairy Sci. 32, 706.

[29] HANSEN, M.H., HURWITZ, W.N., and MADOW, W.G. (1953a).

“Sample Survey Methods and Theory”, Vol. 1, New York: John Wiley.

[30] HANSEN, M.H., HURWITZ, W.N., and MADOW, W.G. (1953b).

“Sample Survey Methods and Theory”, Vol. II, New York: John Wiley.

[31] HARTLEY, H.O., and RAO, J.N.K. (1968). “A New Estimation Theory

for Sample Surveys”, Biometrika, 55, p-547-558.

[32] ISAKI, C.T., and FULLER, W.A. (1982), "Survey Design under a

Regression Superpopulation Model" ,Journal of the American Statistical

Association, 77, p-89-96.

[33] JHAJJ. H.S., SHARMA, M.K. and GROVER, L.K. (2005). “An

efficient class of chain estimators of population variance under sub-

sampling scheme”, J. Jap. Stat. Soc., 35, p-273-286.



[34] JOSHI, V. M. (1968). “Admissibility of the sample mean as estimate

of the mean of a finite population”, Ann. Math. Statist., 39, p-606-

620.

[35] KADILAR, C., CINGI, H. (2004). “Ratio estimators in simple random

sampling”, Appl. Math. Comp.,151, 3, p- 893-902.

[36] KADILAR, C., CINGI, H. (2003). “Ratio estimators in stratified

random sampling”, Biometrical journal. 45 (2), p- 218–225.

[37] KADILAR, C., CINGI, H. (2005). “A new ratio estimator in stratified

sampling”, Communications in Statistics—Theory and Methods, 34, p-

597–602.

[38] KALBFLEISCH, J.D., and SPROTT, D.A. (1969). "Application of

Likelihood and Fiducially Probability to Sampling Finite Populations,"

in New Development in Survey Sampling, ed. N.L. Johnson and H.

Smith, Jr., New York: John Wiley, p-358-389.

[39] KHOSHNEVISAN, M., SINGH, R., CHAUHAN, P., SAWAN, N. and

SMARANDACHE, F. (2007). “A general family of estimators for

estimating population mean using known value of some population

parameter(s)”, Far East J. Theor. Statist., 22, p- 181- 191.

[40] KOYUNCU, N. and KADILAR, C. (2008). “Ratio and product

estimators in stratified random sampling”, Journal of statistical planning

and inference, 3820, p-2-7.

[41] KOYUNCU, N., KADILAR, C. (2009). “Family of estimators of

population mean using two auxiliary variables in stratified random



sampling”, Communications in Statistics—Theory and Methods, 38, p-

2398–2417.

[42] KISH, L. (1965). “Survey Sampling”, New York: John Wiley and Sons.

[43] MURTHY, M. N. (1967). “Sampling Theory and Methods”, Calcutta,

India: Statistical Publishing Society.

[44] MISRA S., YADAV S.K. (2011). “Ratio Type Estimator of Population

Variance Using Qualitative Auxiliary Information”, International

Transactions in Mathematical Sciences and Computer (ISSN 0974-

5068) V3 N2 .

[45] MISRA, S., GUPTA R.K., SHUKLA A.K., (2012). “Generalized Class

of Estimators for Estimation of Finite Population Variance”, Int. J.

Agricult. Stats. Sci., Vol.8, No.2, p-447-458.

[46] MISRA, S., SINGH, R. Karan (2005). “A Class of Estimators for

Estimating Population Mean Using Auxiliary Information on

Moments about Zero”, Lucknow Journal of Science 2005, Vol 2, No. 1,

p-29-39

[47] MISRA, S., MANEESHA and SINGH, R. Karan (2004). “Double

Sampling Ratio Estimator in the presence of Measurement Errors

(2004)” , Lucknow Journal of Science , Vol. 1, N0.2, p-21-27 (India)

[48] MISRA, S., SINGH, R. K. (2003). “Estimation of population variance

using difference type estimator”, Journal of International Academy of

Physical Sciences Vol 7, p- 49-54.



[49] MANEESHA, SINGH R. K. (2001). “An estimator of population mean

in the presence of measurement errors”, Jour. Ind. Soc. Ag. Statistics,

Vol. 54, p-13-18.

[50] MISRA S., YADAV S. K. & PANDEY A. (2008). “Ratio Type

Estimator of Square of Coefficient of Variation Using Qualitative

Auxiliary Information”, JRSS, Vol. 1, Issue 1, p- 44-52.

[51] NIGAM, A.K., SINGH, R. Karan (1994). “A method of sampling with

replacement”, Sankhya Vol. 56, series B, pt. 3, p-369-373.

[52] PRASAD, B. and SINGH, H.P. (1990). “Some improved ratio type

estimators of finite population variance in sample survey”, Comm.

Statist.-Theory Methods, 19, p-1127-1139.

[53] PRASAD, B. and SINGH, H.P. (1992). “Unbiased estimators of finite

population variance using auxiliary information in sample survey”,

Comm. Statist.-Theory Methods, 21, p-1367-1376.

[54] PATHAK, P. K. (1964a). “Sufficiency in sampling theory”, Annals of

Mathematical Statistics, 35, p-795-808.

[55] PATHAK, P. K. (1964b). “On inverse sampling with unequal

probabilities”, Biometrika, 51, p-185-193.

[56] PATHAK, P.K. eds., Lecture Notes- Monograph Series, Institute of

Mathematical Statistics, 17, Hayward, CA, 178-186.

[57] PFEFFERMANN, D., and SVERCHKOV, M. (1999). “Parametric

and semi-parametric estimation of regression models fitted to

survey data”, Sankhya, Series B, 61, p-166-186.



[58] RAO, J. N. K. (1984). “Conditional inference in survey sampling”,

Survey Methodology, 11, p-15-31.

[59] RAO, J.N.K. (2003). “Small Area Estimation”, New York: Wiley.

[60] ROBSON, D.S. (1957). “Application of multivariate polykays to the

theory of unbiased ratio type estimators”, J. Amer. Statist. Assoc., 52, p-

511-522.

[61] ROYALL R. M. and WILLIAM G. (Mar 1981). “An Empirical Study of

the Ratio Estimator and Estimators of its Variance”, Journal of the

American Statistical Association, Vol. 76, No. 373, p-66-77.

[62] ROYALL R.M. (1970). “On finite population sampling theory under

certain linear regression models”, Biometrika, 57, p-377-387.

[63] ROYALL R.M.(1971). “Linear regression models in finite populations

sampling theory”, In Foundations of Statistical Inference, V.P. Godambe

and D. A. Sprott, Eds. Holt, Rinehart and Winston, Toronto.

[64] ROYALL, R.M. (1976). “The Model Based (Prediction) Approach to

Finite Population Sampling Theory”, IMS Lecture Notes Monograph

Series, Volume 17, p- 225-240.

[65] ROYALL, R. M. (1968). "An Old Approach to Finite Population

Sampling Theory", Journal of the American Statistical Association, 63,

p-1269-1279.

[66] ROYALL, R. M. (1976). "Current Advances in Sampling Theory

Implications for Human Observational Studies", American Journal of

Epidemiology, 104, p-463-473.



[67] ROYALL, R. M., and CUMBERLAND, W.G. (1978a). "Variance

Estimation in Finite Population Sampling", Journal of the American

Statistical Association, 73, p-351-358.

[68] ROYALL, R. M., and EBERHARDT, K.R. (1975). "Variance Estimates

for the Ratio Estimator", Sankhya, Ser. C, 37, p-43-52.

[69] ROYALL, R. M., and HERSON, J.H. (1973a). "Robust Estimation in

Finite Populations I", Journal of the American Statistical Association,

68, p-880-889.

[70] ROYALL, R. M., and HERSON, J.H. (1973b). "Robust Estimation in

Finite Populations II: Stratification on a Size Variable", Journal of the

American Statistical Association, 68, p-890-893.

[71] RAO, T.J. (1991). “On certain methods of improving ratio and

regression estimators”, Commun. Statist. Theo. Meth., 20, 10, p-3325-

3340.

[72] SEARLS, D.T. (1964). “The Utilization of Known Coefficient of

Variation in the Estimation Procedure”, J.Amer.Statist.Assoc.59,

p-1225-26.

[73] SINGH, R. and MANGAT, N.S. (1996). “Elements of survey

sampling”, Kluwer Academic Publisher.

[74] SINGH, H.P., TAILOR, R. (2005). “Estimation of finite population

mean using known correlation coefficient between auxiliary characters”,

Statistica, 65, 4, p-407-418.



[75] SINGH, H.P., SOLANKI, R.S. (2012). “Improved estimation of

population mean in simple random sampling using information on

auxiliary attribute”,Appl. Math. Comp., 218, p-7798-7812.

[76] SINGH, H.P., KARPE, N. (2009). “On the estimation of ratio and

product of two population means using supplementary information in

presence of measurement errors”, Statistica, 69, 1, p-27-47.

[77] SHALABH (1999). “Ratio method of estimation in the presence of

measurement errors I”, Ind. Soc. Ag. Statistics, Vol. 52, p-150-155.

[78] SINGH, R.K. and SINGH G. (1984). “A class of estimators with

estimated optimum values in sample survey, statistics & probability

Letters 2”, p-319-321, North Holland.

[79] SINGH, H. P. and VISHWAKARMA, G. K. (2008). “A Family of

Estimators of Population Mean Using Auxiliary Information in

Stratified Sampling”, Communications in Statistics—Theory and

Methods, 37, p-1038–1050.

[80] SINGH, R.K. (1998). “Sequential estimation of the mean of normal

population with known coefficient of variation”, METRON, Vol. LVI n.

3-4, p-73-90.

[81] SHABBIR, J., GUPTA, S. (2005). “Improved ratio estimators in

stratified sampling”, American Journal of Mathematical and

Management Sciences, 25 (3-4), p- 293-311.



[82] SHABBIR, J., GUPTA, S. (2006). “A new estimator of population

mean in stratified Sampling”, Communications in Statistics—Theory

and Methods, 35, p-1201–1209.

[83] SRIVASTAVA, S.K. and JHAJI, H.S. (1981). “A class of estimators of

population mean in survey sampling using auxiliary information”,

Biometrika 68 (1), p- 341-343.

[84] SNEDECOR, G.W. (1946). “Statistical methods”, Ames, I.A. The Iowa

State College Press.

[85] SARNDAL, C.E (1982). "Implications of Survey Design for

Generalized Regression Estimation of Linear Functions", Journal of

Statistical Planning and Inference, p-155-170.

[86] SCOTT, A., and SMITH, T.M.F. (1969). "Estimation in Multi-Stage

Surveys", Journal of the American Statistical Association, 64, p-830-

840.

[87] SHUKLA, N D and PANDEY, S K (1982). “A note on product

estimator”, Pure Appl. Math. Soc. 15, 12, p-97-101.

[88] SINGH D and CHAUDHARY F S (1989). “Theory and analysis of

Sample Surveys Designs”, Wiley Eastern Limited New Delhi-110012

(INDIA).

[89] SINGH, M P. (1967). "Ratio Cum Product Method of Estimation",

Sankhya, Vol 12, No 1, p-34-42.



[90] SMITH, T. M. F. (1981). “Regression analysis for complex surveys”, In:

D. Krewski, R. Platek and J.N.K. Rao, Eds. Current Topics in Survey

Sampling. Academic Press, New York, p-267-292.

[91] SRIVASTAVA V.K. and BHATNAGAR S, (1981). “Ratio and product

methods of estimation when X is not known” Journal of Statistical

Research 15, p-29-39.

[92] SRIVASTAVA, S.K. (1971). “A generalized estimator for the mean of a

finite population using multiauxiliary information”, J. Amer. Statist.

Assoc. 66, p-404-407.

[93] SRIVASTAVA, S.K. and JHAJJ, H.S. (1980). “A class of estimators

using auxilary information for estimating the finite population variance”,

Sankhya, C42, p-87-96.

[94] SRIVASTAVA, S.K. and JHAJJ, H.S. (1981). “A class of estimate of

the population mean in survey sampling using auxilary information”,

Biometrika, 68, p-341-343.

[95] SRIVASTAVA, S.K. and JHAJJ, H.S. (1983). “A class of estimators of

mean and variance using auxiliary information when correlation

coefficient is known”, Biom. J., 25,p- 401-407.

[96] SRIVASTAVA, S.K., B.B. KHARE and S.R. SRIVASTAVA, (1990).

“A generalized chain ratio estimator for mean of finite population”, J.

Ind. Soc. Agri. Statist., 42, p-108-117.



[97] STANEK E J, O'HEARN J R (1998). “Estimating realized random

effects”, Communications in Statistics Theory and Methods ,Volume:

27, p-1021-1048 ISSN: 03610926.

[98] STUART, A. (1976). “Basic Ideas of Scientific Sampling (2nd ed.)”,

New York: Hafner.

[99] SUKHATME, P.V., SUKHATME, B. V., SUKHATME, S. and

ASHOK, C. (1984). “Sampling Theory of Surveys with Applications”.

[100] THOMPSON M. E., CHAPMAN & HALL (1997). “Theory of Sample

Surveys”, London. ISBN 0-412-31780-X

[101] TAILOR, R., TAILOR, R., PARMAR, R. and KUMAR, M. (2012).

“Dual to Ratio-cum-Product estimator using known parameters of

auxiliary variables”, Journal of Reliability and Statistical Studies 5(1), p-

65-71.

[102] UPADHYAYA, L. N., SINGH, H. P. (1999). “Use of transformed

auxiliary variable in estimating the finite population mean”, Biometrical

Journal, 41 (5), p-627-636.

[103] VISWAKARMA, G.K., SINGH, H.P. (2011). “Separate Ratio-Product

Estimator for Estimating Population Mean using Auxiliary

Information”, Journal of Statistical Theory and Applications, Volume

10, Number 4, 2011, p-653-664.

[104] WATSON, D.J., (1937). “The estimation of leaf areas”, Jour. Agr. Sci.,

27: 474.



[105] WU, C.F.J. (1985). “Variance estimation for the combined ratio and

combined regression estimators”, J. Roy. Statist. Soc., 47,p-147-154.

[106] WALPOLE R.E. , MAYER R.H. , MAYER S.L. , and Ye K. (2005).

“Probability & Statistics for Engineers & Scientists”, Pearson Education

(Singapore) Pte. Ltd., India Branch, 482 F.I.E. Patparganj, Delhi-

110092, (INDIA)

Documents

ON SOME SAMPLING STRATEGIES AND THEIR OPTIMALITYshodhganga.inflibnet.ac.in/bitstream/10603/41970/2/archana shukla... · on some sampling strategies and their optimality thesis submitted