Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
1
CHAPTER 1
INTRODUCTION
Statistical methods are becoming more applied in socio-economic parameters
and related problems. Socio-economic statistics thus form an important factor in the
development of the country. It includes a vast array of information on health, literacy
status, labour force, economic status such as poverty, unemployment, population
parameters like fertility, migration, gender effects etc.
Socio-economic parameters and related problems:
The socio-economic parameters we have chosen to study using the official
statistics are mainly rural poverty, literacy, migration, child-labour, unemployment
and related problems.
1. Literacy: Literacy is fundamental to many state sponsored interventions in
less developed countries because of its pervasive influence on economically relevant
variables, such as productivity, health and earnings quite apart from its intrinsic value
as a vitally important goal of development. Traditionally, a society’s literacy is
measured by the “literacy rate” that is the percent (or, equivalently, fraction) of the
adult population that is literate. The present proposal will try to maintain that the
distribution of literates across households may also matter, due o the external effects
of literacy – the benefits that illiterate members of a household will derive from
having a literate person in the family. This argument was introduced by Basu and
2
Foster (1998) who had suggested a new measure of literacy, -the “effective literacy”
rate to which further amendments had been proposed by Subramanian (1999). The
measure of literacy most widely employed is the familiar literacy rate r, which is the
proportion of the adult population that is literate: r = L/ N .Where, L= Literates, N=
Persons.
2. Religion composition: The term religion is used to mean any shared of
beliefs, activities and institutions premised upon faith in supernatural forces.
Abundant evidence affirms that religious belief affects a wide range of behavioral
outcomes and religious activity can also affect economic performance at level of the
individual, group, state or nation. Hindus, Muslims, Christians, Sikhs, Buddhists,
Jains etc. are some of the main religious communities followed in Gujarat.
3. Rural poverty and Total Factor Productivity: In India, only 28% of
population was living in urban areas as per 2001 census ( Pranati Datta, 2006). There
is still a large pressure of population in rural India. So, the best way to study poverty
in India would be to study rural poverty of India. Poverty in rural India has declined
substantially during the period of 1970-1993 decades. With few exceptions (Bardhan
1973;Griffin and Ghose 1979), most of these studies have found an inverse
relationship between growth in agricultural income and the incidence of rural poverty.
Our purpose of the research is also to investigate the contribution of agricultural
growth & other socio-economic parameters as well in rural poverty reduction in India.
The poverty measure we have used to study is poverty Head-count ratio (HCR). The
poverty headcount ratio (HCR) is the proportion of the national population whose
3
incomes are below the official threshold (or thresholds) set by the national
government. The main chosen responsible factors for our study are rural development
expenditure by government (DEV), percentage of cropped area sown with high
yielding varieties (HYV), percentage of cropped area irrigated (IRR), percentage of
villages electrified (ELE), percentage of rural population that is literate (LITE), road
density in rural India (ROAD), total factor productivity growth in Indian agriculture
(TEP), changes in rural wages (WAGES).
As result of rapid adoption of new technology and improved rural
infrastructure, agriculture production and factor productivity have both grown rapidly
in India. Five major crops ( Rice, Wheat, Sorghum, Pearl, Millet and Maize), 14
Minor Crops (Barley, cotton, groundnut, other grain, other pulses, Potato rapeseed,
mustard, sesame, sugar, tobacco, soybeans, jute and sunflower) and three major
livestock products ( Milk, Meat and chicken ) are included in this measure of total
production. Total factor productivity (TFP) is defined as aggregate output minus
aggregate input. Five inputs (labor, land, fertilizer, tractors and buffalos)
are included.
4. In-migration: We have applied cluster analysis techniques to examine the
degree of migration integration in Gujarat state, focusing in particular on the In-
migration process. Dalkhat Ediev(2003) and David Coleman (2006) have studied
Migration as a factor of population reproduction. Migration remains a major player in
the population structure and hence plays a key role in the transmission of population
related policies for the particular area (Gujarat). We have focused on a data set
4
consisting mainly of reason variables for In-migration in Gujarat from distinct
destinations.
5. Child labour: Our study here also deals with the incidence of child labour
in India for the period 1990-1991. According to one article by D. P. Chaudhri (1997),
the incidence of total child labour in India is around 10 million during the decade of
2000 and he had also mentioned that it had the declining trend. Here we are interested
to check this assumption. We have also tried to focus on the impact of socio-
economic parameters on the incidence of decline in child labour.
6. Unemployment: Provision of increasing employment opportunities both
in urban and rural areas to solve the problem of unemployment has been recognized
as an important objective of economic planning in India. Unemployment is measured
through labour force surveys which elicit the “activity” status of the respondent for a
given reference period. Our focus was to study impact of socio-economic variables on
rural unemployment.
MULTIVARIATE TECHNIQUES:
In this research work, we have discussed some advanced multivariate
techniques. It requires a high level of statistical knowledge. Multivariate methods deal
with the simultaneous treatment of several variables. They have a useful role in
analyzing data from developing country surveys. We first discuss the effective use of
such methods: (a) as an exploratory tool to investigate patterns in the data; (b) to
5
identify natural groupings of the population for further analysis; and (c) to reduce
dimensionality in the number of variables involved.
1. MULTIPLE REGRESSION ANALYSIS:
Regression analysis is an important statistical technique to analyze data in
social, economic and engineering sciences. Regression models involve the unknown
parameters denoted as β; this may be a scalar or a vector of length k, the independent
variables: x and the dependent variable y. The regression methods used under our
research work are following:
Step-wise regression method: Step-wise regression method is a very
famous method of regression analysis. It gives a better idea of the independent
contribution of each explanatory variable. Under these techniques, we have added the
independent contribution of each explanatory variable into the prediction equation one
by one, computing betas and R2 at each step.
Pooled Panel data analysis: Pooled Panel data analysis is a method of
studying a particular subject within multiple sites, periodically observed over a
defined time frame. The combination of time series with cross-sections can enhance
the quality and quantity of data in ways that would be impossible using only one of
these two dimensions (Gujarati, 2003). This facilitates the construction and testing
more realistic behavioral models that could not be identified using only a cross
section or a single time-series data set.
6
Robust regression: Robust regression is an important tool for analyzing
data that are contaminated with outliers. It can be used to detect outliers and to
provide resistant (stable) results in the presence of outliers. In order to achieve this
stability, robust regression limits the influence of outliers. In theory, when the
assumption of normally does not meet, the standard least squares estimation for the
regression coefficients will be biased and / or non efficient, see for example
(Hampel et al. 1986). When the assumption of normally is not met in a linear
regression problem, several alternative methods of the standard Least Squares (LS)
regression had been proposed (Draper & Smith, 1998; Kutner et al., 2004; Ortiz et.,
2006). Among these, the estimation methods we have applied are robust M-
estimation, (Huber, 1964), least Absolute Deviations (LAD) method, (Dielman &
Pfaffenberger, 1982) and (Bloom field & steiger, 1983) and nonparametric (rank
based) methods, (Adichie, 1967; Jureckova, 1971); Jaeckel, 1972)etc. LTS estimation
(Rousseeuw, 1984), S estimation (Rousseeuw and Yohai, 1984), LMS estimation
(Rousseeuw and Leroy,1987).
Regression analysis with change-point: Quite often, in practice, the
regression coefficients are assumed constant. In many real-life problems, however,
theoretical or empirical deliberations suggest models with occasionally changing one
or more of its parameters. The main parameter of interest in such regression analyses
is the shift point parameter, which indexes when or where the unknown change
occurred.
7
A variety of problems, such as switching straight lines (Ferreira,
1975), shifts of level or slope in linear time-series models (Smith, 1980), detection of
ovulation time in women (Carter and Blight, 1983), and many others, have been
studied during the last two decades. Holbert (1982), while reviewing Bayesian
developments in structural change from 1968 onward, gives a variety of interesting
examples from economics and biology.
Two-phase linear regression model: The Two-phase linear regression
model is one of the many models, which exhibits structural change. Holbert (1982)
used a Bayesian approach, based on the TPLR model, to reexamine the McGee and
Carlteon (1970) data for stock market sales volume and reached the same conclusion
that the abolition of split-ups did hurt the regional exchanges. Let the regression
model be Yi = β1 X i + i , where i are i. i. d. N (0, 2) random errors with variance
2
> 0 but later it was found that there was a change in the system at some point of time
m and it is reflected in the sequence after Xm by change in slope, regression
parameter The problem of study is: When and where this change has started
occurring. This is called change point inference problem.
2. MULTIVARIATE SPATIAL ANALYSIS:
According to Anselin (1988), spatial econometrics addresses
issues causes by space in statistical analysis of regional science regressions. In other
words, spatial analysis is the combination of statistical and econometric methods that
8
deal with problems concerning spatial effects. According to Anselin (1988), spatial
econometrics addresses issues causes by space in statistical analysis of regional
science regressions. We have focused on two issues that are often overlooked in
technical treatments of the methods of spatial statistics and spatial econometrics.
There are two opposite approaches towards dealing with spatially referenced
data (Anselin 1986b; Haining 1986). In one, which we will call the data-driven
approach; information is derived from the data without a prior notion of what the
theoretical framework should be. In other words, one lets the "data speak for
themselves" (Gould, 1981) and attempts to derive information on spatial pattern,
spatial structure and spatial interaction without the constraints of a pre-conceived
theoretical notion.
In the data-driven approach, errors are considered to provide information. The
focus of attention is on how the spatial pattern of the errors relates to data generation
processes. For example, in attempts to provide measures of uncertainty for spatial
information in a GIS, the spatial pattern of errors is related to data collection and
manipulation procedures. The spatial pattern of errors can often provide insight into
the form of the underlying substantive spatial process, e.g., as exploited in the model
identification stage of a spatial time series analysis. Important and still unresolved
research questions deal with the formulation of useful spatial error distributions, in
which error is related to location, distance to reference locations, area, etc...
9
The data-driven approach in spatial analysis is reflected in a wide range of
different techniques, such as point pattern analysis (Getis and Boots, 1978; Diggle,
1983), indices of spatial association (Hubert 1985; Wartenberg, 1985), kriging (Clark,
1979), spatial adaptive filtering (Foster and Gorr, 1986), and spatial time series
analysis (Bennett, 1979). All these techniques have two aspects in common. First,
they compare the observed pattern in the data (e.g., locations in point pattern analysis,
values at locations in spatial autocorrelation) to one in which space is irrelevant. In
point pattern analysis this is the familiar Poisson pattern, or "randomness," while in
many of the indices of spatial association it is the assumption that an observed data
value could occur equally likely at each location (i.e., the basis for the randomization
approach). The second common aspect is that the spatial pattern, spatial structure, or
forms for the spatial dependence are derived from the data only. For example, in
spatial time series analysis, the specification of the autoregressive and moving
average lag lengths is derived from autocorrelation indices or spatial spectra.
The data-driven approach is attractive in many respects, but its application is
not always straightforward. Indeed, the characteristics of spatial data (dependence and
heterogeneity) often void the attractive properties of standard statistical techniques.
Since most EDA techniques are based on an assumption of independence, they cannot
be implemented uncritically for spatial data. In this respect, it is also important to note
that dependence in space is qualitatively more complex than dependence in the time
dimension, due to its two-directional nature. As a consequence, many results from the
analysis of time series data will not apply to spatial data. As discussed in detail in
10
Hooper and Hewings (1981), the extension of time series analysis into the spatial
domain is limited, and only applies to highly regular processes. It goes without saying
that most data in empirical spatial analysis for irregular areal units do not fit within
this restrictive framework.
The second approach, which is called model-driven, starts from a theoretical
specification, which is subsequently confronted with the data. The main conceptual
problem associated with this approach is how to formalize the role of "space." This is
reflected in three major methodological problems, which are still largely unresolved
to date: the choice of the spatial weights matrix (Anselin 1984, 1986a); the modifiable
areal unit problem (Openshaw and Taylor 1979, 1981); and the boundary value
problem (Griffith 1983, 1985). We have chosen model-driven approach.
The analysis of spatial data has always played a central role in the quantitative
scientific tradition in geography. Recently, there have appeared a considerable
number of publications devoted to presenting research results and to assessing the
state of the art. For example, at an elementary level, Goodchild (1987a), Griffith
(1987a), and Odland (1988) introduce the concept of spatial autocorrelation, and
Boots and Getis (1988) review the analysis of point patterns. At more advanced
levels, Anselin (1988a) and Griffith (1988) deal with a wider range of methodological
issues in spatial econometrics and spatial statistics. Extensive reviews of the current
state of the art for different aspects of spatial data analysis are presented in Anselin
(1988b), Anselin and Griffith (1988), Getis (1988), Griffith (1987b), and Odland,
Golledge and Rogerson (1989). In addition, spatial data analysis has received
11
considerable attention as an essential element in the development of Geographic
Information Systems (GIS), e.g., in Goodchild (1987b) and Openshaw (1987), and as
an important factor in regional modeling, in Anselin (1989a).
The main problem is that the geometric and graphical representation of the
location of points, lines or areal boundaries (i.e., a map) gives an imperfect
impression of the uncertainty associated with errors in their measurement. Since these
locational features are important elements in the evaluation of distance and relative
position, and in the operations of areal aggregation and interpolation, the associated
measurement error will affect many of the "values" generated in a spatial information
system as well. Although similar errors occur in the time dimension, they are much
simpler to take into account since they only propagate in one direction.
SPATIAL ERRORS:
Basic to both the data-driven and the model-driven analysis of spatial data is an
understanding of the stochastic properties of the data. The use of "space" as the
organizing framework leads to a number of features that merit special attention, since
they are different from what holds for a-spatial or time series data. The most
important concept in this respect is that of error, or, more precisely for data observed
in space, spatial error. The distinguishing characteristics of spatial error have
important implications for description, explanation and prediction in spatial analysis.
In this section, we present a simple taxonomy of the nature of spatial errors and
outline some alternative perspectives on how error can be taken into account.
12
THE NATURE OF SPATIAL ERRORS:
Spatial errors can be due to measurement error or specification error, or to a
combination of both. Measurement error occurs when the location or the value of a
variable are observed with imperfect accuracy. The former is an old cartographic
problem and is still very relevant in modern geographic information systems (e.g.,
errors due to a lack of precision in digitizing).
Other spatial errors of measurement have to do with the imperfect way in
which data on socio-economic phenomena are recorded and grouped in spatial units
of observation (e.g., various types of administrative units). This interdependence of
location and value in spatial data leads to distinctively spatial characteristics of the
errors. These are the familiar spatial dependence and spatial heterogeneity.
Dependence is mostly due to the existence of spatial spillovers, as a result of a miss-
match between the scale of the spatial unit of observation and the phenomenon of
interest (e.g., continuous processes represented as points, or processes extending
beyond the boundaries of administrative regions). Heterogeneity is due to structural
differences between locations and leads to different error distributions (e.g.,
differences in accuracy of census counts between low-income and high-income
neighborhoods).
Specification error is particular to the model-driven approach in spatial data
analysis. It pertains to the use of a wrong model (e.g., recursive vs. simultaneous), an
inappropriate functional form (e.g., linear vs. nonlinear), or a wrong set of variables.
In essence it is no different from miss-specification in general, but it generates spatial
13
patterns of error due to the use of spatial data. These spatial aspects can occur as a
result of ignoring location-specific phenomena, spatial drift, regional effects or spatial
interaction. When a false assumption of homogeneity is forced onto a model in those
instances, spatial heterogeneous errors will result. Similarly, when the spatial scale or
extent of a process does not correspond to the scale of observation, or when the nature
of a process changes with different scales of observation, spatial dependent errors will
be generated.
The treatment of spatial errors in data analysis is fundamentally different
between the data-driven and the model-driven approaches. In the model-driven
approach to spatial data analysis, the errors are considered to be a nuisance. The main
focus is on how to identify the spatial distribution of the error process and to eliminate
e the effect of errors on the statistical inference. In other words, once errors are
identified, they are eliminated by means of transformations, corrections or filters.
Alternatively, robust estimation and test procedures can be applied that are no longer
sensitive to the effect of errors. A major research question in this respect is how
diagnostics can be developed that are powerful in detecting various types of errors,
and are able to distinguish between them (e.g., to distinguish between spatial
dependence and spatial heterogeneity, or “real" vs. "apparent" contagion).
3. CLUSTER ANALYSIS:
Clusters are not unique to economics, but also and more frequently appear e.g.
in statistics, music or the computer sciences. In its literal and most general meaning a
14
“cluster” is simply defined as a “close group of things” (The Concise Oxford
Dictionary, 1982). Thus the synonymous notions of “density”, “relative nearness”
and “similarity” lie at the very heart of the cluster idea. A priori, “closeness” or
“similarity” is not restricted to any particular dimension (geography, technology, or
social characteristics, etc.) or limited by any specific scale. Consequently, both have
to be chosen exogenously to fit the question under investigation.
In Cluster analysis, we seek to identify the “natural” structure of groups based
on a multivariate profile, if it exists, which both minimizes the within-group variation
and maximizes the between-group variation. It is exploratory, descriptive and non-
inferential. The solutions in this approach are not unique since they are dependent on
the variables used and how cluster membership is being defined. There are no
essential assumptions required for its use except that there must be some regard to
theoretical/conceptual rationale upon which the variables are selected. The
homogenous clusters derived from the cluster analysis can then become a unit for
analysis. Based on scores on psychological inventories, you can cluster patients into
subgroups that have similar response patterns. This may help you in targeting
appropriate treatment and studying typologies of diseases.
An important step in any clustering is to select a distance measure, which will
determine how the similarity of two elements is calculated. This will influence the
shape of the clusters, as some elements may be close to one another according to one
distance and farther away according to another.
15
4. FACTOR ANALYSIS:
Factor analysis is a statistical method used to describe variability among
Observed variables in terms of fewer unobserved variables called factors. The
Observed variables are modeled as linear combinations of the factors, plus error
terms. The information gained about the interdependencies can be used later to reduce
the set of variables in a dataset. Factor analysis is originated in psychometrics and is
used in behavioral sciences, social sciences, marketing, product management,
operations research and other applied sciences that deal with large quantities of data.
Factor analysis estimates how much of the variability is due to common factors
(communality). A high value of communality means that not much of the variable is
left over after whatever the factors represent is taken into consideration. Since the
factors happen to be linear combination of data, the coordination of each observation
or variable is measured to obtain what are called factor loadings. Such factor loadings
represent the correlation between the particular variable and the factor, and are
usually placed in a matrix of correlations between the variable and the factors.
M.C. Garg and Anju Verma had done an empirical factor analysis of
Marketing Mix in the Life Insurance Industry in India. they have created nine
dimensions and converted them into four factors after applying factor analysis
through principal component analysis. D. P. Chaudhri had also applied factor analysis
to child labour in India with Pervasive Gender and Urban Bias in School Education.
16
5. PRINCIPAL COMPONENT ANALYSIS:
Principal-components method (or simply P.C. method) is
developed by H.Hotelling, seeks to maximize the sum of squared loadings of each
factor extracted in turn. The Principal Component analysis has a special advantage
over all other methods of data aggregation in the sense that it redefines the large set of
variables in terms of a fewer set of orthogonal variables called Principal Components
and succeeds in reducing the dimensionality problem. Principal Components are
linear combinations of variables with weights in terms of their eigen vectors. These
vectors are derived from the correlation matrix of the variables.
6. DISCRIMINENT ANALYSIS:
In discriminant analysis technique individuals or objects might be
classified into one of two or more mutually exclusive and exhaustive groups on the
basis of a set of independent variables. Discriminant analysis is considered an
appropriate technique when the single dependent variable happens to be non-metric
and is to be classified into two or more groups, depending upon its relationship with
several independent variables which all happen to be metric.
The discriminant analysis provides a predictive equation, measures the
relative importance of each variable and is also a measure of the ability of the
equation to predict actual class-groups concerning the dependent variable. While
estimating the discriminant function coefficients, we need to choose the appropriate
discriminant analysis method. The direct method involves estimating the discriminant
17
function so that all the predictors are assessed simultaneously. The step-wise method
enters the predictors sequentially. The two-group method should be used when the
dependent variable has two categories or states.
CHANGE-POINT PROBLEM:
A sequence of random variables X1,…, Xn is said to have a change-point
at r (1 ≤ r ≤ n) if Xi ~ F1(x|θ1) ( i = 1,….,r ) and Xi ~ F2(x|θ2) ( i = r + 1,….,n ), where
F1(x|θ1) ≠ F2(x|θ2). We shall consider the situation in which F1 and F2 have known
functional forms, but the change-point, r, is unknown. Given a sequence of
observations x1,….,xn, the problem we shall be concerned with is that of making
inferences about r. The parameters θ1 and θ2, which could be vector-valued, may be
known or unknown; in the latter case, it might be of interest to make inferences about
these also.
BAYESIAN INFERENCE
In Classical Inference, we consider a parameter in a model ),( xf as a
fixed unknown quantity and make inferences about based on the information
contained in the observed sample only. In Bayesian Inference, it is believed that we
always have available and make use of subjective probabilities measuring degrees of
belief about the value or values of unknown parameter . These subjective
probabilities are used to define what is called the prior distribution for the parameter
18
, prior to sampling. In other words, the parameter may be treated as a random
variable with known prior distribution, say )(g . In this case, the probability density
(mass) function ),( xf is denoted by a conditional probability notation )( xf to
stress the fact that f can only be used to compute probabilities for a given value of ,
i.e. f is conditionally depending on . After observing a random sample
),........,,( 21 nxxxx , the joint (unconditional) density for the sample x and the
parameter is )()( gxf . Using well known Bayes theorem the conditional density
of , given the sample x is,
dgxf
gxfxf
)()(
)()()( ,
Where,
dgxf )()( is the marginal density of x. )( xf is called Posterior
density of . The foundation of Bayesian Inference is the Posterior Density )( xf of
. All phases of the inference problem depend on this distribution. For example, the
posterior mean, the posterior median or the posterior mode are the Bayesian Point
estimate of and may be used depending upon the Posterior distribution is symmetric
or asymmetric.
The ML method, as well as other classical approaches is based only on the
empirical information provided by the data. However, when there is some technical
knowledge on the parameters of the distribution available, a Bayesian inference seems
to an attractive inferential method. The Bayes procedure is based on a posterior
19
density, say g (1, 1,, m|X) , which is proportional to the product of the likelihood
function ),x|m,,,(L21 with a prior joint density, say g(1, 1,,m) representing
the uncertainty on the parameters values.
g(1, 1,,m | x) =
dddmgxmLn
m
mgxmL
21),,2,1()|,,2,1(
21
1
1
),,2,1()|,,2,1(
In Bayesian inferences, choice of prior density is most controversial.
Generally, the priors are classified as follows:
(1) Proper and Improper Priors: A Proper prior satisfies the definition of a
probability mass function or a probability density function (whichever the case
be). It is a weight function that allocates positive weights that total one to
possible values of the parameters. An Improper Prior is any weight function
that sums (or integrates) over the possible values of the parameter to a value
other than one say k, so if f(x) is uniform over the entire line from -∞ to ∞,
0,;)( kxkxf ,
then it is not a proper density, since the integral dxkdxxf
)(
does not exist, no matter how small k is. These kind of density functions are
called Improper Prior distributions. An Improper Prior can induce a proper
prior by normalizing the function, if k is finite, otherwise it remains Improper
and simply plays the role of a weighting function, or some ad hoc method is
20
used to determine a Proper prior. These type of density functions are used to
represent the local behavior of the prior distribution in the region where
likelihood is appreciable but not over its entire admissible range.
(2) Non-Informative Priors: Because of the compelling reasons to perform a
conditional analysis and the attractiveness of using Bayesian Machinery to do so,
there have been attempts to use the Bayesian approach even when no (or minimal)
prior information is available. The Non-Informative prior refer to the case when
relatively little information (or limited information) is available a priori. They are
often called priors of ignorance, but rather than a state of complete ignorance, the
priori information about the parameter is not considered substantial relative to the
information expected to be provided by the sample of empirical data. A non-
informative prior also results when the experimenter is willing to accept the set of
parameter values which are equally likely choices for the parameter. Here the
experimenter is indifferent to the choice of parameter values and this state of
indifference may be expressed by taking a prior to be locally uniform. In general
an approximate prior is taken proportional to the square root of Fisher’s
Information. This is known as Jeffrey’s rule and the prior as Jeffrey’s prior.
(3) Conjugate Priors: The conjugate prior distributions are essentially the
families of distributions that ease the computational burden when used as a
prior distribution. This is a set of the distribution each of which can be
combined with the given without too much difficulty. Three properties that
have been put forth as desirable properties for conjugate families of
21
distribution are mathematical tractability, richness and ease of interpretation.
By richness, we mean, the prior distribution should reflect the statistician’s
prior information. By ease of interpretation, we mean, that it should be
possible to specify the conjugate family such a way that it is readily interpreted
by the person whose prior information is of interest.
SYMMERIC LOSS FUNCTIONS
The Bayes estimator of a generic parameter (or function thereof) under
the “continuous” squared error loss (SEL) function
dddL ,,)(),( 2
1 , (1.1)
is just the posterior mean when is a real valued parameter and d is a decision rule to
estimate .
As a consequence, the SEL function relative to an integer parameter,
.......2,1,0,,)(),( 2
1 vmvmvmL (1.2)
Hence, the Bayesian estimate of an integer-valued parameter under the SEL function
(1.2) is no longer the posterior mean and can be obtained by numerically minimizing
the corresponding posterior loss. Generally, such a Bayesian estimate is equal to the
nearest integer value to the posterior mean.
Other Bayes estimators of based on the loss functions:
||),(2 ddL ,
22
otherwise
difdL
,1
0,||,0),(3
,
is the posterior median and posterior mode respectively.
ASYMMERIC LOSS FUNCTIONS
The loss function L(,d) provides a measure of the financial consequences
arising from a wrong decision rule d to estimate a unknown quantity (a generic
parameter or function thereof) . The choice of the appropriate loss function depends
on financial considerations only, and is independent from the estimation procedure
used. The Bayes approach allows the economic considerations formulated through
loss functions to be used in a rational manner. In fact, combining the selected loss
function with the posterior density, the posterior expectation of the loss function is
obtained and the d value that minimizes the expected loss is defined as the Bayes
Estimate. This point estimate is, of course, optimal relative to the loss function
chosen. The use of symmetric loss function was found to be generally inappropriate,
since for example, an overestimation of the reliability function is usually much more
serious than an under estimation. Large attention has been recently addressed to
Asymmetric Loss Functions.
A useful asymmetric loss, known as the Linex Loss Function was introduced
by Varian (1975). This function rises approximately exponentially on one side of
zero and approximately linearly on the other side. Under the assumption that the
minimal loss occurs at d, the Linex loss function can be expressed as,
23
1)()(exp, 114 dqdqdL , .01 q (1.3)
The sign of the shape parameter q1 reflects the deviation of the asymmetry,
q1>0 if over estimation is more serious than under estimation, and vice versa, and the
magnitude of q1 reflects the degree of asymmetry.
The posterior expectation of LINEX loss function was introduced by Varian
(1975) is:
1expexp, 1114 iii EdqqEdqdLE ,i=1, 2.
The Bayes estimates *L is the value of d that minimizes dLEi ,4 viz., the
solution of the following equation,
0expexp
,111
41
qqEdqq
d
dLEi
i
This gives
11
*1 exp
qEe i
Lq
and hence
11
*expln
1qE
qiL , i=1, 2 (1.4)
provided that 1exp qEi exists and is finite.
Despite the flexibility and popularity of the LINEX loss function as for the
location parameter estimation, it appears to be not suitable for the estimation of the
scale parameter and other quantities (Basu and Ebrahimi (1991)). For these reasons
Basu and Ebrahimi (1991) defined a Modified LINEX Loss function:
1)1()]1(exp[),( 1
2
1
25 dqdqdL , (1.5)
24
q2
where, the estimation error is expressed by ]1)/[( d . Such modification does not
change the characteristics of Varian’s LINEX loss.
The posterior expectation of the modified Linex Loss Function dL ,5 is
1//expexp, 22225 qdEqdqEqdLE iii , i= 1, 2
and the value of d that minimizes dLEi ,5 , say *m , is the solution of the
following equation.
0
2
1exp
1exp
,2
222
5
ii
i Eqdq
Eqqd
dLE
provided that all the expectations are finite. Thus, *m can not be given in a closed
form and its evaluation involves iterative procedures.
A suitable alternative to the Modified LINEX Loss is the General Entropy
(GE) loss function, proposed by Calabria and Pulcini (1994) is given by,
1ln3
3,6
dq
qd
dL , (1.6)
whose, minimum occurs at d=. This loss is generalization of the Entropy loss
function used by several authors (see, for example, Dey et. al. (1987) and Dey and Liu
(1992)), where the shape parameter q3 is equal to 1. The more general version (1.4)
allows different shapes of the loss function to be considered. When q3 > 0, a positive
error (d > α) causes more serious consequence than a negative error. The posterior
expectation of the generalized Entropy Loss Function dL ,6 is
1]/[ln3
]3/[,6
di
Eqqdi
EdLi
E , i= 1, 2
25
and the value of d that minimizes dLEi ,6 , say *E , is the solution of the
following equation.
0/, 1
333
36 1
dEqdEq
d
dLEii
i qq
,
which gives
1]}3[{3
q
iE
qd ,
3/1
]}3[{* qq
iE
E
, (1.7)
provided that ]3[q
iE
exists and is finite.
General Entropy Loss (GEL) Function for an Integer parameter:
,.....2,1,0,,1)/ln()/(),( 363 vmmvqmvvmL
q (1.8)
Minimizing expectation )],([ 6 vmLEm and using posterior distribution, we get the
estimate m by means of the nearest integer value , say *GEm .
The chapters in the thesis are organized in the following way: Chapter: 1
gives a detailed introduction of the study and its genesis.
Chapter: 2 deals with application of multivariate technique different multiple
regression models on rural poverty parameter for selected states of India. The official
data source for the study is the World Bank data. The different regression models we
26
have discussed here are as under. Section 2.1 and 2.2 explains a detailed introduction
and the data application for this chapter respectively.
2.3 Step-wise regression model
2.4 Pooled panel regression model
2.5 Robust regression model
Some contents of the section 2.4 have been published in an International
Journal of Agricultural and Statistical sciences (IJASS) (Vol.6, No.2, Dec 2010).
Chapter: 3
Here we have studied application of multivariate spatial analysis technique on
rural population under poverty line on socio-economic parameters such as rural
literacy, irrigation, wages etc. The official statistics used here is also the World Bank
data across selected states in India. We have developed two spatial error models. Our
focus was to apply such model, which minimize the spatial error. Jean H.P. Paelinck
in the early 1970s, originates as an identifiable field in Europe because sub-country
data in regional econometric models are needed to deal with and been fast developed
& grown during the 1990s (Anselin,1999).
Chapter: 4 discuss application of First order autoregressive process with exponential
errors with one change point on total population of India from the official statistics
source CENSUS. Bayesian inferences, particularly estimation of change point, auto
regressive coefficient are derived under two prior considerations and using Linex and
27
General Entropy loss functions. In applications, we are frequently faced with time
series data which, for a variety of different reasons, have characteristics not
compatible with the usual assumption of linearity or / and Gaussian errors –
exponential autoregressive model studied by Bell and Smith (1986) are for instance,
very powerful in the analysis of such time series. Bell and Smith (1986) concerned
with the autoregressive scheme Xi = Xi-1 + i where 0 < < 1 and ’s are i.i.d. and
non-negative.
Chapter: 5
In this chapter some advanced and popular multivariate techniques have been
studied. They are as under:
5.1 Application of multivariate cluster analysis technique on Gujarat in-migration
problem with respect to the socio-economic reasons for migration from the
Gujarat Census.
5.2 Application of multivariate factor analysis on the burning economic problem for
India and the world as well the child-labour. We have studied child labour in rural
India and the socio-economic reasons for the situation from Census of India across
selected states in India.
5.3 We have studied principal component estimation method through application on
the official statistics of Gujarat Census data to study religious as well as gender
effect on different working population in Gujarat.
5.4 This section shows the application of multivariate discriminent analysis on the
28
Official from National Sample Survey Organization (NSSO) on unemployment
in rural labour household with the affected socio-economic factors
across selected states in India.
The results of the section 5.1 have been published in an International
Journal of Agricultural and Statistical sciences (IJASS) (Vol.6, No.1, pp. 281-291,
2010).
Chapter: 6
Application of Two-Phase regression model is done for rural poverty and
irrigation data from the official statistics of the World Bank. The TPLR model is
defined as
α1 + β1 Xt + ut t = 1, 2, …, m
yt = α2 + β2 Xt + ut t = m+1,…, n; n ≥ 3
where ut’s are i.i.d. N (0, r) random errors with precision r > 0, xt is a
nonstochastic explanatory variable, and the regression parameters (α1, β1) ≠ (α2, β2).
The shift point m is such that if m = n there is no shift, but when m = 1, 2,…, n – 1
exactly one shift has occurred. The estimator of m is derived under symmetric and
asymmetric loss functions and simulated numerical study is derived. Bansal and
Chakravarty (1996) had proposed to study the effect of an ESD prior for the changed
slope parameter of the two-phase linear regression (TPLR) model on the Bayes
estimates of the shift point in the simple regression model.
29
Chapter: 7
This chapter shows the estimation of change point in proposed First order auto
regressive model.
We have proposed First order autoregressive model with single change point.
β1 Xt-1 + Єt t = 1, 2, …, m
Xt = β2 Xt-1 + Єt t = m+1,…., n.
Where, and are unknown autocorrelation coefficients, is the
ith
observation of the dependent variable, the error terms are independent random
variables and follow a N( 0, ) for i=1,2,…,m. and a N( 0,
) for i= m+1, …,n. m is
the unknown change point and is the initial quantity. Bayesian inferences,
particularly estimation of change point, auto regressive coefficient are derived under
two prior considerations and using Linex and General Entropy loss functions.
Simulated numerical study and sensitivity of the estimator is also shown in te chapter.
This model is also applied on the job-seekers registered with employment exchanges
India from the official statistics of Directorate General Employment and Training,
Ministry of Labour.
At the end we provide an extensive bibliography. Some of the outputs of
this thesis are communicated for publication.