Upload
mariela-fernandez
View
129
Download
2
Embed Size (px)
Citation preview
A copula model to analyze minimum admission scores
Mariela Fernandez1 and Veronica A. Gonzalez-Lopez2
Institute of Mathematics, Statistics and Computing ScienceUniversity of Campinas
11th ICNAAM , 21-27 September 2013, Rhodes, Greece
1FAPESP Post-doctoral Grant 2011/18285-6.2(a) USP project “Mathematics, computation, language and the brain”; (b)
FAPESP’s project“Portuguese in time and space: linguistic contact, grammars incompetition and parametric change, 2012/06078-9”’; (c) FAPESP’s project “Research,Innovation and Dissemination Center for Neuromathematics - NeuroMat,2013/07699-0’.’
Motivation Copula theory Application Conclusions References
Topics:
1 Why: Motivation.
2 How: Cumulative conditional expectation in a copula framework.
3 Results: Admission score decisions.
4 Next: Final remarks.
mariela, [email protected] Copula model; Minimum admission scores 11th ICNAAM 1 / 15
Motivation Copula theory Application Conclusions References
Motivation
mariela, [email protected] Copula model; Minimum admission scores 11th ICNAAM 2 / 15
Motivation Copula theory Application Conclusions References
Problem
How to set minimum admission scores in an efficiently way.
Solution
To use the statistical measure
E[Language|Mathematics ≥ m0]
andE[Mathematics|Language ≥ l0]
where m0 is a Mathematics minimum score and l0 is a Language minimumscore.
mariela, [email protected] Copula model; Minimum admission scores 11th ICNAAM 3 / 15
Motivation Copula theory Application Conclusions References
E[Mathematics|Language ≥ l0]
We need to know:
Marginals distribution F (x) and G(y) and joint distribution H(X,Y )where X :=Language score and Y :=Mathematics score. Recallingthat
Y |X ≥ x0 ∼ GX≥x0(y) = P(Y ≤ y|X ≥ x0) =G(y)−H(x0, y)
1− F (x0).
Actually, it is more useful to work with the marginals quantiles thanthe marginals scores: “Manager’s control variable”e.g. F (x0) = 0.25 means that we will admit 75% of the candidates.
By taking U = scaling ranks of X and V = scaling ranks of Y , wesearch for the joint density of (U, V ).
mariela, [email protected] Copula model; Minimum admission scores 11th ICNAAM 4 / 15
Motivation Copula theory Application Conclusions References
E[Mathematics|Language ≥ l0]
We need to know:
Marginals distribution F (x) and G(y) and joint distribution H(X,Y )where X :=Language score and Y :=Mathematics score. Recallingthat
Y |X ≥ x0 ∼ GX≥x0(y) = P(Y ≤ y|X ≥ x0) =G(y)−H(x0, y)
1− F (x0).
Actually, it is more useful to work with the marginals quantiles thanthe marginals scores: “Manager’s control variable”e.g. F (x0) = 0.25 means that we will admit 75% of the candidates.
By taking U = scaling ranks of X and V = scaling ranks of Y , wesearch for the joint density of (U, V ).
mariela, [email protected] Copula model; Minimum admission scores 11th ICNAAM 4 / 15
Motivation Copula theory Application Conclusions References
E[Mathematics|Language ≥ l0]
We need to know:
Marginals distribution F (x) and G(y) and joint distribution H(X,Y )where X :=Language score and Y :=Mathematics score. Recallingthat
Y |X ≥ x0 ∼ GX≥x0(y) = P(Y ≤ y|X ≥ x0) =G(y)−H(x0, y)
1− F (x0).
Actually, it is more useful to work with the marginals quantiles thanthe marginals scores: “Manager’s control variable”e.g. F (x0) = 0.25 means that we will admit 75% of the candidates.
By taking U = scaling ranks of X and V = scaling ranks of Y , wesearch for the joint density of (U, V ).
mariela, [email protected] Copula model; Minimum admission scores 11th ICNAAM 4 / 15
Motivation Copula theory Application Conclusions References
Cumulative conditional expectation in a copula framework
DefinitionA bivariate Copula is a bivariate joint distribution with uniform marginals,denoted by C(u, v) for (u, v) ∈ [0, 1]× [0, 1].
Sklar’s TheoremLet H be a joint distribution function with margins F and G. Then thereexists a copula C such that
H(x, y) = C(F (x), G(y)). (1)
If F and G are continuous, then C is unique; otherwise, C is uniquelydetermined on RanF ×RanG. Conversely, if C is a copula and F and Gare distribution functions, then the function H defined by (1) is a jointdistribution with margins F and G.
mariela, [email protected] Copula model; Minimum admission scores 11th ICNAAM 5 / 15
Motivation Copula theory Application Conclusions References
Some common bivariate Copulas
Product C(u, v) = uv.
Farlie-Gumbel-Morgenstern C(u, v) = uv + αuv(1− u)(1− v), forα ∈ [−1, 1].Clayton C(u, v) = max{0, (u−α + v−α − 1)−1/α}, for α ∈ (0,∞).
Gumbel C(u, v) = exp(−((− lnu)α + (− ln v)α
)1/α), for
α ∈ [1,∞].
Some applications
Actuarial science, e.g. Frees et al. (1996) and Frees et al. (2005).
Finance and risk management, e.g. Cherubini et al. (2004) andEmbrechts et al. (2003).
Hydrology, e.g. Genest and Frave (2007).
Deforestation (spatio-temporal dependence), e.g. Graler et al. (2010).
Linguistic, e.g. Garcıa et al. (2012).
mariela, [email protected] Copula model; Minimum admission scores 11th ICNAAM 6 / 15
Motivation Copula theory Application Conclusions References
Copula model selection according to the problem’s characteristic: simplecross sections (i.e. simple expression for the intersection of the copula withthe plane u = u0, C(u0, v)) since we need
P[V ≤ v|U ≥ u0] =v − C(u0, v)
1− u0
to compute E[V |U ≥ u0].
mariela, [email protected] Copula model; Minimum admission scores 11th ICNAAM 7 / 15
Motivation Copula theory Application Conclusions References
Copula model selection according to the problem’s characteristic: simplecross sections (i.e. simple expression for the intersection of the copula withthe plane u = u0, C(u0, v)) since we need
P[V ≤ v|U ≥ u0] =v − C(u0, v)
1− u0
to compute E[V |U ≥ u0].
%Farlie-Gumbel-Morgenstern C(u, v) = uv + αuv(1− u)(1− v), forα ∈ [−1, 1]. Quadratic cross sections in both variables, weak dependenceand exchangeable copula, i.e. C(u, v) = C(v, u).
mariela, [email protected] Copula model; Minimum admission scores 11th ICNAAM 7 / 15
Motivation Copula theory Application Conclusions References
Copula model selection according to the problem’s characteristic: simplecross sections (i.e. simple expression for the intersection of the copula withthe plane u = u0, C(u0, v)) since we need
P[V ≤ v|U ≥ u0] =v − C(u0, v)
1− u0
to compute E[V |U ≥ u0].
!Asymmetric Cubic Section Copula (ACS) introduced by Nelsen et al.(1997)
C(u, v) = uv + uv(1− u)(1− v)[(a− b)v(1− u) + b]
where |b| ≤ 1, b−3−√9+6b−3b22 ≤ a ≤ 1 and a 6= b. Cubic cross sections in
both variables, weak dependence and non-exchangeable copula.
mariela, [email protected] Copula model; Minimum admission scores 11th ICNAAM 7 / 15
Motivation Copula theory Application Conclusions References
Copula parameters estimation
Bayesian approach through an uniform conjugate prior
a = K−1∫ 1
−1
∫ 1
R(b)aπ(a, b|u)dadb,
b = K−1∫ 1
−1
∫ 1
R(b)bπ(a, b|u)dadb
where π(a, b|u) is the posterior distribution in (a, b), u is the sample data,
K =∫ 1−1∫ 1R(b) π(a, b|u)dadb and R(b) = b−3−
√9+6b−3b22 .
mariela, [email protected] Copula model; Minimum admission scores 11th ICNAAM 8 / 15
Motivation Copula theory Application Conclusions References
Cumulative conditional expectation for the ACS copula family
E[V |U ≥ u] =∫ 10 vdP(v|U ≥ u) =
1
12
(6 + (a+ b)u+ (b− a)u2
),
E[U |V ≥ v] =∫ 10 udP(u|V ≥ v) =
1
2+b
6v +
a− b12
v2.
Property
i) The vertex of the function E[V |U ≥ u] is u0 =−a−b2(b−a) . It is a minimum
if b > a and it is maximum if b < a.
ii) The vertex of the function E[U |V ≥ v] is v0 =−ba−b . It is a minimum if
a > b and it is maximum if a < b.
mariela, [email protected] Copula model; Minimum admission scores 11th ICNAAM 9 / 15
Motivation Copula theory Application Conclusions References
Admission score decisions
Data
Mathematics and Portuguese scores of each student who succeededat the admission test for the undergraduate course of ElectricalEngineering at University of Campinas in Brazil, from 2010 to 2011.
X = Portuguese score and Y = Mathematics score. An annualstandardization was used to avoid the effect of different tests appliedeach year.
We compute the pseudo-observations
ui = F (xi)N
N + 1=rankxiN + 1
and vi = G(yi)N
N + 1=rankyiN + 1
where N is the size of the sample and F and G are the empiricaldistribution of X and Y respectively.
mariela, [email protected] Copula model; Minimum admission scores 11th ICNAAM 10 / 15
Motivation Copula theory Application Conclusions References
E[V|U ≥ u] E[U|V ≥ v]
Year Students τ a b Vertex Type Vertex Type2010 68 -0.0507 -2.2658 0.3253 0.374 min 0.125 max2011 67 -0.2684 -0.5808 -0.7153 – decreas – decreas
mariela, [email protected] Copula model; Minimum admission scores 11th ICNAAM 11 / 15
Motivation Copula theory Application Conclusions References
Final remarks
We have explored
Copula theory applied to educational data.
Cumulative conditional expectation as a measure for decision making.
Work in progress
Mathematical and statistical properties of the cumulative conditionalexpectation.
Relation between the cumulative conditional expectation and thedirectional dependency given by E[V |U = u0].
Analytical expressions for others copula families, for example theGeneralized Farlie-Gumbel-Morgenstern C(u, v) = uv + f(u)g(v).
Application to data from other courses.
mariela, [email protected] Copula model; Minimum admission scores 11th ICNAAM 12 / 15
References
Cherubini, U., Luciano, E. e Vecchiato, W. (2004). Copula Methods inFinance. John Wiley & Sons.
Embrechts, P., Lindskog, F. e McNeil, A. (2003). ModellingDependence with Copulas and Applications to Risk Management.Handbook of Heavy Tailed Distribution in Finance. Elsevier.
Frees, E., Carriere, J. e Valdez, E. (1996). Annuity valuation withdependent mortality. Journal of Risk and Insurance 63, 229-261.
Frees, E. e Wang, P. (2005). Credibility using copulas. NorthAmerican Actuarial Journal 9 (2), 31-48.
Garcıa, J. E., Gonzalez-Lopez, V. A.; Viola, M. L. L.(2012) Robustmodel selection and the statistical classification of languages. AIPConference Proceedings: 11th Brazilian Bayesian Statistics Meeting v.1490. p. 160-170.
References
Genest, C. e Frave, A. C. (2007). Everything you always wanted toknow about copula modeling but were afraid to ask. Journal ofHydrologic Engineering 12, 347-368.
I Graler, B., Kazianka, H. e M. de Espindola, G. (2010). Copulas, anovel approach to model spatial and spatio-temporal dependence.GIScience for Environmental Change Symposium Proceedings 40,49-54.
Nelsen, R. B., Quesada Molina, J. J., Rodrıguez Lallena, J. A. (1997).Bivariate copulas with cubic sections. J Nonparametr Statist 7,205-220.
Motivation Copula theory Application Conclusions References
Thanks!
mariela, [email protected] Copula model; Minimum admission scores 11th ICNAAM 15 / 15