26
2453-14 School on Modelling Tools and Capacity Building in Climate and Public Health KAZEMBE Lawrence 15 - 26 April 2013 University of Malawi Chancellor College Faculty of Science Department of Mathematical Sciences 18 Chirunga Road, 0000 Zomba MALAWI Expectile and Quantile Regression and Other Extensions

Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

  • Upload
    others

  • View
    39

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

2453-14

School on Modelling Tools and Capacity Building in Climate and Public Health

KAZEMBE Lawrence

15 - 26 April 2013

University of Malawi Chancellor College

Faculty of Science Department of Mathematical Sciences 18 Chirunga Road, 0000 Zomba

MALAWI

Expectile and Quantile Regression and Other Extensions

Page 2: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

Expectile and Quantile Regression

and Other Extensions

Lawrence Kazembe

University of Namibia

Windhoek, Namibia

A presentation atSchool on Modelling Tools and Capacity Building in

Climate and Public Health

ICTP, Trieste, Italy

1

Page 3: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

Objectives and Questions

• Objective:Introduce other regression beyond the mean

• Are there median regression models

• What about regression at the mode?

• Can we fit regression model at any other locationstoo?

• What of the variance, skewness and kurtosis?

2

Page 4: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

Prelimaries

• Given a r.v. y ∼ f(·) then

-E(Y ) = μ defines the mean;

-V ar(Y ) = σ2 is the variance.

• General assumption: two measures completely de-

fine the distribution (stationarity concept)

• Classical regression is often characterized by relat-

ing to the mean

-E(y|x) = βx if x is the set of covariates.

-y ∼ N(μ, σ2), then μ = βx

3

Page 5: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

-y ∼ Bern(π), then log(

π1−π

)= βx

-y ∼ Pois(λ), then log(λ) = βx.

• Statisticians are mean lovers (Friedman, Fried-man & Amoo)

• It goes like:We are ”mean” lovers. Deviation is considered normal. We are right 95%

of the time. Statisticians do it discretely and continuously. We can legally

comment on someone’s posterior distribution.

• Why the mean?-Mean regression reduces complexity-However, the mean is not sufficient to describe adistribution.

Page 6: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

Examples Plot (Rent in Munich, Kneib et al)

4

Page 7: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

Examples Quantile Plot (Rent in Munich, Kneib

et al)

5

Page 8: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

Example (Malaria vs Altitude , Kazembe et al)

6

Page 9: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

Example (Botswana data)

7

Page 10: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

Motivating Examples

• Epidemiology and Public Health:

-Investigating height, weight and body mass index

as a function of different covariates e.g. age (Wei

et al. 2006)

-Exploring stunting curves in African and Indian

children (Yue et al. 2012)

-Generating age-specific centile charts (Chitty et al.

1994).

• Economics:

-Study of determinants of wages (Koenker, 2005).

8

Page 11: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

• Education:

-The performance of students in public schools (Koenker

and Hallock, 2001).

• Climate data:

- Spatiotemporal analysis of Boston temperature

(Reich 2012)

Page 12: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

Double GLM

• Classical regression assumes a homogeneous vari-

ance

y|x, ε ∼ N(βx, σ2)

ε ∼ N(0, σ2)

• However variance heterogeneity (heteroscedasticity)

is an order in real-life problems

• The variance of the response may depend on the

covariates

9

Page 13: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

• Normal regression example:

-Regression for mean and variance of a normal dis-

tribution

yi = ηi1 + exp(η2i)εi, εi ∼ N(0,1)

such that

E(yi|xi) = η1i

V ar(yi|xi) = exp(η2i)2

Page 14: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

Regression for location, scale and shape

• In general: Specify a distribution for the response

on all parameters and relate to the predictors.

• The generalized additive model location, scale and

shape (GAMLSS) is a statistical model developed

by Rigby and Stasinopoulos (2005).

• For a probability (density) function f(yi|μi, σi, νi, τi)conditional on (μi, σi, νi, τi) a vector of four distri-

bution parameters, each of which can be a function

10

Page 15: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

to the explanatory variables (X)

g1(μ) = η1 = β1X1 +J1∑j=1

hj1(xj1) (1)

g2(σ) = η2 = β2X2 +J2∑j=1

hj2(xj2) (2)

g3(ν) = η3 = β3X3 +J3∑j=1

hj3(xj3) (3)

g4(τ) = η4 = β4X4 +J4∑j=1

hj4(xj4) (4)

Page 16: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

• GAMLSS assumes different models for each param-

eter

-Model 1: Mean regression

-Model 2: Dispersion regression

-Model 3: Skewness regression

-Model 4: Kurtosis regression

• Other summary measures are also permissible

Page 17: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

Quantile Regression

• Quantile, Centile, and Percentile are terms that canbe used interchangeably-A 0.5 quantile ≡ 50 percentile, which is a median.

• Related terms are quartiles, quintiles and deciles:divides distribution into 4, 5, and 10 parts

• Quantiles are related to the median.

• Suppose Y has a cumulative distribution F (y) =P (Y ≤ τ). Then τth quantile of Y is defined to be

Q(τ) =∫ τ

−∞f(y)dy = inf{y : τ ≤ F (y)}

11

Page 18: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

for 0 < τ < 1.

• The quantile regression drops the parametric as-sumption for the error/ response distribution.

• Fit separate models for different asymmetries τ ∈[0,1]:

yi = ηiτ + εiτ

• Instead of assuming E(yi ≤ ηiτ = 0, one assumes

P (εiτ ≤ 0) = τ

or

Fyi(ηiτ) = P (εiτ ≤ 0) = τ

Page 19: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

• This gives a set of regression function at any as-

sumed quantile value.

• Assumptions:

-it is distribution-free since it does not make any

specific assumption on the type of errors

-it does not even require i.i.d errors

-it allows for heterogeneity.

Page 20: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

Expectile Regression

• Expectiles are related to the mean, as are quantiles

related to the median.

∑ni=1 |yi − ηi| → min

∑ni=1wτ(yi, ηiτ)|yi − ηiτ | → min

median regression quantile regression

∑ni=1 |yi − ηi|2 → min

∑ni=1wτ(yi, ηiτ)|yi − ηiτ |2 → min

mean regression expectile regression

where wτ is the check function defined by

12

Page 21: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

wτ =

⎧⎨⎩τ if yi > μ(τ)

1− τ if yi ≤ μ(τ)

for some population expectile μ(τ) for different val-ues of an asymmetric parameter 0 < τ < 1.

• Expectiles are obtained by solving

τ =

∫ eτ−∞ |y − eτ |fy(y)dy∫∞−∞ |y − eτ |fy(y)dy=

Gy(eτ)− eτFy(eτ)

2(Gy(eτ)− eτFy(eτ)) + (eτ − μ)

where-fy(·) and Fy(·) denote the density and cumulativedistribution function of y.-Gy(e) =

∫ e−∞ yfy(y)dy is the partial moment func-tion of y and-Gy(∞) = μ is the expectation of y.

Page 22: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

Worked example - binomial data

13

Page 23: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

500 1500

−1

01

2

ALT

s(A

LT, d

f = c

(2, 4

, 2))

:1

500 1500

−0.

40.

00.

20.

40.

60.

8

ALT

s(A

LT, d

f = c

(2, 4

, 2))

:2

500 1500

−0.

20.

00.

10.

2

ALT

s(A

LT, d

f = c

(2, 4

, 2))

:3500 1500

−0.

10−

0.05

0.00

0.05

0.10

ALT

s(A

LT, d

f = c

(2, 4

, 2))

:4

500 1500

−0.

3−

0.2

−0.

10.

00.

1

ALT

s(A

LT, d

f = c

(2, 4

, 2))

:5

500 1500

−0.

3−

0.2

−0.

10.

00.

1ALT

s(A

LT, d

f = c

(2, 4

, 2))

:6

Page 24: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

8 10 14

−1.

0−

0.5

0.0

0.5

1.0

1.5

TMIN

s(T

MIN

, df =

c(2

, 4, 2

)):1

8 10 14

−0.

40.

00.

20.

40.

60.

8

TMIN

s(T

MIN

, df =

c(2

, 4, 2

)):2

8 10 14

−0.

10.

00.

10.

20.

30.

4

TMIN

s(T

MIN

, df =

c(2

, 4, 2

)):3

8 10 14

−0.

2−

0.1

0.0

0.1

0.2

TMIN

s(T

MIN

, df =

c(2

, 4, 2

)):4

8 10 14

−0.

5−

0.3

−0.

10.

1

TMIN

s(T

MIN

, df =

c(2

, 4, 2

)):5

8 10 14

−0.

5−

0.3

−0.

10.

1TMIN

s(T

MIN

, df =

c(2

, 4, 2

)):6

14

Page 25: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

25 30 35

−2

−1

01

TMAX

s(T

MA

X, d

f = c

(2, 4

, 2))

:1

25 30 35

−0.

8−

0.4

0.0

0.4

TMAX

s(T

MA

X, d

f = c

(2, 4

, 2))

:2

25 30 35

−0.

20.

00.

20.

4

TMAX

s(T

MA

X, d

f = c

(2, 4

, 2))

:325 30 35

−0.

100.

000.

100.

20

TMAX

s(T

MA

X, d

f = c

(2, 4

, 2))

:4

25 30 35

−0.

2−

0.1

0.0

0.1

0.2

TMAX

s(T

MA

X, d

f = c

(2, 4

, 2))

:5

25 30 35

−0.

050.

000.

050.

100.

15TMAX

s(T

MA

X, d

f = c

(2, 4

, 2))

:615

Page 26: Expectile and Quantile Regression and Other …indico.ictp.it/event/a12175/session/41/contribution/27/...Expectile and Quantile Regression and Other Extensions Lawrence Kazembe University

Reference

• Hudson, I. L., Rea, A., Dalrymple, M. L., Eilers, P. H. C(2008). Climate impacts on sudden infant death syndrome:a GAMLSS approach. Proceedings of the 23rd internationalworkshop on statistical modelling pp. 277280.

• Beyerlein, A., Fahrmeir, L., Mansmann, U., Toschke., A. M(2008). Alternative regression models to assess increase inchildhood BM. BMC Medical Research Methodology, 8(59).

• Smyth, G. K (1989). Generalized linear models with varyingdispersion. J. R. Statist. Soc. B, 51, 4760.

• Sobotka, F., and T. Kneib, 2010. Geoadditive Expectile Re-gression. Computational Statistics and Data Analysis, doi:10.1016/j.csda.2010.11.015.

16