38
Naohito Chino & Shingo Saburi Aichi Gakuin University, Japan March 26 The 6th International Conference on Multiple Comparison Procedures, Tokyo, Japan, March 24-27, 2009

Controlling the two kinds of error rate in selecting an appropriate asymmetric MDS model

  • Upload
    keely

  • View
    19

  • Download
    1

Embed Size (px)

DESCRIPTION

Controlling the two kinds of error rate in selecting an appropriate asymmetric MDS model. Naohito Chino & Shingo Saburi. Aichi Gakuin University, Japan. March 26. The 6th International Conference on Multiple Comparison Procedures, Tokyo, Japan, March 24-27, 2009. Organization of my talk. - PowerPoint PPT Presentation

Citation preview

Page 1: Controlling the two kinds of error rate in selecting an appropriate asymmetric MDS model

Naohito Chino & Shingo Saburi

Aichi Gakuin University, Japan

March 26

The 6th International Conference on Multiple Comparison   Procedures, Tokyo, Japan, March 24-27, 2009

Page 2: Controlling the two kinds of error rate in selecting an appropriate asymmetric MDS model

Organization of my talk

(1) What is “asymmetric MDS”?(2) extant asymmetric MDS models(3) What is “ASYMMAXSCAL”?(4) advantages of ASYMMAXSCAL(5) shortfalls of ASYMMAXSCAL(6) our proposals to overcome the

defects

Page 3: Controlling the two kinds of error rate in selecting an appropriate asymmetric MDS model

(1) What is “asymmetric MDS”?

The asymmetric MDS is a method which is specifically designed to analyze asymmetric relationships among members and display them graphically by plotting each member in a certain dimensional space, given asymmetric data.

Page 4: Controlling the two kinds of error rate in selecting an appropriate asymmetric MDS model

Examples of symmetric and asym-metric relational dataSymmetric data flight mileages among 10 cities => We can recover the map of these cities.Asymmetric data degrees of sentiment relationships among 17 members measured by a 7-point rating scale => We can estimate the configuration of members in a certain dimensional space.

Page 5: Controlling the two kinds of error rate in selecting an appropriate asymmetric MDS model

extant asymmetric MDS models

Chino (1978, 1990), Chino and Shiraiwa (1993), Constantine and Gower (1978), Escoufier and Grorud (1980), Gower (1977), Harshman (1978), Harshman et al. (1982), Kiers and Takane (1994), Krumhansl (1978), Okada and Imaizumi (1987, 1997), Rocci and Bove (2002), Saburi and Chino (2008), Saito (1991), Saito and Takeda(1990), Sato (1988), ten Berge (1997), Tobler (1976-77), Trendafilov (2002),Weeks and Bentler (1982), Young (1975), and Zielman and Heiser (1996).

Page 6: Controlling the two kinds of error rate in selecting an appropriate asymmetric MDS model

Examples of the extant modelsO-I model (Okada & Imaizumi, 1987) -- an augmented distance

model

GIPSCAL (Chino, 1990) -- a non-distance model

,*kjjkjk rrdd

,cbas kqtjk

tjjk xLxxx

Page 7: Controlling the two kinds of error rate in selecting an appropriate asymmetric MDS model

What is “ASYMMAXSCAL”Although various asymmetric MDS methods

have been proposed, these methods have remained to be descriptive until recently.

By contrast, Saburi & Chino (2008) have proposed a maximum likelihood method for asymmetric MDS called ASYMMAXSCAL, which extends MAXSCAL by Takane (1981) to asymmetric relatio-nal data.

MAXSCAL is the name for a multidimensional successive categories scaling.

Page 8: Controlling the two kinds of error rate in selecting an appropriate asymmetric MDS model

ASYMMAXSCAL revisited (1)

Parameters in ASYMMAXSCAL As with MAXSCAL by Takane

(1981), it has three kinds of parameters pertaining to

(1) the representation model (2) the error model (3) the response model

Page 9: Controlling the two kinds of error rate in selecting an appropriate asymmetric MDS model

ASYMMAXSCAL revisited (2)As for the representation model, the

proximity model of Oi to Oj, say, gij, can generally be written as

),,,( cxx jiij fg

where f(•) is any asymmetric MDS model,xi and xj, respectively, are coordinate vectors of Oi and Oj, and c is the remain- ing parameter vector.

Page 10: Controlling the two kinds of error rate in selecting an appropriate asymmetric MDS model

ASYMMAXSCAL revisited (3)As regards the error model, the error-

perturbed proximities are written as

),0(~, 2 Neeg ijijijij

Error-perturbedproximities

proximities

Page 11: Controlling the two kinds of error rate in selecting an appropriate asymmetric MDS model

ASYMMAXSCAL revisited (4)As for the response model, we

assume that subjects place error-perturbed pro-ximities in one of the M rating scale categories, C1, …, CM. Thus, these categories are represented by a set of ordered intervals with upper and lower boundaries on a psychological contin-uum.

MMm bbbbb 110 boundaries

Page 12: Controlling the two kinds of error rate in selecting an appropriate asymmetric MDS model

ASYMMAXSCAL revisited (5)Accordingly, the probability that the

error-perturbed proximity of Oi to Oj falls in Cm is given by

).( 1 mijmijm bbprobp

We assume that

Page 13: Controlling the two kinds of error rate in selecting an appropriate asymmetric MDS model

ASYMMAXSCAL revisited (6)

,)(1

ij

b

b ijijm dpm

m

based on Torgerson’s law of categorical judgment. Here, Ф(τij) denotes the density of the standard normal distribution.

(For computational convenience, we app-roximate it by the logistic distribution.)

Page 14: Controlling the two kinds of error rate in selecting an appropriate asymmetric MDS model

ASYMMAXSCAL revisited (7) We estimate all the parameters

pertaining to ASYMMAXSCAL by maximizing the following joint likelihood of the total observations

,111

ijmYijm

M

m

n

j

n

ipL

where Yijm denotes the frequency in category Cm, in which subjects placed the error-perturbed proximity of Oi to Oj.

Page 15: Controlling the two kinds of error rate in selecting an appropriate asymmetric MDS model

As for the details of ASYMMAXSCAL, seeSaburi, S. and Chino, N. (2008). A maximum likelihood method for an asymmetric MDS model.

Computational Statistics and Data Analysis,

52, 4673-4684.

Page 16: Controlling the two kinds of error rate in selecting an appropriate asymmetric MDS model

Advantages of ASYMMAXSCAL over the extant descriptive models (1)

(1) Determination of the appropriate scal-

ing level of the data by AIC, that is, ordinal, interval, or ratio level.(2) Determination of the appropriate dimensionality of the model under study by AIC.

Page 17: Controlling the two kinds of error rate in selecting an appropriate asymmetric MDS model

Advantages of ASYMMAXSCAL over the extant descriptive models (2)

(3) Examination that the data are sufficiently asymmetric or not, i) prior to the scaling of objects, by applying some tests for symmetry, ii) on the way to the scaling, by selecting a model among several candidates including some symmetry models, using AIC.

Page 18: Controlling the two kinds of error rate in selecting an appropriate asymmetric MDS model

Tests for symmetry, prior to the scaling of objects

The data obtained by the above method is as follows. We call it the Type A

design data, or the Type A data.One of them is the test for a special con-

ditional symmetry hypothesis for this design data.

.

Page 19: Controlling the two kinds of error rate in selecting an appropriate asymmetric MDS model

nnMnnnn

M

M

YYY

YYY

YYY

,,,

..........................

,,,

,,,

21

12122121

11112111

C

1

C2 … CM

from O1 to O1

from O1 to O2

from On to On

rating categoriesproximity

judgments

frequencies

n11

Number of samples

n12

nnn

Page 20: Controlling the two kinds of error rate in selecting an appropriate asymmetric MDS model

Type B (design) data in ASMMAXSCAL

Type B (design) data is obtained by rearranging the Type A data per rat-ing category.

(It is a bit different from traditional designs for the n×n×M table (Ag- resti, 2002, Bishop et al., 1975)).

Y111 … Y1n1. .Yn11 …

.Ynn1

.

Y11M… Y1nM

Yn1M

. .… YnnM

. .

C1

C2

CM

∶∶

n11

nn1 nnn…. . total

The Type B design data

Page 21: Controlling the two kinds of error rate in selecting an appropriate asymmetric MDS model

Tests for symmetry in ASYMMAXSCAL (2)

MmjippH jimijmcs ,,1,,:)(0

According to Saburi and Chino (2008), under the null hypothesis,

the likelihood ratio test statistic,

M

m

n

i

n

j jiij

jimijmijijmijm nn

YYnYYG

1 1 1

2 )(lnln2

asymptotically follows the central χ2-distribution with (M-1)n(n-1)/2 degrees of freedom.

Page 22: Controlling the two kinds of error rate in selecting an appropriate asymmetric MDS model

,2

)(lnln2

1 1 1

2

M

m

n

i

n

j

jimijmijmijm

YYYYG

with Mn(n-1)/2 degrees of freedom under the null hypothesis.

By the way, the traditional conditional symmetry test with the special n×n×M contingency table, of which test statistic is given by

Page 23: Controlling the two kinds of error rate in selecting an appropriate asymmetric MDS model

shortfalls of ASYMMAXSCAL (1)(1) Shortfall of the model selection method At present, ASYMMAXSCAL enables us to

select the most appropriate model among several candidate models which include some variants of symmetry models using AIC. However, such a model selection method by some information criterion does not consider the nature of the data.

It will be necessary to select the representation model which reflects the nature of the data most.

Page 24: Controlling the two kinds of error rate in selecting an appropriate asymmetric MDS model

shortfalls of ASYMMAXSCAL (2) To do this job, it might be necessary to utilize

various smmetry related tests which have been developed in the branch of mathematical statistics.

Chino and Saburi (2006) attempted to administer these tests prior to the scaling step of ASYMMAXS-CAL. Figures 1 and 2 show this. However, relation of inclusion of these tests shown in Figure 1 is very complicated.

=> Chino, N. & Saburi, S. (2006). Tests of symmetry in asymmetric MDS. Paper presented at the 2nd German Japanese Symposium on Classification, Berlin, Germany.

Page 25: Controlling the two kinds of error rate in selecting an appropriate asymmetric MDS model

shortfalls of ASYMMAXSCAL (3)(2) Lack of taking overall statistical errors into account In performing such sequential tests shown in

Figure 2, we did not take overall statistical errors into account.

It will be interesting and useful to examine whether these tests are mutually statistically independent or not. For simplicity, we shall, at

present, exclusively consider a two-dimensional square contingency table.

Page 26: Controlling the two kinds of error rate in selecting an appropriate asymmetric MDS model

We have recently conjectured that, at least, two of these tests are statistically independent.

These are (1) test of the quasi-symmetry hypothesis,

and (2) test of the symmetry hypothesis under the condition that the quasi-symmetry hypo- thesis holds

Our proposal to overcome the shortfalls

Page 27: Controlling the two kinds of error rate in selecting an appropriate asymmetric MDS model

Parameter spaces for these tests (1)

(1) total parameter space of the log-linear model

(2) parameter space for the quasi-

symmetry hypothesis

)0()2()1()12()0()2()1()12( ,,,,,,, jiijjiij

)0()2()1(

)12()12()0()2()1()12(

,,

,,,,

ji

jiijjiijQS

Page 28: Controlling the two kinds of error rate in selecting an appropriate asymmetric MDS model

Parameter spaces for these tests (2)

(3) parameter space for the symmetry hypothesis

(4) parameter space for the equality of the row and

column effects

)0()2()1(

)12()12()0()2()1()12(

,

,,,,,

ii

jiijjiijS

)0()12(

)2()1()0()2()1()12(

,

,,,,

ij

iijiijERC

Page 29: Controlling the two kinds of error rate in selecting an appropriate asymmetric MDS model

Null and nonnull hypotheses of these tests

(1) the quasi-symmetry hypothesis,

(2) the symmetry hypothesis under QSH0

QSQS

QSQS HagainstH θθ :: 10

SQSS

SS HagainstH θθ :: *

1*0

Page 30: Controlling the two kinds of error rate in selecting an appropriate asymmetric MDS model

Ω=ω0

ωQS ωS

ωERC

Θij(12)=Θji

(12) Θ i (1) =Θi (2)

.0 SQS It should be noticed that

Page 31: Controlling the two kinds of error rate in selecting an appropriate asymmetric MDS model

Dissolution of a likelihood ratio statistic (1)

The likelihood ratio λS for testing the usual sym-metry hypothesis can be written, for example, as

Therefore, we have

or

.)ˆ(

)ˆ(

)ˆ(

)ˆ(

)ˆ(

)ˆ(*

00SQS

QS

SQSSS L

L

L

L

L

L

),ln2(ln2ln2 * QSSS

Page 32: Controlling the two kinds of error rate in selecting an appropriate asymmetric MDS model

Dissolution of a likelihood ratio statistic (2)

This statistic is due to Caussinus (1965), and follows asymptotically χ2 distribution with r-1 degrees of freedom.

.222* QSSS GGG

Page 33: Controlling the two kinds of error rate in selecting an appropriate asymmetric MDS model

Test statistic for the quasi-symmetry hypothesisIt is well known that under the

hypothesis,

follows asymptotically the χ2 distribution with

(r-1)(r-2)/2 degrees of freedom. Here, satisfies the following equations:

QSH0

,ˆlnln21 1

2ijij

r

i

r

jijQS ffG

ij

jiijjiijjjii ffff ˆˆ,ˆ,ˆ

Page 34: Controlling the two kinds of error rate in selecting an appropriate asymmetric MDS model

Joint sufficient statistic for the nuisance parametersThe statistics

corresponding to the nuisance parameters

under the quasi-symmetry hypothesis, give their joint sufficient statistics.

QSH0

)(,),(,,1 1

ijji

r

i

r

jijji ffffnff

)(,,,, )12()12(0

)0()2()1(ijijji

Page 35: Controlling the two kinds of error rate in selecting an appropriate asymmetric MDS model

Ancillary statistic for these nuisance parameters

Moreover, under the hypothesis of quasi-symmetry, the statistic, , is free of these nuisance parameters, because the term, , are estimated as functions of the data,

.In other words, the statistic, , is

ancillary for these nuisance parameters.

2QSG

ij

jiijji ffff ,,

2QSG

Page 36: Controlling the two kinds of error rate in selecting an appropriate asymmetric MDS model

Independence of the two statistics

As a result, due to the theorem by Hogg & Craig (1956), is statistically independent of

Hogg, R. V. & Craig, A. T. (1956). Sufficient sta-

tistics in elementary distribution theory. Sankhyā, 17, 209-216.

2QSG

).,,,,(2222* jiijjiijQSSS ffnfffGGGG

Page 37: Controlling the two kinds of error rate in selecting an appropriate asymmetric MDS model

Control of the error rates of the two kinds

Since the two statistics are mutually stochastically independent, we may set the error rate of each of the test statistics to 1-(1-α)1/2 if we want to control the error rate of the first kind at α.

Furthermore, we can construct a more powerful test than Dunn’s and Holm’s, if we set the error rate of the quasi-symmetry test to α and set that of the symmetry test under the quasi-symmetry hypothesis to α/2, according to Hochberg (1988).

Page 38: Controlling the two kinds of error rate in selecting an appropriate asymmetric MDS model

Quasi-symmetry ?

Multiple CP

Non-QS-familyof MDS models

rejected

Symmetry Under QS?

accepted

Multiple CP

SymmetricMDS

QS-family ofMDS models

accepted

rejected