Unilateral Correspondence Analysis - و‌±ن؛¬ه¤§ه­¦ cueda/kenkyu/chiri/correspondence/correspondence

  • View
    2

  • Download
    0

Embed Size (px)

Text of Unilateral Correspondence Analysis - و‌±ن؛¬ه¤§ه­¦...

  • 1

    Sixteenth International Conference on Methods in Dialectology (METHODS XVI)

    National Institute for Japanese Language and Linguistics, Tokyo, 10 August, 2017.

    Unilateral Correspondence Analysis Applied to Spanish linguistic data in time and space

    Hiroto Ueda, University of Tokyo

    (This document = Google: hiroto ueda→Estudios de español )

  • 2

    1. Introduction

    2. Method

    2.1. Bilateral correspondence analysis

    2.2. Unilateral correspondence analysis

    3. Application

    3.1. Spanish preposition «de» in «sospechar de que»

    3.2. Relative chronology of Spanish letters

    3.3. Andalusian phonetic features

    3.4. Spanish words of 'farmer' in Latin America

    4. Conclusion

  • 3

    1. Introduction

    Correspondence analysis by Jean-Paul Benzécri (1973)

    is used in different scientific disciplines (Clausen 1998, Greenacre 2013)

    Quantification Method Type III of Chikio Hayashi (1956)

    had been also well known and widely utilized (Akido 1974, Yasuda /

    Umino 1977, Asano 2000, Kosugi 2007).

    In dialectology, see Inoue (1996, 1997) and Inoue and Fukushima (1997)

    The aim in common is to find the distribution of the frequencies in the

    two-dimensional matrix with the highest correlation coefficient

  • 4

    The analysis of Benzécri as well as the method of Hayashi are

    characterized by the rearrangement of cases in rows and attributes in

    columns in order to obtain the best form of diagonalization in the

    distribution of frequencies.

    However, researchers sometimes need to rearrange only the order of

    cases fixing the order of attributes and vice versa. For example, in the

    historical data the chronological axis of years, decades or centuries

    should be maintained unchanged in order to study the linguistic forms in

    accordance with chronological parameter.

    Unilateral Correspondence Analysis

  • 5

    2. Method

    Anp: Matrix A of n × p elements

    Xn: Column vector X of n elements

  • 6

    2.1. Bilateral correspondence analysis

    Dnp y1: English y2: Physics y3: Latin Sn x1: Ana 9 14 18 s1 = 41 x2: Juan 17 7 11 s2 = 35 x3: Mary 15 13 14 s3 = 42 x4: Ken 5 18 8 s4 = 31

    Tp t1 = 46 t2 = 52 t3 = 51 N = 149

    Cnp y2: Physics y3: Latin y1: English Xn x4: Ken 18 8 5 x4 = -.473 x1: Ana 14 18 9 x1 = -.094 x3: Mary 13 14 15 x3 = .108 x2: Juan 7 11 17 x2 = .400

    Yp y2 = -.361 y3 = .028 y1 = .377

  • 7

    The objective of the method is to look for the case weight vector Xn: (x1,

    x2, …, xn) and the variable weight vector Yp: (y1, y2, …, yp), which render

    the maximum correlation coefficient.

    * In this section we follow basically the explanations of Mino (2001:

    162-164) and Takahasi (2005: 106-124).

  • 8

    The first condition is imposed on the two vectors: that the means (Mx,

    My) of si * xi (i = 1, 2, …, n) and ti * yi (i = 1, 2, …, p) are 0:

    [1a] Mx = [(9x1 + 14x1 + 18x1)

    + (17x2 + 7x2 + 11x2)

    + (15x3 + 13x3 + 14x3)

    + (5x4 + 18x4 + 8x4)] / 149

    = (41x1 + 35x2 + 42x3 + 31x4) / 149

    = Sn T Xn / N = 0

  • 9

    [1b] My = [(9y1 + 17y1 + 15y1 + 5y1)

    + (14y2 + 7y2 + 13y2 + 18y2)

    + (18y3 + 11y3 + 14y3 + 8y3)] / 149

    = (46y1 + 52y2 + 51y3) / 149

    = Tp T Yp / N = 0

    where Sn is vector of horizontal sums (41, 35, 42, 31), Tp is vector of

    vertical sums (46, 52, 51), N is scalar of Total sum (149).

  • 10

    The second condition is that the variances (Vx, Vy) of si * xi (i = 1, 2, …,

    n) and ti * yi (i=1, 2, …, p) are 1:

    [2a] Vx = [41(x1 – Mx) 2 + 35(x2 – Mx)

    2 + 42(x3 – Mx)

    2 + 31(x4 – Mx)

    2 /

    149

    = [41x1 2 + 35x2

    2 + 42x3

    2 + 31x4

    2 ] / 149 ←[1a] Mx = 0

    = Xn T Snn Xn / N = 1

    [2b] Vy = [46(y1 – My) 2 + 52(y2 – My)

    2 + 51(y3 – My)

    2 ] / 149

    = (46y1 2 + 52y2

    2 + 51y3

    2 ) / 149 ←[1b] My = 0

    = Yp T Tpp Yp / N = 1

    where

    Snn = dg(Sn); Tpp = dg(Tp) [dg: diagonal matrix]

  • 11

    Snn x1 x2 x3 x4 Tpp y1 y2 y3

    x1 41 y1 46

    x2 35 y2 52

    x3 42 y3 51

    x4 31

  • 12

    Considering the matrix Dnp as a distribution graph, the correlation (R)

    between the X axis and the Y axis is:

    [3] R = [9(x1–Mx)(y1–My)

    + 14(x1–Mx)(y2–My)

    + 18(x1–Mx)(y3–My)

    + ...

    + 8(x4–Mx)(y3−My)] / [(Vx*Vy] 1/2

    * 149]

    = (9x1 y1 + 14x1 y2 +... + 8x4 y3) / [(Vx*Vy] 1/2

    * 149]

    ←[1a, 1b] Mx = My = 0

    = (9x1 y1 + 14x1 y2 +... + 8x4 y3) / 149 ←[2a, 2b] Vx = Vy = 1

    = Xn T Dnp Yp / N

  • 13

    The objective is to find the two weight vectors Xn and Yp when R is

    maximized.

    To maximize R, we formulate Q with the two constraints of Variances

    Vx = 1 and Vy = 1, with two Lagrange multipliers: Lx and Ly. We

    perform the differential calculation of Q with respect to Xn and Yp,

    whose results are zero vectors: On and Op:

    Q = R – Lx (Vx – 1) – Ly (Vy - 1)

    = (Xn T Dnp Yp) / N – Lx [(Xn

    T Snn Xn) / N – 1] – Ly [(Yp

    T Tpp Yp) / N - 1]

  • 14

    [4a] Df(Q, Xn) = Dnp Yp / N – 2 Lx Snn Xn / N = On

    ←Df(Q, Xn): Differentiate Q with respect to Xn

    [4b] Df(Q, Yp) = Dnp T Xn / N – 2 Ly Tpp Yp / N = Op

    ←Df(Q, Yp): Differentiate Q with respect to Yp

  • 15

    [5a] Dnp Yp / N = 2 Lx Snn Xn / N ←[4a]

    Xn T Dnp Yp / N = 2 Lx Xn

    T Snn Xn / N ←Multiply both sides by Xn

    T

    R = 2 Lx ←[3] R = Xn T Dnp Yp / N; [2a] Xn

    T Snn Xn / N = 1

    [5b] Dnp T Xn / N = 2 Ly Tpp Yp / N ←[4b]

    Xn T Dnp / N = 2 Ly Yp

    T Tpp / N ←Move Yp;Tpp: diagonal matrix

    Xn T Dnp Yp / N = 2 Ly Yp

    T Tpp Yp / N ←Multiply both sides by Yp

    R = 2 Ly ←[3] R = Xn T Dnp Yp / N; [2b] Yp

    T Tpp Yp / N = 1

    By [5a] and [5b],

    [6] R = 2 Lx = 2 Ly

  • 16

    [7a] Dnp Yp = 2 Lx Snn Xn ←[5a] Multiply both sides by N

    Dnp Yp = R Snn Xn ←[6] 2 Lx = R

    R Snn Xn = Dnp Yp ←Change of sides

    Snn Xn = Dnp Yp / R ←Move scalar R

    Snn -1

    Snn Xn = Snn -1

    Dnp Yp / R ←Multiply Snn -1

    in both sides

    Xn = Snn -1

    Dnp Yp / R ←Snn -1

    Snn = Inn (Identity matrix)

    [7b] Dnp T Xn = 2 Ly Tpp Yp ←[5b] Multiply both sides by N

    Dnp T Xn = R Tpp Yp ←[6] R= 2 Ly

  • 17

    Introducing [7a] to Xn of [7b],

    [8] Dnp T Snn

    -1 Dnp Yp / R = R Tpp Yp

    Dnp T Snn

    -1 Dnp Yp = R

    2 Tpp Yp ←Move scalar R

    Dnp T Snn

    -1 Dnp (Tpp

    1/2 )

    -1 Tpp

    1/2 Yp = R

    2 (Tpp)

    1/2 (Tpp)

    1/2 Yp

    ←(Tpp 1/2

    ) -1

    Tpp 1/2

    = Ipp; (Tpp) 1/2

    (Tpp) 1/2

    = Tpp

    (The details will be described later.)

  • 18

    Abbreviating (Tpp) 1/2

    Yp by Ap,

    [9] (Tpp) 1/2

    Yp = Ap

    [8] will be:

    Dnp T Snn

    -1 Dnp (Tpp

    1/2 )

    -1 Ap = Tpp

    1/2 R

    2 Ap

    (Tpp 1/2

    ) -1

    Dnp T Snn

    -1 Dnp (Tpp

    1/2 )

    -1 Ap = (Tpp

    1/2 )

    -1 Tpp

    1/2 R

    2 Ap

    ←Multiple both sides by (Tpp 1/2

    ) -1

    (Tpp 1/2

    ) -1

    Dnp T Snn

    -1 Dnp (Tpp

    1/2 )

    -1 Ap = R

    2 Ap ←(Tpp

    1/2 )

    -1 Tpp

    1/2 = Ipp

  • 19

    Abbreviating by (Tpp 1/2

    ) -1

    Dnp T Snn

    -1 Dnp (Tpp

    1/2 )

    -1 = Wpp

    Wpp Ap = R 2 Ap

    In this way we arrive at the formula of eigen value (R 2 ) and eigen vector

    (Ap), which are calculated by the program at the same time.

  • 20

    The attribute weight vector Yp is obtained by [9]:

    (Tpp) 1/2

    Yp = Ap ←[9]

    [Tpp 1/2

    ] -1

    (Tpp) 1/2

    Yp = [Tpp 1/2

    ] -1

    Ap ←Multiply both sides by [Tpp 1/2

    ] -1

    Yp = [Tpp 1/2

    ] -1

    Ap ←Xpp -1

    Xpp = Ipp (Identity matrix)

  • 21

    The attribute weight vector Yp is reduced quite small because the sum of

    products with Sn is 0 and for the variance is 1 by [1b] and [2b].

    [1b] My = Tp T Yp / N = 0

    [2b] Vy = Yp T Tpp Yp / N = 1

    According to Takahashi (2005: 127-129), in order to adopt the data

    dimension, we multiply Yp by the square root of the total sum N 1/2

    . It is

    also advisable to multiply Yp by the correlation coefficient (R), to reflect

    the magnitude of R:

    [10] Yp' = [Tpp 1/2