View
2
Download
0
Embed Size (px)
1
Sixteenth International Conference on Methods in Dialectology (METHODS XVI)
National Institute for Japanese Language and Linguistics, Tokyo, 10 August, 2017.
Unilateral Correspondence Analysis Applied to Spanish linguistic data in time and space
Hiroto Ueda, University of Tokyo
(This document = Google: hiroto ueda→Estudios de español )
2
1. Introduction
2. Method
2.1. Bilateral correspondence analysis
2.2. Unilateral correspondence analysis
3. Application
3.1. Spanish preposition «de» in «sospechar de que»
3.2. Relative chronology of Spanish letters
3.3. Andalusian phonetic features
3.4. Spanish words of 'farmer' in Latin America
4. Conclusion
3
1. Introduction
Correspondence analysis by Jean-Paul Benzécri (1973)
is used in different scientific disciplines (Clausen 1998, Greenacre 2013)
Quantification Method Type III of Chikio Hayashi (1956)
had been also well known and widely utilized (Akido 1974, Yasuda /
Umino 1977, Asano 2000, Kosugi 2007).
In dialectology, see Inoue (1996, 1997) and Inoue and Fukushima (1997)
The aim in common is to find the distribution of the frequencies in the
two-dimensional matrix with the highest correlation coefficient
4
The analysis of Benzécri as well as the method of Hayashi are
characterized by the rearrangement of cases in rows and attributes in
columns in order to obtain the best form of diagonalization in the
distribution of frequencies.
However, researchers sometimes need to rearrange only the order of
cases fixing the order of attributes and vice versa. For example, in the
historical data the chronological axis of years, decades or centuries
should be maintained unchanged in order to study the linguistic forms in
accordance with chronological parameter.
Unilateral Correspondence Analysis
5
2. Method
Anp: Matrix A of n × p elements
Xn: Column vector X of n elements
6
2.1. Bilateral correspondence analysis
Dnp y1: English y2: Physics y3: Latin Sn x1: Ana 9 14 18 s1 = 41 x2: Juan 17 7 11 s2 = 35 x3: Mary 15 13 14 s3 = 42 x4: Ken 5 18 8 s4 = 31
Tp t1 = 46 t2 = 52 t3 = 51 N = 149
Cnp y2: Physics y3: Latin y1: English Xn x4: Ken 18 8 5 x4 = -.473 x1: Ana 14 18 9 x1 = -.094 x3: Mary 13 14 15 x3 = .108 x2: Juan 7 11 17 x2 = .400
Yp y2 = -.361 y3 = .028 y1 = .377
7
The objective of the method is to look for the case weight vector Xn: (x1,
x2, …, xn) and the variable weight vector Yp: (y1, y2, …, yp), which render
the maximum correlation coefficient.
* In this section we follow basically the explanations of Mino (2001:
162-164) and Takahasi (2005: 106-124).
8
The first condition is imposed on the two vectors: that the means (Mx,
My) of si * xi (i = 1, 2, …, n) and ti * yi (i = 1, 2, …, p) are 0:
[1a] Mx = [(9x1 + 14x1 + 18x1)
+ (17x2 + 7x2 + 11x2)
+ (15x3 + 13x3 + 14x3)
+ (5x4 + 18x4 + 8x4)] / 149
= (41x1 + 35x2 + 42x3 + 31x4) / 149
= Sn T Xn / N = 0
9
[1b] My = [(9y1 + 17y1 + 15y1 + 5y1)
+ (14y2 + 7y2 + 13y2 + 18y2)
+ (18y3 + 11y3 + 14y3 + 8y3)] / 149
= (46y1 + 52y2 + 51y3) / 149
= Tp T Yp / N = 0
where Sn is vector of horizontal sums (41, 35, 42, 31), Tp is vector of
vertical sums (46, 52, 51), N is scalar of Total sum (149).
10
The second condition is that the variances (Vx, Vy) of si * xi (i = 1, 2, …,
n) and ti * yi (i=1, 2, …, p) are 1:
[2a] Vx = [41(x1 – Mx) 2 + 35(x2 – Mx)
2 + 42(x3 – Mx)
2 + 31(x4 – Mx)
2 /
149
= [41x1 2 + 35x2
2 + 42x3
2 + 31x4
2 ] / 149 ←[1a] Mx = 0
= Xn T Snn Xn / N = 1
[2b] Vy = [46(y1 – My) 2 + 52(y2 – My)
2 + 51(y3 – My)
2 ] / 149
= (46y1 2 + 52y2
2 + 51y3
2 ) / 149 ←[1b] My = 0
= Yp T Tpp Yp / N = 1
where
Snn = dg(Sn); Tpp = dg(Tp) [dg: diagonal matrix]
11
Snn x1 x2 x3 x4 Tpp y1 y2 y3
x1 41 y1 46
x2 35 y2 52
x3 42 y3 51
x4 31
12
Considering the matrix Dnp as a distribution graph, the correlation (R)
between the X axis and the Y axis is:
[3] R = [9(x1–Mx)(y1–My)
+ 14(x1–Mx)(y2–My)
+ 18(x1–Mx)(y3–My)
+ ...
+ 8(x4–Mx)(y3−My)] / [(Vx*Vy] 1/2
* 149]
= (9x1 y1 + 14x1 y2 +... + 8x4 y3) / [(Vx*Vy] 1/2
* 149]
←[1a, 1b] Mx = My = 0
= (9x1 y1 + 14x1 y2 +... + 8x4 y3) / 149 ←[2a, 2b] Vx = Vy = 1
= Xn T Dnp Yp / N
13
The objective is to find the two weight vectors Xn and Yp when R is
maximized.
To maximize R, we formulate Q with the two constraints of Variances
Vx = 1 and Vy = 1, with two Lagrange multipliers: Lx and Ly. We
perform the differential calculation of Q with respect to Xn and Yp,
whose results are zero vectors: On and Op:
Q = R – Lx (Vx – 1) – Ly (Vy - 1)
= (Xn T Dnp Yp) / N – Lx [(Xn
T Snn Xn) / N – 1] – Ly [(Yp
T Tpp Yp) / N - 1]
14
[4a] Df(Q, Xn) = Dnp Yp / N – 2 Lx Snn Xn / N = On
←Df(Q, Xn): Differentiate Q with respect to Xn
[4b] Df(Q, Yp) = Dnp T Xn / N – 2 Ly Tpp Yp / N = Op
←Df(Q, Yp): Differentiate Q with respect to Yp
15
[5a] Dnp Yp / N = 2 Lx Snn Xn / N ←[4a]
Xn T Dnp Yp / N = 2 Lx Xn
T Snn Xn / N ←Multiply both sides by Xn
T
R = 2 Lx ←[3] R = Xn T Dnp Yp / N; [2a] Xn
T Snn Xn / N = 1
[5b] Dnp T Xn / N = 2 Ly Tpp Yp / N ←[4b]
Xn T Dnp / N = 2 Ly Yp
T Tpp / N ←Move Yp;Tpp: diagonal matrix
Xn T Dnp Yp / N = 2 Ly Yp
T Tpp Yp / N ←Multiply both sides by Yp
R = 2 Ly ←[3] R = Xn T Dnp Yp / N; [2b] Yp
T Tpp Yp / N = 1
By [5a] and [5b],
[6] R = 2 Lx = 2 Ly
16
[7a] Dnp Yp = 2 Lx Snn Xn ←[5a] Multiply both sides by N
Dnp Yp = R Snn Xn ←[6] 2 Lx = R
R Snn Xn = Dnp Yp ←Change of sides
Snn Xn = Dnp Yp / R ←Move scalar R
Snn -1
Snn Xn = Snn -1
Dnp Yp / R ←Multiply Snn -1
in both sides
Xn = Snn -1
Dnp Yp / R ←Snn -1
Snn = Inn (Identity matrix)
[7b] Dnp T Xn = 2 Ly Tpp Yp ←[5b] Multiply both sides by N
Dnp T Xn = R Tpp Yp ←[6] R= 2 Ly
17
Introducing [7a] to Xn of [7b],
[8] Dnp T Snn
-1 Dnp Yp / R = R Tpp Yp
Dnp T Snn
-1 Dnp Yp = R
2 Tpp Yp ←Move scalar R
Dnp T Snn
-1 Dnp (Tpp
1/2 )
-1 Tpp
1/2 Yp = R
2 (Tpp)
1/2 (Tpp)
1/2 Yp
←(Tpp 1/2
) -1
Tpp 1/2
= Ipp; (Tpp) 1/2
(Tpp) 1/2
= Tpp
(The details will be described later.)
18
Abbreviating (Tpp) 1/2
Yp by Ap,
[9] (Tpp) 1/2
Yp = Ap
[8] will be:
Dnp T Snn
-1 Dnp (Tpp
1/2 )
-1 Ap = Tpp
1/2 R
2 Ap
(Tpp 1/2
) -1
Dnp T Snn
-1 Dnp (Tpp
1/2 )
-1 Ap = (Tpp
1/2 )
-1 Tpp
1/2 R
2 Ap
←Multiple both sides by (Tpp 1/2
) -1
(Tpp 1/2
) -1
Dnp T Snn
-1 Dnp (Tpp
1/2 )
-1 Ap = R
2 Ap ←(Tpp
1/2 )
-1 Tpp
1/2 = Ipp
19
Abbreviating by (Tpp 1/2
) -1
Dnp T Snn
-1 Dnp (Tpp
1/2 )
-1 = Wpp
Wpp Ap = R 2 Ap
In this way we arrive at the formula of eigen value (R 2 ) and eigen vector
(Ap), which are calculated by the program at the same time.
20
The attribute weight vector Yp is obtained by [9]:
(Tpp) 1/2
Yp = Ap ←[9]
[Tpp 1/2
] -1
(Tpp) 1/2
Yp = [Tpp 1/2
] -1
Ap ←Multiply both sides by [Tpp 1/2
] -1
Yp = [Tpp 1/2
] -1
Ap ←Xpp -1
Xpp = Ipp (Identity matrix)
21
The attribute weight vector Yp is reduced quite small because the sum of
products with Sn is 0 and for the variance is 1 by [1b] and [2b].
[1b] My = Tp T Yp / N = 0
[2b] Vy = Yp T Tpp Yp / N = 1
According to Takahashi (2005: 127-129), in order to adopt the data
dimension, we multiply Yp by the square root of the total sum N 1/2
. It is
also advisable to multiply Yp by the correlation coefficient (R), to reflect
the magnitude of R:
[10] Yp' = [Tpp 1/2