Upload
vungoc
View
233
Download
1
Embed Size (px)
Citation preview
Introduction Statistics Tensors TheEnd App
Tensor decompositionsin statistical signal processing
Pierre Comon
I3S, CNRS, University of Nice, Sophia-Antipolis, France
July 2, 2009
Pierre Comon MMDS - July 2009 1
Introduction Statistics Tensors TheEnd App Model Goals Appl
Linear statistical model
y = As + b (1)
withy : K × 1 random
“sensors”
s : P × 1 random stat. independent
“sources”
A : K × P deterministic
“mixing matrix”
b : errors
“noise”(may be removed for P large enough)
Pierre Comon MMDS - July 2009 2
Introduction Statistics Tensors TheEnd App Model Goals Appl
Linear statistical model
y = As + b (1)
withy : K × 1 random “sensors”s : P × 1 random stat. independent “sources”A : K × P deterministic “mixing matrix”b : errors “noise”
(may be removed for P large enough)
Pierre Comon MMDS - July 2009 2
Introduction Statistics Tensors TheEnd App Model Goals Appl
Linear statistical model
y = As + b (1)
withy : K × 1 random “sensors”s : P × 1 random stat. independent “sources”A : K × P deterministic “mixing matrix”b : errors “noise”
(may be removed for P large enough)
Pierre Comon MMDS - July 2009 2
Introduction Statistics Tensors TheEnd App Model Goals Appl
Other writing
y = [A, I]
[sb
]
which is of the noiseless form
y = A s (2)
with a larger dimension P
Pierre Comon MMDS - July 2009 3
Introduction Statistics Tensors TheEnd App Model Goals Appl
Other writing
y = [A, I]
[sb
]which is of the noiseless form
y = A s (2)
with a larger dimension P
Pierre Comon MMDS - July 2009 3
Introduction Statistics Tensors TheEnd App Model Goals Appl
Taxinomy
Problems addressed in the literature:
K ≥ P: “over-determined”
→ Approximation Pb
can be reduced to a square P × P regular mixtureA orthogonal or unitaryA square invertible
K < P: “under-determined”
→ Exact decompostion
A rectangular with pairwise lin. independent columns
Pierre Comon MMDS - July 2009 4
Introduction Statistics Tensors TheEnd App Model Goals Appl
Taxinomy
Problems addressed in the literature:
K ≥ P: “over-determined”
→ Approximation Pb
can be reduced to a square P × P regular mixtureA orthogonal or unitaryA square invertible
K < P: “under-determined”
→ Exact decompostion
A rectangular with pairwise lin. independent columns
Pierre Comon MMDS - July 2009 4
Introduction Statistics Tensors TheEnd App Model Goals Appl
Taxinomy
Problems addressed in the literature:
K ≥ P: “over-determined”
→ Approximation Pb
can be reduced to a square P × P regular mixtureA orthogonal or unitaryA square invertible
K < P: “under-determined”
→ Exact decompostion
A rectangular with pairwise lin. independent columns
Pierre Comon MMDS - July 2009 4
Introduction Statistics Tensors TheEnd App Model Goals Appl
Taxinomy
Problems addressed in the literature:
K ≥ P: “over-determined” → Approximation Pb
can be reduced to a square P × P regular mixtureA orthogonal or unitaryA square invertible
K < P: “under-determined”
→ Exact decompostion
A rectangular with pairwise lin. independent columns
Pierre Comon MMDS - July 2009 4
Introduction Statistics Tensors TheEnd App Model Goals Appl
Taxinomy
Problems addressed in the literature:
K ≥ P: “over-determined” → Approximation Pb
can be reduced to a square P × P regular mixtureA orthogonal or unitaryA square invertible
K < P: “under-determined” → Exact decompostion
A rectangular with pairwise lin. independent columns
Pierre Comon MMDS - July 2009 4
Introduction Statistics Tensors TheEnd App Model Goals Appl
Goals
Solely from realizations of observation vector y
In the stochastic framework:
Estimate matrix A: Blind identification
Estimate realizations of the “source” vector s: Blindseparation/extraction/equalization
In the deterministic framework:
Build a data tensor Y thanks to a diversity k, and decomposeit:
Yijk =∑p
AipsjpBkp
A and s are jointly estimated
Pierre Comon MMDS - July 2009 5
Introduction Statistics Tensors TheEnd App Model Goals Appl
Goals
Solely from realizations of observation vector y
In the stochastic framework:
Estimate matrix A: Blind identification
Estimate realizations of the “source” vector s: Blindseparation/extraction/equalization
In the deterministic framework:
Build a data tensor Y thanks to a diversity k, and decomposeit:
Yijk =∑p
AipsjpBkp
A and s are jointly estimated
Pierre Comon MMDS - July 2009 5
Introduction Statistics Tensors TheEnd App Model Goals Appl
Goals
Solely from realizations of observation vector y
In the stochastic framework:
Estimate matrix A: Blind identification
Estimate realizations of the “source” vector s: Blindseparation/extraction/equalization
In the deterministic framework:
Build a data tensor Y thanks to a diversity k, and decomposeit:
Yijk =∑p
AipsjpBkp
A and s are jointly estimated
Pierre Comon MMDS - July 2009 5
Introduction Statistics Tensors TheEnd App Model Goals Appl
Goals
Solely from realizations of observation vector y
In the stochastic framework:
Estimate matrix A: Blind identification
Estimate realizations of the “source” vector s: Blindseparation/extraction/equalization
In the deterministic framework:
Build a data tensor Y thanks to a diversity k, and decomposeit:
Yijk =∑p
AipsjpBkp
A and s are jointly estimated
Pierre Comon MMDS - July 2009 5
Introduction Statistics Tensors TheEnd App Model Goals Appl
Goals
Solely from realizations of observation vector y
In the stochastic framework:
Estimate matrix A: Blind identification
Estimate realizations of the “source” vector s: Blindseparation/extraction/equalization
In the deterministic framework:
Build a data tensor Y thanks to a diversity k, and decomposeit:
Yijk =∑p
AipsjpBkp
A and s are jointly estimated
Pierre Comon MMDS - July 2009 5
Introduction Statistics Tensors TheEnd App Model Goals Appl
Goals
Solely from realizations of observation vector y
In the stochastic framework:
Estimate matrix A: Blind identification
Estimate realizations of the “source” vector s: Blindseparation/extraction/equalization
In the deterministic framework:
Build a data tensor Y thanks to a diversity k, and decomposeit:
Yijk =∑p
AipsjpBkp
A and s are jointly estimated
Pierre Comon MMDS - July 2009 5
Introduction Statistics Tensors TheEnd App Model Goals Appl
Goals
Solely from realizations of observation vector y
In the stochastic framework:
Estimate matrix A: Blind identification
Estimate realizations of the “source” vector s: Blindseparation/extraction/equalization
In the deterministic framework:
Build a data tensor Y thanks to a diversity k, and decomposeit:
Yijk =∑p
AipsjpBkp
A and s are jointly estimated
Pierre Comon MMDS - July 2009 5
Introduction Statistics Tensors TheEnd App Model Goals Appl
Application areas
1 Telecommunications (Cellular, Satellite, Military),
2 Radar, Sonar,
3 Biomedical (EchoGraphy, ElectroEncephaloGraphy,ElectroCardioGraphy)...
4 Speech, Audio,
5 Machine learning,
6 Data mining,
7 Control...
Pierre Comon MMDS - July 2009 6
Introduction Statistics Tensors TheEnd App Uniqueness c.f. Decomposition
Identifiability & Uniqueness
Uniqueness/Identifiability up to inherent ambiguities
Finite number of solutions
Pierre Comon MMDS - July 2009 7
Introduction Statistics Tensors TheEnd App Uniqueness c.f. Decomposition
Equivalent representations
Let y admit two representations
y = As and y = Bz
where s (resp. z) have independent components, and A (resp. B)have pairwise noncollinear columns.
These representations are equivalent if every column of A isproportional to some column of B, and vice versa.
If all representations of y are equivalent, they are said to beessentially unique (permutation & scale ambiguities only).
Pierre Comon MMDS - July 2009 8
Introduction Statistics Tensors TheEnd App Uniqueness c.f. Decomposition
Equivalent representations
Let y admit two representations
y = As and y = Bz
where s (resp. z) have independent components, and A (resp. B)have pairwise noncollinear columns.
These representations are equivalent if every column of A isproportional to some column of B, and vice versa.
If all representations of y are equivalent, they are said to beessentially unique (permutation & scale ambiguities only).
Pierre Comon MMDS - July 2009 8
Introduction Statistics Tensors TheEnd App Uniqueness c.f. Decomposition
Equivalent representations
Let y admit two representations
y = As and y = Bz
where s (resp. z) have independent components, and A (resp. B)have pairwise noncollinear columns.
These representations are equivalent if every column of A isproportional to some column of B, and vice versa.
If all representations of y are equivalent, they are said to beessentially unique (permutation & scale ambiguities only).
Pierre Comon MMDS - July 2009 8
Introduction Statistics Tensors TheEnd App Uniqueness c.f. Decomposition
Identifiability & uniqueness theorems
Let y be a random vector of the form y = As, where sp areindependent, and A has non pairwise collinear columns.
Identifiability theorem y can be represented asy = A1 s1 + A2 s2, where s1 is non Gaussian, s2 is Gaussianindependent of s1, and A1 is essentially unique.
Uniqueness theorem If in addition the columns of A1 arelinearly independent, then the distribution of s1 is unique upto scale and location indeterminacies.
Pierre Comon MMDS - July 2009 9
Introduction Statistics Tensors TheEnd App Uniqueness c.f. Decomposition
Identifiability & uniqueness theorems
Let y be a random vector of the form y = As, where sp areindependent, and A has non pairwise collinear columns.
Identifiability theorem y can be represented asy = A1 s1 + A2 s2, where s1 is non Gaussian, s2 is Gaussianindependent of s1, and A1 is essentially unique.
Uniqueness theorem If in addition the columns of A1 arelinearly independent, then the distribution of s1 is unique upto scale and location indeterminacies.
Pierre Comon MMDS - July 2009 9
Introduction Statistics Tensors TheEnd App Uniqueness c.f. Decomposition
Identifiability & uniqueness theorems
Let y be a random vector of the form y = As, where sp areindependent, and A has non pairwise collinear columns.
Identifiability theorem y can be represented asy = A1 s1 + A2 s2, where s1 is non Gaussian, s2 is Gaussianindependent of s1, and A1 is essentially unique.
Uniqueness theorem If in addition the columns of A1 arelinearly independent, then the distribution of s1 is unique upto scale and location indeterminacies.
Pierre Comon MMDS - July 2009 9
Introduction Statistics Tensors TheEnd App Uniqueness c.f. Decomposition
Example of uniqueness
Let si be independent with no Gaussian component, and bi beindependent Gaussian. Then the linear model below is identifiable,but A2 is not essentially unique whereas A1 is:(
s1 + s2 + 2 b1
s1 + 2 b2
)= A1 s+A2
(b1
b2
)= A1 s+A3
(b1 + b2
b1 − b2
)with
A1 =
(1 11 0
), A2 =
(2 00 2
)and A3 =
(1 11 −1
)Hence the distribution of s is essentially unique.But (A1, A2) not equivalent to (A1, A3).
Pierre Comon MMDS - July 2009 10
Introduction Statistics Tensors TheEnd App Uniqueness c.f. Decomposition
Example of non uniqueness
Let si be independent with no Gaussian component, and bi beindependent Gaussian. Then the linear model below is identifiable,but the distribution of s is not unique because a 2× 4 matrixcannot be full column rank:
(s1 + s3 + s4 + 2 b1
s2 + s3 − s4 + 2 b2
)= A
s1s2
s3 + b1 + b2
s4 + b1 − b2
= A
s1 + 2 b1
s2 + 2 b2
s3s4
with
A =
(1 0 1 10 1 1 −1
)
Pierre Comon MMDS - July 2009 11
Introduction Statistics Tensors TheEnd App Uniqueness c.f. Decomposition
Characteristic functions
First c.f.
Real Scalar: Φx(t)def= E{eı tx} =
∫u eı tu dFx(u)
Real Multivariate: Φx(t)def= E{eı tTx} =
∫u eı tTu dFx(u)
Second c.f.
Ψ(t)def= log Φ(t)
Properties:
Always exists in the neighborhood of 0Uniquely defined as long as Φ(t) 6= 0
Pierre Comon MMDS - July 2009 12
Introduction Statistics Tensors TheEnd App Uniqueness c.f. Decomposition
Characteristic functions
First c.f.
Real Scalar: Φx(t)def= E{eı tx} =
∫u eı tu dFx(u)
Real Multivariate: Φx(t)def= E{eı tTx} =
∫u eı tTu dFx(u)
Second c.f.
Ψ(t)def= log Φ(t)
Properties:
Always exists in the neighborhood of 0Uniquely defined as long as Φ(t) 6= 0
Pierre Comon MMDS - July 2009 12
Introduction Statistics Tensors TheEnd App Uniqueness c.f. Decomposition
Characteristic functions
First c.f.
Real Scalar: Φx(t)def= E{eı tx} =
∫u eı tu dFx(u)
Real Multivariate: Φx(t)def= E{eı tTx} =
∫u eı tTu dFx(u)
Second c.f.
Ψ(t)def= log Φ(t)
Properties:
Always exists in the neighborhood of 0Uniquely defined as long as Φ(t) 6= 0
Pierre Comon MMDS - July 2009 12
Introduction Statistics Tensors TheEnd App Uniqueness c.f. Decomposition
Characteristic functions
First c.f.
Real Scalar: Φx(t)def= E{eı tx} =
∫u eı tu dFx(u)
Real Multivariate: Φx(t)def= E{eı tTx} =
∫u eı tTu dFx(u)
Second c.f.
Ψ(t)def= log Φ(t)
Properties:
Always exists in the neighborhood of 0Uniquely defined as long as Φ(t) 6= 0
Pierre Comon MMDS - July 2009 12
Introduction Statistics Tensors TheEnd App Uniqueness c.f. Decomposition
Characteristic functions
First c.f.
Real Scalar: Φx(t)def= E{eı tx} =
∫u eı tu dFx(u)
Real Multivariate: Φx(t)def= E{eı tTx} =
∫u eı tTu dFx(u)
Second c.f.
Ψ(t)def= log Φ(t)
Properties:
Always exists in the neighborhood of 0Uniquely defined as long as Φ(t) 6= 0
Pierre Comon MMDS - July 2009 12
Introduction Statistics Tensors TheEnd App Uniqueness c.f. Decomposition
Characteristic functions
First c.f.
Real Scalar: Φx(t)def= E{eı tx} =
∫u eı tu dFx(u)
Real Multivariate: Φx(t)def= E{eı tTx} =
∫u eı tTu dFx(u)
Second c.f.
Ψ(t)def= log Φ(t)
Properties:
Always exists in the neighborhood of 0Uniquely defined as long as Φ(t) 6= 0
Pierre Comon MMDS - July 2009 12
Introduction Statistics Tensors TheEnd App Uniqueness c.f. Decomposition
Characteristic functions (cont’d)
Properties of the 2nd Characteristic function (cont’d):
if a c.f. Ψx(t) is a polynomial, then its degree is at most 2 andx is Gaussian (Marcinkiewicz’1938)
Proof.complicated...
if (x , y) statistically independent, then
Ψx,y (u, v) = Ψx(u) + Ψy (v) (3)
Proof.
Ψx,y (u, v) = log[E{exp ı(ux + vy)}]= log[E{exp ı(ux)}E{exp ı(vy)}].
Pierre Comon MMDS - July 2009 13
Introduction Statistics Tensors TheEnd App Uniqueness c.f. Decomposition
Characteristic functions (cont’d)
Properties of the 2nd Characteristic function (cont’d):
if a c.f. Ψx(t) is a polynomial, then its degree is at most 2 andx is Gaussian (Marcinkiewicz’1938)
Proof.complicated...
if (x , y) statistically independent, then
Ψx,y (u, v) = Ψx(u) + Ψy (v) (3)
Proof.
Ψx,y (u, v) = log[E{exp ı(ux + vy)}]= log[E{exp ı(ux)}E{exp ı(vy)}].
Pierre Comon MMDS - July 2009 13
Introduction Statistics Tensors TheEnd App Uniqueness c.f. Decomposition
Characteristic functions (cont’d)
Properties of the 2nd Characteristic function (cont’d):
if a c.f. Ψx(t) is a polynomial, then its degree is at most 2 andx is Gaussian (Marcinkiewicz’1938)
Proof.complicated...
if (x , y) statistically independent, then
Ψx,y (u, v) = Ψx(u) + Ψy (v) (3)
Proof.
Ψx,y (u, v) = log[E{exp ı(ux + vy)}]= log[E{exp ı(ux)}E{exp ı(vy)}].
Pierre Comon MMDS - July 2009 13
Introduction Statistics Tensors TheEnd App Uniqueness c.f. Decomposition
Characteristic functions (cont’d)
Properties of the 2nd Characteristic function (cont’d):
if a c.f. Ψx(t) is a polynomial, then its degree is at most 2 andx is Gaussian (Marcinkiewicz’1938)
Proof.complicated...
if (x , y) statistically independent, then
Ψx,y (u, v) = Ψx(u) + Ψy (v) (3)
Proof.
Ψx,y (u, v) = log[E{exp ı(ux + vy)}]= log[E{exp ı(ux)}E{exp ı(vy)}].
Pierre Comon MMDS - July 2009 13
Introduction Statistics Tensors TheEnd App Uniqueness c.f. Decomposition
Characteristic functions (cont’d)
Properties of the 2nd Characteristic function (cont’d):
if a c.f. Ψx(t) is a polynomial, then its degree is at most 2 andx is Gaussian (Marcinkiewicz’1938)
Proof.complicated...
if (x , y) statistically independent, then
Ψx,y (u, v) = Ψx(u) + Ψy (v) (3)
Proof.
Ψx,y (u, v) = log[E{exp ı(ux + vy)}]= log[E{exp ı(ux)}E{exp ı(vy)}].
Pierre Comon MMDS - July 2009 13
Introduction Statistics Tensors TheEnd App Uniqueness c.f. Decomposition
Problem posed in terms of Characteristic Functions
If sp independent and y = As, we have the core equation:
Ψy (u) =∑p
ψp
(∑q
uq Aqp
)(4)
where ψp(w)def= log E{exp ı(sp w)}.
Proof.
Plug y = As, in definition of Ψy and get
Φy (u)def= E{exp ı(uTAs)} = E{exp ı(
∑p,q
uq Aqp sp)}
Since sp independent, φy (u) =∏
p E{exp ı(∑
q uq Aqp sp)}
Pierre Comon MMDS - July 2009 14
Introduction Statistics Tensors TheEnd App Uniqueness c.f. Decomposition
Problem posed in terms of Characteristic Functions
If sp independent and y = As, we have the core equation:
Ψy (u) =∑p
ψp
(∑q
uq Aqp
)(4)
where ψp(w)def= log E{exp ı(sp w)}.
Proof.
Plug y = As, in definition of Ψy and get
Φy (u)def= E{exp ı(uTAs)} = E{exp ı(
∑p,q
uq Aqp sp)}
Since sp independent, φy (u) =∏
p E{exp ı(∑
q uq Aqp sp)}
Pierre Comon MMDS - July 2009 14
Introduction Statistics Tensors TheEnd App Uniqueness c.f. Decomposition
Problem posed in terms of Characteristic Functions
If sp independent and y = As, we have the core equation:
Ψy (u) =∑p
ψp
(∑q
uq Aqp
)(4)
where ψp(w)def= log E{exp ı(sp w)}.
Proof.
Plug y = As, in definition of Ψy and get
Φy (u)def= E{exp ı(uTAs)} = E{exp ı(
∑p,q
uq Aqp sp)}
Since sp independent, φy (u) =∏
p E{exp ı(∑
q uq Aqp sp)}
Pierre Comon MMDS - July 2009 14
Introduction Statistics Tensors TheEnd App Uniqueness c.f. Decomposition
Function decomposition
Problem:Equation (4) shows that we have to decompose a mutlivariatefunction into a sum of univariate ones.
Weierstrass (1885), Stone (1948), Hilbert (1900), Kolmogorov(1957), Sprecher (1965), Hornik (1989), Cybenko (1989)...
➽ But here, Ψy and ψp’s are characteristic functions.
Pierre Comon MMDS - July 2009 15
Introduction Statistics Tensors TheEnd App Uniqueness c.f. Decomposition
Function decomposition
Problem:Equation (4) shows that we have to decompose a mutlivariatefunction into a sum of univariate ones.
Weierstrass (1885), Stone (1948), Hilbert (1900), Kolmogorov(1957), Sprecher (1965), Hornik (1989), Cybenko (1989)...
➽ But here, Ψy and ψp’s are characteristic functions.
Pierre Comon MMDS - July 2009 15
Introduction Statistics Tensors TheEnd App Non symmetric Symmetric Deterministic
Problem seen as non symmetric tensor decomposition
Back to core equation (4):
Ψy (u) =∑p
ψp
(∑q
uq Aqp
), u ∈ G
Assumption: functions ψp, 1 ≤ p ≤ P admit finite derivativesup to order d in a neighborhood of the origin, containing G.
Taking d = 3 as a working example:
∂3Ψy
∂ui∂uj∂uk(u) =
P∑p=1
Aip Ajp Akp ψ(3)p (
K∑q=1
uq Aqp)
of the form Tijk` =∑
p Aip Ajp Akp F`p, if L points u` ∈ G.
Pierre Comon MMDS - July 2009 16
Introduction Statistics Tensors TheEnd App Non symmetric Symmetric Deterministic
Problem seen as non symmetric tensor decomposition
Back to core equation (4):
Ψy (u) =∑p
ψp
(∑q
uq Aqp
), u ∈ G
Assumption: functions ψp, 1 ≤ p ≤ P admit finite derivativesup to order d in a neighborhood of the origin, containing G.
Taking d = 3 as a working example:
∂3Ψy
∂ui∂uj∂uk(u) =
P∑p=1
Aip Ajp Akp ψ(3)p (
K∑q=1
uq Aqp)
of the form Tijk` =∑
p Aip Ajp Akp F`p, if L points u` ∈ G.
Pierre Comon MMDS - July 2009 16
Introduction Statistics Tensors TheEnd App Non symmetric Symmetric Deterministic
Problem seen as non symmetric tensor decomposition
Back to core equation (4):
Ψy (u) =∑p
ψp
(∑q
uq Aqp
), u ∈ G
Assumption: functions ψp, 1 ≤ p ≤ P admit finite derivativesup to order d in a neighborhood of the origin, containing G.
Taking d = 3 as a working example:
∂3Ψy
∂ui∂uj∂uk(u) =
P∑p=1
Aip Ajp Akp ψ(3)p (
K∑q=1
uq Aqp)
of the form Tijk` =∑
p Aip Ajp Akp F`p, if L points u` ∈ G.
Pierre Comon MMDS - July 2009 16
Introduction Statistics Tensors TheEnd App Non symmetric Symmetric Deterministic
Problem seen as symmetric tensor decomposition (1/2)
If only the origin in G, i.e. u` = 0, then
Cijk =∑p
Aip Ajp Akp fp
Multi-linear relation relating cumulants.
Pierre Comon MMDS - July 2009 17
Introduction Statistics Tensors TheEnd App Non symmetric Symmetric Deterministic
Problem seen as symmetric tensor decomposition (2/2)
Equivalent writing (still with d = 3 as working example)
C =P∑
p=1
fp h(p)⊗⊗⊗h(p)⊗⊗⊗h(p)
where h(p)def= pth column of A.
If vectors h(p) are not pairwise collinear, P is hopefullyminimal, i.e. equal to the rank of tensor C
→ condition
Pierre Comon MMDS - July 2009 18
Introduction Statistics Tensors TheEnd App Non symmetric Symmetric Deterministic
Problem seen as symmetric tensor decomposition (2/2)
Equivalent writing (still with d = 3 as working example)
C =P∑
p=1
fp h(p)⊗⊗⊗h(p)⊗⊗⊗h(p)
where h(p)def= pth column of A.
If vectors h(p) are not pairwise collinear, P is hopefullyminimal, i.e. equal to the rank of tensor C
→ condition
Pierre Comon MMDS - July 2009 18
Introduction Statistics Tensors TheEnd App Non symmetric Symmetric Deterministic
Problem seen as symmetric tensor decomposition (2/2)
Equivalent writing (still with d = 3 as working example)
C =P∑
p=1
fp h(p)⊗⊗⊗h(p)⊗⊗⊗h(p)
where h(p)def= pth column of A.
If vectors h(p) are not pairwise collinear, P is hopefullyminimal, i.e. equal to the rank of tensor C → condition
Pierre Comon MMDS - July 2009 18
Introduction Statistics Tensors TheEnd App Non symmetric Symmetric Deterministic
Expected rank
For an order-d symmetric tensor of dimension K
Number of equations:(K+d−1
d
)Number of unknowns: PKFor what P can one have an exact fit?
“Expected rank”:
R(K , d)def=
⌈(K+d−1d
)K
⌉(5)
Pierre Comon MMDS - July 2009 19
Introduction Statistics Tensors TheEnd App Non symmetric Symmetric Deterministic
Expected rank
For an order-d symmetric tensor of dimension K
Number of equations:(K+d−1
d
)Number of unknowns: PKFor what P can one have an exact fit?
“Expected rank”:
R(K , d)def=
⌈(K+d−1d
)K
⌉(5)
Pierre Comon MMDS - July 2009 19
Introduction Statistics Tensors TheEnd App Non symmetric Symmetric Deterministic
Clebsch’s statement
Alfred Clebsch (1833-1872)
For (d ,K ) = (4, 3), a generic tensor cannot be written as the sumof 5 rank-1 terms, even if #unknowns = 15 = #equations
Pierre Comon MMDS - July 2009 20
Introduction Statistics Tensors TheEnd App Non symmetric Symmetric Deterministic
Hirschowitz theorem
From Alexander-Hirschowitz thm (1995), one deduces [CGLM08]:
Theorem For d > 2, the generic rank of a dth order symmetrictensor of dimension K is always equal to the expected rank
Rs = R(K , d) (6)
except for (d ,K ) ∈ {(3, 5), (4, 3), (4, 4), (4, 5)}
➽ Only a finite number of exceptions !Pierre Comon MMDS - July 2009 21
Introduction Statistics Tensors TheEnd App Non symmetric Symmetric Deterministic
Identifiability
Identifiability in wider sense: finite number of solutions
Necessary condition for identifiability: P ≤ R(K , d).Not sufficient, cf. Clebsch (1861), Hirschowitz (1995)
Sufficient condition for almost sure identifiability:P < R(K , d).
NS condition for almost sure identifiability:
P =
(K+d−1d
)K
= R(K , d) = Rs
Pierre Comon MMDS - July 2009 22
Introduction Statistics Tensors TheEnd App Non symmetric Symmetric Deterministic
Identifiability
Identifiability in wider sense: finite number of solutions
Necessary condition for identifiability: P ≤ R(K , d).Not sufficient, cf. Clebsch (1861), Hirschowitz (1995)
Sufficient condition for almost sure identifiability:P < R(K , d).
NS condition for almost sure identifiability:
P =
(K+d−1d
)K
= R(K , d) = Rs
Pierre Comon MMDS - July 2009 22
Introduction Statistics Tensors TheEnd App Non symmetric Symmetric Deterministic
Identifiability
Identifiability in wider sense: finite number of solutions
Necessary condition for identifiability: P ≤ R(K , d).Not sufficient, cf. Clebsch (1861), Hirschowitz (1995)
Sufficient condition for almost sure identifiability:P < R(K , d).
NS condition for almost sure identifiability:
P =
(K+d−1d
)K
= R(K , d) = Rs
Pierre Comon MMDS - July 2009 22
Introduction Statistics Tensors TheEnd App Non symmetric Symmetric Deterministic
Identifiability
Identifiability in wider sense: finite number of solutions
Necessary condition for identifiability: P ≤ R(K , d).Not sufficient, cf. Clebsch (1861), Hirschowitz (1995)
Sufficient condition for almost sure identifiability:P < R(K , d).
NS condition for almost sure identifiability:
P =
(K+d−1d
)K
= R(K , d) = Rs
Pierre Comon MMDS - July 2009 22
Introduction Statistics Tensors TheEnd App Non symmetric Symmetric Deterministic
Generic rank of symmetric tensors
Symmetric tensors of order d and dimension K :
dK 2 3 4 5 6 7 8
2 2 3 4 5 6 7 8
3 2 4 5 8 10 12 15
4 3 6 10 15 21 30 42
Pierre Comon MMDS - July 2009 23
Introduction Statistics Tensors TheEnd App Non symmetric Symmetric Deterministic
Deterministic approaches
Model:Yijk =
∑p
Aip sjp Bkp
Y =∑P
p=1 a(p)⊗⊗⊗ s(p)⊗⊗⊗b(p)
Pierre Comon MMDS - July 2009 24
Introduction Statistics Tensors TheEnd App Non symmetric Symmetric Deterministic
Expected rank
For an order-d symmetric tensor of dimensions K`, 1 ≤ ` ≤ d
Number of equations:∏
` K`
Number of unknowns:∑
` K`P − (d − 1)P
“Expected rank” again given by the ceil rule:
R(K1, ..,Kd)def=
⌈ ∏` K`∑
` K` − d + 1
⌉(7)
Pierre Comon MMDS - July 2009 25
Introduction Statistics Tensors TheEnd App Non symmetric Symmetric Deterministic
Expected rank
For an order-d symmetric tensor of dimensions K`, 1 ≤ ` ≤ d
Number of equations:∏
` K`
Number of unknowns:∑
` K`P − (d − 1)P
“Expected rank” again given by the ceil rule:
R(K1, ..,Kd)def=
⌈ ∏` K`∑
` K` − d + 1
⌉(7)
Pierre Comon MMDS - July 2009 25
Introduction Statistics Tensors TheEnd App Non symmetric Symmetric Deterministic
Identifiability
Necessary condition for identifiability:
P ≤ R(K1, ..,Kd)
Sufficient condition for almost sure identifiability:
P < R(K1, ..,Kd)
For general tensors, generic rank R not yet known everywhere.
Pierre Comon MMDS - July 2009 26
Introduction Statistics Tensors TheEnd App Non symmetric Symmetric Deterministic
Identifiability
Necessary condition for identifiability:
P ≤ R(K1, ..,Kd)
Sufficient condition for almost sure identifiability:
P < R(K1, ..,Kd)
For general tensors, generic rank R not yet known everywhere.
Pierre Comon MMDS - July 2009 26
Introduction Statistics Tensors TheEnd App Non symmetric Symmetric Deterministic
Identifiability
Necessary condition for identifiability:
P ≤ R(K1, ..,Kd)
Sufficient condition for almost sure identifiability:
P < R(K1, ..,Kd)
For general tensors, generic rank R not yet known everywhere.
Pierre Comon MMDS - July 2009 26
Introduction Statistics Tensors TheEnd App Non symmetric Symmetric Deterministic
Generic rank of order 3 free tensors (1)
K 2 3 4
J 2 3 4 5 3 4 5 4 5
2 2 3 4 4 3 4 5 4 5
3 3 3 4 5 5 5 5 6 6
4 4 4 4 5 5 6 6 7 8
5 4 5 5 5 5 6 8 8 9
6 4 6 6 6 6 7 8 8 10
I 7 4 6 7 7 7 7 9 9 108 4 6 8 8 8 8 9 10 11
9 4 6 8 9 9 9 9 10 12
10 4 6 8 10 9 10 10 10 1211 4 6 8 10 9 11 11 11 1312 4 6 8 10 9 12 12 12 13
Pierre Comon MMDS - July 2009 27
Introduction Statistics Tensors TheEnd App
Perspectives
Efficient algorithms to compute finite set of solutions
Detection of weak defectivity: rare cases when ∞ manysolutions
Construction of additional diversity: increases order
Thanks for your attention
Pierre Comon MMDS - July 2009 28
Introduction Statistics Tensors TheEnd App
Perspectives
Efficient algorithms to compute finite set of solutions
Detection of weak defectivity: rare cases when ∞ manysolutions
Construction of additional diversity: increases order
Thanks for your attention
Pierre Comon MMDS - July 2009 28
Introduction Statistics Tensors TheEnd App
Perspectives
Efficient algorithms to compute finite set of solutions
Detection of weak defectivity: rare cases when ∞ manysolutions
Construction of additional diversity: increases order
Thanks for your attention
Pierre Comon MMDS - July 2009 28
Introduction Statistics Tensors TheEnd App
Perspectives
Efficient algorithms to compute finite set of solutions
Detection of weak defectivity: rare cases when ∞ manysolutions
Construction of additional diversity: increases order
Thanks for your attention
Pierre Comon MMDS - July 2009 28
Introduction Statistics Tensors TheEnd App Darmois proof Cumulants AH-thm ref
Darmois-Skitovich theorem (1953)
TheoremLet si be statistically independent random variables, and two linearstatistics:
y1 =∑
i
ai si and y2 =∑
i
bi si
If y1 and y2 are statistically independent, then random variables skfor which ak bk 6= 0 are Gaussian.
NB: holds in both R or C
Pierre Comon MMDS - July 2009 29
Introduction Statistics Tensors TheEnd App Darmois proof Cumulants AH-thm ref
Proof of Darmois-Skitovich thm
Assume [ak , bk ] not collinear and ψp differentiable.
1 Hence∑P
k=1 ψp(u ak + v bk) =∑P
k=1 ψk(u ak) + ψk(v bk)Trivial for terms for which akbk = 0.From now on, restrict the sum to terms akbk 6= 0
2 Write this at u + α/aP and v − α/bP :
P∑k=1
ψk
(u ak + v bk + α(
ak
aP− bk
bP)
)= f (u) + g(v)
3 Subtract to cancel Pth term, divide by α, and let α→ 0:
P−1∑k=1
(ak
aP− bk
bP)ψ
(1)k (u ak + v bk) = f (1)(u) + g (1)(v)
for some univariate functions f (1)(u) and g (1)(u).
Conclusion: We have one term less
Pierre Comon MMDS - July 2009 30
Introduction Statistics Tensors TheEnd App Darmois proof Cumulants AH-thm ref
Proof of Darmois-Skitovich thm
Assume [ak , bk ] not collinear and ψp differentiable.
1 Hence∑P
k=1 ψp(u ak + v bk) =∑P
k=1 ψk(u ak) + ψk(v bk)Trivial for terms for which akbk = 0.From now on, restrict the sum to terms akbk 6= 0
2 Write this at u + α/aP and v − α/bP :
P∑k=1
ψk
(u ak + v bk + α(
ak
aP− bk
bP)
)= f (u) + g(v)
3 Subtract to cancel Pth term, divide by α, and let α→ 0:
P−1∑k=1
(ak
aP− bk
bP)ψ
(1)k (u ak + v bk) = f (1)(u) + g (1)(v)
for some univariate functions f (1)(u) and g (1)(u).
Conclusion: We have one term less
Pierre Comon MMDS - July 2009 30
Introduction Statistics Tensors TheEnd App Darmois proof Cumulants AH-thm ref
Proof of Darmois-Skitovich thm
Assume [ak , bk ] not collinear and ψp differentiable.
1 Hence∑P
k=1 ψp(u ak + v bk) =∑P
k=1 ψk(u ak) + ψk(v bk)Trivial for terms for which akbk = 0.From now on, restrict the sum to terms akbk 6= 0
2 Write this at u + α/aP and v − α/bP :
P∑k=1
ψk
(u ak + v bk + α(
ak
aP− bk
bP)
)= f (u) + g(v)
3 Subtract to cancel Pth term, divide by α, and let α→ 0:
P−1∑k=1
(ak
aP− bk
bP)ψ
(1)k (u ak + v bk) = f (1)(u) + g (1)(v)
for some univariate functions f (1)(u) and g (1)(u).
Conclusion: We have one term less
Pierre Comon MMDS - July 2009 30
Introduction Statistics Tensors TheEnd App Darmois proof Cumulants AH-thm ref
Proof of Darmois-Skitovich thm
Assume [ak , bk ] not collinear and ψp differentiable.
1 Hence∑P
k=1 ψp(u ak + v bk) =∑P
k=1 ψk(u ak) + ψk(v bk)Trivial for terms for which akbk = 0.From now on, restrict the sum to terms akbk 6= 0
2 Write this at u + α/aP and v − α/bP :
P∑k=1
ψk
(u ak + v bk + α(
ak
aP− bk
bP)
)= f (u) + g(v)
3 Subtract to cancel Pth term, divide by α, and let α→ 0:
P−1∑k=1
(ak
aP− bk
bP)ψ
(1)k (u ak + v bk) = f (1)(u) + g (1)(v)
for some univariate functions f (1)(u) and g (1)(u).
Conclusion: We have one term less
Pierre Comon MMDS - July 2009 30
Introduction Statistics Tensors TheEnd App Darmois proof Cumulants AH-thm ref
Proof of Darmois-Skitovich thm
Assume [ak , bk ] not collinear and ψp differentiable.
1 Hence∑P
k=1 ψp(u ak + v bk) =∑P
k=1 ψk(u ak) + ψk(v bk)Trivial for terms for which akbk = 0.From now on, restrict the sum to terms akbk 6= 0
2 Write this at u + α/aP and v − α/bP :
P∑k=1
ψk
(u ak + v bk + α(
ak
aP− bk
bP)
)= f (u) + g(v)
3 Subtract to cancel Pth term, divide by α, and let α→ 0:
P−1∑k=1
(ak
aP− bk
bP)ψ
(1)k (u ak + v bk) = f (1)(u) + g (1)(v)
for some univariate functions f (1)(u) and g (1)(u).
Conclusion: We have one term less
Pierre Comon MMDS - July 2009 30
Introduction Statistics Tensors TheEnd App Darmois proof Cumulants AH-thm ref
4 Repeat the procedure (P − 1) times and get:
P∏j=2
(a1
aj− b1
bj) ψ
(P−1)1 (u a1 + v b1) = f (P−1)(u) + g (P−1)(v)
5 Hence ψ(P−1)1 (u a1 + v b1) is linear, as a sum of two univariate
functions (ψ(P)1 is a constant because a1b1 6= 0).
6 Eventually ψ1 is a polynomial.
7 Lastly invoke Marcinkiewicz theorem to conclude that s1 isGaussian.
8 Same is true for any ψp such that apbp 6= 0: sp is Gaussian.
NB: also holds if ψp not differentiable
Pierre Comon MMDS - July 2009 31
Introduction Statistics Tensors TheEnd App Darmois proof Cumulants AH-thm ref
4 Repeat the procedure (P − 1) times and get:
P∏j=2
(a1
aj− b1
bj) ψ
(P−1)1 (u a1 + v b1) = f (P−1)(u) + g (P−1)(v)
5 Hence ψ(P−1)1 (u a1 + v b1) is linear, as a sum of two univariate
functions (ψ(P)1 is a constant because a1b1 6= 0).
6 Eventually ψ1 is a polynomial.
7 Lastly invoke Marcinkiewicz theorem to conclude that s1 isGaussian.
8 Same is true for any ψp such that apbp 6= 0: sp is Gaussian.
NB: also holds if ψp not differentiable
Pierre Comon MMDS - July 2009 31
Introduction Statistics Tensors TheEnd App Darmois proof Cumulants AH-thm ref
4 Repeat the procedure (P − 1) times and get:
P∏j=2
(a1
aj− b1
bj) ψ
(P−1)1 (u a1 + v b1) = f (P−1)(u) + g (P−1)(v)
5 Hence ψ(P−1)1 (u a1 + v b1) is linear, as a sum of two univariate
functions (ψ(P)1 is a constant because a1b1 6= 0).
6 Eventually ψ1 is a polynomial.
7 Lastly invoke Marcinkiewicz theorem to conclude that s1 isGaussian.
8 Same is true for any ψp such that apbp 6= 0: sp is Gaussian.
NB: also holds if ψp not differentiable
Pierre Comon MMDS - July 2009 31
Introduction Statistics Tensors TheEnd App Darmois proof Cumulants AH-thm ref
4 Repeat the procedure (P − 1) times and get:
P∏j=2
(a1
aj− b1
bj) ψ
(P−1)1 (u a1 + v b1) = f (P−1)(u) + g (P−1)(v)
5 Hence ψ(P−1)1 (u a1 + v b1) is linear, as a sum of two univariate
functions (ψ(P)1 is a constant because a1b1 6= 0).
6 Eventually ψ1 is a polynomial.
7 Lastly invoke Marcinkiewicz theorem to conclude that s1 isGaussian.
8 Same is true for any ψp such that apbp 6= 0: sp is Gaussian.
NB: also holds if ψp not differentiable
Pierre Comon MMDS - July 2009 31
Introduction Statistics Tensors TheEnd App Darmois proof Cumulants AH-thm ref
4 Repeat the procedure (P − 1) times and get:
P∏j=2
(a1
aj− b1
bj) ψ
(P−1)1 (u a1 + v b1) = f (P−1)(u) + g (P−1)(v)
5 Hence ψ(P−1)1 (u a1 + v b1) is linear, as a sum of two univariate
functions (ψ(P)1 is a constant because a1b1 6= 0).
6 Eventually ψ1 is a polynomial.
7 Lastly invoke Marcinkiewicz theorem to conclude that s1 isGaussian.
8 Same is true for any ψp such that apbp 6= 0: sp is Gaussian.
NB: also holds if ψp not differentiable
Pierre Comon MMDS - July 2009 31
Introduction Statistics Tensors TheEnd App Darmois proof Cumulants AH-thm ref
4 Repeat the procedure (P − 1) times and get:
P∏j=2
(a1
aj− b1
bj) ψ
(P−1)1 (u a1 + v b1) = f (P−1)(u) + g (P−1)(v)
5 Hence ψ(P−1)1 (u a1 + v b1) is linear, as a sum of two univariate
functions (ψ(P)1 is a constant because a1b1 6= 0).
6 Eventually ψ1 is a polynomial.
7 Lastly invoke Marcinkiewicz theorem to conclude that s1 isGaussian.
8 Same is true for any ψp such that apbp 6= 0: sp is Gaussian.
NB: also holds if ψp not differentiable
Pierre Comon MMDS - July 2009 31
Introduction Statistics Tensors TheEnd App Darmois proof Cumulants AH-thm ref
Definition of Cumulants
Moments:
µrdef= E{x r} = (−ı)r ∂rΦ(t)
∂tr
∣∣∣∣t=0
Cumulants:
Cx (r)
def= Cum{x , . . . , x︸ ︷︷ ︸
r times
} = (−ı)r ∂rΨ(t)
∂tr
∣∣∣∣t=0
Relationship between Moments and Cumulants obtained byexpanding both sides in Taylor series:
log Φx(t) = Ψx(t)
Pierre Comon MMDS - July 2009 32
Introduction Statistics Tensors TheEnd App Darmois proof Cumulants AH-thm ref
Definition of Cumulants
Moments:
µrdef= E{x r} = (−ı)r ∂rΦ(t)
∂tr
∣∣∣∣t=0
Cumulants:
Cx (r)
def= Cum{x , . . . , x︸ ︷︷ ︸
r times
} = (−ı)r ∂rΨ(t)
∂tr
∣∣∣∣t=0
Relationship between Moments and Cumulants obtained byexpanding both sides in Taylor series:
log Φx(t) = Ψx(t)
Pierre Comon MMDS - July 2009 32
Introduction Statistics Tensors TheEnd App Darmois proof Cumulants AH-thm ref
Definition of Cumulants
Moments:
µrdef= E{x r} = (−ı)r ∂rΦ(t)
∂tr
∣∣∣∣t=0
Cumulants:
Cx (r)
def= Cum{x , . . . , x︸ ︷︷ ︸
r times
} = (−ı)r ∂rΨ(t)
∂tr
∣∣∣∣t=0
Relationship between Moments and Cumulants obtained byexpanding both sides in Taylor series:
log Φx(t) = Ψx(t)
Pierre Comon MMDS - July 2009 32
Introduction Statistics Tensors TheEnd App Darmois proof Cumulants AH-thm ref
Polynomial interpolation
Alexander-Hirschowitz Theorem (1995) Let L(d ,m) be thespace of hypersurfaces of degree at most d in m variables. This
space is of dimension D(m, d)def=(m+d
d
)− 1.
Theorem Denote {pi} K given distinct points in the complexprojective space Pm. The dimension of the linear subspace ofhypersurfaces of L(d ,m) having multiplicity at least 2 at everypoint pi is:
D(m, d)− K (m + 1)
except for the following cases:• d = 2 and 2 ≤ K ≤ m• d ≥ 3 and (m, d ,K ) ∈ {(2, 4, 5), (3, 4, 9), (4, 1, 14), (4, 3, 7)}
In other words, there are a finite number of exceptions.Pierre Comon MMDS - July 2009 33
Introduction Statistics Tensors TheEnd App Darmois proof Cumulants AH-thm ref
References
F. L. HITCHCOCK. “Multiple invariants and generalized rank of a p-way matrix or tensor.” J. Math. and
Phys., 7(1):39–79, 1927.
L. R. TUCKER. “Some mathematical notes for three-mode factor analysis.” Psychometrika, 31:279–311,
1966.
J. B. KRUSKAL. “Three-way arrays: Rank and uniqueness of trilinear decompositions.” Linear Algebra and
Applications, 18:95–138, 1977.
V. STRASSEN. “Rank and optimal computation of generic tensors.” Linear Algebra Appl., 52:645–685, July
1983.
P. COMON and B. MOURRAIN. “Decomposition of quantics in sums of powers of linear forms.” Signal
Processing, Elsevier, 53(2):93–107, September 1996.
N. D. SIDIROPOULOS and R. BRO. “On the uniqueness of multilinear decomposition of N-way arrays.” J.
Chemometrics, 14:229–239, 2000.
P. COMON. “Tensor decompositions.” In J. G. McWhirter and I. K. Proudler eds, Mathematics in Signal
Processing V, pages 1–24. Clarendon Press, Oxford, UK, 2002.
A. SMILDE, R. BRO, and P. GELADI. Multi-Way Analysis. Wiley, 2004.
P. COMON, G. GOLUB, L-H. LIM, and B. MOURRAIN. “Symmetric tensors and symmetric tensor rank.”
SIAM Journal on Matrix Analysis Appl., 30(3):1254–1279, 2008.
P. COMON, J. M. F. ten BERGE, L. De LATHAUWER and J. CASTAING. “Generic and Typical Ranks of
Multi-Way Arrays.” Linear Algebra Appl., 2009. See also http://arxiv.org/abs/0802.2371
P. COMON, X. LUCIANI and A. de ALMEIDA. “Tensor decompositions, alternating least squares and other
tales.” J. Chemometrics. 2009, special issue.
A. STEGEMAN and P. COMON. “Subtracting a best rank-1 approximation does not necessarily decrease
tensor rank.” 2009
Pierre Comon MMDS - July 2009 34
Introduction Statistics Tensors TheEnd App Darmois proof Cumulants AH-thm ref
References
Participants in ANR-06-BLAN-0074 “DECOTES” project:
Signal team, Lab. I3S UMR6070, Univ. Nice and CNRS
Galaad team, INRIA, UR Sophia-Antipolis - Mediterranee.
Lab. LTSI U642, INSERM and Univ. Rennes 1
Thales Communications
Pierre Comon MMDS - July 2009 35