Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
The Nonparanormal Skeptic
Han Liu, Fang Han, Ming Yuan,John Lafferty and Larry Wasserman
ICML 2012
Presented by Esther SalazarDuke University
June 7, 2013
E. Salazar (Reading group) June 7, 2013 1 / 14
Summary
The nonparanormal SKEPTIC method is proposed for estimating highdimensional undirected graphical models
SKEPTIC: Spearman/Kendall estimates preempt transformations toinfer correlations
Nonparametric rank based correlation coefficients: Spearman’s rhoand Kendall’s tau
The authors point that the paranormal graphical model can be a safereplacement for the Gaussian graphical model
E. Salazar (Reading group) June 7, 2013 2 / 14
Undirected graphical models (UGM)UGM provide a powerful framework for exploring interrelationships among largenumber of random variables
The joint distribution of a random vector X = (X1, . . . , Xd) is associated with agraph G = (V,E), where each vertex i corresponds to Xi
The pair (i, j) is not an element of the edge set E if and only if Xi isindependent of Xj given (Xk : k 6= i, j)
Goal: We have n observations of the random vector X, wish to estimate the edgeset E (i.e. the precision matrix!)
E. Salazar (Reading group) June 7, 2013 3 / 14
When the dimension d is small → assume that X has a multivariateGaussian distribution and then test the sparsity pattern of Ω = Σ−1
based on Σn
Drawback: d must be strictly smaller than n
In the high dimensional setting (d > n), a number of methods havebeen proposed
I Meinshausen & Buhlmann (2006): method based on parallel lassoregressions of each Xi on (Xj : j 6= i)
I Friedman et al. (2008): Ω computed using the glasso algorithmI . . .
Important issue: Normality assumption is restrictive and conclusionsinferred under this assumption could be misleading
To relax this assumption, the nonparanormal distributions is proposed
E. Salazar (Reading group) June 7, 2013 4 / 14
When the dimension d is small → assume that X has a multivariateGaussian distribution and then test the sparsity pattern of Ω = Σ−1
based on Σn
Drawback: d must be strictly smaller than n
In the high dimensional setting (d > n), a number of methods havebeen proposed
I Meinshausen & Buhlmann (2006): method based on parallel lassoregressions of each Xi on (Xj : j 6= i)
I Friedman et al. (2008): Ω computed using the glasso algorithmI . . .
Important issue: Normality assumption is restrictive and conclusionsinferred under this assumption could be misleading
To relax this assumption, the nonparanormal distributions is proposed
E. Salazar (Reading group) June 7, 2013 4 / 14
The nonparanormal
Let f = (f1, . . . , fd) be a set of monotonic univariate functions and letΣ0 ∈ Rd×d be a positive-definite correlation matrix with diag(Σ0) = 1
A d-dimensional random variable X = (X1, . . . , Xd)T has a nonparanormaldistribution X ∼ NPNd(f,Σ0) if
f(X) := (f1(X1), . . . , fd(Xd))T ∼ N(0,Σ0)
For continuous functions f , Liu et al. (2009) show that the NPN family isequivalent to the Gaussian copula family
The authors claim that the NPN family is much richer than the Normal. Also, theconditional independence graph is still encoded by the sparsity pattern ofΩ0 = (Σ0)−1, i.e.
Ω0jk = 0⇔ Xj ⊥⊥ Xk|X\j,k
E. Salazar (Reading group) June 7, 2013 5 / 14
The Normal-score based Nonparanormal Graph Estimator
(Liu et al., 2009)
Let Sns = [Snsjk ] be the correlation matrix of the transformed data, where
E. Salazar (Reading group) June 7, 2013 6 / 14
The Nonparanormal SKEPTIC: main idea
The main idea is to exploit Spearman’s rho and Kendall’s tau statistics todirectly estimate the unknown correlation matrix Σ0, without explicitlycalculating the marginal transformation functions fj
Both can be viewed as a form of nonparametric correlation between Xj
and Xk
E. Salazar (Reading group) June 7, 2013 7 / 14
The Nonparanormal SKEPTIC: main ideaPopulation versions of Spearman’s rho and Kendall’s tau are given by
E. Salazar (Reading group) June 7, 2013 8 / 14
NPN Skeptic with different graph estimators
NPN Skeptic with the graphical Dantzig selector
Main idea: take advantage of the connection between multivariate linearregression and entries of Ω
E. Salazar (Reading group) June 7, 2013 9 / 14
NPN Skeptic with different graph estimators
NPN Skeptic with CLIME (Cai et al., 2011, JASA)
CLIME: constrained `1-minimization for inverse matrix estimation
Main idea: the estimated correlation matrix S can also be plugged into theCLIME estimator defined by
E. Salazar (Reading group) June 7, 2013 10 / 14
NPN Skeptic with different graph estimators
NPN Skeptic with the graphical lasso
Main idea: plug in the estimated correlation coefficient matrix S into thegraphical lasso
E. Salazar (Reading group) June 7, 2013 11 / 14
Important theoretical property
The authors prove that the NPN Skeptic achieves the optimal parametricrate of convergence for precision matrix estimation
E. Salazar (Reading group) June 7, 2013 12 / 14
Application: Stock price data from Yahoo!
Finance
Data: Daily closing prices for 452 stocks from S&P 500 (Jan. 1, 2003to Jan. 1, 2008) that gives 1257 data points
St = (St,1, . . . , St,452) with St,j denoting the closing price of stock jon day t
They consider the variables Xtj = log(St,j/St−1,j)
Goal: Build graphs over the indices j
The 452 stocks are categorized into 10 Global Industry ClassificationStandard (GICS) sectors. It is expected that stocks from the same GICSsector should tend to be clustered together.
E. Salazar (Reading group) June 7, 2013 13 / 14
E. Salazar (Reading group) June 7, 2013 14 / 14