16
Proximal Methods for Sparse Hierarchical Dictionary Learning Rodolphe Jenatton, Julien Mairal, Guillaume Obozinski, Francis Bach Presented by Bo Chen, 2010, 6.11

Proximal Methods for Sparse Hierarchical Dictionary Learning

Embed Size (px)

DESCRIPTION

Proximal Methods for Sparse Hierarchical Dictionary Learning. Rodolphe Jenatton, Julien Mairal, Guillaume Obozinski, Francis Bach. Presented by Bo Chen, 2010, 6.11. Outline. 1. Structured Sparsity 2. Dictionary Learning 3. Sparse Hierarchical Dictionary Learning 4. Experimental Results. - PowerPoint PPT Presentation

Citation preview

Page 1: Proximal Methods for Sparse Hierarchical Dictionary Learning

Proximal Methods for Sparse Hierarchical Dictionary Learning

Rodolphe Jenatton, Julien Mairal, Guillaume Obozinski, Francis Bach

Presented by Bo Chen, 2010, 6.11

Page 2: Proximal Methods for Sparse Hierarchical Dictionary Learning

Outline

• 1. Structured Sparsity

• 2. Dictionary Learning

• 3. Sparse Hierarchical Dictionary Learning

• 4. Experimental Results

Page 3: Proximal Methods for Sparse Hierarchical Dictionary Learning

Structured Sparsity• Lasso (R. Tibshirani.,1996)

• Group Lasso (M. Yuan & Y. Lin, 2006)

• Tree-Guided Group Lasso (Kim & Xing, 2009)

Page 4: Proximal Methods for Sparse Hierarchical Dictionary Learning

Tree-Guided Structure Example

Tree Regularization Definition:

Kim & Xing, 2009

Multi-task:

Page 5: Proximal Methods for Sparse Hierarchical Dictionary Learning

Tree-Guided Structure PenaltyIntroduce two parameters:

Rewrite the penalty term, if the number of tasks is 2. (K=2):

Generally:

Kim & Xing, 2009

Page 6: Proximal Methods for Sparse Hierarchical Dictionary Learning

In Detail

Kim & Xing, 2009

Page 7: Proximal Methods for Sparse Hierarchical Dictionary Learning

Some Definitions about Hierarchical Groups

Page 8: Proximal Methods for Sparse Hierarchical Dictionary Learning

Hierarchical Sparsity-Inducing Norms

Page 9: Proximal Methods for Sparse Hierarchical Dictionary Learning

Dictionary Learning

If the structure information is introduced, the difference between dictionary learning and group lasso:

1. Group Lasso is a regression problem. Each feature has its own physical meaning. The structure information should be meaningful and correct. Otherwise, the ‘structure’ will hurt the method.

2. In dictionary learning, the dictionary is unknown. So the structure information will be a guide to help learn the structured dictionary.

Page 10: Proximal Methods for Sparse Hierarchical Dictionary Learning

Optimization• Proximal Operator for Structure Norm

Fix the dictionary D, the objective function:

=

Transformed to a proximal problem:

Proximal operator with the structure penalty:

Page 11: Proximal Methods for Sparse Hierarchical Dictionary Learning

Learning the DictionaryUpdating D 5 times in each iteration,

Updating A,

Page 12: Proximal Methods for Sparse Hierarchical Dictionary Learning

Experiments : Natural Image Patches

• Use the learned dictionary from training set to impute the missing values in testing samples. Each sample is a 8x8 patch.

• Training set: 50000; Testing set: 25000• Test 21 balanced tree structures of depth 3 and 4. Also

set the number of the nodes in each layer.

Page 13: Proximal Methods for Sparse Hierarchical Dictionary Learning

Learned Hierarchical Dictionary

Page 14: Proximal Methods for Sparse Hierarchical Dictionary Learning

Experiments : Text DocumentsKey points:

Page 15: Proximal Methods for Sparse Hierarchical Dictionary Learning

Visualization of NIPS proceedings

Documents: 1714Words: 8274

Page 16: Proximal Methods for Sparse Hierarchical Dictionary Learning

Postings ClassificationTraining set: 1000; Testing set: 425; Documents: 1425; Words:13312Goal: classify the postings from the two newsgroups, alt.atheism and talk.religion.misc.