Novel Multiscale Representations of Data Sets for...

Novel Multiscale Representations of Data Sets for Interactive Learning

Mauro Maggioni, Eric Monson, R. BradyDept. of Mathematics and Computer Science

Duke University

FODAVA Annual Review 2010, Georgia Tech12/9/2010

Joint work with: G. Chen.Partial support: NSF/DHS, ONR

Friday, December 10, 2010

Ongoing efforts in several directions

• Using diffusion processes on graphs for (inter)active learning.

• Perform multiscale analysis on graphs: construction of graph-adaptivemultiscale analysis, for graph visualization and exploration, and (inter)activelearning.

• Sparse learning w.r.t. multiscale dictionaries on graphs.

• Construct data-adaptive dictionaries for data-modeling and exploration.

Random walks on data & graphs

• One may connect data points to form a graph, with edges weighted

by the similarity of data points.

• One can then construct a random on the data points, which may be

used for a variety of tasks:

– construct local and global embeddings of the data in low dimen-

sions,

– perform learning tasks such as clustering, classification, regres-

sion, etc..

– diffuse information (e.g. labels) on data

– study geometric properties of data

Start with few labelled

points

Predict (diffusion)

Label a ``well-chosen’’ new point

Active LearningWith E. Monson and R.

Brady [C.S.]

Given: full data set (e.g. a body of text documents).

Goal: learn a categorization of the data (e.g. topics of the text documents).Cost for every label we obtain from an expert. Large scale here means: “labelsare very expensive compared to very large amount of data available”.

Find points whose labels maximize the gain in prediction accuracy. Naturalcandidates: points with highly uncertain predictions + well-distributed on thedata (standard idea) (our contribution) Points actually proceed in a multiscalefashion.

20 30 40 50 60 70 800.5

With E. Monson and R. Brady [C.S.]Active Learning

1147 Science News articles, 8 categories

Cost: # of labeled points

Accuracy of predictions

X is N × D, N documents in RD, compute multiscale dictionary Φ (D ×M)on the D words. If f maps documents to their topic, write f = XΦβ + η andfind β by

argminβ ||f −XΦβ||22 + λ||{2−jγβj,k}||1 ,

which is a form of sparse regression. (λ, γ) are determined by cross-validation.

Example: text documentsWith S. Mukherjee

and J. Guinney

X is N × D, N documents in RD, compute multiscale dictionary Φ (D ×M)on the D words. If f maps documents to their topic, write f = XΦβ + η andfind β by

argminβ ||f −XΦβ||22 + λ||{2−jγβj,k}||1 ,

which is a form of sparse regression. (λ, γ) are determined by cross-validation.

Example: text documentsWith S. Mukherjee

and J. Guinney

Source of data: Nakagawa T, Kollmeyer T, Morlan B, Anderson, S, Bergstralh E, et al, (2008) A tissue biomarker panel predicting systemic progression after PSA recurrence post-definitive prostate cancer therapy, Plos One 3:e2318.

X is N ×D, N patients with D genes (here N ∼ 400 and D ∼ 1000).

With S. Mukherjee and J. Guinney

Example: gene microarray data

Added advantage: the multiscale genes we construct are much interpretable than eigengenes, several of them match important pathways, and moreover both small scale and large scale genelets seem relevant.

Geometric Wavelets: Multiscale Data-Adaptive Dictionaries• Many constructions of “general-purpose” dictionaries [Fourier, wavelets,

curvelets, ...], especially for low-dimensional signals (sounds, images,...).

Motivation: pretend we have rather good tractable models (e.g. functionspaces), construct good dictionaries by hand.

Goals: compression, signal processing tasks (e.g. denoising), etc...

• Recently, many constructions of data-adaptive dictionaries [K-SVD, K-planes, ...].

Motivation: we do not have tractable good models, need to adapt to data.

Goals: as before, albeit hopes for more general types of high-dimensionaldata.

• Important role of sparsity in statistics, learning, design of measurements,...: seek dictionaries that yield sparse representations of the data.

Dictionary Learning

xj+"cj+!

Scalingfunctions

Pt approxat scale j

Finerpt approxat scale j+!

Originaldata pointWavelets

Waveletcorrection

Treestructure

Centers!Graphics by E. Monson

With G. Chen, E. Monson, R. Brady

UI for Geometric WaveletsWith E. Monson and R.

Brady [C.S.]

Open problems & future dir.’s

• Geometric wavelets meet interactive learning.• Multiscale analysis on graphs meets interactive learning.• Better visualization of multiscale analysis of graphs [E. Monson, R. Brady]

• Towards a toolbox of highly robust geometric analysis tools for data sets [A. Little, G. Chen].

• Dynamic graphs [J. Lee].• Wrap up toolboxes; scale part of the code.

Collaborators: E. Monson, R. Brady (Duke C.S.); A. V. Little, K. Balachandrian (Math grad, Duke), J. Lee (Math undergrad, Duke); L. Rosasco (CS, MIT and Universita’ di Genova).Funding: NSF, ONR, Sloan Foundation, Duke.

www.math.duke.edu/~mauro

Novel Multiscale Representations of Data Sets for...

Documents

Convex sets and their integral representations

Multiscale Gaussian network model (mGNM) and multiscale

Machine Learning for Generalized Multiscale Modeling · Machine Learning for Generalized Multiscale Modeling Oliver K. Ernst Multiscale Modeling ML for Model Reduction ML for Multiscale

Sparse Dictionary Learning for Edit Propagation of …openaccess.thecvf.com/content_cvpr_2014/papers/Chen...vised framework to learn multiscale sparse representations of color images

Graph Representations, Two-Distance Sets, and Equiangular ...Graph Representations, Two-Distance Sets, and Equiangular Lines A. Neumaier Znstitut fiir Angewandte Mathemutik Universitiit

Representations of real numbers on fractal setsnumeration/uploads/Main/jiang_20201013.pdf · Representations of real numbers on fractal sets Kan Jiang One World Numeration Seminar

Representations of Molecular Propertiesalpha.chem.umb.edu/chemistry/ch612/documents/Group... · 2020-04-09 · Representations of Molecular Properties • We will routinely use sets

Learning Adaptive Multiscale Approximations to Data and ...wliao/Research/papers/IEEE_2016_MultiscaleGeometryAndLearning.pdfapproximation scheme for sets of points in high-dimensions

PART 2 Fuzzy sets vs crisp sets 1. Properties of α-cuts 2. Fuzzy set representations 3. Extension principle FUZZY SETS AND FUZZY LOGIC Theory and Applications

Intelligent Visualization of Multi Dimension Data Sets · Intelligent Visualization of Multi Dimension Data Sets Faculty of Computers and Information ... (interactive) visual representations

Multiscale representations: fractals, self-similar random …wavelets.ens.fr/PUBLICATIONS/ARTICLES/PDF/329.pdf · Fractals can be traced back to the discovery of continuous non diﬀerentiable

Math Foundations of Multiscale Graph Representations and ...fodava.gatech.edu/files/uploaded/KickOffPres/Duke-maggioni.pdfGraph representations may be adapted to user input, e.g. labels

MULTISCALE GEOMETRIC METHODS FOR DATA SETS II: GEOMETRIC MULTI-RESOLUTION …fodava.gatech.edu/sites/default/files/FODAVA-12-21.pdf · 2012-11-28 · MULTISCALE GEOMETRIC METHODS

Odisha Adarsha Vidyalaya Sangathan · 2019-11-11 · Sets and their representations. Empty set. Finite & Infinite sets. Equal sets. Subsets. Subsets of the set of real numbers.

Diffusion Geometries, and multiscale Harmonic Analysis on graphs and complex data sets. Multiscale diffusion geometries, “Ontologies and knowledge building”

benjaminlharris.weebly.combenjaminlharris.weebly.com › uploads › 1 › 0 › 9 › ...sets_of_reductive_l… · WAVE FRONT SETS OF REDUCTIVE LIE GROUP REPRESENTATIONS BENJAMIN

Multiscale Simulations - Georgia Tech Multiscale Systems ...msse.gatech.edu/CANE/lect05_MultiscaleSims_yanwang.pdf · Multiscale Systems Engineering Research Group Classical MD Based

A panorama on multiscale geometric representations - Laurent … · 2015-11-12 · Personal motivations for 2D directional "wavelets" Oﬀset (traces) Time (smpl) 0 50 100 150 200

Video-Based Object Recognition Using Novel Set-of-Sets ... · Video-based Object Recognition using Novel Set-of-Sets Representations ... provides an estimate of uncertainty. [15]

Multiscale Composites