54
Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections DIGITAL TWIN AI and Machine Learning: Unsupervised Learning Prof. Andrew D. Bagdanov andrew.bagdanov AT unifi.it Dipartimento di Ingegneria dell’Informazione Università degli Studi di Firenze 23 January 2020 AI&ML: Unsupervised Learning A. D. Bagdanov

DIGITAL TWIN AI and Machine Learning: … › bagdanov › pdfs › DTwin › 05-Unsupervised.pdfIntroduction Leftovers: Supervised Learning Unsupervised Learning Reflections Overview

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    DIGITAL TWIN AI and Machine Learning:Unsupervised Learning

    Prof. Andrew D. Bagdanovandrew.bagdanov AT unifi.it

    Dipartimento di Ingegneria dell’InformazioneUniversità degli Studi di Firenze

    23 January 2020

    AI&ML: Unsupervised Learning A. D. Bagdanov

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    Outline

    Introduction

    Leftovers: Supervised Learning

    Unsupervised Learning

    Reflections

    AI&ML: Unsupervised Learning A. D. Bagdanov

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    Introduction

    AI&ML: Unsupervised Learning A. D. Bagdanov

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    Overview

    I Today we will begin with a few leftover topics from supervisedlearning.

    I We will see a few models that produce probabilistic estimates ofclass membership (as opposed to just raw scores).

    I And we will look at some methods for evaluating classifiers.I Then I will talk about three types of unsupervised learning

    approaches that are useful for data modeling and visualization.I Finally, in the second part of today’s lecture we will have a

    laboratory on supervised learning.

    AI&ML: Unsupervised Learning A. D. Bagdanov

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    Leftovers: Supervised Learning

    AI&ML: Unsupervised Learning A. D. Bagdanov

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    Discriminant Analysis: Preliminaries

    A topological tangentI Some of you may have noticed that I have been conveniently

    ignoring a few crucial issues.

    4 3 2 1 0 1 2 3x

    4

    2

    0

    2

    y

    Label01

    AI&ML: Unsupervised Learning A. D. Bagdanov

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    Discriminant Analysis: Preliminaries

    A topological tangent (continued)I Some of you may have noticed that I have been conveniently

    ignoring a crucial issue.

    4 3 2 1 0 1 2 3x

    4

    2

    0

    2

    y

    Label01

    AI&ML: Unsupervised Learning A. D. Bagdanov

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    Discriminant Analysis: Preliminaries

    A topological tangent (continued)I Some of you may have noticed that I have been conveniently

    ignoring a crucial issue.

    8 6 4 2 0 2 4 6 8x

    8

    6

    4

    2

    0

    2

    4

    6

    8

    y

    Label01

    AI&ML: Unsupervised Learning A. D. Bagdanov

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    Discriminant Analysis: Preliminaries

    Multiclass problemsI SVMs are a great for binary classification problems.I What do we do, however, if we have three (or more) classes?I This is a Multiclass SVM, and there are two main approaches.I One Versus Rest (OVR):

    AI&ML: Unsupervised Learning A. D. Bagdanov

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    Discriminant Analysis: Preliminaries

    Multiclass problems (continued)I OVR classifiers work well if classes are clustered together.I But, they can leave portions of the space classified as multiple or

    no classes.I The other strategy is One Versus One (OVR):

    AI&ML: Unsupervised Learning A. D. Bagdanov

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    Discriminant Analysis: Preliminaries

    Multiclass problems (continued)

    I In sklearn: sklearn.svm.SVCI SVC handles multiclass problems according to the

    decision_function_shape argument.I It defaults to the One Versus Rest (OVR) since it is the easiest to

    interpret.I Advice: Scikit-learn usually has reasonable default and I

    recommend starting from these and to override defaults only afteryou have a better understanding of the problem.

    AI&ML: Unsupervised Learning A. D. Bagdanov

    https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html#sklearn.svm.SVC

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    Discriminant Analysis: Preliminaries

    Discrete probability distributionsTo specify a discrete random variable, we need a sample space and aprobability mass function:I Sample space Ω: Possible states x of the random variable X

    (outcomes of the experiment, output of the system, measurement).I Discrete random variables have a finite number of states.I Events: Possible combinations of states (subsets of Ω)I Probability mass function P (X = x): A function which tells us how

    likely each possible outcome is:

    P (X = x) = PX(x) = P (x)

    P (x) ≥ 0 for each x∑x∈ΩP (x) = 1

    P (A) = P (x ∈ A) =∑x∈AP (X = x)

    AI&ML: Unsupervised Learning A. D. Bagdanov

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    Discriminant Analysis: Preliminaries

    Discrete probability distributions (continued)

    I Conditional probability: Recalculated probability of event A aftersomeone tells you that event B happened:

    P (A|B) =P (A ∩ B)P (B)

    P (A ∩ B) = P (A|B)P (B)

    I Bayes Rule:

    P (B|A) =P (A|B)P (B)P (A)

    AI&ML: Unsupervised Learning A. D. Bagdanov

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    Discriminant Analysis: Preliminaries

    Expectation and covariance of multivariate distributions

    I Conditional distributions are just distributions which have a(conditional) mean or variance.

    I Note: E(X|Y ) = f (Y ) – If I tell you what Y is, what is the averagevalue of X?

    I Covariance is the expected value of the product of fluctuations:

    Cov(X, Y ) = E ((X − E(X))(Y − E(Y )))= E(XY )− E(X)E(Y )

    Var(X) = Cov(X,X)

    AI&ML: Unsupervised Learning A. D. Bagdanov

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    Discriminant Analysis: Preliminaries

    Multivariate distributions: the same, but different

    I Multivariate distributions are the same as bivariate distributions –just with more dimensions.

    I X, x are vector valued.I Mean: E(X) =

    ∑x xP (x)

    I Covariance matrix:

    Cov(Xi , Xj) = E(XiXj)− E(Xi)E(Xj)Cov(X) = E(XX>)− E(X)E(X)>

    I Conditional and marginal distributions: Can define and calculateany (multi or single-dimensional) marginals or conditionaldistributions we need: P (X1), P (X1, X2), P (X1, X2, X3|X4), etc..

    AI&ML: Unsupervised Learning A. D. Bagdanov

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    Discriminant Analysis: Preliminaries

    Mean, variance, and conditioning of continuous RVs

    I Mostly the same as the discrete case, just with sums replaced byintegrals.

    I Mean: E(X) =∫x xp(x)dx

    I Variance: Var(X) = E(X2)− E(X)2

    I Conditioning: If X has pdf p(x), then X|(X ∈ A) has pdf:

    pX|A(x) =p(x)

    P (A)=

    p(x)∫x∈A p(x)dx

    AI&ML: Unsupervised Learning A. D. Bagdanov

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    Discriminant Analysis: Preliminaries

    The univariate Gaussian (normal) distribution

    I The Univariate Gaussian:

    t ∼ N (µ, σ2)

    p(t|µ, σ2) =1√2πσ2

    exp

    (−1

    2

    (t − µσ

    )2)

    I The Gaussian has mean µ and variance σ2 and precision β = 1/σ2

    I What are the mode and the median of the Gaussian?

    AI&ML: Unsupervised Learning A. D. Bagdanov

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    Discriminant Analysis: Preliminaries

    Gaussian distributionsI As they say, a picture is worth a thousand words:

    AI&ML: Unsupervised Learning A. D. Bagdanov

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    Discriminant Analysis: Preliminaries

    The multivariate Gaussian

    f (x;µ,Σ) =1√

    (2π)k |Σ|exp(−

    1

    2(x− µ)TΣ−1(x− µ))

    AI&ML: Unsupervised Learning A. D. Bagdanov

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    Discriminant Analysis: Preliminaries

    Why is ANY of this important?I Well, given data the only thing we can do is estimate P (x|C = c)

    and P (x) – the empirical data distribution.I Plus P (C = c) – the class priors.I But then Bayes Rule tells us the posterior:

    P (C = c |x) =P (x|C = c)P (C = c)

    P (x)

    8 6 4 2 0 2 4 6 8x

    8

    6

    4

    2

    0

    2

    4

    6

    8

    y

    Label01

    AI&ML: Unsupervised Learning A. D. Bagdanov

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    Bayesian Discriminant Analysis

    The Math

    I Both the Linear Discriminant Classifier (LDC) and QuadraticDiscriminant Classifier (QDA) can be derived fromclass-conditional distribution of the data for each class.

    I More specifically, for Bayesian discriminant analysis theclass-conditional data distributions are modeled as multivariateGaussian distributions:

    f (x|C = c ;µc ,Σc) =1√

    (2π)k |Σc |exp(−

    1

    2(x− µc)TΣ−1c (x− µc))

    I What’s the difference between quadratic and linear?I Linear: assume all Σc are equal.I Quadratic: estimate Σc independently.

    AI&ML: Unsupervised Learning A. D. Bagdanov

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    Bayesian Discriminant Analysis

    The bit picture

    AI&ML: Unsupervised Learning A. D. Bagdanov

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    Bayesian Discriminant Analysis

    Bayesian Discriminant Analysis: AnalysisI Bayesian discriminants have a number of advantages:

    I The naturally handle multiclass classification problems – justestimate multiple class-dependent data distributions.

    I Parameter estimation is relatively easy: just a per-class empiricalmean and a single (or n) covariance matrices.

    I They are intuitive since they directly output a probability distributionover classes – something SVMs and KNN classifiers do not easily do.

    I They do have a number of disadvantages:I Not all data is Gaussian, and this assumption can lead to poor

    performance.I Can be sensitive to unbalanced problems (due to poor estimates ofΣc and µc).

    I In sklearn:I sklearn.discriminant_analysis.LinearDiscriminantAnalysisI sklearn.discriminant_analysis.QuadraticDiscriminantAnalysis

    AI&ML: Unsupervised Learning A. D. Bagdanov

    https://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html#sklearn.discriminant_analysis.LinearDiscriminantAnalysishttps://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.QuadraticDiscriminantAnalysis.html#sklearn.discriminant_analysis.QuadraticDiscriminantAnalysis

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    Evaluation

    Evaluating Classifiers: Balanced Problems

    I How to we measure the performance of trained classifiers?I If you have a balanced classification problem (like most we will see

    in the labs), use the classifier accuracy:

    accuracy =# correctly classified test samples

    # test samples

    I Why might this be a bad metric if the problem is unbalanced?

    AI&ML: Unsupervised Learning A. D. Bagdanov

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    Evaluation

    Evaluating Classifiers: Confusion Matrices

    I A good tool to get an overview of the types of errors a classifier ismaking is the confusion matrix:

    AI&ML: Unsupervised Learning A. D. Bagdanov

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    Evaluation

    Evaluating Classifiers: Unbalanced ProblemsI We begin by dissecting the classifications:

    I True positives (TP): we predicted yes, and sample does belong toclass.

    I True negatives (TN): we predicted no, and the sample does notbelong to class.

    I False positives (FP): we predicted yes, but the sample does notbelong to class – also known as a Type I error.

    I False negatives (FN): we predicted no, but sample does belong tothe class – also known as a Type II error.

    I We can then define some useful metrics for unbalanced problems:

    Precision(c) =TP

    TP+ FP

    Recall(c) =TP

    TP+ FN

    F1(c) = 2Precision(c) ∗ Recall(c)Precision(c)+ Recall(c)

    AI&ML: Unsupervised Learning A. D. Bagdanov

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    Evaluation

    Evaluating Classifiers: Analysis

    I The evaluation metric to use is usually clear given the type ofproblem (classification, regression, retrieval) and the distribution oftraining data (balanced, unbalanced).

    I Getting a reliable estimate of a classifier can be tricky as it clearlydepends on your train/test split.

    I We will return to this tomorrow when we look at the modelselection problem.

    I In sklearn: sklearn.metrics

    AI&ML: Unsupervised Learning A. D. Bagdanov

    https://scikit-learn.org/0.16/modules/classes.html#classification-metrics

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    Unsupervised Learning

    AI&ML: Unsupervised Learning A. D. Bagdanov

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    Overview

    Why unsupervised learning?

    I Unsupervised learning is learning something from data in theabsence of labels or target values.

    I It is an active area of current research.I Note that it is often used even in the context of supervised learning

    problems:I For dimensionality reduction: often not all of the input features are

    needed (or even desirable), and unsupervised techniques can be toreduce the input dimensionality of a problem.

    I For visualization: to get a sense of the data, we may want tovisualize it in some meaningful way – this often uses dimensionalityreduction to reduce high dimensional features to just two or three.

    I In this part of the lecture we will see three useful unsupervisedtechniques: Principal Component Analysis (PCA), Clustering, andt-Distributed Stochastic Neighbor Embedding (t-SNE).

    AI&ML: Unsupervised Learning A. D. Bagdanov

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    Principal Component Analysis

    PCA: MotivationI Say we have a classification problem with data distributed like this:

    6 4 2 0 2 4 66

    4

    2

    0

    2

    4

    6

    AI&ML: Unsupervised Learning A. D. Bagdanov

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    Principal Component Analysis

    PCA: Motivation (continued)I We know how to train a classifier for linearly separable classes:

    6 4 2 0 2 4 66

    4

    2

    0

    2

    4

    6

    AI&ML: Unsupervised Learning A. D. Bagdanov

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    Principal Component Analysis

    PCA: Motivation (continued)I But, let’s take a closer look at this situation.I In the original feature space we need both features to define the

    discriminant dividing the two classes.I But, maybe if we could somehow transform this space so that the

    features are decorrelated. . .

    6 4 2 0 2 4 66

    4

    2

    0

    2

    4

    6

    AI&ML: Unsupervised Learning A. D. Bagdanov

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    Principal Component Analysis

    PCA: Motivation (continued)I What if we rotate the feature space so that the principal data

    directions are aligned with the axes?

    6 4 2 0 2 4 63

    2

    1

    0

    1

    2

    3

    AI&ML: Unsupervised Learning A. D. Bagdanov

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    Principal Component Analysis

    PCA: Motivation (continued)I Well now we have an easier problem to solve, since the

    discriminant function needs only one feature:

    6 4 2 0 2 4 63

    2

    1

    0

    1

    2

    3

    AI&ML: Unsupervised Learning A. D. Bagdanov

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    Principal Component Analysis

    PCA: Motivation (continued)I We have turned a two-dimensional problem into a one-dimensional

    proble.

    6 4 2 0 2 4 6

    2

    0

    2

    6 4 2 0 2 4 6 80.0

    0.2

    0.4

    AI&ML: Unsupervised Learning A. D. Bagdanov

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    Principal Component Analysis

    PCA: The MathI How do we find these principal directions of the data in feature

    space?I How do we even define what a principal direction is?I Well, one natural way to define it is iteratively:

    1. The first principal component is the direction in space along whichprojections have the largest variance.

    2. The second principal component is the direction which maximizesvariance among all directions orthogonal to the first.

    3. etc.

    I This defines up to D principal components, where D is the originalfeature dimensionality.

    I If we project onto all principal components we are rotating theoriginal features so that in the new axes the features aredecorrelated.

    AI&ML: Unsupervised Learning A. D. Bagdanov

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    Principal Component Analysis

    PCA: The Math (continued)I If we project onto the first d < D principal components, we are

    performing dimensionality reduction.I How do we do any of this? We use the eigenvectors of the data

    covariance matrix Σ.I Recall the multivariate Gaussian:

    f (x;µ,Σ) =1√

    (2π)k |Σ|exp(−

    1

    2(x− µ)TΣ−1(x− µ))

    I Σ this defines the hyperellipsoidal shape of the Gaussian.I If Σ is diagonal, the features are decorrelated, so if we diagonalize

    it we are decorrelating the features.I We do this by finding the eigenvectors of (an estimate of) Σ – the

    data covariance matrix.AI&ML: Unsupervised Learning A. D. Bagdanov

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    Principal Component Analysis

    PCA: Analysis

    I Principal Component Analysis is used for many purposes:I Dimensionality reduction: in high-dimensional features spaces with

    limited data, PCA can help reduce the problem to a simpler one infewer, decorrelated dimensions.

    I Feature decorrelation: some models assume feature decorrelation(or are simpler to solve with decorrelated features).

    I Visualization: looking at data projected down to the first twoprincipal dimensions can tell you a lot about the problem.

    I In sklearn: sklearn.decomposition.PCA

    from sklearn.decomposition import PCAmodel = PCA()model.fit(xs)new_xs = model.transform(xs)

    AI&ML: Unsupervised Learning A. D. Bagdanov

    https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    Clustering

    Clustering: Motivation (continued)

    I What if you have data with structure, but you don’t know whatthis structure is or have any a priori model for it?

    1.0 0.5 0.0 0.5 1.0

    1.0

    0.5

    0.0

    0.5

    1.0

    1.0 0.5 0.0 0.5 1.0 1.5 2.00.75

    0.50

    0.25

    0.00

    0.25

    0.50

    0.75

    1.00

    7.5 5.0 2.5 0.0 2.5 5.0 7.5 10.0

    10

    5

    0

    5

    10

    0.0 0.2 0.4 0.6 0.8 1.0

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    AI&ML: Unsupervised Learning A. D. Bagdanov

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    Clustering

    Clustering: Motivation (continued)

    I Clustering refers to techniques that learn groups of related datapoints in feature space:

    .01s

    MiniBatchKMeans

    1.96s

    AffinityPropagation

    .10s

    MeanShift

    .33s

    SpectralClustering

    .20s

    Ward

    .06s

    AgglomerativeClustering

    .01s

    DBSCAN

    .02s

    Birch

    .02s 2.14s .06s .70s .09s .07s .01s .02s

    .02s 1.99s .06s .13s .10s .08s .01s .02s

    .03s 1.92s .11s .22s .08s .06s .01s .02s

    AI&ML: Unsupervised Learning A. D. Bagdanov

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    Clustering

    Clustering: The k-Means Algorithm

    I k-Means is a very simple algorithm that associates each data pointwith one of k means or centroids in the data space:

    I Given an initial set of k means m11, . . . ,m1k , the alternates between

    two steps:1. Assignment: Sti = { xp | ||xp −mti || ≤ ||xp −mtj ||∀j }.2. Update: mt+1i =

    1|Sti|∑xi∈Sti

    xj .

    I The algorithm converges when the cluster assignments no longerchange.

    I Note: this algorithm is not guaranteed to find the global optimumclusters.

    AI&ML: Unsupervised Learning A. D. Bagdanov

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    Clustering

    Clustering: Analysis

    I Clustering is a robust technique (in that it never fails).I Its usefulness depends on the data distribution and which type of

    clustering you perform.I In sklearn:

    I cluster.AffinityPropagationI cluster.AgglomerativeClusteringI cluster.BirchI cluster.DBSCANI cluster.KMeansI cluster.MeanShiftI cluster.SpectralClusteringI See sklearn.cluster for more details.

    AI&ML: Unsupervised Learning A. D. Bagdanov

    https://scikit-learn.org/stable/modules/classes.html#module-sklearn.cluster

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    Clustering

    Clustering: Motivation (continued)

    I It’s useful to revisit this plot:

    .01s

    MiniBatchKMeans

    1.96s

    AffinityPropagation

    .10s

    MeanShift

    .33s

    SpectralClustering

    .20s

    Ward

    .06s

    AgglomerativeClustering

    .01s

    DBSCAN

    .02s

    Birch

    .02s 2.14s .06s .70s .09s .07s .01s .02s

    .02s 1.99s .06s .13s .10s .08s .01s .02s

    .03s 1.92s .11s .22s .08s .06s .01s .02s

    AI&ML: Unsupervised Learning A. D. Bagdanov

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    t-SNE

    t-SNE: Motivation

    I Sometimes the structure of data in high dimensional data isobscured by its very high-dimensional nature.

    I Questions like this are related to data design: you often need todiscover if the features you have collected are at all related to thequestions you want to ask about it.

    I In this last section we will look at a very powerful algorithm foruncovering latent structure in high-dimensional data.

    I This approach is called the t-Distributed Stochastic NeighborEmbedding (t-SNE) algorith,

    I It works by reducing dimensionality (usually to just two dimensions)in a way that preserves the distances between samples.

    AI&ML: Unsupervised Learning A. D. Bagdanov

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    t-SNE

    t-SNE: Motivation (continued)I Even in just 64 dimensions data can be complex:

    A selection from the 64-dimensional digits dataset

    AI&ML: Unsupervised Learning A. D. Bagdanov

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    t-SNE

    t-SNE: Motivation (continued)I We can try random projection to two dimensions:

    01

    2

    3

    4

    5

    6

    7

    8

    9

    0

    12

    3

    4

    5

    6

    7

    8

    9

    0

    1

    2

    3

    4 5

    6

    7

    89

    0

    9

    5

    5

    6

    5

    09

    8

    98

    4

    1

    77 3

    5

    1

    0

    0

    2

    2

    7

    8

    2

    0

    1

    2

    6

    33

    7 33

    4

    66

    6

    4

    9

    1 5

    0

    9

    5

    2

    8

    2

    0

    0

    1

    7

    6

    3

    2

    1

    7

    4

    6

    31

    3

    9

    1

    7

    6

    8

    4

    3

    1

    4

    0

    53

    6

    9

    6

    1

    7

    5

    44

    7

    2

    8

    2

    2

    5

    7

    9

    5

    4

    8

    84

    9

    0

    8

    98

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    12

    3

    4

    5

    6

    7

    8

    9

    0

    9

    5

    5

    6

    5

    09 8

    9

    8

    4

    1

    7

    7

    3

    5

    1

    00

    2

    2

    7

    8

    2

    0

    1

    2

    6

    3

    3

    7

    33

    4

    6

    6

    6

    4

    9

    1

    5

    0

    9

    5

    2

    8

    2

    0

    0

    1

    7

    6

    3

    2

    1

    7

    3

    1

    3

    9

    1

    7

    6

    8

    4

    3

    1

    40

    5

    3

    6

    9

    6

    1

    7

    5

    44

    7

    2

    8

    2

    2

    5

    5

    4

    8 84

    9

    0

    8

    9

    80 1

    2

    3

    4

    5

    6

    7

    8

    9

    01

    2

    3

    4

    5

    6

    7

    8

    9

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    9

    5

    5

    6

    50

    9

    8

    9

    8

    4

    1

    7

    7

    3

    5

    1

    0

    0

    2

    2

    7

    8

    20

    1

    2

    6

    3

    3

    7

    3

    3

    4

    6

    6

    6

    4

    9

    1

    5

    0

    9

    5

    2

    8

    2

    0

    0

    1

    7

    6

    3

    2

    1

    7

    4

    6

    3

    1

    3

    9

    1

    7

    6

    8

    4

    3

    1

    40

    53

    6

    9

    6

    1

    7

    5

    4

    4

    7

    28

    2

    2

    5

    7

    9

    5

    4

    8

    8

    49

    0

    8

    9

    3

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    1

    2

    3

    4

    56

    7

    89

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    9

    5

    5

    6

    5

    0

    98 9

    8

    4

    1

    7

    7

    3

    51

    0

    0

    2

    2

    7

    8

    2

    0

    1

    2

    6

    3

    3

    7

    3 3

    4

    6

    6

    6

    4

    9

    1

    5

    0

    9

    5

    2

    8

    2

    0

    0

    17

    6

    3

    21

    7

    4

    6

    3

    139

    1

    7

    6

    8

    4

    3

    14

    05

    3

    6

    9

    6

    1

    75

    4

    4

    7

    2

    8

    2

    2

    5

    7

    95

    4

    8

    8

    4

    9

    0

    8

    9

    8

    0

    12

    3

    45

    6

    7

    8

    9

    0

    1

    2

    3

    4

    5

    6

    7

    8

    90

    1

    2

    3

    4

    5

    6

    7

    8

    9

    09

    5

    5

    6

    5

    09

    8

    9

    8

    4 1

    7

    7 3

    5

    1

    00

    2

    2

    7

    8

    2

    0

    1

    2

    6

    3

    3

    7

    33

    4

    66

    6

    4

    9

    1

    5

    0

    9

    5

    2

    82

    0

    0

    1

    7

    6

    3

    2

    1

    7

    4

    6

    3

    1

    3

    9

    1

    7

    6

    8

    4

    3

    1

    4

    05

    3

    6

    9

    6

    1

    7

    5

    4

    4

    7

    2

    8

    2

    25

    7

    9

    5

    4

    88

    4

    9

    0

    8

    9

    80

    12

    3

    4 5

    6

    7

    89

    0 1

    2

    3

    45

    6

    7

    8

    90

    1

    2

    3

    4

    5

    6

    7

    8

    9

    09

    5

    5

    6

    5 0

    9 8

    9

    8

    4

    17

    7

    3

    5

    1

    0 0

    2

    27

    8

    2

    0

    1

    2

    6

    3 37

    33

    4

    6

    6

    6

    4

    9

    1

    5

    0

    9

    5

    2

    8

    2

    0

    0

    1

    7

    6

    3

    2

    1

    7

    4

    6

    3

    1

    3

    9

    1

    7

    6

    84

    3

    1

    4

    0

    5

    3

    6

    9

    6

    1

    7

    5

    4

    4

    72

    8

    22

    5

    7

    9

    5

    4

    8

    8

    4

    9

    0

    8

    9

    8

    01

    2

    3

    4

    5

    6

    7

    89

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    1

    2

    3

    456

    7

    89

    0

    9

    5

    56

    50

    9

    8

    9

    8

    4

    1

    7

    7

    3

    510

    0

    2

    2

    7

    8

    2

    01

    2

    6

    3

    3

    7

    3

    3

    4

    6 6 64

    9 1

    5

    0

    9

    5

    2

    8

    2

    0

    0

    1

    7

    6

    3

    2

    1

    7

    4

    6

    3

    1

    3

    9 1

    7

    6

    8

    4

    3

    1

    4

    0

    5

    3

    6

    9

    6

    1

    7

    544

    7

    2

    8

    2

    2

    5

    7

    9

    5

    4

    8

    8

    4

    9

    0

    8

    9

    8

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    1

    2

    3

    4

    56

    7

    8

    90

    9

    5

    5

    6

    5

    0

    9

    8

    9

    8

    4

    1

    77

    3

    5

    1

    2

    7

    8

    2

    0

    1

    2

    6

    3

    3

    7

    3

    3

    4

    6 66

    4

    9

    15

    0

    9

    5

    28

    2

    0

    0

    1

    7

    6

    3

    2

    1

    4

    6

    3

    1 3

    9

    17

    6

    84

    31

    4

    05

    3

    6

    96

    17 5

    4

    4

    7

    2

    8

    2

    2

    5

    7

    9

    544

    9

    0

    8

    9

    8

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    01

    2

    3

    4 5

    6

    7

    89

    0

    12

    3

    4

    5

    6

    7

    8

    9

    0

    9

    55

    6

    5

    0

    9

    8

    9

    8

    4

    1

    7

    7

    3

    5

    1

    0

    0

    7

    8

    2

    0

    1

    2

    6

    3

    3

    7

    3

    3

    4

    6

    66

    49

    1

    5

    0

    9 5

    2

    82

    00

    1

    7

    6

    3

    2

    1

    74

    6

    3

    1

    3 9

    1

    7

    68

    43

    1

    4

    0 5

    3

    6

    9

    6

    1

    7

    5

    4

    4

    7

    2

    8

    2

    2

    5

    7

    9

    5

    4 8

    8

    4 9

    08

    9

    801

    2

    3

    4

    5

    6

    7

    8

    9

    01

    2

    3

    4

    5

    6

    7

    8

    9

    0

    1

    2

    3

    4

    56

    7

    8

    9

    0

    95

    5

    6

    50

    9

    8

    98

    4

    1

    7

    7

    3

    5

    1

    0

    0

    2

    2

    7

    8

    2

    0

    1

    2

    6

    3

    3

    7

    3

    3

    4

    66

    6

    4

    9

    15

    0

    9

    5

    2

    8

    2

    0

    0 1

    7

    6

    3

    2

    17

    4

    6

    3

    1

    3

    9

    1

    7

    6

    8

    4

    3

    1

    4

    0

    5

    3

    6

    9

    61

    7

    5

    4

    4

    7

    2

    82

    2

    5

    7

    9

    5

    4

    88

    4

    9

    0

    8

    9

    8

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    9

    5

    5

    6

    5

    0

    9

    8

    9

    8

    4

    1

    7

    73

    5

    1

    00

    2

    2

    7

    8

    2

    0

    1

    2

    6

    3

    3

    73

    3

    4

    6

    6

    6

    4

    9

    1

    5

    0

    9

    52

    8

    2

    0

    0

    1

    7

    6

    3

    2

    1

    7

    4

    6

    3

    1

    3

    9

    1

    7

    6

    8

    4

    3

    1

    4

    05

    3

    6

    9

    6

    1

    7

    5

    44

    7

    2

    8

    2

    2

    5

    7

    9

    5

    48

    8

    4

    9

    0

    8

    0

    1

    2

    3

    4

    5

    6

    7 8

    9

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    1

    2

    3

    4

    56

    7

    8

    9

    0

    9

    55

    6

    5

    0

    9

    8

    9

    8

    4

    1

    7

    7

    3

    5

    1

    0 0

    2

    2

    7

    8

    2

    0

    1

    2

    6

    3

    3

    7

    3

    3

    4

    666

    4

    9

    1

    5

    09

    5

    2

    8

    2

    0

    0

    1

    7

    6

    3

    2

    1

    7

    4 6

    3

    1

    39

    1

    7

    6

    8

    4

    3

    1

    4 0

    5

    3

    6

    9

    6

    1

    7

    5

    4

    4

    7

    2

    82

    2

    5

    7

    9

    5

    4

    8

    8

    4

    9

    0

    8

    9

    8

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    12

    3

    4

    5

    6

    9

    0 1

    2

    3

    4 5

    6

    7

    8

    9

    0

    95

    5

    65

    0

    9

    8

    9

    8

    4

    17

    7

    3

    5

    1

    0

    02

    2

    7

    8

    2

    0

    1

    2

    6

    3 3

    7

    3

    3

    4

    6

    6

    6

    4

    9

    1

    5

    0

    9

    52

    8

    0

    1

    7

    6

    3

    2

    1

    7

    4

    6

    3

    1

    3

    9

    1

    7

    6

    8

    4

    3

    1

    4

    05

    3

    6

    9

    6

    1

    7

    5

    4

    4

    7

    22

    5

    7

    9

    5

    4

    4

    90

    8

    9

    80

    1

    2

    3

    4

    5

    6

    7 8

    90

    12

    3

    4

    5

    6

    7

    8

    9

    0

    1

    2

    3

    4

    5

    6

    7

    89

    0

    9

    5

    5

    6

    5

    0

    9

    8

    9

    84

    1

    7

    7

    35

    10 0

    2

    2

    7

    8

    2

    0

    1

    26

    3

    37

    3 3

    4

    6

    6

    6

    4

    9

    1

    5

    0

    95

    2

    8

    2

    00

    1 7

    6

    3

    2

    1

    7

    4

    6

    3

    1

    3

    9

    1

    7

    6

    8

    4

    3

    1

    4

    0

    5 3

    6

    9

    6

    1

    75

    4

    4

    7

    28

    2

    25

    7

    9

    5

    4

    8

    8

    4

    9

    0

    8

    9

    8

    Random Projection of the digits

    AI&ML: Unsupervised Learning A. D. Bagdanov

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    t-SNE

    t-SNE: Motivation (continued)I Or projection onto first two principal components:

    0

    12

    3

    4

    5

    6

    7

    89

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    9

    55

    6

    5

    0

    9

    8

    9

    8

    4

    1

    7

    7

    3

    51

    00

    2

    2 7

    8

    2 0

    1

    2

    6

    3

    3

    7

    3

    3

    4

    6 6

    6

    4

    9

    1

    5

    0

    9

    5

    2 8

    2

    0

    01

    7

    6

    3

    2

    17

    4

    6

    3

    1

    39

    1

    7

    6

    8

    4

    3

    1

    4

    0

    5

    3

    6

    9

    6

    1

    7

    5

    44

    7

    2

    82

    25

    7

    9

    5

    4

    8

    8

    4

    9

    0

    8

    9

    8

    0

    1

    2

    3

    4

    5

    6

    7 8

    9

    01

    2

    3

    4

    5

    6

    7 8

    9

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    9

    5

    5

    6

    50

    9

    8

    9

    8

    4

    1

    7

    7

    3

    51

    0

    0

    22

    7

    8

    2

    0

    1

    2

    6

    3

    3

    7

    33

    4

    6

    6 6

    4

    9

    1

    5

    0

    9

    5

    2

    8

    2

    0

    0

    17

    6

    3

    2

    1

    7

    3

    1

    3

    9

    1

    7

    6

    8

    4

    3

    1

    4

    05

    3

    6

    9

    6 1

    7

    5

    44

    7

    2

    8

    2

    2

    55

    4

    88

    4

    9

    0

    8

    9

    80

    1

    2

    3

    4

    5

    6

    78

    9

    0

    1

    2

    3

    4

    5

    6

    7 8

    9

    0

    1

    2

    3

    4

    5

    6

    78

    9

    0

    9

    55

    6

    5

    0

    9

    8

    9

    8

    4

    1

    7

    7

    3

    5

    1

    0

    0

    2

    2

    7

    8

    2

    0

    1

    2

    6

    3

    3

    7

    3

    3

    4

    6 6

    6

    4

    9

    1

    5

    0

    95

    2

    8

    2

    0

    0

    1

    7

    6

    3

    2

    1

    7

    4

    6

    3

    1

    3

    9

    1

    7

    6

    8

    4

    3

    1

    4

    0

    5

    3

    6

    9

    6 1

    7

    5

    44

    7

    2

    8

    225

    7

    9

    5

    4

    88

    4

    9 08

    9

    3

    0 12

    3

    4

    5

    6

    7

    8

    9

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0 1

    2

    3

    4

    5

    6

    78

    9

    0

    9

    5

    5

    6

    5

    0

    9

    8

    9

    8

    4

    1

    7

    7

    3

    5

    10

    0

    22

    7

    8

    2

    0

    1

    2

    6

    3

    3

    7

    33

    4

    6

    66

    4

    9

    1

    5

    0

    9

    528

    2

    00

    1

    7

    6

    3

    2

    1

    7

    4

    6

    3

    1

    3

    9

    1

    7

    6

    8

    4

    3

    1

    4

    0

    5

    3

    6

    9

    6

    1

    7

    5

    4

    4

    7

    2

    8

    2

    2

    57

    9

    5

    4

    8

    8

    4

    9

    0

    8

    9

    8

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0 1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    1

    2

    3

    4

    5

    6

    78

    9

    0

    9

    55

    6

    5

    0

    9

    8

    9

    8

    4

    1

    7

    7

    3

    5

    1

    0

    0

    2

    2 7

    8

    2

    0

    1

    2

    6

    3

    3

    7

    33

    4

    6

    6

    64

    9

    1

    5

    0

    9

    5

    2

    8

    2

    0

    0

    1

    7

    6

    3

    2

    1

    7

    46

    3

    1

    39

    1

    7

    6

    8

    4

    3

    1

    4

    0

    5

    3

    6

    9

    6

    1

    7

    5

    4

    4

    7

    2

    8

    2

    2

    5

    7

    9

    5

    4

    88

    4

    9

    0

    8

    9

    8

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9 0

    1

    2 3

    4

    5

    6

    7

    8

    9

    0

    9

    5

    5

    6

    5

    09

    8

    9

    8

    4

    1

    7

    7

    3

    5

    1

    0

    0

    2

    2

    7

    8

    2

    0

    1

    2

    6

    3

    3

    7

    33

    46

    6

    6

    4

    9

    1

    50

    9

    5

    2

    8

    2

    00

    1

    7

    6

    3

    2

    17

    4

    6

    3

    1

    3

    9

    1

    7

    6

    8

    4

    3

    1

    4

    0

    5

    3

    6

    9

    6

    1

    7

    5

    44

    7

    2

    8

    22

    5

    7

    9

    5

    4

    8

    8

    4

    90

    8

    9

    8

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    01

    2

    3

    4

    5

    6

    7

    8

    9

    0 1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    9

    5

    5

    6

    5

    0

    9

    8

    9

    8

    4

    1

    7

    7

    3

    5

    1

    0

    0

    2

    2

    78

    2

    0

    1

    2

    6

    33

    7

    3

    3

    4

    6

    6

    6

    4

    9

    1

    5

    0

    9

    52

    8

    2

    0

    0

    1

    7

    6

    3

    2

    1

    7

    4

    6

    3

    1

    3

    9

    1

    7

    6

    8

    4

    3

    1

    4

    05

    3

    6

    9

    6

    1

    75

    44

    72

    8

    2

    2

    57

    9

    5

    4

    88

    4

    9

    0

    8

    9

    8

    1

    2

    3

    4

    5

    6

    7

    8

    9

    01

    2

    3

    4

    5

    6

    7

    8

    9

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    9

    55

    6

    5

    0

    9

    8

    9

    8

    4

    1

    77

    3 5

    1

    2

    7

    82

    0

    1

    2

    6

    33

    7

    3

    3

    4 6

    66

    4

    9

    1

    5

    0

    9

    52

    8

    2

    0 0

    1 7

    6

    3

    2

    1

    46

    3

    1

    39

    1

    7

    6

    8

    4

    3

    1

    4

    0

    5

    3

    6

    9

    6

    1

    7

    5

    4

    4

    7

    2

    8

    2

    25

    7

    9 5

    4

    4

    90

    8

    9

    8

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    1

    2

    3

    4

    5

    6

    7

    89

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    9

    5

    5

    6

    50

    9

    8

    9

    8

    4

    1

    77

    3

    5

    1

    0

    0

    7

    8

    2

    0

    1

    2

    6

    3

    3

    7

    3 3

    4

    66

    6

    4

    9

    1

    5

    0

    9

    5

    2

    8

    2

    00

    1

    7

    6

    3

    2

    1

    7

    46

    3

    1

    3

    9

    1

    7

    6

    8

    4

    3

    1

    4

    05

    3

    6

    9

    6

    1

    7

    5

    4

    4

    7

    2

    82

    2

    5

    7

    9

    5

    4

    8

    8

    4

    90 8

    9

    8

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    1

    2

    3

    4

    5

    6

    78

    9

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    9

    5

    5

    6

    5

    0

    9

    8

    9

    8

    4

    1

    7 7

    3

    5

    1

    00

    2

    2

    78

    2

    0

    1

    2

    6

    3

    3

    7

    3

    3

    4

    6

    66

    4

    9

    15

    0

    9

    5

    2

    8

    2

    0 0

    1

    7

    6

    3

    2

    1

    7

    4

    6

    3

    1

    3

    9

    1

    7

    6

    8

    4

    3

    1

    4

    0

    5

    3

    6

    9

    6

    1

    7

    5

    4

    4

    7

    2

    8

    2

    2

    5

    7

    9

    5

    4

    8

    8

    4

    9

    0

    8

    9

    8

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    01

    2

    3

    4

    5

    6

    7

    8

    9

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    9

    5

    5

    6

    5

    0

    9

    8

    9

    8

    4

    1

    7

    7

    3

    5

    10 0

    2

    2

    7

    8

    2

    0

    1

    2

    6

    3

    3

    7

    3

    3

    4

    6 66

    4

    9

    1

    5

    0

    9

    5

    2

    8

    2

    0 0

    1

    7

    6

    3

    2

    1

    7

    4

    6

    3

    1

    3

    9

    1

    7

    6

    8

    4

    3

    1

    4

    0

    5

    3

    6

    9

    61

    75

    44

    7

    2

    8

    22

    5

    7

    9

    5

    4

    88

    4

    9

    0

    8

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    01

    2

    3

    4

    5

    6

    7

    8

    9

    01

    2

    3

    4

    5

    6

    7

    8

    9

    0

    9

    55

    6

    5

    0

    9

    8

    9

    8

    4

    17

    7

    3

    5

    1

    0

    0

    2

    2

    7

    8

    2

    0

    1

    2

    6

    33

    7

    3

    3

    4

    6

    6

    6

    4

    9

    15 0

    9

    5

    2

    8

    2

    0

    0

    1

    7

    6

    3

    21

    7

    4

    6

    3

    1

    3

    9

    1

    7

    6

    8

    4

    3

    1

    4

    0

    5

    3

    6

    9

    6

    1

    7

    5

    4

    4

    7

    2

    8

    2

    2

    5

    7

    9

    5

    4

    8

    8

    4

    9

    0

    8

    9

    8

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    1

    2

    3

    4

    5

    6

    9

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    95

    5

    6

    5

    0

    9

    8

    9

    8

    4

    1

    7

    7

    3

    5

    1

    0

    0

    2

    2

    7

    8

    2

    0

    1

    2

    6

    33

    7 3

    3

    4

    66

    6

    4

    9

    1

    5

    0

    9

    5

    2

    8 0

    1

    7

    6

    3

    2

    1

    7

    4

    6

    3

    1

    3

    9

    1

    7

    6

    8

    4

    3

    1

    4

    0

    5

    3

    6

    9

    6

    1

    7

    5

    4

    4

    7

    2

    2

    5

    7

    9

    5

    4

    4

    9

    0

    8

    9

    8

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    9

    55

    6

    5

    0

    9

    8

    9

    8

    4

    1

    77

    3

    5

    1

    0

    0

    2

    2

    7

    82

    0 1

    2

    6

    3

    3

    7

    3

    3

    4

    6

    6

    6

    4

    9

    1

    5

    0

    9

    5

    2

    8

    2

    0

    0

    17

    6

    3

    2

    1

    7

    4 6

    3

    1

    3

    9

    1

    7

    6

    8

    4

    3

    1

    4

    0

    5

    3

    6

    9

    6

    1

    7

    5

    4

    4

    7

    2

    8

    22

    5

    7

    9

    5

    4

    8

    8

    4

    9

    0

    8

    9

    8

    Principal Components projection of the digits (time 0.00s)

    AI&ML: Unsupervised Learning A. D. Bagdanov

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    t-SNE

    t-SNE: Motivation (continued)I What we want is something like:

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    1

    2

    34

    5

    6

    7

    8

    9

    0

    9

    5

    5

    6

    5

    0

    9

    8

    9

    8

    4

    1

    77

    3

    5

    1

    0 0

    22

    7

    8

    2

    0

    1

    2

    6

    33

    7

    3 3 4

    6

    6 6

    4

    9

    1

    5

    0

    9

    5

    2

    8

    2

    00

    1

    7

    6

    3

    2

    1

    7

    4

    6

    3

    1

    3

    9

    1

    7

    6

    8

    43

    1

    4

    0

    5

    3

    6

    9

    6

    1

    7

    5 44

    7

    2

    8

    225

    7

    9

    5

    4

    8

    8

    4

    9

    0

    8

    9

    8

    0

    1

    2

    34

    5

    6

    7

    8

    9

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    1

    2

    3 4

    5

    6

    7

    8

    9

    0

    9

    5

    5

    6

    5

    0

    9

    8

    9

    8

    4

    1

    77

    3

    51

    0

    0

    22

    7

    8

    2

    0

    1

    2

    6

    33

    7

    33 4

    6 66

    4

    9

    15

    0

    9

    5

    2

    8

    2

    00

    1

    7

    6

    3

    2

    1

    7

    3

    1

    3

    9

    1

    7

    6

    8

    43

    1

    4

    0

    5

    3

    6

    9 6

    1

    7

    5

    44

    7

    2

    8

    2 2

    55

    4

    88

    4

    9

    0

    8

    9

    8

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    9

    55

    6

    5

    0

    9

    8

    9

    8

    4

    1

    7

    7

    35

    1

    00

    2 2

    7

    8

    2

    0

    1

    2

    6

    33

    7

    33

    4

    666

    4

    9

    1

    5

    0

    9

    5

    2

    8

    2

    00

    1

    7

    6

    3

    2

    1

    7

    4

    6

    3

    1

    3

    9

    1

    7

    6

    8

    4

    3

    1

    4

    0

    5

    3

    6

    9

    6

    1

    7

    54

    4

    7

    2

    8

    22

    5

    7

    95

    48 84

    9

    0

    8

    9

    3

    0

    1

    2

    3

    45

    6

    7

    8

    9

    0

    1

    2

    3

    45

    6

    7

    8

    9

    0

    1

    2

    3

    45

    6

    7

    8

    9

    0

    9

    55

    6

    5

    0

    9

    8

    9

    8 4

    1

    77

    3

    5

    1

    00

    22

    7

    8

    2

    0

    12

    6

    33

    7

    33

    4

    666

    4

    9

    1

    5

    0

    9

    5

    2

    8

    2

    0 0

    1

    7

    6

    3

    2

    1

    7

    4

    6

    3

    1

    3

    9

    1

    7

    6

    84

    3

    1

    4

    0

    5

    3

    6

    9

    6

    1

    7

    544

    7

    2

    8

    2

    25

    7

    9

    5 4

    88

    4

    9

    0

    8

    9

    8

    0

    12

    3

    4

    5

    6

    7

    8

    9

    0

    12

    3

    4

    5

    6

    7

    8

    9

    0

    12

    3

    4

    5

    6

    7

    8

    9

    0

    9

    5

    5

    6

    5

    0

    9

    8

    9

    8 4

    1

    77

    35

    1

    0

    0

    22

    7

    8

    2

    0

    12

    63

    3

    7

    33

    4

    66

    6

    4

    9

    1

    5

    0

    9

    5

    2

    8

    2

    0 0

    1

    7

    63

    2

    1

    7

    4

    6

    3

    1

    3

    9

    1

    7

    6

    84

    3

    1

    4

    0

    53

    696

    1

    7

    5

    44

    7

    2

    8

    22

    5

    7

    9

    5

    488 4

    9

    0

    8

    9

    8

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    9

    5

    5

    6

    5

    0

    9

    8

    9

    8

    4

    1

    77

    3

    5

    1

    0

    0

    22

    7

    8

    2

    0

    1

    2

    6

    3 3

    7

    33

    4

    66 6

    4

    9

    1

    5

    0

    9

    5

    2

    8

    2

    0

    0

    1

    7

    6

    3

    2

    1

    7

    4

    6

    3

    1

    3

    9

    1

    7

    6

    8

    4

    3

    1

    4

    0

    5

    3

    6

    9

    6

    1

    7

    5

    4

    4

    7

    2

    8

    22

    5

    7

    95

    48

    8

    49

    0

    8

    98

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    9

    55

    6

    5

    0

    9

    8

    9

    84

    1

    7

    7

    3

    5

    1

    00

    22

    7

    8

    2

    0

    1

    2

    6

    33

    7

    33

    4

    6

    66

    4

    9

    1

    5

    0

    9

    5

    2

    8

    2

    0

    0

    1

    7

    6

    3

    2

    1

    7

    4

    6

    3

    1

    3

    9

    1

    7

    6

    8 4

    3

    1

    4

    0

    5

    3

    69 6

    1

    7

    5

    44

    7

    2

    8

    2

    2

    5

    7

    9

    5

    48

    8

    4

    9

    0

    8

    9

    8

    1

    2

    3

    45

    6

    7

    8

    9

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    9

    55

    6

    5

    0

    9

    8

    9

    84

    1

    77

    3 5

    1

    2

    7

    8

    2

    0

    1

    2

    6

    33

    7

    3 3

    4

    666

    4

    9

    1

    5

    0

    9

    5

    2

    8

    2

    0 0

    1

    7

    6

    3

    2

    1

    4

    6

    3

    1

    39

    1

    7

    6

    84

    3

    1

    4

    0

    5

    3

    69

    6

    1

    7

    5

    4

    4

    7

    2

    8

    2

    2

    5

    7

    9

    5

    4 4

    9

    0

    8

    9

    8

    0

    1

    2

    3

    45

    6

    7

    8

    9

    0

    1

    2

    3

    45

    6

    7

    8

    9

    0

    12

    3

    4

    5

    6

    7

    8

    9

    0

    9

    55

    6

    5

    0

    9

    8

    9

    8

    4

    1

    7 7

    3

    5

    1

    00

    7

    8

    2

    0

    12

    6

    3

    3

    7

    33

    4

    6

    66

    4

    9

    1

    5

    0

    9

    5

    2

    8

    2

    00

    1

    7

    6

    3

    2

    1

    7

    4

    6

    3

    1

    3

    9

    1

    7

    6

    84

    3

    1

    4

    0

    5

    369

    6

    1

    7

    54

    4

    7

    2

    8

    2

    2

    5

    7

    9

    5

    4

    8 84

    9

    0

    8

    9

    8

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    1

    2

    345

    6

    7

    8

    9

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    9

    5

    5

    6

    5

    0

    9

    8

    9

    8

    4

    1

    77

    3

    5

    1

    00

    22

    7

    8

    2

    0

    1

    2

    6

    3

    3

    7

    3

    34

    6

    66

    4

    9

    1

    5

    0

    9

    5

    2

    8

    2

    00

    1

    7

    6

    3

    2

    1

    7

    4

    6

    3

    1

    3

    9

    1

    7

    6

    8

    4

    3

    1

    4

    0

    5

    3

    6

    96

    1

    7

    5

    4

    4

    7

    2

    8

    22

    5

    7

    9

    5

    4

    88

    4

    9

    0

    8

    9

    8

    0

    1

    2

    3

    45

    6

    7

    8

    9

    0

    12

    3 45

    6

    7

    8

    9

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    9

    55

    6

    5

    0

    9

    8

    9

    8

    4

    1

    77

    3

    5

    1

    00

    22

    7

    82

    0

    1

    2

    6

    3

    3

    7

    33

    4

    666

    4

    9

    1

    5

    0

    9

    5

    2

    8

    2

    00

    1

    7

    6

    3

    21

    7

    4

    6

    3

    1

    3

    9

    1

    7

    6

    8

    4

    3

    1

    4

    0

    5

    3

    69 6

    1

    7

    5

    44

    7

    2

    8

    22

    5

    7

    9

    5 4

    88

    4

    9

    0

    8

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    9

    55

    6

    5

    0

    9

    8

    9

    8 4

    1

    77

    3 5

    1

    00

    22

    7

    8

    2

    0

    1

    2

    6

    33

    7

    33

    4

    66

    6

    4

    9

    1

    5

    0

    9

    5

    2

    8

    2

    00

    1

    7

    6

    3

    2

    1

    7

    4

    6

    3

    1

    3

    9

    1

    7

    6

    8

    4

    3

    1

    4

    0

    53

    69 6

    1

    7

    5

    4

    4

    7

    2

    8

    22

    5

    7

    9

    5

    4

    884

    9

    0

    8

    9

    8

    0

    12

    3

    4

    5

    6

    7

    89

    0

    1

    2

    3

    4

    5

    6

    9

    0

    12

    3

    45

    6

    7

    8

    9

    0

    9

    55

    6

    5

    0

    9

    8

    98

    4

    1

    77

    3

    5

    1

    0

    0

    2

    2

    7

    82

    0

    1

    2

    6

    33

    7

    33

    4

    666

    4

    9

    1

    5

    0

    9

    5

    2

    8

    0

    1

    7

    6

    3

    2

    1

    7

    4

    6

    3

    1

    3

    9

    1

    7

    6

    8

    4

    3

    1

    4

    0

    5

    3

    69

    6

    1

    7

    5

    44

    7

    2

    2

    5

    7

    9

    5

    4

    4

    9

    0

    8

    9

    8

    0

    1

    2

    345

    6

    7

    8

    9

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    0

    1

    2

    34

    5

    6

    7

    8

    9

    0

    9

    55

    6

    5

    0

    9

    8

    9

    8

    4

    1

    7

    7

    3

    5

    1

    00

    22

    7

    8

    2

    0

    1

    2

    6

    33

    7

    3

    3

    4

    66 6

    4

    9

    1

    5

    0

    9

    5

    2

    8

    2

    00

    1

    7

    6

    32

    1

    7

    4

    6

    3

    1

    3

    9

    1

    7

    6

    8

    43

    1

    4

    0

    53

    6

    96

    1

    7

    5 4 4

    7

    2

    8

    2 2

    5

    7

    9

    5

    4

    8 8

    4

    9

    0

    8

    9

    8

    t-SNE embedding of the digits (time 2.63s)

    AI&ML: Unsupervised Learning A. D. Bagdanov

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    t-SNE

    t-SNE: The Math

    I t-distributed Stochastic Neighbor Embedding (t-SNE) is aalgorithm for visualization.

    I It is a nonlinear dimensionality reduction for embeddinghigh-dimensional data in a low-dimensional space of two or threedimensions.

    I It models each high-dimensional point by a low-dimensional pointso that that similar objects map to nearby points and dissimilarobjects are to distant points with high probability.

    I The short version: points that are close in the high-dimensionalspace are close in the mapped space.

    AI&ML: Unsupervised Learning A. D. Bagdanov

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    t-SNE

    T-SNE: Analysis

    I t-SNE is a very powerful tool for visualization and dimensionalityreduction.

    I It can give you visual feedback that indicates if, on average, thehigh-dimensional features represent your problem well (or not).

    I Note that t-SNE visualization can sometimes be misleading in thatthe associations it learns are not always present in the originalspace.

    I In sklearn: sklearn.manifold.TSNE

    AI&ML: Unsupervised Learning A. D. Bagdanov

    https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html#sklearn.manifold.TSNE

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    Reflections

    AI&ML: Unsupervised Learning A. D. Bagdanov

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    Supervised learning

    I Given data, the world is full of supervised learning problems.I There are robust and mature techniques for addressing many of

    these.I Here we have only seem a few examples of the variety of models

    that exist.I Those we have seen are some of the simplest, and therefore I

    strongly urge you to try them before looking to more complexmodels.

    I Even these simple models, explicitly or implicitly (via kernelmachine) embedded in a high-dimensional space can also scale incomplexity.

    AI&ML: Unsupervised Learning A. D. Bagdanov

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    Unsupervised learning

    I Of course, there is far more unlabeled data than labeled data in theworld.

    I Unsupervised learning techniques can help us visualize andunderstand data distributions in feature space.

    I Clustering is useful to uncover latent structure that is often hiddenin high-dimensional data.

    I Principal Component Analysis (PCA) allows us to reducedimensionality in ways that preserve variance in thehigh-dimensional feature space.

    I Tomorrow we will start talking about the bias/variance tradeoffand the problem of overfitting.

    I These unsupervised tools are often helpful for understanding whenmodels are overfitting and often help avoid overfitting outright.

    AI&ML: Unsupervised Learning A. D. Bagdanov

  • Introduction Leftovers: Supervised Learning Unsupervised Learning Reflections

    Supervised Learning Lab

    I The laboratory notebook for today:

    http://bit.ly/DTwin-ML4

    AI&ML: Unsupervised Learning A. D. Bagdanov

    http://bit.ly/DTwin-ML4