What is Machine Learning

WHAT IS MACHINE

LEARNING

Bhaskara Reddy Sannapureddy, Senior Project Manager @Infosys, +91-7702577769

WHAT IS MACHINE

LEARNING?

Automating automation

Getting computers to program themselves

Writing software is the bottleneck

Let the data do the work instead!

ComputerData

ProgramOutput

Computer

Data

OutputProgram

T R A D I T I O N A L P R O G R A M M I N G V S

M A C H I N E L E A R N I N G

Traditional Programming

Machine Learning

MAGIC?

No, more like gardening

Seeds = Algorithms

Nutrients = Data

Gardener = You

Plants = Programs

SAMPLE APPLICATIONS

Web search Computational biologyFinanceE-commerceSpace explorationRoboticsInformation extractionSocial networksDebugging[Your favorite area]

ML IN A NUTSHELL

Tens of thousands of machine learning algorithms

Hundreds new every year

Every machine learning algorithm has three components:

• Representation

• Evaluation

• Optimization

REPRESENTATION

Decision trees

Sets of rules / Logic programs

Instances

Graphical models (Bayes/Markov nets)

Neural networks

Support vector machines

Model ensembles

Etc.

EVALUATION

AccuracyPrecision and recallSquared errorLikelihoodPosterior probabilityCost / UtilityMarginEntropyK-L divergenceEtc.

OPTIMIZATION

Combinatorial optimization

• E.g.: Greedy search

Convex optimization

• E.g.: Gradient descent

Constrained optimization

• E.g.: Linear programming

TYPES OF LEARNING

Supervised (inductive) learning

• Training data includes desired outputs

Unsupervised learning

• Training data does not include desired outputs

Semi-supervised learning

• Training data includes a few desired outputs

Reinforcement learning

• Rewards from sequence of actions

INDUCTIVE LEARNING

Given examples of a function (X, F(X))

Predict function F(X) for new examples X

• Discrete F(X): Classification

• Continuous F(X): Regression

• F(X) = Probability(X): Probability estimation

Supervised learning

• Decision tree induction

• Rule induction

• Instance-based learning

• Bayesian learning

• Neural networks

• Support vector machines

• Model ensembles

• Learning theory

Unsupervised learning

• Clustering

• Dimensionality reduction

SUPERVISED AND

UNSUPERVISED LEARNING

MACHINE LEARNING

PROBLEMS

ML IN PRACTICE

Understanding domain, prior knowledge, and goals

Data integration, selection, cleaning,

pre-processing, etc.

Learning models

Interpreting results

Consolidating and deploying discovered knowledge

Loop

CLUSTERING STRATEGIES

K-means

• Iteratively re-assign points to the nearest cluster center

Agglomerative clustering

• Start with each point as its own cluster and iteratively merge the closest clusters

Mean-shift clustering

• Estimate modes of pdf

Spectral clustering

• Split the nodes in a graph based on assigned links with similarity weights

As we go down this chart, the clustering strategies have

more tendency to transitively group points even if they are

not nearby in feature space

THE MACHINE LEARNING

FRAMEWORK

Apply a prediction function to a feature representation of the

image to get the desired output:

Slide credit: L. Lazebnik

THE MACHINE LEARNING

FRAMEWORK

y = f(x)

Training: given a training set of labeled examples {(x1,y1), …, (xN,yN)}, estimate the

prediction function f by minimizing the prediction error on the training set

Testing: apply f to a never before seen test example x and output the predicted value y = f(x)

output prediction

function

Image

feature

Slide credit: L. Lazebnik

THANK YOU

Technology

What is Machine Learning