30
DynaML: Machine Learning in Scala REPL & API Mandar. Chandorkar 1 1 Multiscale Dynamics CWI Amsterdam SPLASH, 2016 Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 1 / 20

DynaML: Splash 2016

Embed Size (px)

Citation preview

DynaML: Machine Learning in ScalaREPL & API

Mandar. Chandorkar1

1Multiscale DynamicsCWI Amsterdam

SPLASH, 2016

Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 1 / 20

Outline

1 Open Source Machine LearningIntroductionWhy Open SourceWhy Scala

2 DynaMLDynaML: A TourDynaML: Scala Features

Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 2 / 20

Outline

1 Open Source Machine LearningIntroductionWhy Open SourceWhy Scala

2 DynaMLDynaML: A TourDynaML: Scala Features

Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 3 / 20

Machine LearningOr how I learned to stop worrying and enjoy the bomb

A set of methodologies/formalisms to algorithmically generatepredictions of system behavior or learn structure from data.

Your favourite buzzword.

Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 4 / 20

Machine LearningOr how I learned to stop worrying and enjoy the bomb

A set of methodologies/formalisms to algorithmically generatepredictions of system behavior or learn structure from data.

Your favourite buzzword.

Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 4 / 20

Machine LearningOr how I learned to stop worrying and enjoy the bomb

A set of methodologies/formalisms to algorithmically generatepredictions of system behavior or learn structure from data.

Your favourite buzzword.

Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 4 / 20

Outline

1 Open Source Machine LearningIntroductionWhy Open SourceWhy Scala

2 DynaMLDynaML: A TourDynaML: Scala Features

Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 5 / 20

Research environment

Conducting research in this field requires one to ...

Experiment wildly

Mix and match

Test

Visualize (if possible)

Optimize performance/scale

If you happen to see large data sets.

Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 6 / 20

Research environment

Conducting research in this field requires one to ...

Experiment wildly

Mix and match

Test

Visualize (if possible)

Optimize performance/scale

If you happen to see large data sets.

Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 6 / 20

Research environment

Conducting research in this field requires one to ...

Experiment wildly

Mix and match

Test

Visualize (if possible)

Optimize performance/scale

If you happen to see large data sets.

Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 6 / 20

Research environment

Conducting research in this field requires one to ...

Experiment wildly

Mix and match

Test

Visualize (if possible)

Optimize performance/scale

If you happen to see large data sets.

Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 6 / 20

Research environment

Conducting research in this field requires one to ...

Experiment wildly

Mix and match

Test

Visualize (if possible)

Optimize performance/scale

If you happen to see large data sets.

Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 6 / 20

Research environment

Conducting research in this field requires one to ...

Experiment wildly

Mix and match

Test

Visualize (if possible)

Optimize performance/scale If you happen to see large data sets.

Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 6 / 20

Outline

1 Open Source Machine LearningIntroductionWhy Open SourceWhy Scala

2 DynaMLDynaML: A TourDynaML: Scala Features

Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 7 / 20

Mathematical Environment

We must be able to juggle a number of abstract math objects/ideas

Linear Algebra: Matrices, vectors, associated operations M.x , xT y ...

Probability: Random variables, algebraic operations, sampling

Optimization: Iterative methods

Functions: Gamma, Hermite, ...

Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 8 / 20

Mathematical Environment

We must be able to juggle a number of abstract math objects/ideas

Linear Algebra: Matrices, vectors, associated operations M.x , xT y ...

Probability: Random variables, algebraic operations, sampling

Optimization: Iterative methods

Functions: Gamma, Hermite, ...

Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 8 / 20

Mathematical Environment

We must be able to juggle a number of abstract math objects/ideas

Linear Algebra: Matrices, vectors, associated operations M.x , xT y ...

Probability: Random variables, algebraic operations, sampling

Optimization: Iterative methods

Functions: Gamma, Hermite, ...

Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 8 / 20

Mathematical Environment

We must be able to juggle a number of abstract math objects/ideas

Linear Algebra: Matrices, vectors, associated operations M.x , xT y ...

Probability: Random variables, algebraic operations, sampling

Optimization: Iterative methods

Functions: Gamma, Hermite, ...

Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 8 / 20

Language Features

FP

Functional programming constructs such as immutability, map-reduce,transformations on collections.

Object Oriented

Predictive models fit into ontological hierarchies which can be exploitedvia inheritance and generics.

Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 9 / 20

Language Features Contd.

Static Types & Type Inference

A life saver when working with a plethora of mathematical objects Ex: Amatrix multiplied by a vector gives a vector. M.x = y

Tail Recursion

Many computational procedures are conveniently expressed as recursions.Examples abound: Hermite functions, Fibonacci, Blocked Matrixfactorization ...

Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 10 / 20

Outline

1 Open Source Machine LearningIntroductionWhy Open SourceWhy Scala

2 DynaMLDynaML: A TourDynaML: Scala Features

Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 11 / 20

Architecture

DynaML

ModelClasses

LS-SVM

GaussianProcess

LinearModels

NeuralNets

Pipes

Scalers,Encoders

WorkflowLibrary

Optimization

GradientBased

Global Op-timization

EvaluationMetrics

Classification

Regression

Probability

RandomVariables

ProbabilityModels

LinearAlgebra

BlockedMatrices

BlockedLinear

Solvers,Factorizers

Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 12 / 20

Outline

1 Open Source Machine LearningIntroductionWhy Open SourceWhy Scala

2 DynaMLDynaML: A TourDynaML: Scala Features

Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 13 / 20

In Workflows

// Sca l a Gene r i c s , I n f i x n o t a t i o n and app l y methodsva l p = DataPipe ( ( x : Double ) => ( x∗x ) . t o I n t )va l q = DataPipe ( ( x : I n t ) => x%2)va l pThenq = p > q//Now run the p i p e on an i npu tpThenq ( 7 . 0 )// r e s 3 : I n t = 1

Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 14 / 20

Tail Recursion

def h e r m i t e ( n : I n t , x : Double ) : Double = {def h e r m i t e H e l p e r (k : I n t , x : Double ,a : Double , b : Double ) : Double =

k match {case 0 => acase 1 => bcase => h e r m i t e H e l p e r (k−1, x , b ,x∗b − ( k−1)∗a )

}h e r m i t e H e l p e r ( n , x , 1 , x )

}

Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 15 / 20

Model API

t r a i t Model [ T, Q, R ] {

/∗∗∗ The t r a i n i n g data∗ ∗/

protected va l g : T

def p r e d i c t ( p o i n t : Q) : R}

Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 16 / 20

A Glimpse

Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 17 / 20

Ecosystem

DynaML builds on the existing Scala ecosystem for scientific computingand data analysis

Breeze: Linear algebra primitives used extensively in DynaML

Apache Spark: RDD implementations for DynaML models can beachieved by extending top level classes.

Spire: Mathematical abstractions like Field, Group etc

Ammonite: Powers the DynaML REPL

Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 18 / 20

Summary

Open Source is the undisputed driver of innovation in ML research

Scala is an extremely well designed language for scientific computing

DynaML borrows from the Scala jungle and makes from it somethingmore ...

Future

Advanced Neural Network architecturesIntegration with Jupyter, Zeppelin, etc notebook interfaces

Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 19 / 20

Resources

User Guide: http://mandar2812.github.io/DynaML/

Github Repo: http://github.com/mandar2812/DynaML/

Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 20 / 20