Upload
mandar-chandorkar
View
106
Download
0
Embed Size (px)
Citation preview
DynaML: Machine Learning in ScalaREPL & API
Mandar. Chandorkar1
1Multiscale DynamicsCWI Amsterdam
SPLASH, 2016
Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 1 / 20
Outline
1 Open Source Machine LearningIntroductionWhy Open SourceWhy Scala
2 DynaMLDynaML: A TourDynaML: Scala Features
Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 2 / 20
Outline
1 Open Source Machine LearningIntroductionWhy Open SourceWhy Scala
2 DynaMLDynaML: A TourDynaML: Scala Features
Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 3 / 20
Machine LearningOr how I learned to stop worrying and enjoy the bomb
A set of methodologies/formalisms to algorithmically generatepredictions of system behavior or learn structure from data.
Your favourite buzzword.
Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 4 / 20
Machine LearningOr how I learned to stop worrying and enjoy the bomb
A set of methodologies/formalisms to algorithmically generatepredictions of system behavior or learn structure from data.
Your favourite buzzword.
Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 4 / 20
Machine LearningOr how I learned to stop worrying and enjoy the bomb
A set of methodologies/formalisms to algorithmically generatepredictions of system behavior or learn structure from data.
Your favourite buzzword.
Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 4 / 20
Outline
1 Open Source Machine LearningIntroductionWhy Open SourceWhy Scala
2 DynaMLDynaML: A TourDynaML: Scala Features
Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 5 / 20
Research environment
Conducting research in this field requires one to ...
Experiment wildly
Mix and match
Test
Visualize (if possible)
Optimize performance/scale
If you happen to see large data sets.
Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 6 / 20
Research environment
Conducting research in this field requires one to ...
Experiment wildly
Mix and match
Test
Visualize (if possible)
Optimize performance/scale
If you happen to see large data sets.
Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 6 / 20
Research environment
Conducting research in this field requires one to ...
Experiment wildly
Mix and match
Test
Visualize (if possible)
Optimize performance/scale
If you happen to see large data sets.
Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 6 / 20
Research environment
Conducting research in this field requires one to ...
Experiment wildly
Mix and match
Test
Visualize (if possible)
Optimize performance/scale
If you happen to see large data sets.
Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 6 / 20
Research environment
Conducting research in this field requires one to ...
Experiment wildly
Mix and match
Test
Visualize (if possible)
Optimize performance/scale
If you happen to see large data sets.
Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 6 / 20
Research environment
Conducting research in this field requires one to ...
Experiment wildly
Mix and match
Test
Visualize (if possible)
Optimize performance/scale If you happen to see large data sets.
Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 6 / 20
Outline
1 Open Source Machine LearningIntroductionWhy Open SourceWhy Scala
2 DynaMLDynaML: A TourDynaML: Scala Features
Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 7 / 20
Mathematical Environment
We must be able to juggle a number of abstract math objects/ideas
Linear Algebra: Matrices, vectors, associated operations M.x , xT y ...
Probability: Random variables, algebraic operations, sampling
Optimization: Iterative methods
Functions: Gamma, Hermite, ...
Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 8 / 20
Mathematical Environment
We must be able to juggle a number of abstract math objects/ideas
Linear Algebra: Matrices, vectors, associated operations M.x , xT y ...
Probability: Random variables, algebraic operations, sampling
Optimization: Iterative methods
Functions: Gamma, Hermite, ...
Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 8 / 20
Mathematical Environment
We must be able to juggle a number of abstract math objects/ideas
Linear Algebra: Matrices, vectors, associated operations M.x , xT y ...
Probability: Random variables, algebraic operations, sampling
Optimization: Iterative methods
Functions: Gamma, Hermite, ...
Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 8 / 20
Mathematical Environment
We must be able to juggle a number of abstract math objects/ideas
Linear Algebra: Matrices, vectors, associated operations M.x , xT y ...
Probability: Random variables, algebraic operations, sampling
Optimization: Iterative methods
Functions: Gamma, Hermite, ...
Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 8 / 20
Language Features
FP
Functional programming constructs such as immutability, map-reduce,transformations on collections.
Object Oriented
Predictive models fit into ontological hierarchies which can be exploitedvia inheritance and generics.
Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 9 / 20
Language Features Contd.
Static Types & Type Inference
A life saver when working with a plethora of mathematical objects Ex: Amatrix multiplied by a vector gives a vector. M.x = y
Tail Recursion
Many computational procedures are conveniently expressed as recursions.Examples abound: Hermite functions, Fibonacci, Blocked Matrixfactorization ...
Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 10 / 20
Outline
1 Open Source Machine LearningIntroductionWhy Open SourceWhy Scala
2 DynaMLDynaML: A TourDynaML: Scala Features
Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 11 / 20
Architecture
DynaML
ModelClasses
LS-SVM
GaussianProcess
LinearModels
NeuralNets
Pipes
Scalers,Encoders
WorkflowLibrary
Optimization
GradientBased
Global Op-timization
EvaluationMetrics
Classification
Regression
Probability
RandomVariables
ProbabilityModels
LinearAlgebra
BlockedMatrices
BlockedLinear
Solvers,Factorizers
Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 12 / 20
Outline
1 Open Source Machine LearningIntroductionWhy Open SourceWhy Scala
2 DynaMLDynaML: A TourDynaML: Scala Features
Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 13 / 20
In Workflows
// Sca l a Gene r i c s , I n f i x n o t a t i o n and app l y methodsva l p = DataPipe ( ( x : Double ) => ( x∗x ) . t o I n t )va l q = DataPipe ( ( x : I n t ) => x%2)va l pThenq = p > q//Now run the p i p e on an i npu tpThenq ( 7 . 0 )// r e s 3 : I n t = 1
Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 14 / 20
Tail Recursion
def h e r m i t e ( n : I n t , x : Double ) : Double = {def h e r m i t e H e l p e r (k : I n t , x : Double ,a : Double , b : Double ) : Double =
k match {case 0 => acase 1 => bcase => h e r m i t e H e l p e r (k−1, x , b ,x∗b − ( k−1)∗a )
}h e r m i t e H e l p e r ( n , x , 1 , x )
}
Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 15 / 20
Model API
t r a i t Model [ T, Q, R ] {
/∗∗∗ The t r a i n i n g data∗ ∗/
protected va l g : T
def p r e d i c t ( p o i n t : Q) : R}
Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 16 / 20
Ecosystem
DynaML builds on the existing Scala ecosystem for scientific computingand data analysis
Breeze: Linear algebra primitives used extensively in DynaML
Apache Spark: RDD implementations for DynaML models can beachieved by extending top level classes.
Spire: Mathematical abstractions like Field, Group etc
Ammonite: Powers the DynaML REPL
Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 18 / 20
Summary
Open Source is the undisputed driver of innovation in ML research
Scala is an extremely well designed language for scientific computing
DynaML borrows from the Scala jungle and makes from it somethingmore ...
Future
Advanced Neural Network architecturesIntegration with Jupyter, Zeppelin, etc notebook interfaces
Mandar. Chandorkar (CWI) DynaML: Machine Learning in Scala SPLASH, 2016 19 / 20