80
JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong, Sungwoo Cho, Gyeong-In Yu, Joo Seong Jeong, Dong-Jin Shin, Byung-Gon Chun 1

JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

  • Upload
    others

  • View
    14

  • Download
    0

Embed Size (px)

Citation preview

Page 1: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

JANUS: Fast and Flexible Deep Learningvia Symbolic Graph Execution of Imperative Programs

Eunji Jeong, Sungwoo Cho, Gyeong-In Yu,Joo Seong Jeong, Dong-Jin Shin, Byung-Gon Chun

1

Page 2: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

Deep Learning (DL) Frameworks

ExecuteDefine

Images From:http://www.mdpi.com/https://adeshpande3.github.io/A-Beginner%27s-Guide-To-Understanding-Convolutional-Neural-Networks/Going Deeper with Convolutions, 2014, https://towardsdatascience.com/learn-how-recurrent-neural-networks-work-84e975feaaf7Short-Term Load Forecasting Using EMD-LSTM Neural Networks with a Xgboost Algorithm for Feature Importance Evaluation, Energies 2017https://skymind.ai/wiki/generative-adversarial-network-ganhttps://en.wikipedia.org/wiki/Reinforcement_learninghttps://medium.com/@Petuum/intro-to-dynamic-neural-networks-and-dynet-67694b18cb23

2

Page 3: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

Symbolic DL Frameworks

✓ Build a Symbolic Graph✓ Execute the Graph

def build_graph(g): x = g.input(float) linear = g.add(g.mul(W, x), b)

build_graph(graph)run_graph(graph, x_data)

Imperative DL Frameworks

✓ Directly Execute the Computations

Two Paradigms

x

Mul

Add

W

b

3

def linear(x): return W * x + blinear(x_data)

imperative

Page 4: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

Symbolic DL Frameworks

✓ Build a Symbolic Graph✓ Execute the Graph

def build_graph(g): x = g.input(float) linear = g.add(g.mul(W, x), b)

build_graph(graph)run_graph(graph, x_data)

Imperative DL Frameworks

✓ Directly Execute the Computations

Two Paradigms

x

Mul

Add

W

b

4

def linear(x): return W * x + blinear(x_data)

imperative

Page 5: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

Symbolic DL Frameworks

✓ Build a Symbolic Graph✓ Execute the Graph

def build_graph(g): x = g.input(float) linear = g.add(g.mul(W, x), b)

build_graph(graph)run_graph(graph, x_data)

Imperative DL Frameworks

✓ Directly Execute the Computations

Two Paradigms

x

Mul

Add

W

b

5

def linear(x): return W * x + blinear(x_data)

def linear(x): return W * x + blinear(x_data)

imperative

Page 6: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

Symbolic DL Frameworks

✓ Build a Symbolic Graph✓ Execute the Graph

def build_graph(g): x = g.input(float) linear = g.add(g.mul(W, x), b)

build_graph(graph)run_graph(graph, x_data)

Imperative DL Frameworks

✓ Directly Execute the Computations

Two Paradigms

x

Mul

Add

W

b

6

def linear(x): return W * x + blinear(x_data)

imperative

Page 7: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

Imperative DL Frameworks

✓ Directly Execute the Computations

Symbolic DL Frameworks

✓ Build a Symbolic Graph✓ Execute the Graph

def build_graph(g): x = g.input(float) linear = g.add(g.mul(W, x), b)

build_graph(graph)run_graph(graph, x_data)

Two Paradigms

x

Mul

Add

W

b

7

def linear(x): return W * x + blinear(x_data)

imperative

Page 8: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

Imperative DL Frameworks

✓ Directly Execute the Computations

Symbolic DL Frameworks

✓ Build a Symbolic Graph✓ Execute the Graph

def build_graph(g): x = g.input(float) linear = g.add(g.mul(W, x), b)

build_graph(graph)run_graph(graph, x_data)

Two Paradigms

x

Mul

Add

W

b

8

def linear(x): return W * x + blinear(x_data)

imperative

Page 9: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

Symbolic DL Frameworks

+ Easy to Optimize+ Compiler Optimization+ Parallel Execution of Operations+ Deploy on GPU, Cluster, Mobile, ...

- Decoupled View:Hard to Program & Debug

Imperative DL Frameworks

+ Direct Execution:Easy to Program & Debug

- Hard to Optimize

Pros & Cons

Pros

Cons

9

Performance Programmability

Page 10: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

Symbolic DL Frameworks

+ Easy to Optimize+ Compiler Optimization+ Parallel Execution of Operations+ Deploy on GPU, Cluster, Mobile,...

- Decoupled View:Hard to Program & Debug

Imperative DL Frameworks

+ Direct Execution:Easy to Program & Debug

- Hard to Optimize

Pros & Cons

Pros

Cons

10

Performance Programmability

Page 11: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

Symbolic DL Frameworks

+ Easy to Optimize+ Compiler Optimization+ Parallel Execution of Operations+ Deploy on GPU, Cluster, Mobile,...

- Decoupled View:Hard to Program & Debug

Imperative DL Frameworks

+ Direct Execution:Easy to Program & Debug

- Hard to Optimize

Pros & Cons

Pros

Cons

11

Performance Programmability

Page 12: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

Symbolic DL Frameworks

+ Easy to Optimize+ Compiler Optimization+ Parallel Execution of Operations+ Deploy on GPU, Cluster, Mobile,...

- Decoupled View:Hard to Program & Debug

Imperative DL Frameworks

+ Direct Execution:Easy to Program & Debug

- Hard to Optimize

Pros & Cons

Pros

Cons

12

Performance Programmability

Page 13: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

Symbolic DL Frameworks

+ Easy to Optimize+ Compiler Optimization+ Parallel Execution of Operations+ Deploy on GPU, Cluster, Mobile, ...

- Decoupled View:Hard to Program & Debug

Imperative DL Frameworks

+ Direct Execution:Easy to Program & Debug

- Hard to Optimize

Pros & Cons

Pros

Cons

13

Performance Programmability

Page 14: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

Symbolic DL Frameworks

+ Easy to Optimize+ Compiler Optimization+ Parallel Execution of Operations+ Deploy on GPU, Cluster, Mobile, ...

- Decoupled View:Hard to Program & Debug

Imperative DL Frameworks

+ Direct Execution:Easy to Program & Debug

- Hard to Optimize

Pros & Cons

Pros

Cons

14

Performance Programmability

Page 15: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

Symbolic DL Frameworks

+ Easy to Optimize+ Compiler Optimization+ Parallel Execution of Operations+ Deploy on GPU, Cluster, Mobile,...

- Decoupled View:Hard to Program & Debug

Imperative DL Frameworks

+ Direct Execution:Easy to Program & Debug

- Hard to Optimize

What People Want Is...

Pros

Cons

15

ProgrammabilityPerformance

Page 16: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

Imperative DL Program

def foo(x): prod = mul(3, x) return add(prod, 2)

JANUS: Combining the Best of Both Worlds

16

Symbolic DL Graph

x

Mul

Add

3

2

“Easy Programmability” “High Performance”

TransparentConversion

Page 17: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

JANUS: Combining the Best of Both Worlds

17

● 11 models in 5 major neural network categories:○ Convolutional Neural Networks (CNN) LeNet, ResNet-50, Inception-v3○ Recurrent Neural Networks (RNN) LSTM, LM○ Recursive Neural Networks (TreeNN) TreeRNN, TreeLSTM○ Generative Adversarial Networks (GAN) GAN, PIX2PIX○ Deep Reinforcement Learning (DRL) A3C, PPO

● Up to 47.6x speedup compared to imperative DL framework,comparable performance (within 4%) to symbolic DL frameworkwith unmodified imperative DL programs

Page 18: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

Outline

● Approach

● Challenges

● JANUS

● Evaluation

18

Page 19: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

Challenges in Graph Conversion

19

Imperative DL Program

def foo(x): tmp = mul(3, x) return add(tmp, 2)

TransparentConversion

Symbolic DL Graph

x

Mul

Add

3

2

Page 20: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

Challenges in Graph Conversion

20

Imperative Python DL Program

def foo(x): tmp = mul(3, x) return add(tmp, 2)

TransparentConversion

Symbolic DL Graph

x

Mul

Add

3

2

De-facto Standard Languagefor DL Programming

Page 21: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

Challenges in Graph Conversion

21

Imperative Python DL Program

def foo(x): tmp = mul(3, x) return add(tmp, 2)

TransparentConversion

Symbolic DL Graph

x

Mul

Add

3

2?

Page 22: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

Discrepancy between Python Programs and DL Graphs

22

TransparentConversion?

“Dynamic” “Static”

Imperative Python DL Program

def foo(x): tmp = mul(3, x) return add(tmp, 2)

Symbolic DL Graph

x

Mul

Add

3

2

Page 23: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

TransparentConversion

Discrepancy between Python Programs and DL Graphs

23

?Characteristics● determined at runtime● change at runtime

“Dynamic” “Static”

Imperative Python DL Program

def foo(x): tmp = mul(3, x) return add(tmp, 2)

Symbolic DL Graph

x

Mul

Add

3

2

Page 24: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

TransparentConversion

Discrepancy between Python Programs and DL Graphs

24

Symbolic DL Graph

INT, 10x1x

INT, 10x1Mul

INT, 10x1Add

INT3

INT2

Characteristics● must be given

when building a graph

?Characteristics● determined at runtime● change at runtime

“Dynamic” “Static”

Imperative Python DL Program

def foo(x): tmp = mul(3, x) return add(tmp, 2)

Page 25: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

Imperative Python DL Program

def foo(x): tmp = mul(3, x) return add(tmp, 2)

TransparentConversion

Discrepancy between Python Programs and DL Graphs

25

Symbolic DL Graph

INT, 10x1x

INT, 10x1Mul

INT, 10x1ADD

INT3

INT2

Characteristics● must be given

when building a graph

?Characteristics● determined at runtime● change at runtime

“Dynamic” “Static”

DST:NEED INFO

SRC: NO INFO

Page 26: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

class RNNModel(object):def __call__(self, sequence):

state = self.stateoutputs = []for item in sequence:

state = rnn_cell(state, item)outputs += [state]

self.state = statereturn compute_loss(outputs)

for sequence in sequences:optimize(lambda: model(sequence))

Example: Recurrent Neural Network (RNN)

26

Correctness & Performance

of Graph Execution

Dynamic Features of Python

√ Dynamic Control Flow

√ Dynamic Types

√ Impure Functions

?

Page 27: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

class RNNModel(object):def __call__(self, sequence):

state = self.stateoutputs = []for item in sequence:

state = rnn_cell(state, item)outputs += [state]

self.state = statereturn compute_loss(outputs)

for sequence in sequences:optimize(lambda: model(sequence))

RNN Example

27

sawThey dogsseq[0]:

sheWas sick?seq[1]:

Dynamic Control Flow Dynamic Types Impure Function

Page 28: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

class RNNModel(object):def __call__(self, sequence):

state = self.stateoutputs = []for item in sequence:

state = rnn_cell(state, item)outputs += [state]

self.state = statereturn compute_loss(outputs)

for sequence in sequences:optimize(lambda: model(sequence))

RNN Example

28

RNNCell

sawThey dogs

state

out[0]

seq[0]:

sheWas sick?seq[1]:

Dynamic Control Flow Dynamic Types Impure Function

Page 29: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

class RNNModel(object):def __call__(self, sequence):

state = self.stateoutputs = []for item in sequence:

state = rnn_cell(state, item)outputs += [state]

self.state = statereturn compute_loss(outputs)

for sequence in sequences:optimize(lambda: model(sequence))

RNN Example

29

RNNCell

RNNCell

sawThey dogs

state

out[1]

out[0]

seq[0]:

sheWas sick?seq[1]:

Dynamic Control Flow Dynamic Types Impure Function

Page 30: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

class RNNModel(object):def __call__(self, sequence):

state = self.stateoutputs = []for item in sequence:

state = rnn_cell(state, item)outputs += [state]

self.state = statereturn compute_loss(outputs)

for sequence in sequences:optimize(lambda: model(sequence))

RNN Example

30

RNNCell

RNNCell

RNNCell

sawThey dogs

state

out[1]

out[0]

out[2]

seq[0]:

sheWas sick?seq[1]:

Dynamic Control Flow Dynamic Types Impure Function

Page 31: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

class RNNModel(object):def __call__(self, sequence):

state = self.stateoutputs = []for item in sequence:

state = rnn_cell(state, item)outputs += [state]

self.state = statereturn compute_loss(outputs)

for sequence in sequences:optimize(lambda: model(sequence))

RNN Example

31

Dynamic Control Flow Dynamic Types Impure Function

Page 32: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

RNN Example

32

class RNNModel(object):def __call__(self, sequence):

state = self.stateoutputs = []for item in sequence:

state = rnn_cell(state, item)outputs += [state]

self.state = statereturn compute_loss(outputs)

for sequence in sequences:optimize(lambda: model(sequence))

state

CellSwitch

Merge

i<N

Next

Dynamic Control Flow Dynamic Types Impure Function

Page 33: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

RNN Example

● Correct

33

class RNNModel(object):def __call__(self, sequence):

state = self.stateoutputs = []for item in sequence:

state = rnn_cell(state, item)outputs += [state]

self.state = statereturn compute_loss(outputs)

for sequence in sequences:optimize(lambda: model(sequence))

state

CellSwitch

Merge

i<N

Next

Dynamic Control Flow Dynamic Types Impure Function

Page 34: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

RNN Example

● Correct● Slow

34

class RNNModel(object):def __call__(self, sequence):

state = self.stateoutputs = []for item in sequence:

state = rnn_cell(state, item)outputs += [state]

self.state = statereturn compute_loss(outputs)

for sequence in sequences:optimize(lambda: model(sequence))

state

CellSwitch

Merge

i<N

Next

Dynamic Control Flow Dynamic Types Impure Function

Page 35: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

RNN Example

● Correct● Slow

● Fast

35

class RNNModel(object):def __call__(self, sequence):

state = self.stateoutputs = []for item in sequence:

state = rnn_cell(state, item)outputs += [state]

self.state = statereturn compute_loss(outputs)

for sequence in sequences:optimize(lambda: model(sequence))

state

CellSwitch

Merge

i<N

Next

Cell

state

Cell

Cell

Dynamic Control Flow Dynamic Types Impure Function

Page 36: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

RNN Example

● Correct● Slow

● Fast● Incorrect● Need Info

36

class RNNModel(object):def __call__(self, sequence):

state = self.stateoutputs = []for item in sequence:

state = rnn_cell(state, item)outputs += [state]

self.state = statereturn compute_loss(outputs)

for sequence in sequences:optimize(lambda: model(sequence))

state

CellSwitch

Merge

i<N

Next

Cell

state

Cell

Cell

Dynamic Control Flow Dynamic Types Impure Function

Page 37: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

class RNNModel(object):def __call__(self, sequence):

state = self.stateoutputs = []for item in sequence:

state = rnn_cell(state, item)outputs += [state]

self.state = statereturn compute_loss(outputs)

for sequence in sequences:optimize(lambda: model(sequence))

RNN Example

37

RNNCell

RNNCell

RNNCell

sawThey dogs

state state

out[1]

out[0]

out[2]

seq[0]:

sheWas sick?seq[1]:

Dynamic Control Flow Dynamic Types Impure Function

Page 38: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

class RNNModel(object):def __call__(self, sequence):

state = self.stateoutputs = []for item in sequence:

state = rnn_cell(state, item)outputs += [state]

self.state = statereturn compute_loss(outputs)

for sequence in sequences:optimize(lambda: model(sequence))

RNN Example

38

PlaceHoldertype: intshape: ?

PlaceHoldertype: float

shape: ?...

● Correct● Inefficient

Dynamic Control Flow Dynamic Types Impure Function

Page 39: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

class RNNModel(object):def __call__(self, sequence):

state = self.stateoutputs = []for item in sequence:

state = rnn_cell(state, item)outputs += [state]

self.state = statereturn compute_loss(outputs)

for sequence in sequences:optimize(lambda: model(sequence))

RNN Example

39

PlaceHoldertype: int

shape: (3x128)

● Correct● Inefficient

● Fast● Incorrect● Need Info

...

Dynamic Control Flow Dynamic Types Impure Function

PlaceHoldertype: intshape: ?

PlaceHoldertype: float

shape: ?

Page 40: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

class RNNModel(object):def __call__(self, sequence):

state = self.stateoutputs = []for item in sequence:

state = rnn_cell(state, item)outputs += [state]

self.state = statereturn compute_loss(outputs)

for sequence in sequences:optimize(lambda: model(sequence))

RNN Example

40

RNNCell

RNNCell

RNNCell

sawThey dogs

state state

out[1]

out[0]

out[2]

seq[0]:

sheWas sick?seq[1]:

Dynamic Control Flow Dynamic Types Impure Function

Page 41: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

class RNNModel(object):def __call__(self, sequence):

state = self.stateoutputs = []for item in sequence:

state = rnn_cell(state, item)outputs += [state]

self.state = statereturn compute_loss(outputs)

for sequence in sequences:optimize(lambda: model(sequence))

RNN Example

41

RNNCell

RNNCell

RNNCell

sawThey dogs

state state

out[1]

out[0]

out[2]

seq[0]:

sheWas sick?

state

seq[1]:

RNNCell

RNNCell

RNNCell

out[1]

out[0]

out[2]

state

Dynamic Control Flow Dynamic Types Impure Function

Page 42: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

class RNNModel(object):def __call__(self, sequence):

state = self.stateoutputs = []for item in sequence:

state = rnn_cell(state, item)outputs += [state]

self.state = statereturn compute_loss(outputs)

for sequence in sequences:optimize(lambda: model(sequence))

RNN Example

42

?

Dynamic Control Flow Dynamic Types Impure Function

Page 43: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

Challenge:

achieving Correctness & Performance at the same time

Challenge Summary

43

Imperative Python DL Programwith Dynamic Features

Correct & FastSymbolic DL Graph?

Page 44: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

Challenge:

achieving Correctness & Performance at the same time

Challenge Summary

44

Imperative Python DL Programwith Dynamic Features

CorrectGraph

FastGraph

Slow

Incorrect

Page 45: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

Challenge:

achieving Correctness & Performance at the same time

Challenge Summary

45

Imperative Python DL Programwith Dynamic Features

CorrectGraph

FastGraph

Slow

Incorrect

Page 46: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

Outline

● Approach

● Challenges

● JANUS

● Evaluation

46

Page 47: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

Solution: Speculative Graph Generation and Execution

● Goal: Correctness & Performance

● [Performance] Speculatively Specialize the Graph○ Make reasonable assumptions based on the execution history (Profiling)○ Run specialized graph (Common Case)

● [Correctness] Validate Assumptions○ Fallback if an assumption is broken (Rare Case)

47

Page 48: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

Imperative DL Program

for item in sequence: state = rnn(state, item) outputs += [state]

Imperative Executor

Pre-defined DL Operations .Python Interpreter .

Overall Workflow on JANUS

48

Fast Path(Common Case)

Correct Path(Rare Case)

Page 49: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

Imperative DL Program

for item in sequence: state = rnn(state, item) outputs += [state]

Imperative Executor

Pre-defined DL Operations .Python Interpreter .

Profiler

len:3

49

Overall Workflow on JANUS

Modified Python Interpreterfor Transparent Profiling

Fast Path(Common Case)

Correct Path(Rare Case)

Page 50: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

Imperative DL Program

for item in sequence: state = rnn(state, item) outputs += [state]

Symbolic DL Graph

Pre-defined DL Operations .Python Interpreter .

Profiler

GraphGenerator

50

Overall Workflow on JANUS

Cell

state

Cell

Cell

Optimize Graph withProfile Information

Standard Compiler Pass● Reaching Definition Analysis● Type inference● Constant Propagation● ...

len:3

Fast Path(Common Case)

Correct Path(Rare Case)

Page 51: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

Imperative DL Program

for item in sequence: state = rnn(state, item) outputs += [state]

Symbolic DL Graph

Pre-defined DL Operations .Python Interpreter .

Profiler

GraphGenerator

51

Overall Workflow on JANUS

Cell

state

Cell

Cell

len == 3?

Assert

Validate Assumptionfor Correctness

len:3

Fast Path(Common Case)

Correct Path(Rare Case)

Page 52: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

Imperative DL Program

for item in sequence: state = rnn(state, item) outputs += [state]

Symbolic DL Graph

Pre-defined DL Operations .Python Interpreter .

Profiler

GraphGenerator

52

Graph Cache

Overall Workflow on JANUS

Cell

state

Cell

Cell

len == 3?

Assert

len:3

Fast Path(Common Case)

Correct Path(Rare Case)

Page 53: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

Symbolic Graph Executor

Imperative DL Program

for item in sequence: state = rnn(state, item) outputs += [state]

Symbolic DL Graph

Pre-defined DL Operations .Python Interpreter .

Profiler

GraphGenerator

53

Graph Cache

Overall Workflow on JANUS

Cell

state

Cell

Cell

len == 3?

Assert

len:3

Fast Path(Common Case)

Correct Path(Rare Case)

Page 54: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

Cell

state

Cell

Cell

len == 3?

Assert

Symbolic Graph Executor

Imperative DL Program

for item in sequence: state = rnn(state, item) outputs += [state]

Symbolic DL Graph

Pre-defined DL Operations .Python Interpreter .

Profiler

GraphGenerator

54

Graph Cache

Overall Workflow on JANUS

AssumptionFailure

len:3

Fast Path(Common Case)

Correct Path(Rare Case)

Page 55: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

Symbolic Graph Executor

Cell

state

Cell

Cell

len == 3?

Assert

Imperative DL Program

for item in sequence: state = rnn(state, item) outputs += [state]

Symbolic DL Graph

Pre-defined DL Operations .Python Interpreter .

Profiler

GraphGenerator

55

Graph Cache

Overall Workflow on JANUS

len:3

Fast Path(Common Case)

Correct Path(Rare Case)

Page 56: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

Imperative Executor

Imperative DL Program

for item in sequence: state = rnn(state, item) outputs += [state]

Pre-defined DL Operations .Python Interpreter .

56

Overall Workflow on JANUS

len:3

Fast Path(Common Case)

Correct Path(Rare Case)

Profiler

GraphGenerator

Graph Cache

Page 57: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

Imperative Executor

Imperative DL Program

for item in sequence: state = rnn(state, item) outputs += [state]

Pre-defined DL Operations .Python Interpreter .

Profiler

GraphGenerator

57

Graph Cache

Overall Workflow on JANUS

len:?

Fast Path(Common Case)

Correct Path(Rare Case)

Page 58: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

Imperative Executor

Imperative DL Program

for item in sequence: state = rnn(state, item) outputs += [state]

Symbolic DL Graph

Pre-defined DL Operations .Python Interpreter .

Profiler

GraphGenerator

58

Graph Cache

Overall Workflow on JANUS

state

CellSwitch

Merge

i<N

Next

len:?

Fast Path(Common Case)

Correct Path(Rare Case)

Page 59: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

Imperative Executor Symbolic Graph Executor

Imperative DL Program

for item in sequence: state = rnn(state, item) outputs += [state]

Pre-defined DL Operations .Python Interpreter .

Profiler

GraphGenerator

59

Graph Cache

Overall Workflow on JANUS

Cell

state

Cell

Cell

len == 3?

Assert

Symbolic DL Graphlen:3

Page 60: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

Imperative Executor

Imperative DL Program

for item in sequence: state = rnn(state, item) outputs += [state]

Pre-defined DL Operations .Python Interpreter .

Profiler

GraphGenerator

60

Additional System Aspects

Imperative Execution for Full Python Coverage

len:3

See our paper for more details!

Python Coverage Global State Consistency

Page 61: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

SetAttr

0xb84c “data”

value

Symbolic Graph Executor

Python Interpreter .

61

Additional System Aspects

Pre-defined DL Operations. .

“Impure”Symbolic DL Graph

“Impure”Imperative DL Program

def foo(obj): obj.data = value do_sth if pred else pass

Python Coverage Global State Consistency

pred?

Assertdo_sth

Python Heap

Page 62: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

ImpureImperative DL Program

def foo(obj): obj.data = value do_sth if pred else pass

SetAttr

0xb84c “data”

value

pred?

Assertdo_sth

Symbolic Graph Executor

Python Interpreter .

62

Additional System Aspects

Pre-defined DL Operations. .

AssumptionFailure

Unsafe to fallback after heap update

ImpureSymbolic DL Graph

?

Python Coverage Global State Consistency

ModifiedPython Heap

Page 63: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

Symbolic Graph Executor

Python Interpreter .

63

Additional System Aspects

Pre-defined DL Operations. .

Python Heap Local Copy

ImpureImperative DL Program

def foo(obj): obj.data = value do_sth if pred else pass

SetAttr

0xb84c “data”

value

pred?

Assertdo_sth

ImpureSymbolic DL Graph

Write-back after validating assumptions

See our paper for more details!

Python Coverage Global State Consistency

Page 64: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

Outline

● Approach

● Challenges

● JANUS

● Evaluation

√ Correctness and Performance

√ Breakdown of Performance Improvement

64

Page 65: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

Evaluation Setup: Frameworks & Environments

● Frameworks

○ JANUS. Implemented on top of TensorFlow and CPython

○ Symbolic. TensorFlow

○ Imperative. TensorFlow Eager

● Hardware & Software Setup

○ 6 machines connected via Mellanox ConnectX-4 cards w/ 100Gbps InfiniBand

○ Each machine w/ 2x(Intel Xeon E5-2695)+6x(NVIDIA GeForce Titan Xp)

○ Ubuntu 16.04, TensorFlow 1.8.0, CUDA 9.0

○ Horovod 0.12.1, NCCL v2.1, OpenMPI v3.0.065

Executors: TensorFlow (TF) + TF Eager (modified)LoC: 4700 LoC / TF diff 771 LoC / CPython diff 1096 LoC

Page 66: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

Evaluation Setup: Applications

11 models in 5 categories using various dynamic characteristics of Python

● Convolutional Neural Networks (CNN) LeNet, ResNet-50, Inception-v3

● Recurrent Neural Networks (RNN) LSTM, LM

● Recursive Neural Networks (TreeNN) TreeRNN, TreeLSTM

● Deep Reinforcement Learning (DRL) A3C, PPO

● Generative Adversarial Networks (GAN) AN, PIX2PIX

66

Page 67: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

ImageNet Test Error with ResNet50

67

ImperativeSymbolic

Faster

Time

Test Error (%)

36 GPUs

Page 68: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

ImageNet Test Error with ResNet50

68

ImperativeSymbolic

Time

Test Error (%)

JANUS

Faster

36 GPUs

Page 69: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

ImageNet Test Error with ResNet50

69

ImperativeSymbolic

Time

Test Error (%)

JANUS

Faster

36 GPUs

3.4x Faster Convergence

Page 70: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

ImageNet Test Error with ResNet50

70

ImperativeSymbolic

Time

Test Error (%)

JANUS

Faster

36 GPUs

3.4x Faster Convergence

Overlapping computationand communication

Page 71: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

Model Convergence

71 Faster

RNN TreeNN

DRL GAN

ImperativeSymbolic

6 GPUs CPU

4 GPUs 1 GPU

Page 72: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

Model Convergence

72

3.1x

Faster

18.4x

2.6x3.2x

ImperativeSymbolic

JANUSRNN TreeNN

DRL GAN

6 GPUs CPU

4 GPUs 1 GPU

Page 73: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

CNN

GAN

DRL

TreeNN

RNN

Normalized Training Throughput

73

LeNet

ResNet-50

Inception-v3

LSTM

LM

TreeRNN

TreeLSTM

A3C

PPO

AN

PIX2PIX

Single machine w/ Single GPU

Imp.

Symbolic

47.6x over Imperative

96.0% ofSymbolic

Imperative JANUS

Page 74: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

CNN

GAN

DRL

TreeNN

RNN

Normalized Training Throughput

74

LeNet

ResNet-50

Inception-v3

LSTM

LM

TreeRNN

TreeLSTM

A3C

PPO

AN

PIX2PIX

Imp.

SymbolicImperative JANUS

}Hand-optimized GPU ops

dominated execution time;

Working on applying further graph optimizations!

Single machine w/ Single GPU

Page 75: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

CNN

GAN

DRL

TreeNN

RNN

JANUS Speedup over Imperative Execution

75

LeNet

ResNet-50

Inception-v3

LSTM

LM

TreeRNN

TreeLSTM

A3C

PPO

AN

PIX2PIX

Imp.

JANUSSingle machine w/ Single GPU

Page 76: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

CNN

GAN

DRL

TreeNN

RNN

JANUS Speedup over Imperative Execution: Breakdown

76

LeNet

ResNet-50

Inception-v3

LSTM

LM

TreeRNN

TreeLSTM

A3C

PPO

AN

PIX2PIX

Imp.

Single machine w/ Single GPUSpeedup

“without” specializationby runtime profiling

Page 77: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

CNN

GAN

DRL

TreeNN

RNN

JANUS Speedup over Imperative Execution: Breakdown

77

LeNet

ResNet-50

Inception-v3

LSTM

LM

TreeRNN

TreeLSTM

A3C

PPO

AN

PIX2PIX

Imp.

Speedup“with” specializationby runtime profiling

Single machine w/ Single GPU

Page 78: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

CNN

GAN

DRL

TreeNN

RNN

JANUS Speedup over Imperative Execution: Breakdown

78

LeNet

ResNet-50

Inception-v3

LSTM

LM

TreeRNN

TreeLSTM

A3C

PPO

AN

PIX2PIX

Imp.

BypassingPython Heap

Control Flow Unrolling

Type Specialization

Composed of CNNs

Single machine w/ Single GPU

Page 79: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

Related Works

79

● Imperative to symbolic: one-shot converters

○ TensorFlow: defun, AutoGraph, Swift for TensorFlow, JAX, ...○ PyTorch JIT trace, script○ MXNet Gluon

● Cannot handle the dynamic semantics of Python correctly & efficiently

Page 80: JANUS: Fast and Flexible Deep Learning via Symbolic Graph … · 2019-03-12 · JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs Eunji Jeong,

Conclusion

● Programmability and debuggability of imperative DL frameworkswith the performance of symbolic DL frameworks

● Speculative graph generation and execution with runtime profiling

● Up to 47.6x speedup over imperative DL framework,within up to 4% difference compared to symbolic DL framework,while transparently and correctly executing imperative DL programs

80