60
Intel Nervana Graph とは @Vengineer 2017/05/22 2017/07/01, 08/12更新 いつものように ソースコードの中を 探ってみました

Intel Nervana Graph とは?

Embed Size (px)

Citation preview

Page 1: Intel Nervana Graph とは?

Intel Nervana Graph とは

@Vengineer

2017/05/222017/07/01, 08/12更新

いつものようにソースコードの中を探ってみました

Page 2: Intel Nervana Graph とは?

ブログ : Vengineerの戯言http://blogs.yahoo.co.jp/verification_engineer

Twitter : @Vengineer

FPGAマガジン (No.16/17) FPGAコミュニティのススメhttp://fpga.cqpub.co.jp/

自己紹介

SlideShare https://www.slideshare.net/ssuser479fa3

Page 3: Intel Nervana Graph とは?

この資料は、

各社の公開情報を

Google君で検索したものを

まとめたものです。

ご利用は、自己責任でお願いします

Page 4: Intel Nervana Graph とは?

2016年8月9日、IntelはNervana Systemsを3.5億ドル以上で買収

創立2年のスタートアップで、投資家から2500万ドル近くを調達していたということは、投資家は2年で10倍で売り抜けたということ

2年間で3億ドル

Softbank GroupのARM買収は240億ポンドなので、ざっくり 1/100

引用:http://jp.techcrunch.com/2016/08/10/20160809intel-buys-deep-learning-startup-nervana-systems-for-a-reported-350-million/

Page 5: Intel Nervana Graph とは?

Nervana Graph Compiler

引用:https://www.nervanasys.com/intel-nervana-graph-preview-release/

・Frontends : neon / TensorFlow / Caffe / Caffe2 / CNTK /MXnet

・Nervana Graph

・Transformers : CPU / GPU (CUDA)

Lowering

Page 6: Intel Nervana Graph とは?

TensorFlowグラフ

XLAグラフに変換

コード生成

JIT or AOT

LLVMを利用

Lowering

TensorFlow XLA

CPUGPU(CUDA)

Page 7: Intel Nervana Graph とは?

Nervana Graph Compilerと

TensorFlow XLA

何か同じじゃん

Page 8: Intel Nervana Graph とは?

出ましたよ

https://www.intelnervana.com/intel-nervana-graph-and-neon-3-0-updates/

The connection between the XLA and Intel Nervana Graph APIs was quite straightforward given the similar projects’ intent for a compact and explicit intermediate representation.

While today the XLA/Intel Nervana Graph integration is at a pre-alpha level, we’d love for people to take it for a spin and kick the tires. We’re working on ironing out known performance issues and improving op and backend support.

Intel Nervana Graph Beta : 2017/6/22

Page 9: Intel Nervana Graph とは?

Intel neon

Page 10: Intel Nervana Graph とは?

neonhttps://github.com/NervanaSystems/neon

最新バージョンは、v1.9ARMのNEONと同じ名前だけど

neon is Intel Nervana's reference deep learning framework committed to best performance on all hardware

Page 11: Intel Nervana Graph とは?

Datasets

Images: MNIST, CIFAR-10, ImageNet 1K, PASCAL VOC, Mini-Places2

Text: IMDB, Penn Treebank, Shakespeare Text, bAbI, Hutter-prize

Video: UCF101Others: flickr8k, flickr30k, COCO

Page 12: Intel Nervana Graph とは?

neon vs cuDNN 4

“Not so fast, FFT”: Winograd (March 3, 2016)

引用:https://www.nervanasys.com/winograd/

Page 13: Intel Nervana Graph とは?

cuDNN 5

Optimizing Recurrent Neural Networks in cuDNN 5 (April 6, 2016)https://devblogs.nvidia.com/parallelforall/optimizing-recurrent-neural-networks-cudnn-5/

Faster forward and backward convolutions

using the Winograd convolution algorithm;

Page 14: Intel Nervana Graph とは?

Winogradで高速化!

Fast Algorithms for Convolutional Neural NetworksAndrew Lavin, Scott Grayhttps://arxiv.org/abs/1509.09308

Going beyond full utilization: The inside scoop on Nervana’s Winograd kernels (June 29, 2016)https://www.nervanasys.com/winograd-2/

Page 15: Intel Nervana Graph とは?

neon v1.3 vs cuDNN v5.1

Still not slowing down: Benchmarking optimized Winograd implementations (July 25, 2016)

引用:https://www.nervanasys.com/winograd-3/

vs cuDNN v4 vs cuDNN v5.1

Page 16: Intel Nervana Graph とは?

Scott Gray さん

https://twitter.com/scottgray76

High-Performance GPU kernels for deep learning• Fast matrix multiply for small minibatches• Direct convolution leveraging GEMM advances• Even faster convolution with Winograd

Nervana (2014年10月 〜 2017年7月)現在は、Open AI所属 (〜 2017年7月) 

引用:http://on-demand.gputechconf.com/gtc/2016/presentation/s6485-scott-gray-gpu-programming-deep-learning.pdf

Page 17: Intel Nervana Graph とは?

Intel NervanaGraph Compiler

Page 18: Intel Nervana Graph とは?

Nervana Graph Compiler

引用:https://www.nervanasys.com/intel-nervana-graph-preview-release/

・Frontends : neon / TensorFlow / Caffe / Caffe2 / CNTK /MXnet

・Nervana Graph

・Transformers : CPU / GPU (CUDA)

Lowering

Page 19: Intel Nervana Graph とは?

Graph Compilerの位置づけ

引用:http://pc.watch.impress.co.jp/docs/news/1034408.html

Page 20: Intel Nervana Graph とは?

MKL-DNN Support

Mar 23, 2017 :Intelに買収された後

To install with Intel MKL-DNN support, first download MKL-DNN from [here] ・(https://github.com/01org/mkl-dnn) and follow the installation instructions・there to install MKL-DNN. Set environment variable MKLDNN_ROOT to point to ・the installated location and follow the rest of the steps to install Ngraph

引用:https://github.com/NervanaSystems/ngraph/commit/f3b7306214f40b4c1b4c40e3e223080797afb382

Page 21: Intel Nervana Graph とは?

Transformer API

・CPU と GPU をサポートMemory usage optimization passesTransformers allow users to register an includedset of optional compiler passes for debug and visualization.

・GPUautomatic kernel fusion/compounding for increased performance

・LLVMのPassのような仕組み

引用:https://github.com/NervanaSystems/ngraph/blob/master/README.md

Page 22: Intel Nervana Graph とは?

グラフを生成する

・Nervana Graph構造Data DependenciesInitializersNon-data Control Dependencies

・General properties of ops・Op Hierarchy・Ops influencing evaluation・Derivatives

引用:https://github.com/NervanaSystems/ngraph/blob/master/doc/source/building_graphs.rst

Page 23: Intel Nervana Graph とは?

将来サポートするもの?

・Nervana Graph serialization/deserialization・Further improvements/abstractions to graph composability for usability/optimization

・Distributed, heterogeneous backend target support

・C APIs for interoperability to enable other languages to create/execute graphs

・Better debugging・Support for model deployment

引用:https://github.com/NervanaSystems/ngraph/blob/master/README.md

Page 24: Intel Nervana Graph とは?

コレ以降、

Intel Nervana Graph Compilerの

ソースコードを探っていいきます

ngraphhttps://github.com/NervanaSystems/ngraph

Page 25: Intel Nervana Graph とは?

サンプルコード

import ngraph as ngimport ngraph.transformers as ngt

x = ng.placeholder(())x_plus_one = x + 1

transformer = ngt.make_transformer()

plus_one = transformer.computation(x_plus_one, x)

for i in range(5): print(plus_one(i))

引用:https://github.com/NervanaSystems/ngraph/blob/master/doc/source/overview.rst

Page 26: Intel Nervana Graph とは?

Caffeでの例

from __future__ import print_functionimport ngraph.transformers as ngtfrom ngraph.frontends.caffe.cf_importer.importer import parse_prototxt

model = "sum.prototxt"op_map = parse_prototxt(model,verbose=True)op = op_map.get("D")

res = ngt.make_transformer().computation(op)()print("Result is:",res)

引用:https://github.com/NervanaSystems/ngraph/blob/master/doc/source/caffe.rst

Page 27: Intel Nervana Graph とは?

TensorFlowでの例

x = tf.constant(1.)y = tf.constant(2.)f = x + y

importer = TFImporter()importer.import_graph_def(tf.Session().graph_def)

f_ng = importer.get_op_handle(f)

transformer = ngt.make_transformer()f_result = transformer.computation(f_ng)()print(f_result)

引用:https://github.com/NervanaSystems/ngraph/blob/master/doc/source/tensorflow.rst

Page 28: Intel Nervana Graph とは?

Transformers&

Computations

Page 29: Intel Nervana Graph とは?

Transformers

Transfomersは、グラフをバックエンド固有の実行フォーマットに変換する。グラフから、Transformerによって、1つ以上のComputationが生成される。Transfomerによって生成された実行オブジェクトは、Computationによって操作される。

すべてのTransformerは、ユーザがバックエンドを切り替えられるように共通の抽象インターフェースを実装しなければいけない。

サポートしているバックエンド・CPUs (via NumPy)・NVIDIA GPUs (via PyCUDA)

引用:https://github.com/NervanaSystems/ngraph/blob/master/doc/source/transformer_usage.rst

Page 30: Intel Nervana Graph とは?

Transformerの生成

1)、デフォルト

from ngraph.transformers import make_transformertransformer = make_transformer()

2)、ファクトリを利用

import ngraph.transformers as ngtavailable_transformers = ngt.transformer_choices()if 'gpu' in available_transformers: factory = ngt.make_transformer_factory('gpu') ngt.set_transformer_factory(factory)

transformer = ngt.make_transformer()

引用:https://github.com/NervanaSystems/ngraph/blob/master/doc/source/transformer_usage.rst

Page 31: Intel Nervana Graph とは?

Computations

Computatonは、Transformerによって生成され、グラフのサブセットを評価するためのインターフェースを備えている

生成された Computation のフォーマットは、Computation を実行するTransformerに依存する。

たとえば、

 ・CPUなら、NumPy ・GPUなら、PyCUDAである

引用:https://github.com/NervanaSystems/ngraph/blob/master/doc/source/transformer_usage.rst

Page 32: Intel Nervana Graph とは?

Computationの生成

import ngraph as ng

a = ng.constant(4)b = ng.placeholder(())c = ng.placeholder(())d = ng.multiply(a, b)e = ng.add(d, c)

example_comp = transformer.computation(e, b, c)

引用:https://github.com/NervanaSystems/ngraph/blob/master/doc/source/transformer_usage.rst

Page 33: Intel Nervana Graph とは?

Computationの実行

example_comp = transformer.computation(e, b, c)

 result_e = eの戻り値

 b = 第一引数

 c = 第二引数

result_e = example_comp(2, 7) : b = 2, c = 7result_e = (4 * b) + c => ( 4*2 ) + 7 = 15

引用:https://github.com/NervanaSystems/ngraph/blob/master/doc/source/transformer_usage.rst

Page 34: Intel Nervana Graph とは?

Computationの実行

複数の戻り値

example_comp2 = transformer.computation([d, e], b, c)

 result_d = dの戻り値, result_e = eの戻り値

 b = 第一引数

 c = 第二引数

result_d, result_e = example_comp2(2, 7)result_d = (4 * b) = (4 * 2) = 8result_e = (4 * b) + c => (4 * 2) + 7 = 15

引用:https://github.com/NervanaSystems/ngraph/blob/master/doc/source/transformer_usage.rst

Page 35: Intel Nervana Graph とは?

Transformerの実装

・Transformerの生成

・Computationの生成

・Transformerの初期化Transformer PassesIntialization ComputationTensor Description InitializationComputation Transformation

・Computationの実行

引用:https://github.com/NervanaSystems/ngraph/blob/master/doc/source/transformer_implementation.rst

Page 36: Intel Nervana Graph とは?

Transformerの実装

base.py : Transformer_ABC_Metabase.py : Transformer (ベース)

cputransform.py : CPUTransformergputransform.py : GPUTransformerhetrtransform.py : HetrTransformer

引用:https://github.com/NervanaSystems/ngraph/tree/master/ngraph/transformers

Page 37: Intel Nervana Graph とは?

Transformer_ABC_Metaクラス

class Transformer_ABC_Meta(abc.ABCMeta): """ metaclass for the backend objects takes care of registering all the backend subclasses """ def __init__(cls, name, bases, dict_): if not hasattr(cls, 'transformers'): # First possible transformer class sets things up cls.transformers = {}

# If this transformer has a transformer_name, register it transformer_name = getattr(cls, 'transformer_name', None) if transformer_name is not None: cls.transformers[transformer_name] = cls super(Transformer_ABC_Meta, cls).__init__(name, bases, dict_)

引用:https://github.com/NervanaSystems/ngraph/tree/master/ngraph/transformers/base.py

Page 38: Intel Nervana Graph とは?

Transformerクラス

class Transformer(with_metaclass(Transformer_ABC_Meta, object)): """ Produce an executable version of op-graphs.

Computations are subsets of Ops to compute. The transformer determines storage allocation and transforms the computations and allocations into functions.

Arguments: fusion (bool): Whether to combine sequences of operations into one operation. **kwargs: Args for related classes.

Attributes: computations (:obj:`set` of :class:`Computation`): The set of requested computations. all_results (:obj:`set` of :class:`ngraph.op_graph.op_graph.Op`): A root set of Ops that need to be computed. finalized (bool): True when transformation has been performed. initialized (bool): True when variables have been initialized/restored. fusion (bool): True when fusion was enabled. device_buffers (set): Set of handles for storage allocations. """

引用:https://github.com/NervanaSystems/ngraph/tree/master/ngraph/transformers/base.py

Page 39: Intel Nervana Graph とは?

Computationクラス

class Computation(NameableValue): """ A handle for a computation function.

Arguments: transformer (obj:`Transformer`): The associated transformer. returns: If an Op, return the value of the Op, if sequence of Ops, return the sequence of values, if a set return a map, if None, return None. *args: AllocationOps marked input will be arguments to the function. **kwargs: Args for related classes. """

引用:https://github.com/NervanaSystems/ngraph/tree/master/ngraph/transformers

Page 40: Intel Nervana Graph とは?

Computationクラス

def __init__(self, transformer, computation, **kwargs): super(Computation, self).__init__(**kwargs) self.transformer = transformer self.computation = computation self.computation_name = None self.executor = None

self.send_nodes = [] self.recv_nodes = [] self.scatter_send_nodes = [] self.scatter_recv_nodes = [] self.gather_send_nodes = [] self.gather_recv_nodes = [] self.allreduce_nodes = []

引用:https://github.com/NervanaSystems/ngraph/tree/master/ngraph/transformers/base.py

Page 41: Intel Nervana Graph とは?

Computationの実装

base.py : Computation (ベース)

cputransform.py : CPUComputationBase.py : GPUComputationhetrtransform.py : HetrComputation

make_computationが実行されたとき、

各Transformerに対応するComputationが

生成される

引用:https://github.com/NervanaSystems/ngraph/tree/master/ngraph/transformers

Page 42: Intel Nervana Graph とは?

Computationの実装

cputransform.py : CPUComputation def make_computation(self, computation): return CPUDeviceComputation(self, computation)

base.py : GPUComputation def make_computation(self, computation): return Computation(self, computation)

hetrtransform.py : HetrComputation def make_computation(self, computation): return HetrComputation(self, computation)

Page 43: Intel Nervana Graph とは?

Computationクラス

class Computation(NameableValue):

def __init__(self, transformer, computation_op, **kwargs): super(Computation, self).__init__(**kwargs) logging.info("Creating computation with computation_op: %s", computation_op) self.transformer = transformer self.computation_op = computation_op self.computation_name = None self.executor = None

引用:https://github.com/NervanaSystems/ngraph/tree/master/ngraph/transformers/base.py

Page 44: Intel Nervana Graph とは?

CPUDeviceComputationクラス

class CPUDeviceComputation(Computation):

def __init__(self, transformer, computation, **kwargs): super(CPUComputation, self).__init__(transformer, computation, **kwargs) self.pool_params = dict() self.pool_slices = dict() self.conv_params = dict() self.conv_slices = dict()

引用:https://github.com/NervanaSystems/ngraph/tree/master/ngraph/transformers/cputransform.py

Page 45: Intel Nervana Graph とは?

HetrComputationクラス

class HetrComputation(Computation):

def __init__(self, hetr, computation_op): self.child_computations = dict() self.transformer = hetr self.send_nodes = hetr.send_nodes self.computation_op = computation_op

引用:https://github.com/NervanaSystems/ngraph/tree/master/ngraph/transformers/hetrtransform.py

Page 46: Intel Nervana Graph とは?

Passの実装 (その1)

passes.py GraphPass (ベースクラス)passes.py GraphBuildingPasspasses.py GraphRewritePass passes.py PeepholeGraphPasspasses.py RequiredTensorShapingpasses.py CPUTensorShapingpasses.py SimplePruneflexpass.py FlexDtypePassflexpass.py FlexDECPassflexpass.py ClearTensorDescriptionsnviz.py JSONPass(GraphPass):nviz.py VizPass(GraphPass):

引用:https://github.com/NervanaSystems/ngraph/tree/master/ngraph/transformers/passes/base.py

Page 47: Intel Nervana Graph とは?

Passの実装 (その2)

layout.py PruneContiguousPasslayout.py GenerateLayoutDomainslayout.py GenerateLayoutConstraintslayout.py AssignLayoutslayout.py AddLayoutConversions

cpufusion.py FusionPasscpulayout.py CPUTensorLayoutgpusimplification.py GPUSubstitutionhetrpasses.py DeviceAssignPasshetrpasses.py CommunicationPasshetrpasses.py DistributedPass

引用:https://github.com/NervanaSystems/ngraph/tree/master/ngraph/transformers/passes

Page 48: Intel Nervana Graph とは?

Passの実装 (その3) MKL-DNNを利用するときの Pass

mkldnnpasses.py MklCreateOpDescriptorsmkldnnpasses.py MklAddLayoutConversions

引用:https://github.com/NervanaSystems/ngraph/tree/master/ngraph/transformers/passes

Page 49: Intel Nervana Graph とは?

ComputationGraphTransformerクラス

class ComputationGraphTransformer(Transformer):

def run_registered_graph_passes(self, ops, **kwargs): for graph_pass in self.graph_passes: graph_pass.wrapped_do_pass(ops=ops, **kwargs) return ops

gputransformer.py class GPUTransformer(ComputationGraphTransformer):

hetrtransformer.py class HetrTransformer(ComputationGraphTransformer):

引用:https://github.com/NervanaSystems/ngraph/tree/master/ngraph/transformers/base.py

Page 50: Intel Nervana Graph とは?

ExecutionGraphTransformerクラス

extransformer.py class ExecutionGraphTransformer(Transformer): def run_registered_graph_passes(self, computation_decl, **kwargs): op_accessor = ExOpGraphOpAccessor() for graph_pass in self.graph_passes: graph_pass.wrapped_do_pass( op_accessor=op_accessor, computation_decl=computation_decl, **kwargs)

cputransformer.py class CPUTransformer(ExecutionGraphTransformer):

引用:https://github.com/NervanaSystems/ngraph/tree/master/ngraph/transformers/extransformer.py

Page 51: Intel Nervana Graph とは?

GraphPassクラス

class GraphPass(with_metaclass(abc.ABCMeta, DelegateOpAccessor)):

def wrapped_do_pass(self, **kwargs): self.begin_pass(**kwargs) self.do_pass(**kwargs) self.end_pass(**kwargs)

@abc.abstractmethod def do_pass(self, **kwargs): pass

引用:https://github.com/NervanaSystems/ngraph/tree/master/ngraph/transformers/base.py

Page 52: Intel Nervana Graph とは?

CPUTransformerクラス

class CPUTransformer(Transformer):

def __init__(self, **kwargs): super(CPUTransformer, self).__init__(**kwargs) self.current_computation = None self.conv_engine = CPUConvEngine() self.init_code = CPUCodeGenerator(self) self.allocate_storage_code = CPUCodeGenerator(self) self.allocate_code = CPUCodeGenerator(self) self.compute_code = CPUCodeGenerator(self) self.code = CPUCodeGenerator(self) …..

引用:https://github.com/NervanaSystems/ngraph/tree/master/ngraph/transformers/cputransform.py

Page 53: Intel Nervana Graph とは?

CPUTransformerクラス

 Passの追加

  self.graph_passes = [] if self.mkldnn.enabled: self.graph_passes.append(CPUFusion()) self.graph_passes += [ # ExVizPass(view=True, filename="initial"), CPUTensorLayout(), SimplePrune(), RequiredTensorShaping(), CPUTensorShaping(), DeadCodeEliminationPass(), ]

引用:https://github.com/NervanaSystems/ngraph/tree/master/ngraph/transformers/cputransform.py

Page 54: Intel Nervana Graph とは?

CPUTransformerクラス if self.mkldnn.enabled: self.graph_passes.append(MklCreateOpDescriptors(mkldnn=self.mkldnn)),

DeadCodeEliminationPass(), self.graph_passes.append(MklAddLayoutConversions(mkldnn=self.mkldnn, layoutpass=add_layout_conversion)), DeadCodeEliminationPass() self.graph_passes += [ SSAConversion(), IndexElision(), # DeadCodeEliminationPass(), LivenessPass(), MemOptimizePass(), LivenessPass(), MemLayoutPass() ]

引用:https://github.com/NervanaSystems/ngraph/tree/master/ngraph/transformers/cputransform.py

Page 55: Intel Nervana Graph とは?

CPUTransformerクラス # from ngraph.transformers.passes.dumpgraphpass import DumpGraphPass

# self.graph_passes += [DumpGraphPass()]

# from ngraph.transformers.passes.visualizemem import VisualizeMemPass

# self.graph_passes += [VisualizeMemPass()]

引用:https://github.com/NervanaSystems/ngraph/tree/master/ngraph/transformers/cputransform.py

Page 56: Intel Nervana Graph とは?

GPUTransformerクラス

class GPUTransformer(Transformer):

def __init__(self, device_id=None, comm=None, **kwargs): super(GPUTransformer, self).__init__(**kwargs) GPUTransformer.gpu_transformers.add(self) ….. self.graph_passes = [ SimplePrune(), PruneContiguousPass(), GPUSubstitution(), layout_domain_pass, layout_constraints_pass, Layout_assign_pass, layout_convert_pass

]

引用:https://github.com/NervanaSystems/ngraph/tree/master/ngraph/transformers/gputransform.py

Page 57: Intel Nervana Graph とは?

HetrTransformerクラス

class HetrTransformer(Transformer):

def __init__(self, device_id=None, comm=None, **kwargs): super(HetrTransformer, self).__init__(**kwargs) ….. self.graph_passes = [ DeviceAssignPass(hetr=self, default_device=device, default_device_id=0), CommunicationPass(self.send_nodes), DistributedPass(self.send_nodes)

]

引用:https://github.com/NervanaSystems/ngraph/tree/master/ngraph/transformers/hetrtransform.py

Page 58: Intel Nervana Graph とは?

コード生成

Page 59: Intel Nervana Graph とは?

CPUCodeGeneratorクラス

class CPUCodeGenerator(PyGen):

def __init__(self, transformer, **kwargs): super(CPUCodeGenerator, self).__init__(prefix="op", **kwargs) self.transformer = transformer

def name(self, x): if isinstance(x, CPUDeviceBufferStorage): return x.ref_str if isinstance(x, CPUDeviceTensor): return x.ref_str return x

引用:https://github.com/NervanaSystems/ngraph/tree/master/ngraph/transformers/cputransform.py

Page 60: Intel Nervana Graph とは?

ありがとうございました

ブログ : Vengineerの戯言http://blogs.yahoo.co.jp/verification_engineer

Twitter : @Vengineer

勉強会主催 : Xilinx Zynq MPSoC (2016/02/20) Altera SDK for OpenCL (2016/06/10) Xilinx SDSoC (2017/01/28)

PYNQ祭り (2017/03/04)FPGAディープラーニング実践懇親会 (2017/05/20)