25
Xiaoyi Lu, Md. WasiurRahman, Nusrat Islam, and Dhabaleswar K. (DK) Panda NetworkBased Compu2ng Laboratory Department of Computer Science and Engineering The Ohio State University, Columbus, OH, USA A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance Networks

A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance … · 2015-07-29 · • To achieve optimal performance, Hadoop RPC needs to be tuned based on cluster and workload

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance … · 2015-07-29 · • To achieve optimal performance, Hadoop RPC needs to be tuned based on cluster and workload

Xiaoyi  Lu,  Md.  Wasi-­‐ur-­‐Rahman,  Nusrat  Islam,  and  Dhabaleswar  K.  (DK)  Panda  

 Network-­‐Based  Compu2ng  Laboratory  

Department  of  Computer  Science  and  Engineering  The  Ohio  State  University,  Columbus,  OH,  USA  

A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance

Networks

Page 2: A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance … · 2015-07-29 · • To achieve optimal performance, Hadoop RPC needs to be tuned based on cluster and workload

WBDB 2013

Outline    

•  IntroducAon  and  MoAvaAon  

•  Problem  Statement  

•  Design  ConsideraAons  •  Micro-­‐benchmark  Suite  

•  Performance  EvaluaAon  

•  Conclusion  &  Future  work

2

Page 3: A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance … · 2015-07-29 · • To achieve optimal performance, Hadoop RPC needs to be tuned based on cluster and workload

WBDB 2013

Big Data Technology •  Apache Hadoop is one of the most popular

Big Data technology –  Provides framework for large-scale,

distributed data storage and processing •  An open-source implementation of

MapReduce programming model •  Hadoop Distributed File System (HDFS) is

the underlying file system of Hadoop MapReduce and Hadoop DataBase, HBase

•  Hadoop Core – Common functionalities, e.g. Remote Procedure Call (RPC)

HDFS

MapReduce HBase

Hadoop Framework

3

Core (RPC, ..)

Page 4: A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance … · 2015-07-29 · • To achieve optimal performance, Hadoop RPC needs to be tuned based on cluster and workload

WBDB 2013

Adoption of Hadoop RPC •  Hadoop RPC is increasingly being used with data-center

middlewares such as MapReduce, HDFS, and HBase because of its simplicity, productivity, and high performance. –  Metadata exchange –  Manage compute nodes and track system status –  Efficient data management operations: get block info, create blocks etc. –  Database operations: put, get, etc.

4

High Performance

Networks

(HDD/SSD)

(HDD/SSD)

(HDD/SSD)

...

...

(HDFS Data Nodes)(HDFS Clients)

...

...

(HBase Clients) (HRegion Servers) (Data Nodes)

(HDD/SSD)

(HDD/SSD)

(HDD/SSD)

...

... ...

... ...

...

High Performance

Networks

High Performance

Networks

MapReduce & HDFS HBase

Map/Reduce (HDFS Name Node)

Page 5: A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance … · 2015-07-29 · • To achieve optimal performance, Hadoop RPC needs to be tuned based on cluster and workload

WBDB 2013

Common  Protocols  using  Open  Fabrics  

5

Applica-on  

Verbs  Sockets  

ApplicaAon  Interface  

SDP

RDMA  

SDP  

InfiniBand  Adapter  

InfiniBand  Switch  

RDMA  

IB  Verbs  

InfiniBand  Adapter  

InfiniBand  Switch  

User space

RDMA  

RoCE  

RoCE  Adapter  

User space

Ethernet  Switch  

TCP/IP  

Ethernet  Driver  

Kernel Space

Protocol  

InfiniBand  Adapter  

InfiniBand  Switch  

IPoIB  

Ethernet  Adapter  

Ethernet  Switch  

Adapter  

Switch  

1/10/40  GigE  

iWARP  

Ethernet  Switch  

iWARP  

iWARP  Adapter  

User space IPoIB  

TCP/IP  

Ethernet  Adapter  

Ethernet  Switch  

10/40  GigE-­‐TOE  

Hardware  Offload  

RSockets  

InfiniBand  Adapter  

InfiniBand  Switch  

User space

RSockets  

Page 6: A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance … · 2015-07-29 · • To achieve optimal performance, Hadoop RPC needs to be tuned based on cluster and workload

WBDB 2013

Can  Big  Data  Processing  Systems  be  Designed  with  High-­‐Performance  Networks  and  Protocols?  

Enhanced  Designs  

Applica-on  

Accelerated  Sockets  

10  GigE  or  InfiniBand  

Verbs  /  Hardware  Offload  

Current    Design  

Applica-on  

Sockets  

1/10  GigE  Network  

•  Sockets  not  designed  for  high-­‐performance  –  Stream  semanAcs  oSen  mismatch  for  upper  layers  (Memcached,  HBase,  Hadoop)  –  Zero-­‐copy  not  available  for  non-­‐blocking  sockets  

Our  Approach  

Applica-on  

OSU    Design  

10  GigE  or  InfiniBand  

Verbs  Interface  

6

Page 7: A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance … · 2015-07-29 · • To achieve optimal performance, Hadoop RPC needs to be tuned based on cluster and workload

WBDB 2013

Hadoop RPC over InfiniBand

Hadoop  RPC  

IB  Verbs  

InfiniBand  

Applica-ons  

1/10  GigE,  IPoIB    Network  

Java  Socket    Interface  

Java  Na-ve  Interface  (JNI)  

Our Design

Default

 OSU  Design  

 

Enables  high  performance  RDMA  communicaAon,  while  supporAng  tradiAonal  socket  interface  

Xiaoyi  Lu,  Nusrat  Islam,  Md.  Wasi-­‐ur-­‐Rahman,  Jithin  Jose,  Hari  Subramoni,  Hao  Wang,  Dhabaleswar  K.  (DK)  Panda.  “High-­‐Performance  Design  of  Hadoop  RPC  with  RDMA  over  InfiniBand.”  To  be  presented  in  the  42nd  Interna-onal  Conference  on  Parallel  Processing  (ICPP  2013),  Lyon,  France,  October,  2013.   7

rpc.ib.enabled

Page 8: A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance … · 2015-07-29 · • To achieve optimal performance, Hadoop RPC needs to be tuned based on cluster and workload

WBDB 2013

8  

Hadoop  RPC  over  IB:  Gain  in  Latency  and  Throughput  

•  Hadoop  RPC  over  IB  PingPong  Latency  

–  1  byte:  39  us;  4  KB:  52  us  –   42%-­‐49%  and  46%-­‐50%  improvements  compared  with  the  performance  of  default  

Hadoop  RPC  on  10  GigE  and  IPoIB  (32Gbps)  respec-vely  

•  Hadoop  RPC  over  IB  Throughput  

–  512  bytes  &  48  clients:  135.22  Kops/sec  –   82%  and  64%  improvements  compared  with  the  peak  performance  of  default  

Hadoop  RPC  on  10  GigE  and  IPoIB  (32Gbps)  respec-vely  

 

0  

20  

40  

60  

80  

100  

120  

1   2   4   8   16   32   64   128   256   512  1024  2048  4096  

Latency  (us)

 

Payload  Size  (Byte)  

RPC-­‐10GigE  RPC-­‐IPoIB(32Gbps)  RPCoIB(32Gbps)  

0  

20  

40  

60  

80  

100  

120  

140  

160  

8   16   24   32   40   48   56   64  

Throughp

ut  (K

ops/Sec)  

Number  of  Clients

RPC-­‐10GigE  

RPC-­‐IPoIB(32Gbps)  

RPCoIB(32Gbps)  

Page 9: A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance … · 2015-07-29 · • To achieve optimal performance, Hadoop RPC needs to be tuned based on cluster and workload

WBDB 2013

•  High-Performance Design of Hadoop over RDMA-enabled Interconnects

–  High performance design with native InfiniBand support at the verbs-level for HDFS, MapReduce, and RPC components

–  Easily configurable for both native InfiniBand and the traditional sockets-based support (Ethernet and InfiniBand with IPoIB)

–  Current release: 0.9.0 •  Based on Apache Hadoop 0.20.2

•  Compliant with Apache Hadoop 0.20.2 APIs and applications

•  Tested with

–  Mellanox InfiniBand adapters (DDR, QDR and FDR)

–  Various multi-core platforms

–  Different file systems with disks and SSDs

–  http://hadoop-rdma.cse.ohio-state.edu

Available in Hadoop-RDMA SoSware  

9

Page 10: A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance … · 2015-07-29 · • To achieve optimal performance, Hadoop RPC needs to be tuned based on cluster and workload

WBDB 2013

Requirements of Hadoop RPC Benchmarks •  To achieve optimal performance, Hadoop RPC needs

to be tuned based on cluster and workload characteristics

•  A micro-benchmark tool suite to evaluate Hadoop RPC performance metrics in different configurations is important for tuning and understanding

•  For Hadoop developers, this kind of micro-benchmark suite is helpful to evaluate and optimize the performance of new designs

10

Page 11: A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance … · 2015-07-29 · • To achieve optimal performance, Hadoop RPC needs to be tuned based on cluster and workload

WBDB 2013

Outline    

•  IntroducAon  and  MoAvaAon  

•  Problem  Statement  

•  Design  ConsideraAons  •  Micro-­‐benchmark  Suite  

•  Performance  EvaluaAon  

•  Conclusion  &  Future  work

11

Page 12: A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance … · 2015-07-29 · • To achieve optimal performance, Hadoop RPC needs to be tuned based on cluster and workload

WBDB 2013

Problem Statement •  Can we design and implement a simple and standardized

benchmark suite to let all users and developers in the Big Data community evaluate, understand, and optimize the Hadoop RPC performance over a range of networks/protocols?

•  What will be the performance of Hadoop RPC when evaluated

using this benchmark suite on high-performance networks?

12

Page 13: A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance … · 2015-07-29 · • To achieve optimal performance, Hadoop RPC needs to be tuned based on cluster and workload

WBDB 2013

Outline    

•  IntroducAon  and  MoAvaAon  

•  Problem  Statement  

•  Design  ConsideraAons  •  Micro-­‐benchmark  Suite  

•  Performance  EvaluaAon  

•  Conclusion  &  Future  work

13

Page 14: A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance … · 2015-07-29 · • To achieve optimal performance, Hadoop RPC needs to be tuned based on cluster and workload

WBDB 2013

Design Considerations

•  The performance of RPC systems is usually measured by the metrics of latency and throughput

•  Performance of Hadoop RPC is determined by: –  Factors related to network configurations; Faster

interconnects and/or protocols can enhance Hadoop RPC performance

–  Controllable parameters in RPC engine-level and benchmark-level: handler/client number, etc.

–  Data types: serialization and deserialization issues of different data types in RPC system; BytesWritable, Text, etc.

–  CPU Utilization: tradeoff between RPC subsystem performance and the whole system performance

14

Page 15: A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance … · 2015-07-29 · • To achieve optimal performance, Hadoop RPC needs to be tuned based on cluster and workload

WBDB 2013

Outline    

•  IntroducAon  and  MoAvaAon  

•  Problem  Statement  

•  Design  ConsideraAons  •  Micro-­‐benchmark  Suite  

•  Performance  EvaluaAon  

•  Conclusion  &  Future  work

15

Page 16: A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance … · 2015-07-29 · • To achieve optimal performance, Hadoop RPC needs to be tuned based on cluster and workload

WBDB 2013

Micro-benchmark Suite •  Two different micro-benchmarks:

–  Latency: Single Server, Single Client –  Throughput: Single Server, Multiple Clients

•  A script framework for job launching and resource monitoring

•  Calculates statistics like Min, Max, Average

16

Component Network Address

Port Data Type

Min Msg Size

Max Msg Size

No. of Iterations

Handlers Verbose

lat_client √ √ √ √ √ √ √

lat_server √ √ √ √

Component Network Address

Port Data Type

Min Msg Size

Max Msg Size

No. of Iterations

No. of Clients

Handlers Verbose

thr_client √ √ √ √ √ √ √

thr_server √ √ √ √ √ √

Page 17: A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance … · 2015-07-29 · • To achieve optimal performance, Hadoop RPC needs to be tuned based on cluster and workload

WBDB 2013

Outline    

•  IntroducAon  and  MoAvaAon  

•  Problem  Statement  

•  Design  ConsideraAons  •  Micro-­‐benchmark  Suite  

•  Performance  EvaluaAon  

•  Conclusion  &  Future  work

17

Page 18: A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance … · 2015-07-29 · • To achieve optimal performance, Hadoop RPC needs to be tuned based on cluster and workload

WBDB 2013

Experimental Setup •  Hardware  

–  Intel  Westmere  Cluster  • 8  nodes  • Each  node  has  8  processor  cores  on  2  Intel  Xeon  2.67  GHz  Quad-­‐core  CPUs,  24  GB  main  memory  

• Network:  1GigE,  10GigE,  and  IPoIB  (32Gbps)  •  SoSware  

–  Enterprise Linux Server release 6.1 (Santiago) at kernel version 2.6.32-131 with OpenFabrics version 1.5.3  

–  Hadoop  0.20.2  and  Sun  Java  SDK  1.7.    18

Page 19: A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance … · 2015-07-29 · • To achieve optimal performance, Hadoop RPC needs to be tuned based on cluster and workload

WBDB 2013

RPC Latency for BytesWritable

Small Messages Large Messages  

•  Latency  for  RPC  decreases  if  the  underlying  interconnect  is  changed  to  IPoIB  or  10  GigE  from  1  GigE.    

•  With  10  GigE  interconnect,  we  observe  beher  latency  than  IPoIB  for  small  payload  sizes.  For  large  payload  sizes,  IPoIB  performs  beher  than  10  GigE.    –  IPoIB  achieves  27%  gain  over  10  GigE  for  a  64  MB  payload  size,  whereas  it  performs  worse  by  

0.66%  over  10  GigE  for  a  4  KB  payload  size.    19

0

50

100

150

200

250

Late

ncy

(us)

Payload Size (Byte)

1GigE

10GigE

IPoIB(32Gbps)

0

100

200

300

400

500

600

700

800

128K 256K 512K 1M 2M 4M 8M 16M 32M 64M

Late

ncy

(ms)

Payload Size (Byte)

1GigE

10GigE

IPoIB(32Gbps)

Page 20: A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance … · 2015-07-29 · • To achieve optimal performance, Hadoop RPC needs to be tuned based on cluster and workload

WBDB 2013

RPC Latency for Text

Small Messages Large Messages

•  Similar  performance  characterisAc  for  RPC  latency  with  the  data  type  of  Text.    

20

0

20

40

60

80

100

120

140

160

180

200

1 2 4 8 16 32 64 128 256 512 1024 2048 4096

Late

ncy

(us)

Payload Size (Byte)

1GigE

10GigE

IPoIB(32Gbps)

0

100

200

300

400

500

600

700

800

128K 256K 512K 1M 2M 4M 8M 16M 32M 64M

Late

ncy

(us)

Payload Size (Byte)

1GigE

10GigE

IPoIB(32Gbps)

Page 21: A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance … · 2015-07-29 · • To achieve optimal performance, Hadoop RPC needs to be tuned based on cluster and workload

WBDB 2013

RPC Throughput for BytesWritable

7 RPC Server Handlers 16 RPC Server Handlers  

•  IPoIB  performs  beher  than  10  GigE  as  payload  size  is  increased.  

•  At  4  KB,  the  improvement  goes  upto  26%  for  seven  handler  threads.  For  small  payload  sizes,  10  GigE  performs  beher  than  IPoIB  by  an  average  margin  of  5-­‐6%.    

21

0

5

10

15

20

25

30

35

40

45

1 2 4 8 16 32 64 128 256 512 1024 2048 4096

Thro

ughp

ut (K

ops/

Sec)

Payload Size (byte)

1GigE

10GigE

IPoIB(32Gbps) 0

5

10

15

20

25

30

35

40

45

1 2 4 8 16 32 64 128 256 512 1024 2048 4096

Thro

ughp

ut (K

ops/

Sec)

Payload Size (byte)

1GigE

10GigE

IPoIB(32Gbps)

Page 22: A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance … · 2015-07-29 · • To achieve optimal performance, Hadoop RPC needs to be tuned based on cluster and workload

WBDB 2013

RPC Throughput for BytesWritable

CPU utilization for the experiment with 4 handlers Throughput Comparison for 4 KB payload size

•  Keep  the  payload  size  fixed  to  4  KB  and  observe  the  trend  with  different  handler  numbers  and  different  networks  –  IPoIB  performs  beher  than  10  GigE  as  48%,  5%,  45%,  and  47%  for  1,  4,  16,  and  32  handlers  

respecAvely.  

•  Easily  used  to  monitor  resource  uAlizaAon.  Enable  a  parameter  in  the  script  framework.   22

0 10 20 30 40 50 60 70 80 90

100

1 4 16 32

Thro

ughp

ut (K

ops/

Sec)

Handler Number

1GigE 10GigE IPoIB(32Gbps)

0

5

10

15

20

25

30

35

40

45

0 9 18

27

36

45

54

63

72

81

90

99

108

117

126

135

144

153

162

171

180

189

198

207

216

CPU

Util

izat

ion

(%)

Sampling Point

1GigE 10GigE IPoIB(32Gbps)

Page 23: A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance … · 2015-07-29 · • To achieve optimal performance, Hadoop RPC needs to be tuned based on cluster and workload

WBDB 2013

Outline    

•  IntroducAon  and  MoAvaAon  

•  Problem  Statement  

•  Design  ConsideraAons  •  Micro-­‐benchmark  Suite  

•  Performance  EvaluaAon  

•  Conclusion  &  Future  work

23

Page 24: A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance … · 2015-07-29 · • To achieve optimal performance, Hadoop RPC needs to be tuned based on cluster and workload

WBDB 2013

Conclusion and Future Works

•  Design and implement a micro-benchmark suite to evaluate the performance of standalone Hadoop RPC.

•  Provide standard micro-benchmarks to measure the latency and throughput of Hadoop RPC with different data types.

•  Illustrate the performance results of Hadoop RPC using our benchmarks over different networks/protocols (1GigE/10GigE/IPoIB).

•  Will extend our benchmark suite to help users to make the performance comparisons among Hadoop Writable RPC, Avro, Thrift, and Protocol buffers

•  Will be made available to the big data community via an open-source release 24

Page 25: A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance … · 2015-07-29 · • To achieve optimal performance, Hadoop RPC needs to be tuned based on cluster and workload

WBDB 2013

     Thank  You!  

{luxi,  rahmanmd,  islamn,  panda}@cse.ohio-­‐state.edu    

Network-­‐Based  CompuAng  Laboratory  hhp://nowlab.cse.ohio-­‐state.edu/

MVAPICH  Web  Page  hhp://mvapich.cse.ohio-­‐state.edu/

25 Hadoop-­‐RDMA  Web  Page  

hhp://hadoop-­‐rdma.cse.ohio-­‐state.edu/