36
© 2013 IBM Corporation Richard Ning – Enterprise Developer 9/24/2013 Implement high-level parallel API in JDK 1

JavaOne2013: Implement a High Level Parallel API - Richard Ning

Embed Size (px)

DESCRIPTION

This session discusses how to implement a high-level parallel API (such as parallel_for, parallel_while, or parallel_scan) and math calculation based on a thread pool and task in OpenJDK that aligns with the development of multicores and parallel computing. At present, programmers have to use a schedule strategy statically in code instead of choosing it dynamically based on the core number and load balance on the computer with the current Java concurrent package. In the design presented in the session, the function parallel_for(array, task) is a high-level API that can divide the task range dynamically, based on the condition of and load on different computers. Presented by Richard Ning at JavaOne 2013

Citation preview

Page 1: JavaOne2013: Implement a High Level Parallel API - Richard Ning

© 2013 IBM Corporation

Richard Ning – Enterprise Developer

9/24/2013

Implement high-level parallel API in JDK

1

Page 2: JavaOne2013: Implement a High Level Parallel API - Richard Ning

© 2013 IBM Corporation2

Important Disclaimers

– THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONAL PURPOSES ONLY.

– WHILST EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND ACCURACY OF THE INFORMATION CONTAINED IN THIS PRESENTATION, IT IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED.

– ALL PERFORMANCE DATA INCLUDED IN THIS PRESENTATION HAVE BEEN GATHERED IN A CONTROLLED ENVIRONMENT. YOUR OWN TEST RESULTS MAY VARY BASED ON HARDWARE, SOFTWARE OR INFRASTRUCTURE DIFFERENCES.

– ALL DATA INCLUDED IN THIS PRESENTATION ARE MEANT TO BE USED ONLY AS A GUIDE.

– IN ADDITION, THE INFORMATION CONTAINED IN THIS PRESENTATION IS BASED ON IBM’S CURRENT PRODUCT PLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM, WITHOUT NOTICE.

– IBM AND ITS AFFILIATED COMPANIES SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION.

– NOTHING CONTAINED IN THIS PRESENTATION IS INTENDED TO, OR SHALL HAVE THE EFFECT OF: CREATING ANY WARRANT OR REPRESENTATION FROM IBM, ITS AFFILIATED COMPANIES OR ITS OR THEIR SUPPLIERS AND/OR LICENSORS.

Page 3: JavaOne2013: Implement a High Level Parallel API - Richard Ning

© 2013 IBM Corporation

About me

Richard Ning

IBM JDK development

Developing enterprise application

software since 1999 (C++, Java)

My contact information:

–:

mail:[email protected]

3

Page 4: JavaOne2013: Implement a High Level Parallel API - Richard Ning

© 2013 IBM Corporation

What should you get from this talk?

■By the end of this session, you should be able to:

–Understand implementation of high-level parallel API in JDK

–Understand how parallel computing works on multi-cores

4

Page 5: JavaOne2013: Implement a High Level Parallel API - Richard Ning

© 2013 IBM Corporation

Agenda

Introduction: multi-threading, multi-cores, parallel computing

Case study

Other high-level parallel API

1

2

3

Roadmap4

5

Page 6: JavaOne2013: Implement a High Level Parallel API - Richard Ning

© 2013 IBM Corporation

Introduction

Multi-Threading

Multi-core computer

Parallel computing

6

Page 7: JavaOne2013: Implement a High Level Parallel API - Richard Ning

© 2013 IBM Corporation

Case study

■ Execute the same task for every element in a loop

■ Use multi-threading for the execution

7

Page 8: JavaOne2013: Implement a High Level Parallel API - Richard Ning

© 2013 IBM Corporation

■ Can it improve performance?8

Page 9: JavaOne2013: Implement a High Level Parallel API - Richard Ning

© 2013 IBM Corporationtime

CPU

t1t2

t1t2

t1

■ Multi-threading on computer with one core

9

Page 10: JavaOne2013: Implement a High Level Parallel API - Richard Ning

© 2013 IBM Corporation

■ 100% CPU usage with single thread and multi-threading

• Performance even decreases with extra threading consuming

• Can't improve performance

• It is useless to

use multi-

threading(paral

lel API)

10

Page 11: JavaOne2013: Implement a High Level Parallel API - Richard Ning

© 2013 IBM Corporation

■ Multi-threading on computer with multi-core

11

Page 12: JavaOne2013: Implement a High Level Parallel API - Richard Ning

© 2013 IBM Corporation

Cor4 t4

t2

t3

t1

Cor3

Cor2

Cor1

Thread runs separately on every core

time

12

Page 13: JavaOne2013: Implement a High Level Parallel API - Richard Ning

© 2013 IBM Corporation

■Raw thread

Any improvement? Executor

–Users need to create and manage it

Disadvantages

– Not flexible – the number of threads is hard to configure flexibly> core number, resources are consumed in thread context, even decrease performance< core number, some cores are wastedNo balance, the calculation can't be allocated into every core equally

13

Page 14: JavaOne2013: Implement a High Level Parallel API - Richard Ning

© 2013 IBM Corporation

■ Separate creation and execution of thread■ Use thread pool to reuse thread

14

Page 15: JavaOne2013: Implement a High Level Parallel API - Richard Ning

© 2013 IBM Corporation

■A high-level API concurrent_for

15

Page 16: JavaOne2013: Implement a High Level Parallel API - Richard Ning

© 2013 IBM Corporation16

Page 17: JavaOne2013: Implement a High Level Parallel API - Richard Ning

© 2013 IBM Corporation

The API is easy to use, users only need to input executed task and data range and

don't care about how they are executed. However they still have disadvantages.

1. The number of

thread in thread

pool isn't

aligned to core

number

2. Task executes

an entry once,

which isn't

sufficient

3. A task is

targeted to a

thread, which

isn't flexible

17

Page 18: JavaOne2013: Implement a High Level Parallel API - Richard Ning

© 2013 IBM Corporation

1 2 3 nThread Pool

1 3 n2Tasks

m

1 2 3 4

CPUCore

Thread

Task

Core: 4Thread: nTask: m

Overloading: n>>4

Not flexible: m >n

18

Page 19: JavaOne2013: Implement a High Level Parallel API - Richard Ning

© 2013 IBM Corporation

1 2 3 4

Thread Pool

1 2 3 4

CPUCore

Thread

Thread number = core number

Core number doesn't align to thread number: Use fixed thread pool

19

Page 20: JavaOne2013: Implement a High Level Parallel API - Richard Ning

© 2013 IBM Corporation

Task division: another task division strategy ForkJoinPool

ForkJoin

Task2 Task3

Task5 Task6 Task7

Divide and conquer

1. Divide big task into small tasks recursively2. Execute the same operation for every task3. Join result of every small task

Task4

20

Task1

Page 21: JavaOne2013: Implement a High Level Parallel API - Richard Ning

© 2013 IBM Corporation21

Page 22: JavaOne2013: Implement a High Level Parallel API - Richard Ning

© 2013 IBM Corporation22

Page 23: JavaOne2013: Implement a High Level Parallel API - Richard Ning

© 2013 IBM Corporation

Better use for divide and conquer problem Balancing: Work queue by thread and task stealing Oversubscription and starvation: Configuring thread number

Task dividing is static instead of dynamic. Task dividing granularity isn't configured properly according to running condition.

Task daviding strategy is from programmers who need to design it themselves in different implementation scenarios.

23

Page 24: JavaOne2013: Implement a High Level Parallel API - Richard Ning

© 2013 IBM Corporation

New parallel API based on task scheduler

24

Page 25: JavaOne2013: Implement a High Level Parallel API - Richard Ning

© 2013 IBM Corporation

1 2 3 4

Thread Pool

1 2 3 4

CPUCore

Thread

1

2

3

4

5

TASKQUEUE

6

7

8

11

12

16

13

14

15

9

10

17

18

19

20

Initial statusTasks are allocated equally,One thread by one coreEvery thread maintains its task

queue which consists of

affiliated tasks25

Page 26: JavaOne2013: Implement a High Level Parallel API - Richard Ning

© 2013 IBM Corporation

1 2 3 4

Thread Pool

1 2 3 4

CPUCore

Thread

2

3

4

5

10 15

Unbalancing loading

TASKQUEUE

26

Page 27: JavaOne2013: Implement a High Level Parallel API - Richard Ning

© 2013 IBM Corporation

1 2 3 4

Thread Pool

1 2 3 4

CPUCore

Thread

2

3 22

10

4

15

5

21

Balancing loading by

task stealing and

adding new tasks who

probably have different

task granularity.

TASKQUEUE

27

Page 28: JavaOne2013: Implement a High Level Parallel API - Richard Ning

© 2013 IBM Corporation

Parallel API with new working mechanism - concurrent_for

Range: the range of data set [0, n)

Strategy: the strategy of dividing range: automatic, static with fixed granularity. In

automatic case, task granularity is probably different

Task: the task which executes the same operation on range

28

Page 29: JavaOne2013: Implement a High Level Parallel API - Richard Ning

© 2013 IBM Corporation29

Page 30: JavaOne2013: Implement a High Level Parallel API - Richard Ning

© 2013 IBM Corporation30

Page 31: JavaOne2013: Implement a High Level Parallel API - Richard Ning

© 2013 IBM Corporation

Other high-level parallel API

Can add data set while executing it concurrently.concurrent_while

Use divide_join based task to return calculation result.concurrent_reduce

Sort data set concurrently.concurrentsort

for example, a matrix multiply another matrixint[5][10] matrix1 , int[10][5] matrix2int[5][5] matrix3 = matrix1 * matrix2int[5][5] matrix3 = concurrent_multiply(matrix1, matrix2)

Math calculation

31

Page 32: JavaOne2013: Implement a High Level Parallel API - Richard Ning

© 2013 IBM Corporation

Anyway we always can achieve performance improvement by

parallel computing based on multi-cores.

32

Page 33: JavaOne2013: Implement a High Level Parallel API - Richard Ning

© 2013 IBM Corporation

Scalable

Roadmap

■Implement high-level parallel API in JDK based on new task scheduler

Correct

Portable

High performance

33

Page 34: JavaOne2013: Implement a High Level Parallel API - Richard Ning

© 2013 IBM Corporation

Review of Objectives

■Now that you’ve completed this session, you are able to:

–Understand design of new parallel API based on task.

–Understand what parallel computing is and what is good for

34

Page 35: JavaOne2013: Implement a High Level Parallel API - Richard Ning

© 2013 IBM Corporation

Q & A

35

Page 36: JavaOne2013: Implement a High Level Parallel API - Richard Ning

© 2013 IBM Corporation

Thanks!

36