20
Universidade Federal de Pelotas (BRA) LUPS – Laboratory of Ubiquitous and Parallel Systems C. A. S. Camargo, G. G. H. Cavalheiro, M. L. Pilla , S. A. C. Cavalheiro, L. Foss Applying List Scheduling Algorithms in a Multithreaded Execution Environment

Applying List Scheduling Algorithms in a Multithreaded ...hpc2012.hpclatam.org/files/HPCLatAm2012_presentation25.pdf · HPCS 2005 Overview • Introduction • List Scheduling Algorithms

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Applying List Scheduling Algorithms in a Multithreaded ...hpc2012.hpclatam.org/files/HPCLatAm2012_presentation25.pdf · HPCS 2005 Overview • Introduction • List Scheduling Algorithms

HPCS 2005

Universidade Federal de Pelotas (BRA) LUPS – Laboratory of Ubiquitous and Parallel Systems

C. A. S. Camargo, G. G. H. Cavalheiro, M. L. Pilla, S. A. C. Cavalheiro, L. Foss

Applying List Scheduling Algorithms in a Multithreaded Execution Environment

Page 2: Applying List Scheduling Algorithms in a Multithreaded ...hpc2012.hpclatam.org/files/HPCLatAm2012_presentation25.pdf · HPCS 2005 Overview • Introduction • List Scheduling Algorithms

HPCS 2005

Overview

•  Introduction

•  List Scheduling Algorithms

•  Anahy Multithreaded Execution Model –  Programming interface –  Scheduling strategy

•  Analysis of the scheduling strategy –  Transforming parallel program representations

•  Concluding Remarks

HPCLatam’12

Page 3: Applying List Scheduling Algorithms in a Multithreaded ...hpc2012.hpclatam.org/files/HPCLatAm2012_presentation25.pdf · HPCS 2005 Overview • Introduction • List Scheduling Algorithms

HPCS 2005

Introduction

Program

Sequential SMP Cluster NOW

•  Performance portability

The concurrency of an application can be described regardless of hardware resources

HPCLatam’12

Page 4: Applying List Scheduling Algorithms in a Multithreaded ...hpc2012.hpclatam.org/files/HPCLatAm2012_presentation25.pdf · HPCS 2005 Overview • Introduction • List Scheduling Algorithms

HPCS 2005

Introduction

Program

Sequential SMP Cluster NOW

•  Performance portability

The concurrency of an application can be described regardless of hardware resources

HPCLatam’12

Concurrency >> Parallelism

Page 5: Applying List Scheduling Algorithms in a Multithreaded ...hpc2012.hpclatam.org/files/HPCLatAm2012_presentation25.pdf · HPCS 2005 Overview • Introduction • List Scheduling Algorithms

HPCS 2005

Introduction

Performance portability •  Our approach: – Dissociate programming of execution

•  Our proposal: – 

•  Our mechanisms: –  List Scheduling and dataflow control at run time

HPCLatam’12

Page 6: Applying List Scheduling Algorithms in a Multithreaded ...hpc2012.hpclatam.org/files/HPCLatAm2012_presentation25.pdf · HPCS 2005 Overview • Introduction • List Scheduling Algorithms

HPCS 2005

Introduction •  Many multithread runtime environments use list

scheduling strategies with good practical results –  Cilk, OpenMP, Anahy

•  We want to evaluate the theoretical efficiency of greedy list scheduling on multithreaded environments

We built an algorithm that obtains DGCs apart from DAGs, so we could compare the results of dynamic multithreaded schedulings with static, task-based

ones, for the same programs

HPCLatam’12

Page 7: Applying List Scheduling Algorithms in a Multithreaded ...hpc2012.hpclatam.org/files/HPCLatAm2012_presentation25.pdf · HPCS 2005 Overview • Introduction • List Scheduling Algorithms

HPCS 2005

List Scheduling •  Program described as a DAG •  Task is the scheduling unit •  A task defines a sequence of

instructions and two set of data: input and output data

•  Tasks are assigned priorities and ordered in a list, that is consulted for each scheduling event

HPCLatam’12

T1 /4 T2 /1

T3 /2

T5 /5 T4 /4 T6 /10

T7 /10

Page 8: Applying List Scheduling Algorithms in a Multithreaded ...hpc2012.hpclatam.org/files/HPCLatAm2012_presentation25.pdf · HPCS 2005 Overview • Introduction • List Scheduling Algorithms

HPCS 2005

List Scheduling •  Program described as a DAG •  Task is the scheduling unit •  A task defines a sequence of

instructions and two set of data: input and output data

•  Tasks are assigned priorities and ordered in a list, that is consulted for each scheduling event

HPCLatam’12

T1 /4 T2 /1

T3 /2

T5 /5 T4 /4 T6 /10

T7 /10

knowing the critical path is paramount

Page 9: Applying List Scheduling Algorithms in a Multithreaded ...hpc2012.hpclatam.org/files/HPCLatAm2012_presentation25.pdf · HPCS 2005 Overview • Introduction • List Scheduling Algorithms

HPCS 2005

Environment

Anahy

API programminginterface

Applicative Scheduling performanceportability

multithreading

Operating System Hardware

genericarchitecture

HW/OSdependentmodules

Execution pool active messages

Communication

HPCLatam’12

Page 10: Applying List Scheduling Algorithms in a Multithreaded ...hpc2012.hpclatam.org/files/HPCLatAm2012_presentation25.pdf · HPCS 2005 Overview • Introduction • List Scheduling Algorithms

HPCS 2005

Anahy

void foo(In x) { res = computes(x); } void bar(In p) { Task_A t1 = create(foo,a); Task_B t2 = create(foo,b); ... join(t1,r1) Task_C join(t2,r2) Task_D }

!!!

!

"

#

$

!"#$%

&''$% &''$%

Detailed DCG Programming Interface

HPCLatam’12

Page 11: Applying List Scheduling Algorithms in a Multithreaded ...hpc2012.hpclatam.org/files/HPCLatAm2012_presentation25.pdf · HPCS 2005 Overview • Introduction • List Scheduling Algorithms

HPCS 2005

Anahy Just the thread level

(what the scheduler sees)

!!!

!

"

#

$

!"#$%

&''$% &''$%

Detailed DCG

foo()

foo()

bar()

HPCLatam’12

Page 12: Applying List Scheduling Algorithms in a Multithreaded ...hpc2012.hpclatam.org/files/HPCLatAm2012_presentation25.pdf · HPCS 2005 Overview • Introduction • List Scheduling Algorithms

HPCS 2005

Anahy

Scheduling strategy – A list of ready threads, ordered by priority – Prioritize threads in the critical path •  In the default strategy, the closer a thread is from

the root of the DCG, the higher is its priority •  If more than one thread is at the same level in the

DCG, the oldest ready thread has higher priority (for multiple create, ties are broken randomly)

– No migration, no task preemption

HPCLatam’12

Page 13: Applying List Scheduling Algorithms in a Multithreaded ...hpc2012.hpclatam.org/files/HPCLatAm2012_presentation25.pdf · HPCS 2005 Overview • Introduction • List Scheduling Algorithms

HPCS 2005

Analysis of the scheduling strategy

•  DAGs from 9 case studies of Graham’s (1976) were transformed to DCGs

•  The resulting DCGs were scheduled according to Anahy’s strategy

•  Scheduling lengths were compared with the ones showed by Graham (optimal and non-optimal schedules)

HPCLatam’12

Page 14: Applying List Scheduling Algorithms in a Multithreaded ...hpc2012.hpclatam.org/files/HPCLatAm2012_presentation25.pdf · HPCS 2005 Overview • Introduction • List Scheduling Algorithms

HPCS 2005

Analysis of the scheduling strategy

Two-step transformation •  Pre-processing

–  Identify input and output tasks in the DAG and insert them in threads in a proper way

–  Group sequences of tasks that that do not configure a call to create or join primitives in a multithreaded program

•  Iterative Processing –  Breadth-first analysis in the DAG

–  Edges visited from left to right

–  Heuristics to solve conflicts

HPCLatam’12

Page 15: Applying List Scheduling Algorithms in a Multithreaded ...hpc2012.hpclatam.org/files/HPCLatAm2012_presentation25.pdf · HPCS 2005 Overview • Introduction • List Scheduling Algorithms

HPCS 2005

Iterative Processing

!"#$%#

!!!!!!!"!

!!!!!!!"

!"#$

!!!!!!!

"

!!!!!!!

"

Transforming DAGs into DCGs

HPCLatam’12

Page 16: Applying List Scheduling Algorithms in a Multithreaded ...hpc2012.hpclatam.org/files/HPCLatAm2012_presentation25.pdf · HPCS 2005 Overview • Introduction • List Scheduling Algorithms

HPCS 2005

Iterative Processing

!"#$"%&"

!!!!!!!

"

!

!!!!!!!

"

!"#$%&$'(

!!!!!!!!!!!"!!!!!!!!!!!!!!!"

!"#$%

!!!!!!!!!!" !!!!!!!!!!"

Transforming DAGs into DCGs

HPCLatam’12

Page 17: Applying List Scheduling Algorithms in a Multithreaded ...hpc2012.hpclatam.org/files/HPCLatAm2012_presentation25.pdf · HPCS 2005 Overview • Introduction • List Scheduling Algorithms

HPCS 2005

Analysis of the scheduling strategy

Schedule lengths

•  4 optimal schedules

•  2 good schedules •  Same length of Graham’s Critical Path heuristic

•  3 bad schedules •  But wen we added some dependencies to the graph

we could got optimal schedules

HPCLatam’12

Page 18: Applying List Scheduling Algorithms in a Multithreaded ...hpc2012.hpclatam.org/files/HPCLatAm2012_presentation25.pdf · HPCS 2005 Overview • Introduction • List Scheduling Algorithms

HPCS 2005

Analysis of the scheduling strategy

Schedule lengths

•  4 optimal schedules

•  2 good schedules •  Same length of Graham’s Critical Path heuristic

•  3 bad schedules •  But wen we added some dependencies to the graph

we could got optimal schedules

HPCLatam’12

7

Page 19: Applying List Scheduling Algorithms in a Multithreaded ...hpc2012.hpclatam.org/files/HPCLatAm2012_presentation25.pdf · HPCS 2005 Overview • Introduction • List Scheduling Algorithms

HPCS 2005

Concluding remarks •  Conclusions

–  We developed a graph grammar that can successfully map DAG programs to multithreaded applications

–  Anahy’s scheduling strategy can provide a dynamic schedule as efficient as static list strategies, preserving the time bounds •  However, the programmer has to be aware of the scheduling policy to

take advantage of the runtime environment

•  Future work –  Analyse the scheduling assuming NUMA architecture models

–  Improve thread priority strategies using attributes derived from DAG level

HPCLatam’12

Page 20: Applying List Scheduling Algorithms in a Multithreaded ...hpc2012.hpclatam.org/files/HPCLatAm2012_presentation25.pdf · HPCS 2005 Overview • Introduction • List Scheduling Algorithms

HPCS 2005

C. A. S. Camargo, G. G. H. Cavalheiro, M. L. Pilla, S. A. C. Cavalheiro, L. Foss

Applying List Scheduling Algorithms in a Multithreaded Execution Environment

[email protected]

.org