33
CSE539: Advanced Computer Architecture PART-IV {Chapter 10, 11} Software for Parallel Programming Book: “Advanced Computer Architecture – Parallelism, Scalability, Programmability”, Hwang & Jotwani Sumit Mittu Assistant Professor, CSE/IT Lovely Professional University [email protected]

Aca2 10 11

Embed Size (px)

DESCRIPTION

Software for Parallel Programming

Citation preview

Page 1: Aca2 10 11

CSE539: Advanced Computer Architecture

PART-IV

{Chapter 10, 11}

Software for Parallel Programming Book: “Advanced Computer Architecture – Parallelism, Scalability, Programmability”, Hwang & Jotwani

Sumit Mittu

Assistant Professor, CSE/IT

Lovely Professional University

[email protected]

Page 2: Aca2 10 11

In this chapter…

• Chapter 10 o Parallel Programming Models

o Parallel Languages and Compilers

• Chapter 11 o Parallel Programming Environments

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 2

Page 3: Aca2 10 11

PARALLEL PROGRAMING MODELS

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 3

• Programming Model o A collection of program abstractions providing a simplified and transparent view of the computer

hardware/software system to the programmer.

• Parallel Programming models o Specifically designed for:

• Multiprocessor, Multicomputer and Vector/SIMD Computer

o Basic Models

• Shared-Variable Model

• Message-Passing Model

• Data-Parallel Model

• Object-Oriented Model*

• Functional and Logic Models*

Page 4: Aca2 10 11

PARALLEL PROGRAMING MODELS

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 4

• Resources in Programming Systems o ACTIVE RESOURCES – processors

o PASSIVE RESOURCES – memory and I/O devices

• Processes o A processes are the basic computational units in parallel program

o A program is a collection of processes

o Parallelism depends on implementation of IPC (inter-process communication)

• Fundamentals issues around parallel programming o Specification, Creation, Suspension, Reactivation, Migration, Termination and

Synchronization of concurrent processes

o Process address space may be shared or restricted by limiting scope and access rights

Page 5: Aca2 10 11

PARALLEL PROGRAMING MODELS

Shared-Variable Model

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 5

• Shared Variable Communication o Shared Variable and mutual exclusion

o Main Issues:

• Protected access of critical sections

• Memory Consistency

• Atomicity of Memory Operations

• Fast synchronization

• Shared data structures

• Fast data movement techniques

o Critical Section (CS)

• Code segments accessing shared variable with atomic operation

• Requirements – Mutual Exclusion, No Deadlock in waiting, Non-preemption, Eventual Entry

Page 6: Aca2 10 11

PARALLEL PROGRAMING MODELS

Shared-Variable Model

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 6

• Shared Variable Communication o Protected Access using CS

• CS boundary

o Too large…may limit parallelism

o Too small…may add unnecessary code complexity and software overhead

o Shorten a heavy-duty CS or use conditional CSs to maintain balanced performance

• Binary and Counting Semaphores

• Monitors

• Operational Modes used in programming multiprocessor systems o Multiprogramming Multiprocessing Multitasking Multithreading

o Level of

Page 7: Aca2 10 11

PARALLEL PROGRAMING MODELS

Shared-Variable Model

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 7

• Partitioning and Replication o Program Partitioning

o Program Replication

o Role of programmer and compiler in partitioning and replication

• Scheduling and Synchronization o Static Scheduling

o Dynamic Scheduling

• Cache Coherence and Protection o Multiprocessor Cache Coherence

o Sequential Consistency Model

o Strong and Weak Consistency Model

Page 8: Aca2 10 11

PARALLEL PROGRAMING MODELS

Message-Passing Model

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 8

• Message may be: o Instructions, Data, Synchronization Signals or Interrupt Signals

• Communication Delays caused by message passing is more than that caused by accessing shared variables

• Message passing Models o Synchronous Message Passing

o Asynchronous Message Passing

• Critical issue in programing this model o How to distribute or duplicate program codes and data sets over processing modes?

• Distributing the computations!

o Trade-offs between computation time and communication overhead to be considered

Page 9: Aca2 10 11

PARALLEL PROGRAMING MODELS

Message-Passing Model

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 9

• Synchronous Message Passing o Sender and receiver processes synchronized in time and space

o No need of mutual exclusion

o No buffers required

o Blocking communication scheme

o Uniform Communication Delays, in general

• Asynchronous Message Passing o Sender and receiver processes do not require to be synchronized in time and space

o Often uses buffers in channels

o Non-blocking communication scheme

o Non-uniform or arbitrary communication delays

Page 10: Aca2 10 11

PARALLEL PROGRAMING MODELS

Data-Parallel Model

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 10

• SIMD Lockstep operations

• Parallelism explicitly handled by hardware synchronization and flow control

• Synchronization done at compile time rather than at run time

• Data parallel programs require the use of pre-distributed data sets

• Choice of parallel data structure is an important consideration

• Emphasis on local computation and data routing operations

• Can be applicable to fine-grain problems

• Implemented either on SIMD or SPMD

• Leads to high degree of parallelism involving thousands of data operations concurrently

Page 11: Aca2 10 11

PARALLEL PROGRAMING MODELS

Data-Parallel Model

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 11

• Data Parallelism o Illiac-IV processing model

o Connection Machine CM-2 processing model

o Synchronous SIMD programming v/s Asynchronous MIMD programming

o SIMD Characterization for data parallelism

• Scalar operations and scalar data operands

• Vector operations and vector data operands

• Constant data operands

• Masking operations

• Data-routing operations

Page 12: Aca2 10 11

PARALLEL PROGRAMING MODELS

Data-Parallel Model

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 12

• Array Language Extensions for data-parallel processing o Languages

• Fortran 90 array notation

• CFD for Illiac IV

• DAP Fortran for AMT Distributed Array Processor (DAP)

• C* for TMC Connection Machine

• MPF for MasPar family of MPPs

• Actus for SIMD programming

o Expected Characteristics

• Global Address Space

• Explicit data routing among PEs

• No. of PEs be the function of problem size rather than machine size

Page 13: Aca2 10 11

PARALLEL PROGRAMING MODELS

Data-Parallel Model

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 13

• Compiler Support for data parallel programming o Array language expressions and their optimizing compilers must be embedded in familiar standards

such as Fortran and C with an idea to:

• Unify program execution model

• Facilitate precise control of massively parallel hardware

• Enable incremental migration to data parallel execution

o Array Sectioning

• Allows a programmer to reference a section of a region of a multidimensional array designated by specifying a:

o Start Index

o Bound (or upper limit)

o Stride (or step-size)

• Vector-valued subscripts are often used to construct arrays from arbitrary permutations of another array

Page 14: Aca2 10 11

PARALLEL LANGUAGES AND COMPILERS

Language Features for Parallelism

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 14

• Chang and Smith (1990) classified language features for parallel programming into 6 classes according to functionality: o Optimization features

o Availability features

o Synchronization/Communication features

o Parallelism Control features

o Data Parallelism features

o Process Management features

• In practice, real languages might have some or no features of these incorporated in them. o The features act as guidelines for user-friendly and efficient programming environment

o Compiler support, OS assistance, integration with existing environment is required

Page 15: Aca2 10 11

PARALLEL LANGUAGES AND COMPILERS

Language Features for Parallelism

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 15

• Optimization features o Used for Program restructuring and compilation directives in coverting sequentially coded program

into parallel forms

o Purpose: to match software parallelism with hardware parallelism in target machine

o Automated parallelizer

• Express C, Alliant FX Fortran compiler

o Semi-automated parallelizer

• DINO

o Interactive restructure support

• MIMDizer from Pacific Sierra

Page 16: Aca2 10 11

PARALLEL LANGUAGES AND COMPILERS

Language Features for Parallelism

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 16

• Availability features o Purpose:

• Enhance user-friendliness

• Make the language portable to a large class of parallel computers

• Expand applicability of software libraries

o Scalability

• in terms of no. of processors available

o Independence

• From hardware topology

o Compatibility

• With an established sequential language

o Portability

• across shared-memory and message-passing computers

Page 17: Aca2 10 11

PARALLEL LANGUAGES AND COMPILERS

Language Features for Parallelism

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 17

• Synchronization/Communication features o Single assignment languages

o Shared variables (locks) for IPC

o Logically shared memory (e.g. tuple space in Linda)

o Send/receive primitives for message passing

o Remote procedure call

o Data flow languages (e.g. ID)

o Support for Barriers, mailbox, semaphores and monitors

Page 18: Aca2 10 11

PARALLEL LANGUAGES AND COMPILERS

Language Features for Parallelism

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 18

• Parallelism Control features o Coarse, Medium or Fine grain parallelism

o Explicit or Implicit parallelism

o Global parallelism

o Loop parallelism

o Task-split parallelism

o Shared task queue

o Divide-and-conquer paradigm

o Shared abstract data types

o Task dependency specification

Page 19: Aca2 10 11

PARALLEL LANGUAGES AND COMPILERS

Language Features for Parallelism

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 19

• Data Parallelism features o Purpose:

• How data are accessed and distributed in SIMD or MIMD computers

o Runtime automatic Decomposition

• Express C

o Mapping Specification

• DINO

o Virtual Processor Support

• PISCES 2 and DINO

o Direct Access to Shared Data

• Linda

o SPMD

• DINO and Hypertasking

Page 20: Aca2 10 11

PARALLEL LANGUAGES AND COMPILERS

Language Features for Parallelism

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 20

• Process Management features o Purpose:

• Efficient creation of parallel processes

• Implement multi-threading or multi-asking

• Program Partitioning and Replication

• Dynamic Load balancing at runtime

o Dynamic Process creation

o LWP (threads)

o Replicated Workers

o Partitioned networks

o Automatic Load Balancing

Page 21: Aca2 10 11

PARALLEL LANGUAGES AND COMPILERS

Parallel Language Constructs

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 21

• Fortran 90 Array Notations o Lower Bound : Upper Bound : Stride

• E1 : E2 : E3

• E1: E2

• E1: * : E3

• E1 : *

• E1

• *

• Parallel Flow Control o Doall – Endall or Forall – Endall

o Doacross – Endacross or ParDo – ParEnd

o Cobegin – Coend or ParBegin – ParEnd

o Fork – Join

Page 22: Aca2 10 11

PARALLEL LANGUAGES AND COMPILERS

Optimizing Compilers for Parallelism

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 22

• Major Phases of parallelizing compiler o Phase I : Flow Analysis

• Data Dependence

• Control Dependence

• Reuse Analysis

o Phase II : Program Optimizations

• Vectorization

• Parallelizations

• Locality

• Pipelining

o Phase III : Parallel Code Generation

• Granularity

• Degree of Parallelism

• Code Scheduling

Flow Analyis

Program Optimizations

Parallel Code Generation

Page 23: Aca2 10 11

PARALLEL LANGUAGES AND COMPILERS

Optimizing Compilers for Parallelism

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 23

• Parallelizing Compilers o Parafrase [1984]

• David Kuck, University of Illinois

• Vectorization of paralleiszation of Fortran 77 programs

o Parafrase 2

• David Kuck, University of Illinois

• Parallelization of C or Pascal programs

o KAP vectorizer

• Kuck and associatesBased on Parafrase 2

o PFC, PFC+ and Parascope

• Allen and Kennedy (1984)

o Other compilers

• PTRAN, Alliant FX/F, Convex Vectorizing compiler, Cray CFT compiler, VAST vectorizer, etc.

Page 24: Aca2 10 11

PARALLEL PROGRAMMING ENVIRONMENTS

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 24

• Constituents of Environment for parallel programming o Hardware Platforms or Machine models

o Languages Supported

o OS and Software Tools

o Application Packages

• Software Tools and Environments o Parallel Languages

o Integrated Environment Components

• Editor

• Debugger

• Performance Monitors

• Program Visualizers

Page 25: Aca2 10 11

PARALLEL PROGRAMMING ENVIRONMENTS

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 25

• Fig 11.1

Page 26: Aca2 10 11

PARALLEL PROGRAMMING ENVIRONMENTS

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 26

• Types of Integrated Environment o Basic: Provides simple tools for

• Program tracing facility for debugging and performance monitoring

• Graphic mechanism for specifying dependence graphs

o Limited: Provides tools for

• Parallel Debugging

• Performance monitoring

• Program visualization beyond capability of basic environments

o Well-developed: Provides intensive tools for

• Debugging programs

• Interaction of textual/graphical representations of parallel program

• Visualization support for performance monitoring, program visualization, parallel I/O, etc.

Page 27: Aca2 10 11

PARALLEL PROGRAMMING ENVIRONMENTS

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 27

• Environment Features o Design Issues

• Compatibility

• Expressiveness

• Ease of use

• Efficiency

• Portability

o Parallel languages may developed as extension to existing sequential languages

o A new parallel programming language has the advantage of:

• Using high-level parallel concepts or constructs for parallelism instead of imperative

(algorithmic) languages which are inherently sequential

o Special optimizing compilers detect parallelism and transform sequential constructs into parallel.

Page 28: Aca2 10 11

PARALLEL PROGRAMMING ENVIRONMENTS

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 28

• Environment Features o Special optimizing compilers detect parallelism and transform sequential constructs into parallel.

o High Level parallel constructs can be:

• implicitly embedded in syntax, or

• explicitly specified by users

o Compiler Approaches

• Pre-processors:

o use compiler directives or macroprocessors

• Pre-compilers:

o include automated and semi-automated parallelizers

• Parallelizing Compilers

Page 29: Aca2 10 11

PARALLEL PROGRAMMING ENVIRONMENTS

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 29

• Fig 11.2

Page 30: Aca2 10 11

PARALLEL PROGRAMMING ENVIRONMENTS

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 30

• Summary of Important Environment Features o Control flow graph generation

o Integrated textual/graphical map

o Parallel debugger at source code level

o Performance Monitoring by either software or hardware means

o Performance Prediction Model

o Program visualizer for displaying program structures and data flow patterns

o Parallel I/O for fast data movement

o Visualization support for program development and guidance for parallel computations

o OS support for parallelism in front-end or back-end environments

o Communication support in a network environment

• Refer to Table 11.1 for attributes of Representative Parallel Programming Tools

Page 31: Aca2 10 11

PARALLEL PROGRAMMING ENVIRONMENTS

Case Studies

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 31

• Cray Y-MP Software

• Intel Paragon XP/S Software o Refer Table 11.2 for its characteristic attributes

• CM-5 Software o Refer Figure 11.3 for several software layers of Connection Machine System

Page 32: Aca2 10 11

PARALLEL PROGRAMMING ENVIRONMENTS

Case Studies

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 32

Page 33: Aca2 10 11

That’s All Folks! Attempt your best in exams

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 33