CS- 492 : Distributed system & Parallel Processing Lecture 7: Sun: 15/5/1435 Foundations of designing parallel algorithms and shared memory models Lecturer

CS- 492 : Distributed system & Parallel Processing

Lecture 7: Sun: 15/5/1435

Foundations of designing parallel algorithms

and shared memory models

Lecturer/ Kawther Abas

[email protected]

mailto:[email protected]

Parallelism

• Parallelism Is a set of activities that occur at the same time.

Why need Parallelism?• Faster, of course

– Finish the work earlier• Same work in less time

– Do more work• More work in the same time

Parallel Processing

• Parallel processing is the ability to carry out multiple operations or tasks simultaneously.

Parallel computing

• Parallel computing is a form of computation in which many calculations are carried out simultaneously.

What level Parallelism?• Bit level parallelism: 1970 to ~1985

– 4 bits, 8 bit, 16 bit, 32 bit microprocessors

• Instruction level parallelism (ILP): ~1985 through today

– Pipelining– Superscalar

– VLIW(very long instruction word )– Out-of-Order execution– Limits to benefits of ILP?

• Process Level or Thread level parallelism; mainstream for general purpose computing?

– Servers are parallel– High-end Desktop dual processor PC

Why Multiprocessors?

1. Microprocessors as the fastest CPUs2. Complexity of current microprocessors3. Slow (but steady) improvement in parallel

software (scientific apps, databases, OS)4. Emergence of embedded and server markets

driving microprocessors .

Classification of Parallel Processors

• SIMD – Single instruction, multiple data • MIMD - multiple instruction, multiple data 1. Message Passing Multiprocessor:

Interprocessor communication through explicit “send” and “receive” operation of messages over the network

2. Shared Memory Multiprocessor: Interprocessor communication by load and store operations to shared memory locations.

Concurrency

• Is the ability to playback at the same time.

• designing parallel algorithms

Have tow steps:

1- Task Decomposition

2- Parallel Processing

Task Decomposition

• Big idea • First decompose for message passing• Then decompose for the shared memory on each

node

• Decomposition Techniques• Recursive • Data• Exploratory• Speculative

Recursive Decomposition

• Good for problems which are amenable to a divide and conquer strategy

• Quicksort - a natural fit

Data Decomposition

• Idea-partitioning of data leads to tasks

• Can partition• Output data• Input data• Intermediate data• Whatever………………….

Exploratory Decomposition

• For search space type problems

• Partition search space into small parts

• Look for solution in each part

Speculative Decomposition

• Computation gambles at a branch point in the program

• Takes path before it knows result

• Win big or waste

Parallel Programming ModelsData parallelism / Task parallelism• Explicit parallelism / Implicit parallelism • Shared memory / Distributed memory• Other programming paradigms

– Object-oriented – Functional and logic

Parallel Programming Models

Data ParallelismParallel programs that concurrent execution

emphasize of the same task on different data elements (data-parallel programs)Task Parallelism

Parallel programs that emphasize the concurrent execution of different tasks on the same or different data


• Explicit ParallelismThe programmer specifies directly the activities of the multiple concurrent “threads of control” that form a parallel computation.

• Implicit ParallelismThe programmer provides high-level specification of program behavior.


• Shared MemoryThe programmer’s task is to specify the activities of a set of processes that communicate by reading and writing shared memory.

• Distributed MemoryProcesses have only local memory and must use some other mechanism to exchange information.


Parallel Programming Tools:

• Parallel Virtual Machine (PVM)• Message-Passing Interface (MPI)• PThreads• OpenMP• High-Performance Fortran (HPF)• Parallelizing Compilers

Shared Memory vs. Distributed Memory Programs

• Shared Memory Programming–Start a single process and fork threads.–Threads carry out work.–Threads communicate through shared memory.–Threads coordinate through synchronization (also

through shared memory).

• Distributed Memory Programming–Start multiple processes on multiple systems.–Processes carry out work.–Processes communicate through message-passing.–Processes coordinate either through message-

passing or synchronization (generates messages).

Shared Memory• Dynamic threads

–Master thread waits for work, forks new threads, and when threads are done, they terminate

–Efficient use of resources, but thread creation and termination is time consuming.

• Static threads–Pool of threads created and are allocated work, but do not terminate until cleanup.

–Better performance, but potential waste of system resources.

Documents

CS- 492 : Distributed system & Parallel Processing Lecture 7: Sun: 15/5/1435 Foundations of designing parallel algorithms and shared memory models Lecturer