20
CS- 492 : Distributed system & Parallel Processing Lecture 7: Sun: 15/5/1435 Foundations of designing parallel algorithms and shared memory models Lecturer/ Kawther Abas [email protected]

CS- 492 : Distributed system & Parallel Processing Lecture 7: Sun: 15/5/1435 Foundations of designing parallel algorithms and shared memory models Lecturer

Embed Size (px)

Citation preview

Page 1: CS- 492 : Distributed system & Parallel Processing Lecture 7: Sun: 15/5/1435 Foundations of designing parallel algorithms and shared memory models Lecturer

CS- 492 : Distributed system & Parallel Processing

Lecture 7: Sun: 15/5/1435

Foundations of designing parallel algorithms

and shared memory models

Lecturer/ Kawther Abas

[email protected]

Page 2: CS- 492 : Distributed system & Parallel Processing Lecture 7: Sun: 15/5/1435 Foundations of designing parallel algorithms and shared memory models Lecturer

Parallelism

• Parallelism Is a set of activities that occur at the same time.

Why need Parallelism?• Faster, of course

– Finish the work earlier• Same work in less time

– Do more work• More work in the same time

Page 3: CS- 492 : Distributed system & Parallel Processing Lecture 7: Sun: 15/5/1435 Foundations of designing parallel algorithms and shared memory models Lecturer

Parallel Processing

• Parallel processing is the ability to carry out multiple operations or tasks simultaneously. 

Parallel computing 

• Parallel computing is a form of computation in which many calculations are carried out simultaneously.

Page 4: CS- 492 : Distributed system & Parallel Processing Lecture 7: Sun: 15/5/1435 Foundations of designing parallel algorithms and shared memory models Lecturer

What level Parallelism?• Bit level parallelism: 1970 to ~1985

– 4 bits, 8 bit, 16 bit, 32 bit microprocessors

• Instruction level parallelism (ILP): ~1985 through today

– Pipelining– Superscalar

– VLIW(very long instruction word )– Out-of-Order execution– Limits to benefits of ILP?

• Process Level or Thread level parallelism; mainstream for general purpose computing?

– Servers are parallel– High-end Desktop dual processor PC

Page 5: CS- 492 : Distributed system & Parallel Processing Lecture 7: Sun: 15/5/1435 Foundations of designing parallel algorithms and shared memory models Lecturer

Why Multiprocessors?

1. Microprocessors as the fastest CPUs2. Complexity of current microprocessors3. Slow (but steady) improvement in parallel

software (scientific apps, databases, OS)4. Emergence of embedded and server markets

driving microprocessors .

Page 6: CS- 492 : Distributed system & Parallel Processing Lecture 7: Sun: 15/5/1435 Foundations of designing parallel algorithms and shared memory models Lecturer

Classification of Parallel Processors

• SIMD – Single instruction, multiple data • MIMD - multiple instruction, multiple data 1. Message Passing Multiprocessor:

Interprocessor communication through explicit “send” and “receive” operation of messages over the network

2. Shared Memory Multiprocessor: Interprocessor communication by load and store operations to shared memory locations.

Page 7: CS- 492 : Distributed system & Parallel Processing Lecture 7: Sun: 15/5/1435 Foundations of designing parallel algorithms and shared memory models Lecturer

Concurrency

• Is the ability to playback at the same time.

Page 8: CS- 492 : Distributed system & Parallel Processing Lecture 7: Sun: 15/5/1435 Foundations of designing parallel algorithms and shared memory models Lecturer

• designing parallel algorithms

Have tow steps:

1- Task Decomposition

2- Parallel Processing

Page 9: CS- 492 : Distributed system & Parallel Processing Lecture 7: Sun: 15/5/1435 Foundations of designing parallel algorithms and shared memory models Lecturer

Task Decomposition

• Big idea • First decompose for message passing• Then decompose for the shared memory on each

node

• Decomposition Techniques• Recursive • Data• Exploratory• Speculative

Page 10: CS- 492 : Distributed system & Parallel Processing Lecture 7: Sun: 15/5/1435 Foundations of designing parallel algorithms and shared memory models Lecturer

Recursive Decomposition

• Good for problems which are amenable to a divide and conquer strategy

• Quicksort - a natural fit

Page 11: CS- 492 : Distributed system & Parallel Processing Lecture 7: Sun: 15/5/1435 Foundations of designing parallel algorithms and shared memory models Lecturer

Data Decomposition

• Idea-partitioning of data leads to tasks

• Can partition• Output data• Input data• Intermediate data• Whatever………………….

Page 12: CS- 492 : Distributed system & Parallel Processing Lecture 7: Sun: 15/5/1435 Foundations of designing parallel algorithms and shared memory models Lecturer

Exploratory Decomposition

• For search space type problems

• Partition search space into small parts

• Look for solution in each part

Page 13: CS- 492 : Distributed system & Parallel Processing Lecture 7: Sun: 15/5/1435 Foundations of designing parallel algorithms and shared memory models Lecturer

Speculative Decomposition

• Computation gambles at a branch point in the program

• Takes path before it knows result

• Win big or waste

Page 14: CS- 492 : Distributed system & Parallel Processing Lecture 7: Sun: 15/5/1435 Foundations of designing parallel algorithms and shared memory models Lecturer

Parallel Programming ModelsData parallelism / Task parallelism• Explicit parallelism / Implicit parallelism • Shared memory / Distributed memory• Other programming paradigms

– Object-oriented – Functional and logic

Page 15: CS- 492 : Distributed system & Parallel Processing Lecture 7: Sun: 15/5/1435 Foundations of designing parallel algorithms and shared memory models Lecturer

Parallel Programming Models

Data ParallelismParallel programs that concurrent execution

emphasize of the same task on different data elements (data-parallel programs)Task Parallelism

Parallel programs that emphasize the concurrent execution of different tasks on the same or different data

Page 16: CS- 492 : Distributed system & Parallel Processing Lecture 7: Sun: 15/5/1435 Foundations of designing parallel algorithms and shared memory models Lecturer

Parallel Programming Models

• Explicit ParallelismThe programmer specifies directly the activities of the multiple concurrent “threads of control” that form a parallel computation.

• Implicit ParallelismThe programmer provides high-level specification of program behavior.

Page 17: CS- 492 : Distributed system & Parallel Processing Lecture 7: Sun: 15/5/1435 Foundations of designing parallel algorithms and shared memory models Lecturer

Parallel Programming Models

• Shared MemoryThe programmer’s task is to specify the activities of a set of processes that communicate by reading and writing shared memory.

• Distributed MemoryProcesses have only local memory and must use some other mechanism to exchange information.

Page 18: CS- 492 : Distributed system & Parallel Processing Lecture 7: Sun: 15/5/1435 Foundations of designing parallel algorithms and shared memory models Lecturer

Parallel Programming Models

Parallel Programming Tools:

• Parallel Virtual Machine (PVM)• Message-Passing Interface (MPI)• PThreads• OpenMP• High-Performance Fortran (HPF)• Parallelizing Compilers

Page 19: CS- 492 : Distributed system & Parallel Processing Lecture 7: Sun: 15/5/1435 Foundations of designing parallel algorithms and shared memory models Lecturer

Shared Memory vs. Distributed Memory Programs

• Shared Memory Programming–Start a single process and fork threads.–Threads carry out work.–Threads communicate through shared memory.–Threads coordinate through synchronization (also

through shared memory).

• Distributed Memory Programming–Start multiple processes on multiple systems.–Processes carry out work.–Processes communicate through message-passing.–Processes coordinate either through message-

passing or synchronization (generates messages).

Page 20: CS- 492 : Distributed system & Parallel Processing Lecture 7: Sun: 15/5/1435 Foundations of designing parallel algorithms and shared memory models Lecturer

Shared Memory• Dynamic threads

–Master thread waits for work, forks new threads, and when threads are done, they terminate

–Efficient use of resources, but thread creation and termination is time consuming.

• Static threads–Pool of threads created and are allocated work, but do not terminate until cleanup.

–Better performance, but potential waste of system resources.