Upload
melyfony
View
220
Download
0
Embed Size (px)
Citation preview
8/14/2019 Parallel Comp Overview
http://slidepdf.com/reader/full/parallel-comp-overview 1/37
Parallel Processing:Architecture Overview
Subject Code: 433-498
Rajkumar BuyyaGrid Computing and Distributed Systems
(GRIDS) Lab. The University of Melbourne
Melbourne, Australiawww.gridbus.org
WW Grid
8/14/2019 Parallel Comp Overview
http://slidepdf.com/reader/full/parallel-comp-overview 2/37
Overview of the Talk
Why Parallel Processing ?
Parallel Hardwares Parallel Operating Systems
Parallel Programming Paradigms Grand Challenges
8/14/2019 Parallel Comp Overview
http://slidepdf.com/reader/full/parallel-comp-overview 3/37
P PP P P P.
Microkernel
Multi-Processor Computing System
Threads Interface
Hardware
Operating System
ProcessProcessor Thread
P
Applications
Computing Elements
Programming paradigms
8/14/2019 Parallel Comp Overview
http://slidepdf.com/reader/full/parallel-comp-overview 4/37
Architectures System
Software/Compiler
ApplicationsP.S.Es
Architectures
System Software
Applications
P.S.Es
Sequential
Era
Parallel
Era
1940 50 60 70 80 90 2000 2030
Two Eras of Computing
Commercialization
R & D Commodity
8/14/2019 Parallel Comp Overview
http://slidepdf.com/reader/full/parallel-comp-overview 5/37
History of Parallel Processing
PP can be traced to a tablet datedaround 100 BC. Tablet has 3 calculating positions.
Infer that multiple positions: Reliability/ Speed
8/14/2019 Parallel Comp Overview
http://slidepdf.com/reader/full/parallel-comp-overview 6/37
Motivating factors
Just as we learned to fly, not by
constructing a machine that flaps its
wings like birds, but by applying
aerodynamics principles demonstrated
by the nature...We modeled PP after those of
biological species.
8/14/2019 Parallel Comp Overview
http://slidepdf.com/reader/full/parallel-comp-overview 7/37
ªAggregated speed with
which complex calculations
carried out by neurons-individual
response is slow (ms) – demonstratefeasibility of PP
Motivating Factors
8/14/2019 Parallel Comp Overview
http://slidepdf.com/reader/full/parallel-comp-overview 8/37
Why Parallel Processing?
ÄComputation requirements are ever
increasing -- visualization, distributed
databases, simulations, scientific prediction (earthquake), etc.
ÄSequential architectures reaching physical limitation (speed of light,
thermodynamics)
8/14/2019 Parallel Comp Overview
http://slidepdf.com/reader/full/parallel-comp-overview 9/37
Age
G r o w t
h
5 10 15 20 25 30 35 40 45 . . . .
Human Architecture! Growth Performance
Vertical Horizontal
8/14/2019 Parallel Comp Overview
http://slidepdf.com/reader/full/parallel-comp-overview 10/37
No. of Processors
C . P . I .
1 2 . . . .
ompu a ona ower Improvement
Multiprocessor
Uniprocessor
8/14/2019 Parallel Comp Overview
http://slidepdf.com/reader/full/parallel-comp-overview 11/37
ÄThe Tech. of PP is mature and can be
exploited commercially; significant
R & D work on development of tools &environment.
ÄSignificant development in Networkingtechnology is paving a way for
heterogeneous computing.
Why ParallelProcessing?
8/14/2019 Parallel Comp Overview
http://slidepdf.com/reader/full/parallel-comp-overview 12/37
ÄHardware improvements like
Pipelining, Superscalar, etc., are non-
scalable and requires sophisticatedCompiler Technology.
ÄVector Processing works well for
certain kind of problems.
Why ParallelProcessing?
8/14/2019 Parallel Comp Overview
http://slidepdf.com/reader/full/parallel-comp-overview 13/37
Parallel Program has &needs ...
®Multiple “processes” active
simultaneously solving a givenproblem, general multiple processors.
®Communication and synchronizationof its processes (forms the core of
parallel programming efforts).
8/14/2019 Parallel Comp Overview
http://slidepdf.com/reader/full/parallel-comp-overview 14/37
Processing ElementsArchitecture
8/14/2019 Parallel Comp Overview
http://slidepdf.com/reader/full/parallel-comp-overview 15/37
®Simple classification by Flynn:
(No. of instruction and data streams) SISD - conventional
SIMD - data parallel, vector computing MISD - systolic arrays MIMD - very general, multiple approaches.
®Current focus is on MIMD model,using general purpose processors.
(No shared memory)
Processing Elements
8/14/2019 Parallel Comp Overview
http://slidepdf.com/reader/full/parallel-comp-overview 16/37
SISD : A Conventional Computer
Speed is limited by the rate at which computer can transfer information internally.
Processor Data Input Data Output
I n
s t r
u c
t i
o n
s
Ex:PC, Macintosh, Workstations
8/14/2019 Parallel Comp Overview
http://slidepdf.com/reader/full/parallel-comp-overview 17/37
The MISD Architecture
More of an intellectual exercise than a practicle configuration. Few built, butcommercially not available
Data
Input
Stream
Data
Output
Stream
Processor
A
Processor
B
Processor
C
InstructionStream A
Instruction
Stream B
Instruction Stream C
8/14/2019 Parallel Comp Overview
http://slidepdf.com/reader/full/parallel-comp-overview 18/37
SIMD Architecture
Ex: CRAY machine vector processing, Thinking machine cm*
Intel MMX (multimedia support)
Ci<= Ai * Bi
Instruction
Stream
Processor
A
Processor
B
ProcessorC
Data Input
stream A
Data Input
stream B
Data Inputstream C
Data Output
stream A
Data Output
stream B
Data Output
stream C
8/14/2019 Parallel Comp Overview
http://slidepdf.com/reader/full/parallel-comp-overview 19/37
Unlike SISD, MISD, MIMD computer works asynchronously.
Shared memory (tightly coupled) MIMD
Distributed memory (loosely coupled) MIMD
MIMD Architecture
Processor
A
Processor
B
Processor
C
Data Input
stream A
Data Input
stream B
Data Input
stream C
Data Output
stream A
Data Output
stream B
Data Output
stream C
Instruction
Stream A InstructionStream B
Instruction
Stream C
Sh d hi
8/14/2019 Parallel Comp Overview
http://slidepdf.com/reader/full/parallel-comp-overview 20/37
M
E
M
O
R Y
B
U
S
Shared Memory MIMD machine
Comm: Source PE writes data to GM & destination retrieves it
Easy to build, conventional OSes of SISD can be easily be ported
Limitation : reliability & expandibility. A memory component or any processorfailure affects the whole system.
Increase of processors leads to memory contention.
Ex. : Silicon graphics supercomputers....
M
E
M
O
R Y
B
U
S
Global Memory System
Processor
A
Processor
B
Processor
C
M
E
M
O
R Y
B
U
S
8/14/2019 Parallel Comp Overview
http://slidepdf.com/reader/full/parallel-comp-overview 21/37
M
E
MO
R
Y
B
U
S
Distributed Memory MIMD
q Communication : IPC on High Speed Network.
q Network can be configured to ... Tree, Mesh, Cube, etc.
q Unlike Shared MIMD easily/ readily expandable
Highly reliable (any CPU failure does not affect the whole system)
Processor
A
Processor
B
Processor
C
M
E
MO
R
Y
B
U
S
M
E
MO
R
Y
B
U
S
Memory
System AMemory
System B
Memory
System C
IPC
channel
IPC
channel
8/14/2019 Parallel Comp Overview
http://slidepdf.com/reader/full/parallel-comp-overview 22/37
Laws of caution.....
q Speed of computers is proportional to the squareof their cost.
i.e. cost = Speed
Speedup by a parallel computer increases as the
logarithm of the number of processors. Speedup = log2(no. of processors)
S
P
l o g 2 P
C
S
(speed = cost2)
8/14/2019 Parallel Comp Overview
http://slidepdf.com/reader/full/parallel-comp-overview 23/37
Caution....
¢Very fast development in PP and related area have
blurred concept boundaries, causing lot of terminologicalconfusion : concurrent computing/ programming, parallel
computing/ processing, multiprocessing, distributed
computing, etc.
8/14/2019 Parallel Comp Overview
http://slidepdf.com/reader/full/parallel-comp-overview 24/37
It’s hard to imagine a field
that changes as rapidly as
computing.
8/14/2019 Parallel Comp Overview
http://slidepdf.com/reader/full/parallel-comp-overview 25/37
Computer Science is Immature Science.(lack of standard taxonomy, terminologies)
Caution....
8/14/2019 Parallel Comp Overview
http://slidepdf.com/reader/full/parallel-comp-overview 26/37
¢ Even well-defined distinctions like
shared memory and distributed memory
are merging due to new advances in
technolgy.
¢ Good environments for developments
and debugging are yet to emerge.
Caution....
8/14/2019 Parallel Comp Overview
http://slidepdf.com/reader/full/parallel-comp-overview 27/37
¢There is no strict delimiters for contributors to
the area of parallel processing : CA,OS, HLLs,
databases, computer networks, all have a roleto play.
§This makes it a Hot Topic of Research
Caution....
8/14/2019 Parallel Comp Overview
http://slidepdf.com/reader/full/parallel-comp-overview 28/37
Operating Systems forHigh PerformanceComputing
8/14/2019 Parallel Comp Overview
http://slidepdf.com/reader/full/parallel-comp-overview 29/37
Types of Parallel Systems
®Shared Memory Parallel± Smallest extension to existing systems
± Program conversion is incremental
®Distributed Memory Parallel± Completely new systems
± Programs must be reconstructed
®Clusters± Slow communication form of Distributed
Operating Systems for
8/14/2019 Parallel Comp Overview
http://slidepdf.com/reader/full/parallel-comp-overview 30/37
Operating Systems forPP
MPP systems having thousands of processors requires OS radically different
fromcurrent ones. Every CPU needs OS :
to manage its resources
to hide its details
Traditional systems are heavy, complexand not suitable for MPP
Operating System
8/14/2019 Parallel Comp Overview
http://slidepdf.com/reader/full/parallel-comp-overview 31/37
Frame work that unifies features,services and tasks performed
Three approaches to building OS.... Monolithic OS
Layered OS
Microkernel based OS
Client server OSSuitable for MPP systems
Simplicity, flexibility and highperformance are crucial for OS.
Operating SystemModels
Mono t c Operat ng
8/14/2019 Parallel Comp Overview
http://slidepdf.com/reader/full/parallel-comp-overview 32/37
Application
Programs
Application
Programs
System Services
Hardware
User Mode
Kernel Mode
Mono t c Operat ngSystem
b Better application Performance
b Difficult to extend Ex: MS-DOS
8/14/2019 Parallel Comp Overview
http://slidepdf.com/reader/full/parallel-comp-overview 33/37
Layered OS
q Easier to enhance
q Each layer of code access lower level interface
q
Low-application performance
Application
Programs
System Services
User Mode
Kernel Mode
Memory & I/O Device Mgmt
Hardware
Process Schedule
Application
Programs
Ex : UNIX
8/14/2019 Parallel Comp Overview
http://slidepdf.com/reader/full/parallel-comp-overview 34/37
Traditional OS
OS Designer
OS
Hardware
User Mode
Kernel Mode
Application
Programs
Application
Programs
8/14/2019 Parallel Comp Overview
http://slidepdf.com/reader/full/parallel-comp-overview 35/37
New trend in OS design
User Mode
Kernel Mode
Hardware
Microkernel
ServersApplication
Programs
Application
Programs
8/14/2019 Parallel Comp Overview
http://slidepdf.com/reader/full/parallel-comp-overview 36/37
OS
(for MPP Systems)
q Tiny OS kernel providing basic primitive (process, memory,IPC)
q Traditional services becomes subsystems
q Monolithic Application Perf. Competence
q OS = Microkernel + User Subsystems
Client
Application
Thread
lib.
File
Server
Network
Server
Display
Server
Microkernel
Hardware
User
Kernel
Send
Reply
Ex: Mach, PARAS, Chorus, etc.
Few Popular Microkernel
8/14/2019 Parallel Comp Overview
http://slidepdf.com/reader/full/parallel-comp-overview 37/37
Few Popular MicrokernelSystems
MACH, CMU
PARAS, C-DAC
Chorus
QNX,
(Windows)