44
Advanced Computer Architecture The Architecture of Parallel Computers

Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software

  • Upload
    vanthu

  • View
    219

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software

Advanced Computer Architecture

The Architecture ofParallel Computers

Page 2: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software

Computer Systems

HardwareArchitecture

OperatingSystem

ApplicationSoftwareNo Component

Can be TreatedIn IsolationFrom the Others

Page 3: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software

Hardware Issues

• Number and Type of Processors

• Processor Control

• Memory Hierarchy

• I/O devices and Peripherals

• Operating System Support

• Applications Software Compatibility

Page 4: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software

Operating System Issues

• Allocating and Managing Resources

• Access to Hardware Features– Multi-Processing

– Multi-Threading

• I/O Management

• Access to Peripherals

• Efficiency

Page 5: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software

Applications Issues

• Compiler/Linker Support

• Programmability

• OS/Hardware Feature Availability

• Compatibility

• Parallel Compilers– Preprocessor

– Precompiler

– Parallelizing Compiler

Page 6: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software

Architecture Evolution

• Scalar Architecture

• Prefetch Fetch/Execute Overlap

• Multiple Functional Units

• Pipelining

• Vector Processors

• Lock-Step Processors

• Multi-Processor

Page 7: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software

Flynn’s Classification

• Consider Instruction Streams and DataStreams Separately.

• SISD - Single Instruction, Single DataStream

• SIMD - Single Instruction, Multiple DataStreams

• MIMD - Multiple Instruction, Multiple DataStreams.

• MISD - (rare) Multiple Instruction, SingleData Stream

Page 8: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software

SISD

• Conventional Computers.

• Pipelined Systems

• Multiple-Functional Unit Systems

• Pipelined Vector Processors

• Includes most computers encountered ineveryday life

Page 9: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software

SIMD

• Multiple Processors Execute a SingleProgram

• Each Processor operates on its own data

• Vector Processors

• Array Processors

• PRAM Theoretical Model

Page 10: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software

MIMD

• Multiple Processors cooperate on a singletask

• Each Processor runs a different program

• Each Processor operates on different data

• Many Commercial Examples Exist

Page 11: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software

MISD

• A Single Data Stream passes throughmultiple processors

• Different operations are triggered ondifferent processors

• Systolic Arrays

• Wave-Front Arrays

Page 12: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software

Programming Issues

• Parallel Computers are Difficult to Program

• Automatic Parallelization Techniques areonly Partially Successful

• Programming languages are few, not wellsupported, and difficult to use.

• Parallel Algorithms are difficult to design.

Page 13: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software

Performance Issues

• Clock Rate / Cycle Time = τ• Cycles Per Instruction (Average) = CPI

• Instruction Count = Ic• Time, T = Ic × CPI × τ• p = Processor Cycles, m = Memory Cycles,

k = Memory/Processor cycle ratio

• T = Ic × (p + m × k) × τ

Page 14: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software

Performance Issues II

• Ic & p affected by processor design andcompiler technology.

• m affected mainly by compiler technology

τ affected by processor design

• k affected by memory hierarchy structureand design

Page 15: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software

Other Measures

• MIPS rate - Millions of instructions persecond

• Clock Rate for similar processors

• MFLOPS rate - Millions of floating pointoperations per second.

• These measures are not neccessarily directlycomparable between different types ofprocessors.

Page 16: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software

Parallelizing Code

• Implicitly– Write Sequential Algorithms

– Use a Parallelizing Compiler

– Rely on compiler to find parallelism

• Explicitly– Design Parallel Algorithms

– Write in a Parallel Language

– Rely on Human to find Parallelism

Page 17: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software

Multi-Processors

• Multi-Processors generally share memory,while multi-computers do not.– Uniform memory model

– Non-Uniform Memory Model

– Cache-Only

• MIMD Machines

Page 18: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software

Multi-Computers

• Independent Computers that Don’t ShareMemory.

• Connected by High-Speed CommunicationNetwork

• More tightly coupled than a collection ofindependent computers

• Cooperate on a single problem

Page 19: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software

Vector Computers

• Independent Vector Hardware

• May be an attached processor

• Has both scalar and vector instructions

• Vector instructions operate in highlypipelined mode

• Can be Memory-to-Memory or Register-to-Register

Page 20: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software

SIMD Computers

• One Control Processor

• Several Processing Elements

• All Processing Elements execute the sameinstruction at the same time

• Interconnection network between PEsdetermines memory access and PEinteraction

Page 21: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software

The PRAM Model

• SIMD Style Programming

• Uniform Global Memory

• Local Memory in Each PE

• Memory Conflict Resolution– CRCW - Common Read, Common Write

– CREW - Common Read, Exclusive Write

– EREW - Exclusive Read, Exclusive Write

– ERCW - (rare) Exclusive Read, Common Write

Page 22: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software

The VLSI Model

• Implement Algorithm as a mostlycombinational circuit

• Determine the area required forimplementation

• Determine the depth of the circuit

Page 23: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software

Advanced Computer Architecture

The Architecture ofParallel Computers

Page 24: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software

Computer Systems

HardwareArchitecture

OperatingSystem

ApplicationSoftwareNo Component

Can be TreatedIn IsolationFrom the Others

Page 25: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software

Hardware Issues

• Number and Type of Processors

• Processor Control

• Memory Hierarchy

• I/O devices and Peripherals

• Operating System Support

• Applications Software Compatibility

Page 26: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software

Operating System Issues

• Allocating and Managing Resources

• Access to Hardware Features– Multi-Processing

– Multi-Threading

• I/O Management

• Access to Peripherals

• Efficiency

Page 27: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software

Applications Issues

• Compiler/Linker Support

• Programmability

• OS/Hardware Feature Availability

• Compatibility

• Parallel Compilers– Preprocessor

– Precompiler

– Parallelizing Compiler

Page 28: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software

Architecture Evolution

• Scalar Architecture

• Prefetch Fetch/Execute Overlap

• Multiple Functional Units

• Pipelining

• Vector Processors

• Lock-Step Processors

• Multi-Processor

Page 29: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software

Flynn’s Classification

• Consider Instruction Streams and DataStreams Separately.

• SISD - Single Instruction, Single DataStream

• SIMD - Single Instruction, Multiple DataStreams

• MIMD - Multiple Instruction, Multiple DataStreams.

• MISD - (rare) Multiple Instruction, SingleData Stream

Page 30: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software

SISD

• Conventional Computers.

• Pipelined Systems

• Multiple-Functional Unit Systems

• Pipelined Vector Processors

• Includes most computers encountered ineveryday life

Page 31: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software

SIMD

• Multiple Processors Execute a SingleProgram

• Each Processor operates on its own data

• Vector Processors

• Array Processors

• PRAM Theoretical Model

Page 32: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software

MIMD

• Multiple Processors cooperate on a singletask

• Each Processor runs a different program

• Each Processor operates on different data

• Many Commercial Examples Exist

Page 33: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software

MISD

• A Single Data Stream passes throughmultiple processors

• Different operations are triggered ondifferent processors

• Systolic Arrays

• Wave-Front Arrays

Page 34: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software

Programming Issues

• Parallel Computers are Difficult to Program

• Automatic Parallelization Techniques areonly Partially Successful

• Programming languages are few, not wellsupported, and difficult to use.

• Parallel Algorithms are difficult to design.

Page 35: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software

Performance Issues

• Clock Rate / Cycle Time = τ• Cycles Per Instruction (Average) = CPI

• Instruction Count = Ic• Time, T = Ic × CPI × τ• p = Processor Cycles, m = Memory Cycles,

k = Memory/Processor cycle ratio

• T = Ic × (p + m × k) × τ

Page 36: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software

Performance Issues II

• Ic & p affected by processor design andcompiler technology.

• m affected mainly by compiler technology

τ affected by processor design

• k affected by memory hierarchy structureand design

Page 37: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software

Other Measures

• MIPS rate - Millions of instructions persecond

• Clock Rate for similar processors

• MFLOPS rate - Millions of floating pointoperations per second.

• These measures are not neccessarily directlycomparable between different types ofprocessors.

Page 38: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software

Parallelizing Code

• Implicitly– Write Sequential Algorithms

– Use a Parallelizing Compiler

– Rely on compiler to find parallelism

• Explicitly– Design Parallel Algorithms

– Write in a Parallel Language

– Rely on Human to find Parallelism

Page 39: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software

Multi-Processors

• Multi-Processors generally share memory,while multi-computers do not.– Uniform memory model

– Non-Uniform Memory Model

– Cache-Only

• MIMD Machines

Page 40: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software

Multi-Computers

• Independent Computers that Don’t ShareMemory.

• Connected by High-Speed CommunicationNetwork

• More tightly coupled than a collection ofindependent computers

• Cooperate on a single problem

Page 41: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software

Vector Computers

• Independent Vector Hardware

• May be an attached processor

• Has both scalar and vector instructions

• Vector instructions operate in highlypipelined mode

• Can be Memory-to-Memory or Register-to-Register

Page 42: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software

SIMD Computers

• One Control Processor

• Several Processing Elements

• All Processing Elements execute the sameinstruction at the same time

• Interconnection network between PEsdetermines memory access and PEinteraction

Page 43: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software

The PRAM Model

• SIMD Style Programming

• Uniform Global Memory

• Local Memory in Each PE

• Memory Conflict Resolution– CRCW - Common Read, Common Write

– CREW - Common Read, Exclusive Write

– EREW - Exclusive Read, Exclusive Write

– ERCW - (rare) Exclusive Read, Common Write

Page 44: Advanced Computer Architecture - Baylor ECScs.baylor.edu/~maurer/aida/courses/archintro.pdf · Computer Systems Hardware Architecture Operating System Application No Component Software

The VLSI Model

• Implement Algorithm as a mostlycombinational circuit

• Determine the area required forimplementation

• Determine the depth of the circuit