21
High Performance Processor Architecture Neeraj Goel 2004csz8035 Embedded System Group Dept. of Computer Science and Engineering Indian Institute of Technology Delhi http://embedded.cse.iitd.ernet.in/ HU810 Semina

High Performance Processor Architectureneeraj/doc/pentium/pentium.pdf · Intel Pentium processor 1993 3,100,000 Intel Pentium II processor 1997 7,500,000 Intel Pentium III processor

  • Upload
    others

  • View
    36

  • Download
    0

Embed Size (px)

Citation preview

Page 1: High Performance Processor Architectureneeraj/doc/pentium/pentium.pdf · Intel Pentium processor 1993 3,100,000 Intel Pentium II processor 1997 7,500,000 Intel Pentium III processor

High Performance Processor Architecture

Neeraj Goel

2004csz8035

Embedded System Group

Dept. of Computer Science and Engineering

Indian Institute of Technology Delhi

http://embedded.cse.iitd.ernet.in/

HU810 Seminar

Page 2: High Performance Processor Architectureneeraj/doc/pentium/pentium.pdf · Intel Pentium processor 1993 3,100,000 Intel Pentium II processor 1997 7,500,000 Intel Pentium III processor

Outline

Introduction

History and Future prediction

Pentium 4 features

Pipelining

Superscalar features

Hyper-Threading

Conclusion and future

HU810 Seminar

Page 3: High Performance Processor Architectureneeraj/doc/pentium/pentium.pdf · Intel Pentium processor 1993 3,100,000 Intel Pentium II processor 1997 7,500,000 Intel Pentium III processor

Moore’s Law

Intel Microprocessors(source:www.intel.com)

HU810 Seminar

Page 4: High Performance Processor Architectureneeraj/doc/pentium/pentium.pdf · Intel Pentium processor 1993 3,100,000 Intel Pentium II processor 1997 7,500,000 Intel Pentium III processor

Intel’s Processors : past and Current

Year of Introduction Transistors8008 1972 2,500

8080 1974 5,000

8086 1978 29,000

286 1982 120,000

Intel386 processor 1985 275,000

Intel486 processor 1989 1,180,000

Intel Pentium processor 1993 3,100,000

Intel Pentium II processor 1997 7,500,000

Intel Pentium III processor 1999 24,000,000

Intel Pentium 4 processor 2000 42,000,000

Intel Itanium processor 2002 220,000,000

Intel Itanium 2 processor 2003 410,000,000HU810 Seminar

Page 5: High Performance Processor Architectureneeraj/doc/pentium/pentium.pdf · Intel Pentium processor 1993 3,100,000 Intel Pentium II processor 1997 7,500,000 Intel Pentium III processor

How to increase performance

PipeliningBreaking a large system in number of stages

Instruction level parallelismSoftware codes are serially writtenIndependent instructions can be executed parallelLarge number of function units required

Thread level parallelismApplication are written with threadsOperating system can have threadsDifferent application on different thread

HU810 Seminar

Page 6: High Performance Processor Architectureneeraj/doc/pentium/pentium.pdf · Intel Pentium processor 1993 3,100,000 Intel Pentium II processor 1997 7,500,000 Intel Pentium III processor

How Pentium is getting high performance

Rapid execution, more pipelining stages

Out of order execution

Speculative execution

Hyper threading

Trace cache

Store to load forwarding enhancements

HU810 Seminar

Page 7: High Performance Processor Architectureneeraj/doc/pentium/pentium.pdf · Intel Pentium processor 1993 3,100,000 Intel Pentium II processor 1997 7,500,000 Intel Pentium III processor

Pipelining

The concept of splitting a job into sub-processes in whichthe output of one sub-process feeds into the next.

A mechanical example of a pipeline is a washer/dryersystem for clothing.

HU810 Seminar

Page 8: High Performance Processor Architectureneeraj/doc/pentium/pentium.pdf · Intel Pentium processor 1993 3,100,000 Intel Pentium II processor 1997 7,500,000 Intel Pentium III processor

Pipelining

The concept of splitting a job into sub-processes in whichthe output of one sub-process feeds into the next.

A mechanical example of a pipeline is a washer/dryersystem for clothing.

More stages means more throughput also more latency

Issue : All stages should be of almost equal delay otherwiseslowest stage will determine clock cycle

Fetch Decode Execute Write−back

HU810 Seminar

Page 9: High Performance Processor Architectureneeraj/doc/pentium/pentium.pdf · Intel Pentium processor 1993 3,100,000 Intel Pentium II processor 1997 7,500,000 Intel Pentium III processor

Superscalar Architecture

We can have large number if functional units but program isserial

Will multiple instruction fetch solve the problem?

HU810 Seminar

Page 10: High Performance Processor Architectureneeraj/doc/pentium/pentium.pdf · Intel Pentium processor 1993 3,100,000 Intel Pentium II processor 1997 7,500,000 Intel Pentium III processor

Superscalar Architecture

We can have large number if functional units but program isserial

Will multiple fetch solve the problem?

IssuesDependenciesBranches

HU810 Seminar

Page 11: High Performance Processor Architectureneeraj/doc/pentium/pentium.pdf · Intel Pentium processor 1993 3,100,000 Intel Pentium II processor 1997 7,500,000 Intel Pentium III processor

Speculative Execution

Situation: There is pipeline of 20 stages and all are waitingfor branch to be resolved

Effect: Benefits of pipelining and superscalar will vanish atbranch instructions?

Solution?

HU810 Seminar

Page 12: High Performance Processor Architectureneeraj/doc/pentium/pentium.pdf · Intel Pentium processor 1993 3,100,000 Intel Pentium II processor 1997 7,500,000 Intel Pentium III processor

Speculative Execution

Situation: There is pipeline of 20 stages and all are waitingfor branch to be resolved

Effect: Benefits of pipelining and superscalar will vanish onbranches?

Execute both if and else instructions simultaneously

Discard wrong one when result of branch come

HU810 Seminar

Page 13: High Performance Processor Architectureneeraj/doc/pentium/pentium.pdf · Intel Pentium processor 1993 3,100,000 Intel Pentium II processor 1997 7,500,000 Intel Pentium III processor

Thread level parallelism

Multi-processorsSupercomputers

Chip Multi-ProcessingDual core chips like Intel’s Xeon

Simultaneous Multi-threadingOne processor and multiple threadDifferent from multi-programing and multi-tasking

HU810 Seminar

Page 14: High Performance Processor Architectureneeraj/doc/pentium/pentium.pdf · Intel Pentium processor 1993 3,100,000 Intel Pentium II processor 1997 7,500,000 Intel Pentium III processor

Hyper-threading

Makes a single processor appear as multiple logicalprocessors

Each logical processor keeps a its own copy of thearchitecture state

OS view the logical processors as physical processors

Logical processors share a single set of physical resources

HU810 Seminar

Page 15: High Performance Processor Architectureneeraj/doc/pentium/pentium.pdf · Intel Pentium processor 1993 3,100,000 Intel Pentium II processor 1997 7,500,000 Intel Pentium III processor

Hyper-threading

Makes a single processor appear as multiple logicalprocessors

Each logical processor keeps a its own copy of thearchitecture state

OS view the logical processors as physical processors

Logical processors share a single set of physical resources

HU810 Seminar

Page 16: High Performance Processor Architectureneeraj/doc/pentium/pentium.pdf · Intel Pentium processor 1993 3,100,000 Intel Pentium II processor 1997 7,500,000 Intel Pentium III processor

Conclusion and Future

Future processor will need more performance - higher clockspeed

Not possible with shrinking device dimensions

Need architectural solutions

SMP and CMP will be solution

More instruction level parallelism can be exploited usingcompiler techniques

HU810 Seminar

Page 17: High Performance Processor Architectureneeraj/doc/pentium/pentium.pdf · Intel Pentium processor 1993 3,100,000 Intel Pentium II processor 1997 7,500,000 Intel Pentium III processor

Thank You

Thank You

HU810 Seminar

Page 18: High Performance Processor Architectureneeraj/doc/pentium/pentium.pdf · Intel Pentium processor 1993 3,100,000 Intel Pentium II processor 1997 7,500,000 Intel Pentium III processor

Backup

Backup

HU810 Seminar

Page 19: High Performance Processor Architectureneeraj/doc/pentium/pentium.pdf · Intel Pentium processor 1993 3,100,000 Intel Pentium II processor 1997 7,500,000 Intel Pentium III processor

Source Files

http://www.cse.iitd.ernet.in/ neeraj/doc

HU810 Seminar

Page 20: High Performance Processor Architectureneeraj/doc/pentium/pentium.pdf · Intel Pentium processor 1993 3,100,000 Intel Pentium II processor 1997 7,500,000 Intel Pentium III processor

Some Definitions

CacheAn on chip memory with very less access timeCost is moreusually required data can be placed there

Clock speedMentioned in MHz and GHzMHz : Million instructions per second

BusesData, Address and ControlBus width -> Number of parallel bits that can beaccessed

HU810 Seminar

Page 21: High Performance Processor Architectureneeraj/doc/pentium/pentium.pdf · Intel Pentium processor 1993 3,100,000 Intel Pentium II processor 1997 7,500,000 Intel Pentium III processor

Block Diagram of Pentium 4

HU810 Seminar