33
Intel® Core™ Intel® Core™ Duo Processor Duo Processor Behrooz Jafarnejad Winter 2006

Intel® Core™ Duo Processor Behrooz Jafarnejad Winter 2006

Embed Size (px)

Citation preview

Page 1: Intel® Core™ Duo Processor Behrooz Jafarnejad Winter 2006

Intel® Core™ Intel® Core™ Duo ProcessorDuo Processor

Behrooz Jafarnejad

Winter 2006

Page 2: Intel® Core™ Duo Processor Behrooz Jafarnejad Winter 2006

2006 PC WorldWorld Class Award

July 2006 Intel® Core™ Duo processor named Product of the Year by PC World.

Page 3: Intel® Core™ Duo Processor Behrooz Jafarnejad Winter 2006

Outline

Microprocessor After Pentium® ProIntel® Core™ Duo Processor OverviewMicroarchitectureIntel Core 2 Duo Vs. AMD AM2Resources

Page 4: Intel® Core™ Duo Processor Behrooz Jafarnejad Winter 2006

Microprocessor Hall Of Fame

1995: Intel® Pentium® Pro Processor

– Released in the Fall of 1995.

– 5.5 million transistors.

– Designed for 32-bit server and workstation applications.

– Packaged with a second speed-enhancing cache memory chip.

Page 5: Intel® Core™ Duo Processor Behrooz Jafarnejad Winter 2006

Microprocessor Hall Of Fame

1997: Intel® Pentium® II Processor

– 7.5 million transistor.

– incorporates Intel® MMX™ technology, which is designed specifically to process video, audio and graphics data efficiently.

– high-speed cache memory chip.

Page 6: Intel® Core™ Duo Processor Behrooz Jafarnejad Winter 2006

Microprocessor Hall Of Fame

1999: Intel® Pentium® III Processor

– 9.5 million transistors.– Using 0.25-micron technology.– 70 new instructions that enhance the performance of:

• Advanced imaging• 3D• Streaming audio, video

Page 7: Intel® Core™ Duo Processor Behrooz Jafarnejad Winter 2006

Microprocessor Hall Of Fame

2000: Intel® Pentium® 4 Processor

– 42 million transistors.

– Circuit lines of 0.18 microns.

– Intel's first microprocessor, the 4004, ran at 108 KHz, compared to the Intel® Pentium® 4 processor's initial speed of 1.5 GHz. If automobile speed had increased similarly over the same period, you could now drive from San Francisco to New York (about 4100 Km) in about 13 seconds.

Page 8: Intel® Core™ Duo Processor Behrooz Jafarnejad Winter 2006

Microprocessor Hall Of Fame

2006: The Intel® Core™ Duo processor

– 151 million transistor.– Using 65 nm technology.– 2.33 – 2.50 GHz Clock Frequency.– 4-wide, 14 stage pipeline.– Low power consumption.

Page 9: Intel® Core™ Duo Processor Behrooz Jafarnejad Winter 2006

Benefits• New Microarchitecture:

– Low Power.– Higher Performance.

• At Home:– Ultra-quiet.– Sleek and low-power computing.

• For IT: – Reduced footprints – Lower power – Energy efficiency across client and server platforms.

• For Mobile Users:– greater computer performance and battery life to enable a variety of

small form factors that enable world-class computing "on the go.”

Page 10: Intel® Core™ Duo Processor Behrooz Jafarnejad Winter 2006

Performance for an Enhanced Digital Performance for an Enhanced Digital Entertainment ExperienceEntertainment Experience

IntelIntel®® Core™2 Core™2 Duo ProcessorDuo Processor

(Formerly known by the (Formerly known by the codename Conroe)codename Conroe)

New generation of technology New generation of technology

IntelIntel®® Core™ microarchitecture Core™ microarchitecture

High energy efficiencyHigh energy efficiency

Revolutionary performanceRevolutionary performance

Page 11: Intel® Core™ Duo Processor Behrooz Jafarnejad Winter 2006

IntelIntel®® Wide WideDynamic ExecutionDynamic Execution

IntelIntel®® Advanced AdvancedDigital Media BoostDigital Media Boost

IntelIntel®® Intelligent IntelligentPower CapabilityPower Capability

IntelIntel®® Smart SmartMemory AccessMemory Access

IntelIntel®® Advanced AdvancedSmart CacheSmart Cache

Five Key Innovations

Page 12: Intel® Core™ Duo Processor Behrooz Jafarnejad Winter 2006

Intel® AdvancedDigital Media Boost

IntelIntel®® Wide WideDynamic ExecutionDynamic Execution

Intel® SmartMemory Access

Intel® AdvancedSmart Cache

4-wide4-wide

14-stage pipeline14-stage pipeline

Macro-fusionMacro-fusion

Intel® IntelligentPower Capability

Five Key Innovations

Page 13: Intel® Core™ Duo Processor Behrooz Jafarnejad Winter 2006

Intel® SmartMemory Access

Intel® AdvancedSmart Cache

Intel® IntelligentPower Capability

Single-cycleSingle-cycle128-bit SSE128-bit SSE

Intel® WideDynamic Execution

IntelIntel®® Advanced AdvancedDigital Media BoostDigital Media Boost

Five Key Innovations

Page 14: Intel® Core™ Duo Processor Behrooz Jafarnejad Winter 2006

IntelIntel®® Advanced AdvancedSmart CacheSmart Cache

Intel® AdvancedDigital Media Boost

Intel® SmartMemory Access

Intel® IntelligentPower Capability

Shared L2 cacheShared L2 cache

Intel® WideDynamic Execution

Five Key Innovations

Page 15: Intel® Core™ Duo Processor Behrooz Jafarnejad Winter 2006

Intel® AdvancedDigital Media Boost

Intel® AdvancedSmart Cache

IntelIntel®® Smart SmartMemory AccessMemory Access

Intel® IntelligentPower Capability

Advanced Pre-fetchAdvanced Pre-fetch

MemoryMemoryDisambiguationDisambiguation

Intel® WideDynamic Execution

Five Key Innovations

Page 16: Intel® Core™ Duo Processor Behrooz Jafarnejad Winter 2006

Intel® AdvancedDigital Media Boost

Intel® SmartMemory Access

IntelIntel®® Intelligent IntelligentPower CapabilityPower Capability

Advanced Advanced Power GatingPower Gating

Intel® WideDynamic Execution

Intel® AdvancedSmart Cache

Five Key Innovations

Page 17: Intel® Core™ Duo Processor Behrooz Jafarnejad Winter 2006

Intel® Wide Dynamic Execution

• Fetch• Dispatch: Decode + (Read

from Memory)• Execute• Retire up: Write Back

• Macro-Fusion: combination of certain common x86 instructions into a single instruction for execution.

Page 18: Intel® Core™ Duo Processor Behrooz Jafarnejad Winter 2006

Pipeline Concept

In Computing, a pipeline is a set of data processing elements connected in series, so that the output of one element is the input of the next one. The elements of a pipeline are often executed in parallel or in time-sliced fashion; in that case, some amount of buffer storage is often inserted between elements.

Page 19: Intel® Core™ Duo Processor Behrooz Jafarnejad Winter 2006

Intel® Wide Dynamic Execution

• Dynamic execution is a combination of such techniques:

– Data-Flow Analysis.– Out-of-Order Execution (OoOE).– Speculative Execution.– Super Scalar.Intel first implemented these techniques in the P6

microarchitecture used in the Pentium® Pro processor, Pentium® II processor and Pentium® III processors.

Page 20: Intel® Core™ Duo Processor Behrooz Jafarnejad Winter 2006

Intel® Wide Dynamic Execution

• It enables delivery of more instructions per clock cycle to improve execution time and energy efficiency.

• Every execution core is 33 percent wider than previous generations, allowing each core to fetch, dispatch, execute and retire up to four full instructions simultaneously.

Page 21: Intel® Core™ Duo Processor Behrooz Jafarnejad Winter 2006

Intel® Advanced Digital Media Boost

• SIMD:– In computing, SIMD (Single Instruction, Multiple Data)

is a technique employed to achieve data level parallelism, as in a vector or array processor.

• SSE (Streaming SIMD Extensions) – is a SIMD instruction set designed by INTEL and

introduced in 1999 in their Pentium III series processors as a reply to AMD's 3DNow! (which had debuted a year earlier).

– contains 70 new instructions.– SSE2/SSE3 are later versions of SSE.

Page 22: Intel® Core™ Duo Processor Behrooz Jafarnejad Winter 2006

Intel® Advanced Digital Media Boost

• Enables these 128-bit instructions to be completely executed at a throughput rate of one per clock cycle, effectively doubling the speed of execution for these instructions as compared to previous generations.

• This feature significantly improves performance when executing Streaming SIMD Extension (SSE/SSE2/SSE3) instructions:– Video, Speech and Image (MPEG).– Photo Processing.– Encryption.

Page 23: Intel® Core™ Duo Processor Behrooz Jafarnejad Winter 2006

Intel® Advanced Smart Cache

The Intel® Advanced Smart Cache is a multi-core optimized cache that significantly reduces latency to frequently used data, thus improving performance and efficiency by increasing the probability that each execution core of a multi-core processor can access data from a higher-performance, more efficient cache subsystem.

Page 24: Intel® Core™ Duo Processor Behrooz Jafarnejad Winter 2006

Intel® Smart Memory Access

• Optimizing the use of the available data bandwidth from the memory subsystem .

• Includes a new capability called “Memory Disambiguation“, which increases the efficiency of out-of-order processing by providing the execution cores with the built-in intelligence to speculatively load data for instructions that are about to execute before all previous store instructions are executed.

Page 25: Intel® Core™ Duo Processor Behrooz Jafarnejad Winter 2006

Intel® Intelligent Power Capability

• A set of capabilities designed to reduce power consumption and design requirements.

• This feature manages the runtime power consumption of all the processor's execution cores and allocates energy to the part which needs energy.

Page 26: Intel® Core™ Duo Processor Behrooz Jafarnejad Winter 2006

FeatureFeature DescriptionDescription FunctionFunction BenefitBenefit

IntelIntel® ® Advanced Advanced Smart CacheSmart Cache

Up to 4MB shared and multi-Up to 4MB shared and multi-core optimized core optimized L2 cacheL2 cacheHigher L2 cache to Higher L2 cache to processor core bandwidthprocessor core bandwidth

Improves execution core access Improves execution core access to data in high perf. to data in high perf. L2 cacheL2 cacheDynamically allocates cache Dynamically allocates cache based on core workload- entire based on core workload- entire L2 cache can be allocated to L2 cache can be allocated to one core (dedicated L2 for each one core (dedicated L2 for each core in PDP and K8 DC)core in PDP and K8 DC)

Better performance on Better performance on single and multithreaded single and multithreaded applicationsapplications

IntelIntel®® Advanced Advanced Digital Media Digital Media BoostBoost

Single cycle SSE/2/3 Single cycle SSE/2/3 instruction executioninstruction execution

Allows 128 bit SSE/2/3 Allows 128 bit SSE/2/3 instructions to execute in a instructions to execute in a single clock cycle (versus 2 single clock cycle (versus 2 cycles for PDP, Yonah-DC, and cycles for PDP, Yonah-DC, and K8 DC)K8 DC)

Better performance on Better performance on video, gaming and video, gaming and multimedia applications multimedia applications (Applications that rely on (Applications that rely on SSE instructions)SSE instructions)

IntelIntel®® Wide Wide Dynamic Dynamic ExecutionExecution

Efficient 4-wide, 14 Efficient 4-wide, 14 stage pipelinestage pipeline

Executes 4 instructions per clock Executes 4 instructions per clock (versus 3 per clock with PDP, (versus 3 per clock with PDP, Yonah-DC, and K8 DC)Yonah-DC, and K8 DC)

Better performance on Better performance on multiple application types multiple application types and user environmentsand user environments

IntelIntel® ® Intelligent Intelligent Power CapabilityPower Capability

Powers on processor Powers on processor elements only when neededelements only when neededMore precise control of More precise control of power to buses and arrays power to buses and arrays

Conroe 65W desktop Conroe 65W desktop mainstream TDPmainstream TDPMerom continues low power Merom continues low power mobile processor direction mobile processor direction

Can help enable Can help enable quieter, lower power system quieter, lower power system designsdesigns

IntelIntel®® Smart Smart Memory AccessMemory Access

Improved pre-fetchersImproved pre-fetchersOut of order memory accessOut of order memory access

Feeds the Intel Wide Dynamic Feeds the Intel Wide Dynamic Execution engine (IE, “fuel- Execution engine (IE, “fuel- injection” for the Core engine)injection” for the Core engine)Benefits for memory operations Benefits for memory operations reduce latencyreduce latency

Better performance on all Better performance on all types of applications and types of applications and user environmentsuser environments

New levels of performance and power efficiency based on Intel® CoreTM Microarchitecture

Page 27: Intel® Core™ Duo Processor Behrooz Jafarnejad Winter 2006

Intel Core 2 Duo Vs. AMD AM2

The results from SYSmark 2004SE, which simulates real-life workloads for both Internet Content Creation and Office Productivity. The content-creation part uses apps like Photoshop, 3ds Max, Dreamweaver, and more, while the office-productivity tests use typical office apps, such as PowerPoint, Word, and Excel.

Page 28: Intel® Core™ Duo Processor Behrooz Jafarnejad Winter 2006

Intel Core 2 Duo Vs. AMD AM2

Page 29: Intel® Core™ Duo Processor Behrooz Jafarnejad Winter 2006

Intel Core 2 Duo Vs. AMD AM2

Page 30: Intel® Core™ Duo Processor Behrooz Jafarnejad Winter 2006

Intel Core 2 Duo Vs. AMD AM2

Page 31: Intel® Core™ Duo Processor Behrooz Jafarnejad Winter 2006

Resources

• Intel.com• PCWorld.com• ExtremeTech.com• Wikipedia.org• Microsoft.com

Page 32: Intel® Core™ Duo Processor Behrooz Jafarnejad Winter 2006
Page 33: Intel® Core™ Duo Processor Behrooz Jafarnejad Winter 2006