13
1 Projects on the Intel Single-chip Cloud Computer (SCC) Jan-Arne Sobania Dr. Peter Tröger Prof. Dr. Andreas Polze Operating Systems and Middleware Group Hasso Plattner Institute for Software Systems Engineering University of Potsdam Projects on the Intel SCC | Jan-Arne Sobania 2 Outline Intel SCC Overview Our Contributions Debugging Support: Virtual Serial Ports Understanding the SCC: New SCC Linux More Debugging: Dynamic Voltage and Frequency Scaling SCC Virtualization: The RockyVisor Summary

Projects on the Intel Single-chip Cloud Computer (SCC) Projects on the Intel Single-chip Cloud Computer (SCC) Jan-Arne Sobania Dr. Peter Tröger Prof. Dr. Andreas Polze Operating Systems

Embed Size (px)

Citation preview

1

Projects on the Intel Single-chip Cloud Computer (SCC)

Jan-Arne Sobania

Dr. Peter TrögerProf. Dr. Andreas PolzeOperating Systems and Middleware GroupHasso Plattner Institute for Software Systems EngineeringUniversity of Potsdam

Projects on the Intel SCC | Jan-Arne Sobania

2

Outline

■ Intel SCC Overview

■ Our Contributions

□ Debugging Support: Virtual Serial Ports

□ Understanding the SCC: New SCC Linux

□ More Debugging: Dynamic Voltage and Frequency Scaling

□ SCC Virtualization: The RockyVisor

■ Summary

2

Projects on the Intel SCC | Jan-Arne Sobania

3

Intel Single-Chip Cloud Computer

■ 48 x86 cores organized in 24 tiles

■ A 24-router mesh network with 256GB bisection bandwidth

■ 4 integrated DDR3 memory controllers

■ Hardware support for message passing

■ Power management: 24 FIs, 6 VIs

■ No hardware cache-coherency

Research Processor, not a product

Projects on the Intel SCC | Jan-Arne Sobania

4

Third takeIntel DX58SO

Management Console PC (MCPC)Intel Core i7

Intel SCC

3

Projects on the Intel SCC | Jan-Arne Sobania

5

Intel SCCRockyLake inside here

Management Console PC (MCPC)Intel Core i7

Projects on the Intel SCC | Jan-Arne Sobania

6

South Bridgevia FPGA

BMC

On-Board I/O

SCC(under the heat spreader)

Host/MCPC ConnectionPCIe x4, BMC-LAN

4

Projects on the Intel SCC | Jan-Arne Sobania

7

SCC: Cluster-on-a-chip

■ Heterogeneous cores

□ Memory and I/O bandwidth/latency depend on position on the die

□ 32-bit cores: not all system memory visible all the time

□ Again: no hardware cache coherency

Projects on the Intel SCC | Jan-Arne Sobania

8

SCC Address Translation

■ “GaussLake”-Cores derive from P54C (original Pentium@~90MHz)

□ 32-bit Physical Addresses

■ All components accessed via 46-bit System Address

□ Route to tile (x,y coordinates)

□ Destination ID (directly connected to router)

□ 34-bit address per destination

■ Mapping via per-coreLook-Up Table (LUT)

5

Projects on the Intel SCC | Jan-Arne Sobania

9

SCC Physical -> System Address Map.

■ Traditional P54C: Physical Address would be seen on FSB

■ On SCC: Physical Address mapped to System Address via LUT

Look-Up Table

□ Memory access via mesh msg.

■ 8-bit LUT index translates to:

□ 1-bit bypass

□ 8-bit route

□ 3-bit destination

□ 10-bit address extension

Up to 34 bit (16GB) per destination

Projects on the Intel SCC | Jan-Arne Sobania

10

Outline

■ Intel SCC Overview

■ Our Contributions

□ Debugging Support: Virtual Serial Ports

□ Understanding the SCC: New SCC Linux

□ More Debugging: Dynamic Voltage and Frequency Scaling

□ SCC Virtualization: The RockyVisor

■ Summary

6

Projects on the Intel SCC | Jan-Arne Sobania

11

SCC Debugging: Virtual Serial Ports

■ How to debug operating systems on SCC cores?

□ Virtual Ethernet requires full network driver stack (oops…)

□ Regular x86 Linux: 16550A UART (“serial port”)

□ SCC does not have a chipset, no UART implemented

■ Our contribution: Virtual Serial Ports

□ 4x 16550A-compatible UARTs per SCC core (192 total)

□ Fully transparent for SCC Linux; use stock driver□ But without interrupts…

□ Implemented in kernel driver on MCPC

□ Connected to device on MCPC via virtual Null-Modem Cable□ minicom –D /dev/crbif0rb0c9ttyS0 (Core 9’s /dev/ttyS0)

Projects on the Intel SCC | Jan-Arne Sobania

12

Understanding the SCC:New SCC Linux

■ Intel-provided SCC Linux:

□ Based on 2.6.16 (released 2006)□ Outdated, no virtualization

□ Various patches for SCC scattered throughout the kernel□ Hard to understand□ Hard to port to newer versions

■ Our contribution: New patches based on Linux’ subarchitectures

□ SCC BIOS: minimal BIOS “emulation”, no more patches

□ Re-engineered other patches into subarchitecture callbacks

□ New drivers for hardware features (global TSC, SCC /proc, …)

Merged with new Intel patches in September (on-site)

To be published with next SCC firmware in December

Maintained by our group at HPI

7

Projects on the Intel SCC | Jan-Arne Sobania

13

Outline

■ Intel SCC Overview

■ Our Contributions

□ Debugging Support: Virtual Serial Ports

□ Understanding the SCC: New SCC Linux

□ More Debugging: Dynamic Voltage and Frequency Scaling

□ SCC Virtualization: The RockyVisor

■ Summary

Projects on the Intel SCC | Jan-Arne Sobania

14

More Debugging: Dynamic Voltage and Frequency Scaling (DVFS)

■ SCC has programmable Voltage Regulator Controller (VRC) and Clock Generators

□ 6 Voltage Islands: 2x2 tile

□ 24 Frequency Islands: each tile (100-800MHz)

□ Reconfigurable at runtime (aside from mesh and routers)

8

Projects on the Intel SCC | Jan-Arne Sobania

15

Fixing undocumented JTAG Sequences

■ Implementation Bug: Tile frequencies cannot be changed

□ Depending on startup configuration (initial tile, mesh and memory clock), reprogramming tile clock generators fails silently□ Register settings change, but frequency does not Time measurement on cores unreliable Accidental under-volting possible

■ Reason: Fixed Frequency Bits/“Override Settings” set on startup

□ Undocumented bits in per-tile register control which settings can be changed by software

■ Our contribution: Fixed undocumented JTAG sequences

Approved by Intel, next SCC firmware will include our patches

Projects on the Intel SCC | Jan-Arne Sobania

17

Outline

■ Intel SCC Overview

■ Our Contributions

□ Debugging Support: Virtual Serial Ports

□ Understanding the SCC: New SCC Linux

□ More Debugging: Dynamic Voltage and Frequency Scaling

□ SCC Virtualization: The RockyVisor

■ Summary

9

Projects on the Intel SCC | Jan-Arne Sobania

18

Software-Managed Coherency

Projects on the Intel SCC | Jan-Arne Sobania

19

Shared Memory Driver:Lessons learned

■ Shared Memory Driver simulates coherency for applications

□ Still requires changes to management layer (malloc, mmap,…)

Not sufficient for SMP simulation for legacy software

■ Simulating an SMP (really: Single System Image/SSI) system

□ User mode/custom libc: simulate system calls (many…)

□ Kernel mode (upper level): simulate SMP-style API (many…)

□ Kernel mode (lower level): synchronize data structures□ Breaks OS layering (+ brittle)

□ Hypervisor mode: simulate SMP platform□ Unmodified Applications and OS□ Hypervisor implements SMP interfaces the real HW lacks That’s what the RockyVisor does on SCC

10

Projects on the Intel SCC | Jan-Arne Sobania

20

SCC Virtualization: The RockyVisor

■ A Hypervisor for the Intel SCC Platform (“RockyLake”)

□ Hosted Virtual Machine, Hypervisor runs as process in host OS

□ “Virtual Machine Monitor” and “Hypervisor” used as synonyms

■ Multiple independent nodes, each having local memory

■ Host OS: One instance per node (“Level 1 OS”)

□ Default SCC operating mode

Projects on the Intel SCC | Jan-Arne Sobania

21

SCC Virtualization: The RockyVisor

■ Separate hypervisor process per LV1 instance

■ Hypervisors communicate via on-die message passing

■ Responsibilities

□ Virtual devices (console, network, …) belong to single HV

□ Virtual (cache-coherent) Shared Memory

□ VM-global state in shared memory (virtual interrupt controller)

□ Virtual CPU per HV (registers, TLB, shadow page tables)

11

Projects on the Intel SCC | Jan-Arne Sobania

22

SCC Virtualization: The RockyVisor

■ Cooperating Hypervisors provide illusion of SMP hardware

■ Cache coherency simulated via DSM-like protocol

□ SCC already supports shared memory, so no replication

□ Explicit cache flushes if memory needed by other node

□ Use x86 Page Protection to prevent harmful interference

Projects on the Intel SCC | Jan-Arne Sobania

23

SCC Virtualization: The RockyVisor

■ Cooperating Hypervisors provide illusion of SMP hardware

■ Each hypervisor adds one Virtual CPU per physical CPU

□ VCPUs communicate just like in real SMP

□ Inter-Processor-Interrupts (IPI) via Inter-HV messages

12

Projects on the Intel SCC | Jan-Arne Sobania

24

SCC Virtualization: The RockyVisor

■ Cooperating Hypervisors provide illusion of SMP hardware

□ Allow standard SMP OS (“Level 2 OS”) in VM

□ LV2 kernel needs to support virtual peripherals

Projects on the Intel SCC | Jan-Arne Sobania

25

Summary

■ Intel SCC

A prototype for future many-core architectures

■ Our contributions

□ Debugging Support for cores: Virtual Serial Ports

□ New Linux patches for easy porting to new kernel versions

□ Fixed JTAG sequences to enable Frequency Scaling Will both become official part of SCC software distribution

□ Distributed Hypervisor to support symmetric multiprocessing operating systems (SMP OS): Our RockyVisor

13

Projects on the Intel SCC | Jan-Arne Sobania

26