Upload
others
View
7
Download
0
Embed Size (px)
Citation preview
1
Hardware design for Cloud-scale datacenters
USENIX LISA14
2
Public Cloud
Disaster
Recovery /
Business
Continuity
3
2.4+ millionemails per day
200+ Cloud Services1+ billion customers · 20+ million businesses · 90+ markets worldwide
5.8+ billionworldwide queries each month
1 in 4enterprise customers
50+ billionminutes of connections handled
each month
48+ millionusers in 41 markets
50+ million
active users
400+ millionactive accounts
250+ million
active users
8.6+ trillionobjects in Microsoft Azure storage
4
Design
<10K
SMB/Enterprise
100K
Hosters
1M
Cloud-Scale
# SKUs Several Limited Extremely limited
Redundancy modelHardware based
(Hot-*)
Software based
(Local datacenter)
Software based
(Geo-distributed)
HW availability 99.999% or higher 99.9% - 99.999% 99% - 99.9%
HW type Enterprise SKUOff-the-shelf design,
custom integration
Custom designs,
custom integration
Infrastructure
co-designNone
Limited integration
with Datacenter and
Network
OS, Datacenter, Server
and Network tightly
integrated
5
Operations
<10K
SMB/Enterprise
100K
Hosters
1M
Cloud-Scale
Break/fix support 24 hours x 7 days 8 hours x 5 days Up to 1-2 weeks
Issue triage model IT adminSome automation,
Admin support
Fully automated,
Machine learning
OOB HW
managementFull command set,
BMC required
Basic feature set,
BMC required
Power On/Off only,
No BMC
Management
domain scale100’s of servers 1000’s of servers
10’s of 1000’s of
servers
FRU granularityHot-swappable
components
Component
replacement
Entire server
replacement
6
Storage Stamp
Partition Layer
Stream LayerIntra-Stamp Replication
Storage Stamp
Partition Layer
Stream LayerIntra-Stamp Replication
7
…
Write
Commit operation
Erasure Coding operations
8
Query performance is measured as an aggregate of ALL compute nodes
Source: “Web search using mobile cores, ISCA 2010
Query distribution
Index unit 2 Index unit … Index unit nIndex unit 1
Partition 11
Partition 12
Partition 1m
Partition 21
Partition 22
Partition 2m
Partition n1
Partition n2
Partition nm
9
10
Performance
Power Cost
Reliability
UniformityCustomization
Agility
Simplicity
11
Architecture should be adapt to variety of cloud workloads
Support for global datacenter operating environments
CISPR, ANSI, IEC), UL, IEC, CSA)
12
Design Principles
Standardization & Modularization
Design Simplicity
Operations Excellence
13
Open CloudServer (OCS) design
Open Source CodeChassis management
Operations Toolkit
Board Files & GerbersPower Distribution Backplane
Tray Backplane
Mechanical CAD ModelsChassis, Blade, Mezzanines
SpecificationsChassis, Blade, Mezzanines
Management APIs
Certification Requirements
http://www.opencompute.org/wiki/Server/SpecsAndDesigns
14
Shared infrastructure for efficiency and TCO optimization
Computeblade
Signalbackplane
Shared power
Shared management
JBODexpansion
Shared fans
12U Shared ChassisEIA Rack Mountable
15
Blind-mated signal connectivity
Simplified installation and repair
Cable free design for significantly fewer operator errors during servicing
Reduces need for cabling reseats Blind-mated connectors (12V Power, Ethernet, SAS,
Management)
SignalBackplane
25%
75%
Network Repairs1
H/W Replaced Reseated
16
HDDs are #1 failure item AFR increases with temperature1
Simplified fan control cools HDDs HDDs in front of hot motherboard
Closed loop fan moderates temperatures
1DSN 2011: Impact of Temperature on Hard Disk
Drive Reliability in Large Datacenters
17
Secure OOB management
Low-cost embedded x86 SoC
REST API for machine management
CLI interface for human operations
Hard-wired management
On/Off to blade power cut-off circuit
IPMI-over-serial out of band communication
Fan and PSU control and monitoring
Remote switch and CM power control
Chassis Manager (CM)
PDB
X86 SoC
COM4COM3
CPLD Serial Multiplexer (x2)
Serial to I2C (x2)
Bla
de
Ad
dre
ss
TX/RX
RS232 Serial
to/from blades
Blade EnableON/OFF
6 PSU
6Fans
6Fans
6Fans
6Fans6
Fans
6Fans Fan
Control
GPIO
I2C Mux
PMBUS
PWM
COM1
COM2
COM5
COM6
1GbE
1GbE
ON/OFF
Remote Power Control
18
Security at all layersHardware, UEFI, APIs, User Management
Trusted Platform Module v1.2Blades and Chassis manager
UEFI Firmware v2.3.2Secure BIOS and Boot
Chassis manager interfacesTLS (SSL) and IPsec for communication encryption
User ManagementActive Directory integration and authentication
UEFI
2.3.2
TPM
TLS/SSL
IPsec
Role BasedManagement
Active DirectoryIntegration
19
BMC-Lite
IPMI basic mode over Serial
I2C Master (SDR)
UART I/O
System Event Log
Power Control
KVM, Video drivers
Ethernet, Network Stack or SOL
USB
Full IPMI Command Set
20
Targeted for deployment and production support
Features
http://github.com/MSOpenTech/OCSOperationsToolKit
21
Identify defective components by physical location
Summarize data for quick repairs
22
View configuration command - View-WcsConfig
23
View-Disk, View-Dimm, View-Nic, View-Fru, etc
24
Check, clear, and log the Windows System Event Log and BMC SEL
View contents of BMC SEL
25
Commands…
Example: Update-WcsConfig Command
26
27
Q & A