Upload
duongcong
View
235
Download
1
Embed Size (px)
Citation preview
Abstract
Tile: Intel Rack Scale Architecture
• This talk provides an overview of Intel Rack Scale Architecture and discusses how this architecture addresses underutilized and stranded resources in a Data center through resource pooling!
• We will also specifically discuss concept of a pooled system, storage node, pooling of PCIe as well as NVMe based storage.
• The impact of pooling on latency, radix and failure domains will also be discussed.
• Further pooling introduces a need for composition of the platform. We will also discuss the characteristics of such platform composition how software can emerge to take advantage of these capabilities!
Intel Confidential
Rack Scale Architecture Mohan J. Kumar
Intel Fellow, Data Center Group
Data Center Challenges Infrastructure has not kept up with increasing business demands
3
Business Needs • Reduce operational and capital expenses. • Deliver new services in minutes, not months. • Optimize data center based on real-time analytics. • Address application workload needs with agility. • Scale capacity without interruption.
Less than 50% server utilization2
Inefficiency
Data growth doubles every 18 months1
Growth
Agility New services can take a week or more to provision1
1 Worldwide and Regional Public IT Cloud Services 2013–2017 Forecast. IDC (August 2013) 2 IDC’s Digital Universe Study, sponsored by EMC, December 2012
4
Today’s Architecture
• Proprietary and preconfigured
• Upgrade as a system
• Limited flexibility
5
What’s Next? A seismic shift in how data centers are built and managed—powered by Intel
All infrastructure delivered as a service
Resources automatically tuned to application workloads
Hyper-scalable to keep up with business demands
6
Software Defined Infrastructure Software Defined Infrastructure
SAN
NAS
Network Appliance
Telco Appliance
…
Dedicated Appliances
7
Simplified Platform Management
Disaggregate
Pool & Compose
Intel® Rack Scale Architecture Logical architecture for efficiently building and managing cloud infrastructure—and providing the simplest path to a software defined data center.
User-Defined Performance
Maximum Utilization
Interoperable Solutions
Increase performance per TCO$ & accelerate cloud adoption
8
Rack Scale – Architecture Framework
8
Modular scalable management architecture
1. Pooled systems 2. Network fabric management 3. Pod-wide storage 4. Pod management
Pod-wide Management
Scalability
v
Pod wide storage Network fabric
9
Management Software Framework Flexible management architecture allowing for range of implementation options
• Asset & location discovery • Disaggregated resource management • Composable system support
• Support compute, network, and storage
Comprehensive management architecture v
Network Switch
Rack Scale API
Rack Scale POD Management API
Rack Scale Pooled System Platform
• Supports range of server processors
• Pooled NVM
• Supports Ethernet fabric
• Service Model requires Full node replacement
• Supports range of server processors • Direct Attach Storage • Pooled NVM • Supports Ethernet fabric • Redundant networks, scalable DAS storage, sub
node FRUs
Storage Node Compute Node
Rack Scale Pooled NVMe Controller (PNC)
Assign Drive2 to Node1
Server Node 1
Drive 2
Logical effect of the assignment
Server Node 2
Drive 1
Pooled NVMe
Controller
… Drive 1 Drive 2 Drive n
… Server Node 1
Server Node 2
Server Node n
PSME Mgt Port
Drive 3
Drive 3
Assign Drive1, Drive 3 to Node 2
Rack Scale Pooled NVMe Controller (PNC)
Assign ½ capacity of Drive1 to Node1
Pooled NVMe
Controller
… … Drive 1 Drive 2 Drive n
… Server Node 1
Server Node 2
Server Node n
PSME Mgt Port
Server Node 1
Drive 1’
Logical effect of the assignment
Server Node 2
Drive 1”
Capacity: ½ TB Capacity: ½ TB
Capacity: 1TB
Assign ½ capacity of Drive1 to Node2
13
Local vs. Pooled Storage
Avg IOPS/SVR < Local Capacity IOPS/SVR > Local Capacity
Pool
ing
Valu
e
Consolidation of Storage
Consolidation of compute
Pooled Infrastructure cost
Op Ex Value of pooling
less than capacity deployed
More than capacity deployed
WL/SVR Local Deployment model?
Avg IOPS/SVR = Local Capacity
Diverse Workload deployment
…
Rack Scale Pooled NVMe Controller (PNC) • Enable pooling and disaggregation of PCIe devices
away from compute / storage nodes
• Enable disaggregation of PCIe devices including Storage, FPGA
• Assign high performance storage to nodes based on workload demand
• Allow full and partial drive assignment
• Prevent SPOF through host failover
• Enables ease of workload migration in hyperscale environment
• Enables better utilization of Data Center resources by allowing composable high performance IO capacity
Pooled NVMe
Controller
… Drive 1 Drive 2 Drive n
… Server Node 1
Server Node 2
Server Node n
PSME Mgt Port
Drive 3
Rack Scale Pooled NVMe Controller (PNC) • Enable disaggregation of NVM Express
devices
• Utilizes NVMeOF to expand radix of pooling
• Assign storage to Compute or Storage nodes based on workload demand
• Prevent SPOF through host failover
• Enables ease of workload migration in hyperscale environment
• Enables better utilization of DC resources through composition
PNC
…
…
Management link
Server Node 1
PSME
Server Node 2
Server Node n
Drive 1 Drive 2 Drive n
Ethernet
16
Local vs. Pooled Resources Resource Local to a Node Pooled Resource
Device Attach and Capacity
Limited by Physical constraints
Not constrained by node volumetrics
Device Availability (Failure Domain)
Node is SPOF Pooled Fabric
Utilization Limited by Local use No stranded capacity/capability
Latency Local Access Incurs Additional Pooling Latency
Radix Local Limited by Pooling Fabric (one rack to multiple racks)
Refresh and Life Cycle Management
Node based Component based
Composable Infrastructure – Software Implications
17
Orchestration that comprehends composition capabilities
Location aware placement of workloads
Location aware placement in hyperscale storage
Monitoring software that knows the physical bounds of the hardware
Software (OS, VMM, App) capability to take advantage of dynamically added resources
Composed System 1
Composed System n
Compose
Release
Pool of Resources
Summary
18
INTEROPERABLE SOLUTIONS
MAXIMUM UTILIZATION
• Tailor performance to meet application SLAs by selecting from pooled compute, storage & network resources
• Easily scale capacity with modular, buy-as-you-go architecture
• Autonomously manage compute, network & storage pools to virtually eliminate stranded resources
• Interoperable system architecture simplifies data center operations and integration of multi-vendor solutions
USER-DEFINED PERFORMANCE
“Buy What You Need. Use What You Buy”
Intel Confidential — Do Not Forward