Upload
melanie-brandt
View
431
Download
2
Tags:
Embed Size (px)
Citation preview
Enabling Energy Efficient Data Centers for New York and the Nation
Collaborative Partner
Center Vision: To create electronic systems that are self sensing and regulating, and are optimized for energy efficiency at any desired performance level.
E3S works in partnership with industry and academia to develop systematic methodologies for operating information technology, telecommunications, and electronic systems and cooling equipment
IEEC
Electronics Packaging
E3S
Energy-Efficient
Electronic Systems
CASP
Autonomous Solar
CAMM
Flexible Electronics
E3S is part of the NYS Center of Excellence in Small Scale Systems Integration and Packaging
New electronics that improve the way people interact with their surroundings
Supportive Analytical and
Diagnostics Infrastructure
S3IP: academia, industry and government in
collaboration
Binghamton University is a leader in microelectronics R&D in collaboration with government and industry. New $30 million Center of Excellence building will house I/UCRC in Energy-Efficient Electronic Systems, including new data center research laboratory.
Binghamton University
Bahgat Sammakia, Kanad Ghose, Bruce Murray
The University of Texas at Arlington
Dereje Agonafer
Villanova University
Alfonso Ortega, Amy Fleischer
Additional Collaborators:
The Georgia Institute of Technology
Yogendra Joshi
History of successful partnerships with industry
Research collaborations have 10+ year history
Faculty are leaders in this research effort with expertise in: ◦ Thermal management techniques at all levels: chips to
racks to entire data centers
◦ Fast models for thermal analysis and trend prediction
◦ Scalable control systems for co-managing IT and cooling systems
◦ Sensing-controls and model adjustments as needed
Endicott Interconnect Technologies
Corning, Inc.
IBM
General Electric
Emerson Network Power
Panduit
GE
Microsoft
Bloomberg
NYSERDA
Advanced Electronics
Commscope
Sealco
Verizon
Comcast
Steel Orca
DVL
Microsoft
Emerson
DVL
E3S Members
Collaborating Partner Facebook
Lifecycle: 90% of energy expended during the operations
Breakdown of operating costs: ◦ Equipment energy consumption: 45% ◦ HVAC: 50% (about 25% to 30% in
water chilling facilities) ◦ Remainder: 5%
Breakdown inside a modern server: ◦ CPU energy: 20% to 25% ◦ Memory energy: 30% to 45% ◦ Interconnection energy: 10% to 15%
Total energy consumptions: ◦ 2.8% of total national electricity
consumption in US ◦ 1.6% for Western Europe
CPU
22%Others
28%
Interconn
12% Memory
38%
Equip
45%
HVAC
50%
Others
5%
Server overprovisioning is a common practice
Situation becoming worse with smaller form factors
Individual Server
Utilization
0%
Typical
Operating
Region
100%
100%
Power
Energy
Efficiency
Performance is not proportional to
power
Poor energy efficiency
(performance per Watt) on average
HVAC equipment typically use water chillers to supply cold air: takes a long time to adjust cooling system output (water has a very high thermal capacity)
-Difficult for cooling system to track instantaneous load
changes
-Overprovisioning is common here as well to permit handling of sudden load surges
-Using sensed temperature to adjust cooling
system performance is a viable first step,
but not enough
-Solutions for improving the energy efficiency of
data centers have been fairly isolated
System-level, holistic solutions are a MUST
Addresses energy-efficiency improvements of computing and telecommunications systems at multiple spatial and temporal scales: ◦ Synergistic techniques for simultaneously controlling IT load
allocation and thermal management in real time, to minimize the overall energy expenditures includes the development of infrastructures and standards needed for
synergistic control and actuation mechanisms at the system level ◦ From chip-level systems such as processors and operating systems
to data centers, including IT equipment hardware, software, cooling systems
◦ Multi-scale control systems, including scalability and stability issues for both traditional reactive control mechanisms as well as proactive control mechanisms based on reduced order models for predicting workload and thermal trends: the resulting system is self-sensing and self-regulating.
◦ Mechanisms for utilizing waste heat
Five-Year Goals for thermal management infrastructure •Determining key intrinsic energy consumption inefficiencies at every level from devices to entire systems •Techniques for managing data centers synergistically using predictive models for computing workload and thermal trends, driven by live transient models in the control loop •Airflow management techniques based on fast, compact models that are continuously validated with live data and using these models to reduce the overall energy expenditures. •Techniques for improving the energy efficiency of buildings and containers.
2011 2012 2013 2014 2015
Modeling
Validation &
Verification
Infrastructures,
Decision &
Control
Center
Management
Component Validation
Multi-scale subsystem models (e.g.,aisle)
Multi-scale system models (e.g., room/bldg)
Subsystem Validation
System Testbed Demos
Subsystem Testbed Demos
Cyber models (compute loads,etc)
IAB
Benchmarks & Metrics
IAB IAB
Portfolio Assessment
Benchmark/ Assessment
IAB
Portfolio Assessment
Integrated Energy Mgmt Algorithms
IAB
Benchmark/ Assessment
Eval
Gaps &
Refinement
IAB = Industrial Advisory Board
Holistic Exergy based system modeling Back end waste energy recovery modeling
Scalable Control Algorithms
Airflow & Water Management
Software Job Scheduling
Hardware & Software Infrastructures
Typical raised floor data center configuration with alternating hot and cold aisle arrangement.
Plan view of the rack layout for the data center under study.
• Model Details: ▫ CFD software FloTHERM ▫ 20 Racks; 4 CRACs ▫ k-ε turbulence model ▫ Boussinesq approximation ▫ 230k grid cells ▫ Plenum depth: 0.3 m ▫ Tile resistance: 50% ▫ 6 monitor points at rack
inlet. 17
3D View of the room
Team:
Sammakia and Murray (BU), Agonafer (UTA)
Goals:
Develop verified thermal analysis tools to provide real time assessment of Data Center operating conditions to enable operation as dynamic self-sensing and self-regulating systems
Enable Data Centers to run at optimal or near optimal conditions in order to minimize energy consumption for a broad range of specified performance metrics
10
20
30
40
50
60
0 50 100 150 200 250 300 350 400
Inle
t Tem
pera
ture
Time (sec)
60
100
0 200 400
% S
uppl
y
Time (sec)
Rack A1
Outcomes: Data showing measured thermal capacity values for different servers. Modeling methodology for data centers transient response. Design guidelines for thermal management of data center accounting for both steady state and dynamic loads.
Purpose: ◦ Eliminate hot air recirculation. ◦ Eliminate short circuiting of cold air. ◦ Energy efficiency!
Previous problems with fire codes.
Assessing the Effectiveness of Cold Aisle Containment on the Efficiency of the Data Center Cooling Infrastructure
Diagram of a data center with hot air recirculation and cold air short circuiting. (Caption courtesy of www.42U.com) 19
Cannot totally seal aisle due to pressure and noise problems.
3 configurations are used for perforated ceiling.
Aim is to keep rack inlet temperature for a cold aisle uniform.
Assessing the Effectiveness of Cold Aisle Containment on the Efficiency of the Data Center Cooling Infrastructure
(a) 3D model in FloTHERM showing perforated ceiling. (b) 3 different configurations used for perforated ceiling. 20
Rapid drop in inlet temperature when sealing the cold aisle.
Very little variation in inlet temperature per rack.
Tile airflow uniformity and lower hot air recirculation
Assessing the Effectiveness of Cold Aisle Containment on the Efficiency of the Data Center Cooling Infrastructure
Maximum rack temperature for various perforations.
Delta temperature of maximum and minimum temperature across each rack for various
perforations.
All racks powered at 32 kW and CRAC supply is 100%
21
Average variation of 8 oC with no containment
Configuration I: average 𝑇𝑣 is 0.6 oC for 20% open ceiling.
Assessing the Effectiveness of Cold Aisle Containment on the Efficiency of the Data Center Cooling Infrastructure
Temperature profile at height of 1750 mm.
Temperature profile at height of 1750 mm.
22
Fan and system curves for CRACs and servers
Steady state and transient models
Impact of thermal capacity
Impact of airflow leaks
Overall thermodynamic models
Development of a Compact Model for a Server
Top view of the server showing the details of the thermocouple locations (red dots).
Pictures of thermocouples attached
to the server. 24
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 1000020
30
40
50
60
70
80
90
100
Time (s)
Te
mp
era
ture
(oC
)
Measured Graphic Card
Measured HDD
Measured RAM 1
Measured RAM 2
CPU 1 Heatsink Bottom
CPU 1 Heatsink Top
CPU 2 Heatsink Bottom
CPU 2 Heatsink Top
Server Graphic Card
Server HDD
Server RAM 1
Server RAM 2
Server CPU 1
Server CPU 2
0 2000 4000 6000 8000 100004000
6000
8000
10000
12000
Time (s)
Fa
n S
pe
ed
(R
PM
)
Fan 1
Fan 2
Fan 3
Fan 4
0 2000 4000 6000 8000 10000100
150
200
250
300
350
Time (s)
Po
we
r (W
)
Measured power. Measured fan speed.
Development of a Compact Model for a Server
Temperature measurements of different components inside the server. Zoomed in plot of: (a) fan speed,
(b) RAM temperature. 25
Analyze time constants for variable power
Analyze the impact of air flow and temperature fluctuations
Analyze the impact of CRAC failures
Collaborate with Georgia Tech and its partners on infrastructure measurements
Collaborate with UT on developing a simple control model for airflow
Research agenda is unique in its use of techniques that synergistically combine: (a) computing (b) innovative cooling solutions and thermal management techniques and, (c) adaptive, pro-active and scalable control system concepts in devising a system-level solution.
Relevant research is already supported by many sources, including NSF and industry.
Designation as a NSF I/UCRC is formalizing and expanding current collaborations with industry and is enabling development of practical, deployable, high-impact solutions.
Bahgat Sammakia
Interim Vice President for Research
Director
The NSF I/UCRC in Energy-Efficient Electronic Systems
Kand Ghose
Chair, Department of Computer Science
Site Director
The NSF I/UCRC in Energy-Efficient Electronic Systems
Binghamton University
Binghamton, NY
13902-6000
http://s3ip.binghamton.edu