28
Enabling Energy Efficient Data Centers for New York and the Nation Collaborative Partner

E3 s binghamton

Embed Size (px)

Citation preview

Page 1: E3 s binghamton

Enabling Energy Efficient Data Centers for New York and the Nation

Collaborative Partner

Page 2: E3 s binghamton

Center Vision: To create electronic systems that are self sensing and regulating, and are optimized for energy efficiency at any desired performance level.

E3S works in partnership with industry and academia to develop systematic methodologies for operating information technology, telecommunications, and electronic systems and cooling equipment

Page 3: E3 s binghamton

IEEC

Electronics Packaging

E3S

Energy-Efficient

Electronic Systems

CASP

Autonomous Solar

CAMM

Flexible Electronics

E3S is part of the NYS Center of Excellence in Small Scale Systems Integration and Packaging

New electronics that improve the way people interact with their surroundings

Supportive Analytical and

Diagnostics Infrastructure

Page 4: E3 s binghamton

S3IP: academia, industry and government in

collaboration

Page 5: E3 s binghamton

Binghamton University is a leader in microelectronics R&D in collaboration with government and industry. New $30 million Center of Excellence building will house I/UCRC in Energy-Efficient Electronic Systems, including new data center research laboratory.

Page 6: E3 s binghamton

Binghamton University

Bahgat Sammakia, Kanad Ghose, Bruce Murray

The University of Texas at Arlington

Dereje Agonafer

Villanova University

Alfonso Ortega, Amy Fleischer

Additional Collaborators:

The Georgia Institute of Technology

Yogendra Joshi

Page 7: E3 s binghamton

History of successful partnerships with industry

Research collaborations have 10+ year history

Faculty are leaders in this research effort with expertise in: ◦ Thermal management techniques at all levels: chips to

racks to entire data centers

◦ Fast models for thermal analysis and trend prediction

◦ Scalable control systems for co-managing IT and cooling systems

◦ Sensing-controls and model adjustments as needed

Page 8: E3 s binghamton

Endicott Interconnect Technologies

Corning, Inc.

IBM

General Electric

Emerson Network Power

Panduit

GE

Microsoft

Bloomberg

NYSERDA

Advanced Electronics

Commscope

Sealco

Verizon

Comcast

Steel Orca

DVL

Microsoft

Page 9: E3 s binghamton

Emerson

DVL

E3S Members

Collaborating Partner Facebook

Page 10: E3 s binghamton

Lifecycle: 90% of energy expended during the operations

Breakdown of operating costs: ◦ Equipment energy consumption: 45% ◦ HVAC: 50% (about 25% to 30% in

water chilling facilities) ◦ Remainder: 5%

Breakdown inside a modern server: ◦ CPU energy: 20% to 25% ◦ Memory energy: 30% to 45% ◦ Interconnection energy: 10% to 15%

Total energy consumptions: ◦ 2.8% of total national electricity

consumption in US ◦ 1.6% for Western Europe

CPU

22%Others

28%

Interconn

12% Memory

38%

Equip

45%

HVAC

50%

Others

5%

Page 11: E3 s binghamton

Server overprovisioning is a common practice

Situation becoming worse with smaller form factors

Individual Server

Utilization

0%

Typical

Operating

Region

100%

100%

Power

Energy

Efficiency

Performance is not proportional to

power

Poor energy efficiency

(performance per Watt) on average

Page 12: E3 s binghamton

HVAC equipment typically use water chillers to supply cold air: takes a long time to adjust cooling system output (water has a very high thermal capacity)

-Difficult for cooling system to track instantaneous load

changes

-Overprovisioning is common here as well to permit handling of sudden load surges

-Using sensed temperature to adjust cooling

system performance is a viable first step,

but not enough

-Solutions for improving the energy efficiency of

data centers have been fairly isolated

System-level, holistic solutions are a MUST

Page 13: E3 s binghamton

Addresses energy-efficiency improvements of computing and telecommunications systems at multiple spatial and temporal scales: ◦ Synergistic techniques for simultaneously controlling IT load

allocation and thermal management in real time, to minimize the overall energy expenditures includes the development of infrastructures and standards needed for

synergistic control and actuation mechanisms at the system level ◦ From chip-level systems such as processors and operating systems

to data centers, including IT equipment hardware, software, cooling systems

◦ Multi-scale control systems, including scalability and stability issues for both traditional reactive control mechanisms as well as proactive control mechanisms based on reduced order models for predicting workload and thermal trends: the resulting system is self-sensing and self-regulating.

◦ Mechanisms for utilizing waste heat

Page 14: E3 s binghamton

Five-Year Goals for thermal management infrastructure •Determining key intrinsic energy consumption inefficiencies at every level from devices to entire systems •Techniques for managing data centers synergistically using predictive models for computing workload and thermal trends, driven by live transient models in the control loop •Airflow management techniques based on fast, compact models that are continuously validated with live data and using these models to reduce the overall energy expenditures. •Techniques for improving the energy efficiency of buildings and containers.

Page 15: E3 s binghamton

2011 2012 2013 2014 2015

Modeling

Validation &

Verification

Infrastructures,

Decision &

Control

Center

Management

Component Validation

Multi-scale subsystem models (e.g.,aisle)

Multi-scale system models (e.g., room/bldg)

Subsystem Validation

System Testbed Demos

Subsystem Testbed Demos

Cyber models (compute loads,etc)

IAB

Benchmarks & Metrics

IAB IAB

Portfolio Assessment

Benchmark/ Assessment

IAB

Portfolio Assessment

Integrated Energy Mgmt Algorithms

IAB

Benchmark/ Assessment

Eval

Gaps &

Refinement

IAB = Industrial Advisory Board

Holistic Exergy based system modeling Back end waste energy recovery modeling

Scalable Control Algorithms

Airflow & Water Management

Software Job Scheduling

Hardware & Software Infrastructures

Page 16: E3 s binghamton
Page 17: E3 s binghamton

Typical raised floor data center configuration with alternating hot and cold aisle arrangement.

Plan view of the rack layout for the data center under study.

• Model Details: ▫ CFD software FloTHERM ▫ 20 Racks; 4 CRACs ▫ k-ε turbulence model ▫ Boussinesq approximation ▫ 230k grid cells ▫ Plenum depth: 0.3 m ▫ Tile resistance: 50% ▫ 6 monitor points at rack

inlet. 17

3D View of the room

Page 18: E3 s binghamton

Team:

Sammakia and Murray (BU), Agonafer (UTA)

Goals:

Develop verified thermal analysis tools to provide real time assessment of Data Center operating conditions to enable operation as dynamic self-sensing and self-regulating systems

Enable Data Centers to run at optimal or near optimal conditions in order to minimize energy consumption for a broad range of specified performance metrics

10

20

30

40

50

60

0 50 100 150 200 250 300 350 400

Inle

t Tem

pera

ture

Time (sec)

60

100

0 200 400

% S

uppl

y

Time (sec)

Rack A1

Outcomes: Data showing measured thermal capacity values for different servers. Modeling methodology for data centers transient response. Design guidelines for thermal management of data center accounting for both steady state and dynamic loads.

Page 19: E3 s binghamton

Purpose: ◦ Eliminate hot air recirculation. ◦ Eliminate short circuiting of cold air. ◦ Energy efficiency!

Previous problems with fire codes.

Assessing the Effectiveness of Cold Aisle Containment on the Efficiency of the Data Center Cooling Infrastructure

Diagram of a data center with hot air recirculation and cold air short circuiting. (Caption courtesy of www.42U.com) 19

Page 20: E3 s binghamton

Cannot totally seal aisle due to pressure and noise problems.

3 configurations are used for perforated ceiling.

Aim is to keep rack inlet temperature for a cold aisle uniform.

Assessing the Effectiveness of Cold Aisle Containment on the Efficiency of the Data Center Cooling Infrastructure

(a) 3D model in FloTHERM showing perforated ceiling. (b) 3 different configurations used for perforated ceiling. 20

Page 21: E3 s binghamton

Rapid drop in inlet temperature when sealing the cold aisle.

Very little variation in inlet temperature per rack.

Tile airflow uniformity and lower hot air recirculation

Assessing the Effectiveness of Cold Aisle Containment on the Efficiency of the Data Center Cooling Infrastructure

Maximum rack temperature for various perforations.

Delta temperature of maximum and minimum temperature across each rack for various

perforations.

All racks powered at 32 kW and CRAC supply is 100%

21

Page 22: E3 s binghamton

Average variation of 8 oC with no containment

Configuration I: average 𝑇𝑣 is 0.6 oC for 20% open ceiling.

Assessing the Effectiveness of Cold Aisle Containment on the Efficiency of the Data Center Cooling Infrastructure

Temperature profile at height of 1750 mm.

Temperature profile at height of 1750 mm.

22

Page 23: E3 s binghamton

Fan and system curves for CRACs and servers

Steady state and transient models

Impact of thermal capacity

Impact of airflow leaks

Overall thermodynamic models

Page 24: E3 s binghamton

Development of a Compact Model for a Server

Top view of the server showing the details of the thermocouple locations (red dots).

Pictures of thermocouples attached

to the server. 24

Page 25: E3 s binghamton

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 1000020

30

40

50

60

70

80

90

100

Time (s)

Te

mp

era

ture

(oC

)

Measured Graphic Card

Measured HDD

Measured RAM 1

Measured RAM 2

CPU 1 Heatsink Bottom

CPU 1 Heatsink Top

CPU 2 Heatsink Bottom

CPU 2 Heatsink Top

Server Graphic Card

Server HDD

Server RAM 1

Server RAM 2

Server CPU 1

Server CPU 2

0 2000 4000 6000 8000 100004000

6000

8000

10000

12000

Time (s)

Fa

n S

pe

ed

(R

PM

)

Fan 1

Fan 2

Fan 3

Fan 4

0 2000 4000 6000 8000 10000100

150

200

250

300

350

Time (s)

Po

we

r (W

)

Measured power. Measured fan speed.

Development of a Compact Model for a Server

Temperature measurements of different components inside the server. Zoomed in plot of: (a) fan speed,

(b) RAM temperature. 25

Page 26: E3 s binghamton

Analyze time constants for variable power

Analyze the impact of air flow and temperature fluctuations

Analyze the impact of CRAC failures

Collaborate with Georgia Tech and its partners on infrastructure measurements

Collaborate with UT on developing a simple control model for airflow

Page 27: E3 s binghamton

Research agenda is unique in its use of techniques that synergistically combine: (a) computing (b) innovative cooling solutions and thermal management techniques and, (c) adaptive, pro-active and scalable control system concepts in devising a system-level solution.

Relevant research is already supported by many sources, including NSF and industry.

Designation as a NSF I/UCRC is formalizing and expanding current collaborations with industry and is enabling development of practical, deployable, high-impact solutions.

Page 28: E3 s binghamton

Bahgat Sammakia

Interim Vice President for Research

Director

The NSF I/UCRC in Energy-Efficient Electronic Systems

Kand Ghose

Chair, Department of Computer Science

Site Director

The NSF I/UCRC in Energy-Efficient Electronic Systems

Binghamton University

Binghamton, NY

13902-6000

http://s3ip.binghamton.edu