Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Grid computing and emerging HPC
trends in e-Science
Dr. Fabrizio GagliardiEMEA Director
External Research
Microsoft ResearchKasetsart University
Bangkok, September 19th
Outline
• Research e- and cyber-infrastructures for e-Science
• The experience of the Grid
• Examples beyond e-Science
• Issues : Complexity, Cost, Security, Standards
• Emerging HPC trends: Green IT, Cloud Computing, Virtualisation, Data Centers, Software as a Service, Multi-core architectures…
Cost analysis: Grid vs Cloud Computing
Conclusions
EC co-funded e-Infrastructures in Europe:• Research Network infrastructure:
– GEANT pan-European network interconnecting NRENs
• Computing Grid Infrastructure:– Enabling Grids for E-SciencE (EGEE project)
– Transition to the sustainable European Grid:EGI_DS project
• Data & Knowledge Infrastructure:– Digital Libraries (D4Science) and repositories (DRIVER-II)
• A series of other projects :– Middleware interoperation, applications, policy and
support actions, etc.
Cyber-Infrastructures around the world:• Similar activities in US and Asia/Pacific
Research e-Infrastructures for e-Science
19/9/2008 3Kasetsart University, Thailand
Kasetsart University, Thailand 4
Steady growth / Improved reliability~80.000 cores, ~400PBs, 261 sites, >270 VOsStats by GStat, 09/09/2008
The EGEE Production Grid Infrastructure
0
50
100
150
200
250
300
Apr 04
Jul 04
Okt 04
Jan 05
Apr 05
Jul 05
Okt 05
Jan 06
Apr 06
Jul 06
Okt 06
Jan 07
Apr 07
Jul 07
Okt 07
Jan 08
Apr 08
No. Sites
01000020000300004000050000600007000080000
Ap
r 04
Jul 0
4
Okt
04
Jan
05
Ap
r 05
Jul 0
5
Okt
05
Jan
06
Ap
r 06
Jul 0
6
Okt
06
Jan
07
Ap
r 07
Jul 0
7
Okt
07
Jan
08
Ap
r 08
No. Cores
Courtesy of Erwin Laure19/9/2008
• EC is expected to further invest on e-Infrastructures and Grids in particular*– Call 4 - GEANT3 and Data interoperability call closed earlier in September
(113M)– Call 5 – Small call (~10M) on Scientific publications repositories, ERA-NET
and international cooperation due March 2009– Call 6 – ICT based e-Infrastructures (EGI plus other Grid infrastructures and
Virtual Communities) due Autumn 2009 – estimated 75M € for around 2 years (50 M € EGI plus other Grids and 25 M € Virtual Communities)
– Call 7 – ICT based e-Infrastructures (Scientific Data infrastructures) due Spring 2010
– Call 8 - estimated for Spring 2012
• EC consultation on 2010 Work Programme on Grids, Data and Virtual Communities taking place in November (by invitation only)
• Next e-IRG workshop (21-22 October, Paris) is expected to promote the use of e-Infrastructures by the ESFRI facilities (ESFRI preparatory projects invited) – www.e-irg.eu
* Main source: EC Work Programme 2008 (WP2009 will be updating the above and be released in October)
EC continues to invest in Grid-based e-Infrastructures
19/9/2008 Kasetsart University, Thailand 5
• Grids for e-Science: mainly a success story!
– Several maturing Grid Middleware stacks
– Many HPC applications using the Grid• Some (HEP, Bio) in production use
• Some still in testing phase: more effort
required to make the Grid their day-to-day workhorse
– e-Health applications also part of the Grid
– Some industrial applications: • CGG Energy
The experience of Grid / HPC 1/3
19/9/2008 6Kasetsart University, Thailand
• Grids beyond e-Science?
– Slower adoption: prefer different environments, tools and have different TCOs
• Intra grids, internal dedicated clusters , cloud computing
– e-Business applications• Finance, ERP, SMEs and Banking!
– Industrial applications• Energy, Automotive, Aerospace, Pharmaceutical industry,
Telecom
– e-Government applications• Earth Observation, Civil protection:
• e.g. The Cyclops project
The experience of Grid / HPC 2/3
19/9/2008 7Kasetsart University, Thailand
• IT Industry demonstrated interest in becoming an HPC infrastructure provider and/or user (intra-grids):– On-demand infrastructures:
• Cloud and Elastic computing, pay as you go…• Data centers: Data getting more and more attention
– Service hosting: outsourced integrated services• Software as a Service (SaaS)
– Virtualisation being exploited in Cloud and Elastic computing (e.g. Amazon EC2 virtual instances)
• “Pre-commercial procurement”– Research-industry collaboration in Europe to achieve new
leading-edge products• Example: PRACE building a PetaFlop Supercomputing Centre in Europe
The experience of Grid / HPC 3/3
19/9/2008 8Kasetsart University, Thailand
9
Examples beyond e-Science
CitiGroup (Citigroup Inc., operating as Citi, is a major American financial
services company based in New York City) adopted Grid computing
http://www.americanbanker.com/usb_article.html?id=20080825IXTFW8BS
19/9/2008 Kasetsart University, Thailand
- Citi chose Platform Computing's Symphony
grid product to consolidate its computing
assets into a single resource pool with increased
utilization
- At Citi, since the grid was implemented,
individual business units are charged for the
processing power they use, creating a shared
services environment
-Citi is now using near 20,000 CPUs and there
are periods of the day where the utilization rate
is 100 percent
-Citi is planning of using the cloud in cases their
data centers do not suffice (overflow model or
cooperative data centers)
10
Examples beyond e-Science
EU BEinGRID: Computational Fluid Dynamics
19/9/2008 Kasetsart University, Thailand
11
Examples beyond e-Science
CYCLOPS: Forest Fire propagation
19/9/2008 Kasetsart University, Thailand
12
Examples beyond e-Science
EGEODE VO : Seismic processing based on Geocluster
application by CGG company (France)
19/9/2008 Kasetsart University, Thailand
13
In summary…
• Grid computing has delivered an affordable high
performance computing infrastructure to scientists all
over the world to solve intense computing and storage
problems within constrained research budget
• This has also been effectively used by industry to
increase the usage of their HPC infrastructure and
reduce Total Cost of Ownership (TCO)
• Grid is not only aggregating computing resources but
also leveraging international research networks to
deliver an effective and irreplaceable channel for
international collaboration
19/9/2008 Kasetsart University, Thailand
14
The flip side…
• Major issues with wide adoption of Grid computing in e-
Science, e-Business, industry etc. have to do with:
• Cost of operations and complexity of management
• Not a solution for all problems (latency, fine grain
parallelism are difficult)
• Difficult to use for the average single scientist
• Security and reliability in general
• Power consumption and heat dissipation are becoming
a limiting factor to consumer based distributed systems
• We are observing the limits of Moore‟s law
19/9/2008 Kasetsart University, Thailand
Kasetsart University, Thailand
Switching Gears:
“To Distribute or Not To Distribute”
• Prof. Satoshi Matsuoka, TITech
• Keynote at Mardi Gras Conference, Baton Rouge, Jan.31, 2008
• In the late 90s, petaflops were considered very hard and
at least 20 years off …
• while grids were supposed to happen right way
• After 10 years (around now) petaflos are “real close” but
there‟s still no “global grid”
• What happened: It was easier to put together massive clusters than to get people to agree about how to share their resources For tightly coupled HPC applications, tightly coupled machines are still necessary Grids are inherently suited for loosely coupled apps or enabling access to machines and/or data
• With Gilder's Law*, bandwidth to the compute resources will promote thin client approach
* “Bandwidth grows at least three times faster than computer power." This means that if computer power doubles every eighteen months (per Moore's Law), then communications power doubles every six months
• Example: Tsubame machine in Tokyo
19/9/2008 15
Supercomputing Reached the Petaflop
IBM RoadRunner at
Los Alamos National Lab
19/9/2008 16Kasetsart University, Thailand
17
Emerging trends: Green IT, pay per CPU/GB
virtualisation and/or HPC in every lab? (1/4)
• The Green Grid (www.thegreengrid.org) consortium (AMD,
APC, Dell, HP, IBM, Intel, Microsoft, Rackable Systems,
Sun Microsystems and Vmware) , IBM Project Big Green (a
$1 billion investment to dramatically increase the
efficiency of IBM products) and other IT industry initiatives try
to address current HPC limits in energy and environmental
impact requirements
• Computer and data centers in energy and environmental
favorable locations are becoming important
• Elastic computing, Computing on the Cloud, Data Centers and
Service Hosting - Software as a Service, are becoming the new
emerging solutions for HPC applications
• Many-multi-core and CPU accelerators are promising potential
breakthroughs
19/9/2008 Kasetsart University, Thailand
18
Emerging trends: Cloud computing and
storage on demand (2/4)
•Cloud Computing: http://en.wikipedia.org/wiki/Cloud_computing
•Amazon, IBM, Google, Microsoft, Sun, Yahoo, major „Cloud Platform‟
potential providers
•Operating compute and storage facilities around the world
•Have developed middleware technologies for resource sharing and
software services
•First services already operational - Examples:
•Amazon Elastic Computing Cloud (EC2) -Simple Storage Service (S3)
•Google Apps www.google.com/a
•Sun Network.com www.network.com (1$/CPU hour, no contract cost)
•IBM Grid solutions (www.ibm.com/grid)
•GoGrid – a division of ServePath company (www.gogrid.com) Beta
released (pay-as-you-go and pre-paid plans, server manageability)
19/9/2008 Kasetsart University, Thailand
19
Emerging trends: Cloud computing and
storage on demand (3/4)
http://www.itjungle.com/bns/bns100807-story02.html
http://www.itjungle.com/tug/tug050307-story05http://www.computerworld.com/action/article.do?command=viewArticleBasic&taxonomyName=mainframes_and_supercomputers&articleId=9073758&taxonomyId=6
7&intsrc=kc_top
19/9/2008 Kasetsart University, Thailand
Emerging trends: Virtualisation (4/4)
Virtual PresentationPresentation layer separate from process
Virtual StorageStorage and backup over the network
Virtual NetworkLocalizing dispersed resources
Virtual MachineOS can be assigned to any desktop or server
Virtual ApplicationsAny application on any computer on-demand
Virtualization is the isolation of one computing
resource from the others (not only Virtual Machine):
19/9/2008
Slide Courtesy of Neil Sanderson,
UK Product Manager,
Virtualisation and Management,
Microsoft Ltd.20Kasetsart University, Thailand
Virtualisation market: The game is only starting Only 5% of servers are virtualised
93.00%
4.90%1.75% 0.35%
World WideVirtualization Adoption
Non-virtualized servers
VMware
• Computerworld
– “Although virtualization has been the buzz among technology providers, only 6% of enterprises have actually deployed virtualization on their networks, said Levine, citing a TWP Research report. That makes the other 94% a wide-open market.”
– “We calculate that roughly 6% of new servers sold last year were virtualized and project that 7% of those sold this year will be virtualized and believe that less than 4% of the X86 server installed base has been virtualized to date.
• Pat Gelsinger, Intel VP Sept. 2007
– “Only 5% of servers are virtualized.”
Slide Courtesy of Neil Sanderson19/9/2008 21Kasetsart University, Thailand
22
EGEE cost estimation (1/2)
Capital Expenditures (CAPEX):
a. Hardware costs: 80.000 CPUs ~ in the order of 120M Euros
(80-160M)
Depreciating the infrastructure in 4 years:30Meuros per year
(20M to 40M)
b. Cooling and power installations (supposing existing housing
facilities available)
25% of H/W costs: 30M, depreciated over 5 years: 6M Euros
Total: ~ 36M Euros / year (26M-46M)
Slide Courtesy of Fotis Karayannis19/9/2008 Kasetsart University, Thailand
23
EGEE cost estimation (2/2)
Operational Expenditures (OPEX):
a. 20 MEuros per year for all EGEE costs (including site
administration, operations, middleware etc.)
b. Electricity ~10% of h/w costs: 12M Euros per year (other
calculations lead to similar results)
c. Internet connectivity: Supposing no connectivity costs
(existing over-provisioned NREN connectivity)
*If other model is used (to construct the service from scratch), then
network costs should be taken into account
Total 32M / year
CAPEX+OPEX= 68M per year (58-78M)
Slide Courtesy of Fotis Karayannis19/9/2008 Kasetsart University, Thailand
24
EGEE if performed with Amazon EC2 and S3
In the order of ~50M Euros, probably more cost effective of EGEE
actual cost, depending on the promotion of the EC2/S3 service
Slide Courtesy of Bob Jones19/9/2008 Kasetsart University, Thailand
25
Cloud mature enough for big sciences?
http://www.symmetrymagazine.org/breaking/2008/05/23/are-commercial-computing-clouds-
ready-for-high-energy-physics/
http://www.csee.usf.edu/~anda/papers/dadc108-palankar.pdf
19/9/2008Kasetsart University, Thailand
Probably not yet, as not designed for them; Does not support complex
Scenarios: “S3 lacks in terms of flexible access control and support for
delegation and auditing, and it makes implicit trust assumptions”
Challenge: High Productivity Computing
“Make high-end computing easier and
more productive to use.
Emphasis should be placed on time to
solution, the major metric of value to
high-end computing users…
A common software environment for
scientific computation encompassing
desktop to high-end systems will
enhance productivity gains by promoting
ease of use and manageability of
systems.”
2004 High-End Computing
Revitalization Task Force
Office of Science and
Technology Policy,
Executive Office of the
President
The target 1
Data Gathering
Discovery and
Browsing
Science
Exploration
Domain specific
analyses Scientific Output
“Raw” data includes
sensor output, data
downloaded from
agency or collaboration
web sites, papers
(especially for ancillary
data)
“Raw” data browsing for
discovery (do I have
enough data in the right
places?), cleaning (does
the data look obviously
wrong?), and light weight
science via browsing
“Science variables” and
data summaries for early
science exploration and
hypothesis testing.
Similar to discovery and
browsing, but with
science variables
computed via gap filling,
units conversions, or
simple equation.
“Science variables”
combined with models,
other specialized code,
or statistics for deep
science understanding.
Scientific results via
packages such as
MatLab or R2. Special
rendering package such
as ArcGIS.
Paper preparation.
Courtesy Catherine Van Ingen, MSR
The data pipeline
19/9/2008 27Kasetsart University, Thailand
The target 2
Explosion of Data from everywhere to
everywhere
19/9/2008 28Kasetsart University, Thailand
Experiments Archives LiteratureSimulations
Petabytes
Doubling every
2 years
• We are at a flex point in the evolution of distributed computing (nothing new under the sun…)
• Grid remains a good solution for a reduced number of communities (and often for social/political reasons)
• Cloud computing and hosted services are emerging as the next incarnation of distributed computing with some obvious additional advantages (think of data centres located in Iceland or Siberia)
30
Conclusion (1/2)
19/9/2008 Kasetsart University, Thailand
• Many changes are happening in the basic underpinning technology (parallel everywhere!)
• New boundary constraints and very much energy are becoming the limiting factor to the otherwise still valid Moore’s Law…
• If we will be able to harness the potential enormous power of parallel computing (not so good story so far) then we might be able to provide better computing solutions for research in energy and eventually better energy solutions for our computing needs!
31
Conclusion (2/2)
19/9/2008 Kasetsart University, Thailand
Thanks to Thaigrid for the kind invitation and to all of you for
your attentionContact me at:
Fabrig @ microsoft.com
32
Thanks
19/9/2008 Kasetsart University, Thailand