Upload
hoangque
View
222
Download
2
Embed Size (px)
Citation preview
Introduc)on to High Performance Compu)ng Advanced Research Computing
Advanced Research Compu)ng
Outline
• What cons)tutes high performance compu)ng (HPC)?
• When to consider HPC resources • What kind of problems are typically solved? • What are the components of HPC? • What resources are available? • Overview of HPC Resources at Virginia Tech
2
Advanced Research Compu)ng
Should I Pursue HPC for my Problem?
• Are local resources insufficient to meet your needs? – Very large jobs – Very many jobs – Large data
• Do you have na)onal collaborators? – Share projects between different en))es – Convenient mechanisms for data sharing
3
Advanced Research Compu)ng
Who Uses HPC?
Physics (91) 19%
Molecular Biosciences (271)
17%
Astronomical Sciences (115)
13%
Atmospheric Sciences (72)
11%
Materials Research (131)
9%
Chemical, Thermal Sys (89)
8%
Chemistry (161) 7%
ScienEfic CompuEng (60)
2%
Earth Sci (29) 2%
Training (51) 2%
• >2 billion cpu-‐hours allocated
• 1400 alloca)ons • 350 ins)tu)ons • 32 research domains
Advanced Research Compu)ng
Learning Curve
• Linux: Command-‐line interface • Scheduler: Shares resources among mul)ple users
• Parallel Compu)ng: – Need to parallelize code to take advantage of supercomputer’s resources
– Third party programs or libraries make this easier
Advanced Research Compu)ng
Popular SoYware Packages
• Molecular Dynamics: Gromacs, LAMMPS • CFD: OpenFOAM, Ansys • Finite Elements: Deal II, Abaqus • Chemistry: VASP, Gaussian • Climate: CESM • Bioinforma)cs: Mothur, QIIME, MPIBLAST • Numerical Compu)ng/Sta)s)cs: R, Matlab • Visualiza)on: ParaView, Ensight
Advanced Research Compu)ng 8
WHAT IS PARALLEL COMPUTING?
Advanced Research Compu)ng 9
Parallel Compu)ng 101 • Parallel compu)ng: use of mul)ple processors or
computers working together on a common task. – Each processor works on its sec)on of the problem – Processors can exchange informa)on
Grid of Problem to be solved
CPU #1 works on this area of the problem
CPU #3 works on this area of the problem
CPU #4 works on this area of the problem
CPU #2 works on this area of the problem
y
x
exchange exchange
exchange
exchange exchange
Advanced Research Compu)ng 10
Why Do Parallel Compu)ng? • Limits of single CPU compu)ng
– performance – available memory – I/O rates
• Parallel compu)ng allows one to: – solve problems that don’t fit on a single CPU – solve problems that can’t be solved in a reasonable )me
• We can solve… – larger problems – faster – more cases
Advanced Research Compu)ng
A Change in Moore’s Law
Advanced Research Compu)ng
Parallelism is the New Moore’s Law
• Power and energy efficiency impose a key constraint on design of micro-‐architectures
• Clock speeds have plateaued
• Hardware parallelism is increasing rapidly to make up the difference
Advanced Research Compu)ng 13
WHAT DOES A MODERN SUPERCOMPUTER LOOK LIKE?
Advanced Research Compu)ng
Essential Components of HPC – Supercompu)ng resources – Storage – Visualiza)on – Data management – Network infrastructure – Support
15
Advanced Research Compu)ng
Blade : Rack : System • 1 node : 2 x 8 cores = 16 cores • 1 chassis : 10 nodes = 160 cores • 1 rack (frame) : 4 chassis = 640 cores • system : 10 racks = 6,400 cores
x 4 x 10
Advanced Research Compu)ng 18
Shared and distributed memory
• All processors have access to a
pool of shared memory
• Access )mes vary from CPU to CPU in NUMA systems
• Example: SGI UV, CPUs on same node
• Memory is local to each processor
• Data exchange by message passing over a network
• Example: Clusters with single-‐socket blades
P
Memory
P P P P P P P P P
M M MM M
Network
Advanced Research Compu)ng
HPC Trends
Architecture Code
Single core Serial
Mul)core OpenMP, Pthreads
GPU CUDA, OpenACC
Cluster MPI
P
MGPU
Memory Memory
Advanced Research Compu)ng
How are accelerators different? Intel Xeon E5-‐2670
(CPU) Intel Xeon Phi 5110P
(MIC) Nvidia Tesla K20X
(GPU)
Cores 8 60 14 SMX
Logical Cores 16 240 2,688 CUDA cores
Frequency 2.60 GHz 1.05 GHz 0.74 MHz
GFLOPs (double) 333 1,010 1,317
Memory 64 GB 8GB 6GB
Memory B/W 51.2GB/s 320GB/s 250GB/s
Advanced Research Compu)ng
Mul)-‐core systems
• Current processors place mul)ple processor cores on a die • Communica)on details are increasingly complex
– Cache access – Main memory access – Quick Path / Hyper Transport socket connec)ons – Node to node connec)on via network
Memory
Network
Memory Memory Memory Memory
Advanced Research Compu)ng
Accelerator-‐based Systems
• Calcula)ons made in both CPUs and Graphical Processing Unit
• No longer limited to single precision calcula)ons
• Load balancing cri)cal for performance
• Requires specific libraries and compilers (CUDA, OpenCL)
• Co-‐processor from Intel: MIC (Many Integrated Core)
Network
GPU
Memory
GPU
Memory
GPU
Memory
GPU
Memory
Batch Submission Process
Internet
qsub job
Queue: Job script waits for resources. Master: Compute node that executes the job
script, launches all MPI processes.
Compute Nodes
mpirun –np # ./a.out
ibrun ./a.out
Queue
Master Node C1 C3 C2 ssh
Login Node
Advanced Research Compu)ng 24
ARC OVERVIEW
Advanced Research Compu)ng
Advanced Research Compu)ng (ARC)
• Unit within the Office of the Vice President of Informa)on Technology
• Provide centralized resources for: – Research compu)ng – Visualiza)on
• Staff to assist users • Website: hmp://www.arc.vt.edu/
Advanced Research Compu)ng
Goals
• Advance the use of compu)ng and visualiza)on in VT research
• Centralize resource acquisi)on, maintenance, and support for research community
• Provide support to facilitate usage of resources and minimize barriers to entry
• Enable and par)cipate in research collabora)ons between departments
Advanced Research Compu)ng
Personnel • Associate VP for Research Compu)ng: Terry Herdman
• Director, HPC: Vijay Agarwala • Director, Visualiza)on: Nicholas Polys • Computa)onal Scien)sts
– Jus)n Krome)s – James McClure – Brian Marshall – Srinivas Yarlanki – Srijith Rajamohan
Advanced Research Compu)ng
Personnel (Con)nued) • System Administrators
– Tim Rhodes – Chris Snapp – Brandon Sawyers
• Vis & Virtual Reality Specialist: Wole Oyekoya • Business Manager: Alana Romanella • User Support GRAs: Umar Kalim and Di Zhang
Advanced Research Compu)ng
Computa)onal Resources Name BlueRidge HokieSpeed HokieOne Ithaca
Key Features, Uses Large-‐scale CPU or MIC GPU Shared Memory Beginners, MATLAB
Available March 2013 Sept 2012 Apr 2012 Fall 2009
Theore)cal Peak (TFlops/s) 398.7 238.2 5.4 6.1
Nodes 408 201 N/A 79
Cores 6,528 2,412 492 632
Cores/Node 16 12 N/A* 8
Accelerators/Coprocessors
260 Intel Xeon Phi 8 Nvidia K40 GPU 408 Nvidia Tesla GPU N/A N/A
Memory Size 27.3 TB 5.0 TB 2.62 TB 2 TB
Memory/Core 4 GB* 2 GB 5.3 GB 3 GB*
Memory/Node 64 GB* 24 GB N/A* 24 GB*
Advanced Research Compu)ng
Visualiza)on Resources
• VisCube: 3D immersion environment with three 10ʹ′ by 10ʹ′ walls and a floor of 1920×1920 stereo projec)on screens
• DeepSix: Six )led monitors with combined resolu)on of 7680×3200
• ROVR Stereo Wall • AISB Stereo Wall
Advanced Research Compu)ng
Gewng Started on ARC Systems
1. Review ARC’s system specifica)ons and choose the right system(s) for you a. Specialty soYware
2. Apply for an account online the Advanced Research Compu)ng website
3. When your account is ready, you will receive confirma)on from ARC’s system administrators
Advanced Research Compu)ng
Resources
• ARC Website: hmp://www.arc.vt.edu • ARC Compute Resources & Documenta)on: hmp://www.arc.vt.edu/resources/hpc/
• New Users Guide: hmp://www.arc.vt.edu/userinfo/newusers.php
• Frequently Asked Ques)ons: hmp://www.arc.vt.edu/userinfo/faq.php
• Linux Introduc)on: hmp://www.arc.vt.edu/resources/soYware/unix/
Advanced Research Compu)ng
Thank you.
Ques)ons?