12
Computational Challenges in Cold QCD Bálint Joó, Jefferson Lab Computational Nuclear Physics Workshop SURA Washington, DC July 23-24, 2012 Monday, July 23, 2012

Computational Challenges in Cold QCDComputational Challenges in Cold QCD Bálint Joó, Jefferson Lab Computational Nuclear Physics Workshop SURA Washington, DC July 23-24, 2012 Monday,

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Computational Challenges in Cold QCDComputational Challenges in Cold QCD Bálint Joó, Jefferson Lab Computational Nuclear Physics Workshop SURA Washington, DC July 23-24, 2012 Monday,

Computational Challenges in Cold QCD

Bálint Joó, Jefferson LabComputational Nuclear Physics Workshop

SURAWashington, DCJuly 23-24, 2012

Monday, July 23, 2012

Page 2: Computational Challenges in Cold QCDComputational Challenges in Cold QCD Bálint Joó, Jefferson Lab Computational Nuclear Physics Workshop SURA Washington, DC July 23-24, 2012 Monday,

2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 20250.1

1

10

100

1000

10000

Cum

ulat

ive

cycl

es (P

F ye

ars)

2 x 20 PF & Moore's law

Cycles from Titan, Mira

5

Our science requires that we advance computational capability 1000x over the next decade Mission: Deploy and operate the computational resources required to tackle global challenges

Vision: Maximize scientific productivity and progress on the largest scale computational problems

•! Deliver transforming discoveries in climate, materials, biology, energy technologies, etc.

•! Ability to investigate otherwise inaccessible systems, from regional climate impacts to energy grid dynamics

•! Providing world-class computational resources and specialized services for the most computationally intensive problems

•! Providing stable hardware/software path of increasing scale to maximize productive applications development

Cray XT5 2+ PF Leadership system for science

OLCF-3: 10-20 PF Leadership system with some HPCS technology

2009 2012 2015 2018

OLCF-5: 1 EF

OLCF-4: 100-250 PF based on DARPA HPCS technology

Exascale era

this is a guesstimate...peak

performances

Monday, July 23, 2012

Page 3: Computational Challenges in Cold QCDComputational Challenges in Cold QCD Bálint Joó, Jefferson Lab Computational Nuclear Physics Workshop SURA Washington, DC July 23-24, 2012 Monday,

2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 20250.1

1

10

100

1000

10000

Cum

ulat

ive

cycl

es (P

F ye

ars)

2 x 20 PF & Moore's law5% allocation, 10% of peak efficiency10% allocation, 10% of peak efficiency

Cycles from Titan, Mira

Milestone

Monday, July 23, 2012

Page 4: Computational Challenges in Cold QCDComputational Challenges in Cold QCD Bálint Joó, Jefferson Lab Computational Nuclear Physics Workshop SURA Washington, DC July 23-24, 2012 Monday,

2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 20250.1

1

10

100

1000

10000

Cum

ulat

ive

cycl

es (P

F ye

ars)

2 x 20 PF & Moore's law5% allocation, 10% of peak efficiency10% allocation, 10% of peak efficiency

Initial cost: 60 PF y = 12 PF y generation + 48 PF y analysis

Sooner is better

Milestone

Monday, July 23, 2012

Page 5: Computational Challenges in Cold QCDComputational Challenges in Cold QCD Bálint Joó, Jefferson Lab Computational Nuclear Physics Workshop SURA Washington, DC July 23-24, 2012 Monday,

2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 20250.1

1

10

100

1000

10000

Cum

ulat

ive

cycl

es (P

F ye

ars)

2 x 20 PF & Moore's law5% allocation, 10% of peak efficiency10% allocation, 10% of peak efficiency

Initial cost: 60 PF y = 12 PF y generation + 48 PF y analysis

Improve Algorithms

Multigrid: 10x in analysis

Milestone

Monday, July 23, 2012

Page 6: Computational Challenges in Cold QCDComputational Challenges in Cold QCD Bálint Joó, Jefferson Lab Computational Nuclear Physics Workshop SURA Washington, DC July 23-24, 2012 Monday,

2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 20250.1

1

10

100

1000

10000

Cum

ulat

ive

cycl

es (P

F ye

ars)

2 x 20 PF & Moore's law5% allocation, 10% of peak efficiency10% allocation, 10% of peak efficiency

Initial cost: 60 PF y = 12 PF y generation + 48 PF y analysis

Improve Algorithms

Multigrid: 10x in analysis

Projection: 3x in generation

Milestone

Monday, July 23, 2012

Page 7: Computational Challenges in Cold QCDComputational Challenges in Cold QCD Bálint Joó, Jefferson Lab Computational Nuclear Physics Workshop SURA Washington, DC July 23-24, 2012 Monday,

2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 20250.1

1

10

100

1000

10000

Cum

ulat

ive

cycl

es (P

F ye

ars)

2 x 20 PF & Moore's law5% allocation, 10% of peak efficiency10% allocation, 10% of peak efficiency

Initial cost: 60 PF y = 12 PF y generation + 48 PF y analysis

Improve Performance

Multigrid: 10x in analysis

Projection: 3x in generation

Performance: 15% of peak

Milestone

Monday, July 23, 2012

Page 8: Computational Challenges in Cold QCDComputational Challenges in Cold QCD Bálint Joó, Jefferson Lab Computational Nuclear Physics Workshop SURA Washington, DC July 23-24, 2012 Monday,

Required Partnerships

CS & AM LCFs

Vendors

AlgorithmsProg. Models ArchitecturesPerformance

I/OTools

Systems

SciDACINCITEALCC

Early access

relationship basedpartnerships

Lattice QCD

Monday, July 23, 2012

Page 9: Computational Challenges in Cold QCDComputational Challenges in Cold QCD Bálint Joó, Jefferson Lab Computational Nuclear Physics Workshop SURA Washington, DC July 23-24, 2012 Monday,

Algorithms and Architectures• Improve algorithms & extract more performance

– shift the cost down, to meet milestones– implement currently known techniques (e.g. MultiGrid)

• Partnerships with CS & AM: SciDAC• Partnerships with Vendors

– Co-design activity with IBM (BG/L & BG/Q)– Partnership with NVIDIA (GPUs)– Partnership with Intel (Xeon, MIC)

• Partnerships with Leadership Computing Facilities– Partnership with OLCF (TitanDev),– Partnership with ALCF (VEAS/VESTA, Mira)– Partnership with LLNL (Sequoia, Edge, uBGL) – Partnership with NSF Centers (Blue Waters, Stampede)

• Greatest impact is from software: need people

Monday, July 23, 2012

Page 10: Computational Challenges in Cold QCDComputational Challenges in Cold QCD Bálint Joó, Jefferson Lab Computational Nuclear Physics Workshop SURA Washington, DC July 23-24, 2012 Monday,

Exascale Era• Concurrency:

– In principle no problem: • O(10M) way now (lattice sites), O(100M-1000M) soon, V4 growth

– Difficulty is “portably efficient” mapping to hardware• Diverse architectures (Homogeneous Multicore, Heterogeneous Manycore)• Few vendor neutral performance portable programming models

– Heterogeneity• Current LQCD GPU approaches ignore CPU • Need to develop truly heterogeneous approaches

• Resiliency– Some computations can be checked (e.g. Solves)– I/O & data management may become (already are?) issues– Will we need to change our methodology (?)

Monday, July 23, 2012

Page 11: Computational Challenges in Cold QCDComputational Challenges in Cold QCD Bálint Joó, Jefferson Lab Computational Nuclear Physics Workshop SURA Washington, DC July 23-24, 2012 Monday,

Staffing and Resource• HPC is becoming an integral part of Nuclear Physics

– lack of HPC training is already a disadvantage• Need long term career structure for computational scientists

– end “computing doesn’t lead to physics career” stereotype• Useful to coordinate academic & technical positions

– hire theoretical and computational staff together• e.g. faculty at local university, supporting computational staff at nearby center

• Industry and Leadership Facilities can play a strong role:– studentships, internships, training workshops

• Develop skill-set, interest and experience early on

– liaisons• Maintain awareness of QCD • Provide guidance for architectures, systems, and policies for QCD needs

Monday, July 23, 2012

Page 12: Computational Challenges in Cold QCDComputational Challenges in Cold QCD Bálint Joó, Jefferson Lab Computational Nuclear Physics Workshop SURA Washington, DC July 23-24, 2012 Monday,

Summary & Conclusions• In order to meet the Scientific Milestones QCD will need:

– increased levels of HPC (cycles)– capability on both homogeneous & heterogeneous H/W– investment in algorithmic and programming model research– strong links between domain, AM & CS communities– strong partnerships with LCFs and Vendors to anticipate

and influence architectures and systems• Meeting these needs requires an increase in staffing

– both short and long term– to cope with architectural diversity– to develop algorithms and methodology towards exascale– to enable scientists to ‘choose computational’ as a career

Monday, July 23, 2012