22
DRIVING PURDUE UNIVERSITY'S NEXT GIANT LEAP IN HPC AND DATA SCIENCE PRESTON SMITH EXECUTIVE DIRECTOR OF RESEARCH COMPUTING [email protected]

DRIVING PURDUE UNIVERSITY'S NEXT GIANT LEAP IN ......– Buy 1 node or 100, you get a queue that guarantees access up to that many resources • But wait, there’s more!! – What

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: DRIVING PURDUE UNIVERSITY'S NEXT GIANT LEAP IN ......– Buy 1 node or 100, you get a queue that guarantees access up to that many resources • But wait, there’s more!! – What

DRIVING PURDUE UNIVERSITY'S NEXT GIANT LEAP IN HPC AND DATA SCIENCE

PRESTON SMITH EXECUTIVE DIRECTOR OF RESEARCH [email protected]

Page 2: DRIVING PURDUE UNIVERSITY'S NEXT GIANT LEAP IN ......– Buy 1 node or 100, you get a queue that guarantees access up to that many resources • But wait, there’s more!! – What

Purdue University

Page 3: DRIVING PURDUE UNIVERSITY'S NEXT GIANT LEAP IN ......– Buy 1 node or 100, you get a queue that guarantees access up to that many resources • But wait, there’s more!! – What

Outline

Topics Today

▪ Community Cluster Program at Purdue University

▪ Integrative Data Science Initiative

▪ “Gilbreth” Supercomputer for AI, ML, and HPC

Page 4: DRIVING PURDUE UNIVERSITY'S NEXT GIANT LEAP IN ......– Buy 1 node or 100, you get a queue that guarantees access up to that many resources • But wait, there’s more!! – What

Research Computing History at Purdue

1967 – CDC 6500

• Purdue became one of the first

academic institutions with a

supercomputer, a Control Data

Corp 6500

• Peak performance of 1/3 of a

megaflop!

1983 – Cyber 205

• Purdue acquired a Cyber 205 –

one of the most powerful

systems operated by a

university at the time.

Page 5: DRIVING PURDUE UNIVERSITY'S NEXT GIANT LEAP IN ......– Buy 1 node or 100, you get a queue that guarantees access up to that many resources • But wait, there’s more!! – What

Research Computing History at Purdue

Early 90s – Intel Paragon XPS

Late 90s – IBM SP

Page 6: DRIVING PURDUE UNIVERSITY'S NEXT GIANT LEAP IN ......– Buy 1 node or 100, you get a queue that guarantees access up to that many resources • But wait, there’s more!! – What

Landscape, Early 2000s

... Privately-run clusters were proliferating in labs and newly-made datacenters across campus!

Page 7: DRIVING PURDUE UNIVERSITY'S NEXT GIANT LEAP IN ......– Buy 1 node or 100, you get a queue that guarantees access up to that many resources • But wait, there’s more!! – What

Community Clusters

The Early Years

• Without a large capital acquisition by the university, providing cutting-edge computing capabilities for researchers was not possible.

• Many faculty were getting funding to acquire, host and operate HPC resources for themselves

• Solution: pool these funds to operate clusters for researchers!

– The faculty no longer have to devote a grad student to managing their cluster, or persuade their dean to renovate computing space for them!

Page 8: DRIVING PURDUE UNIVERSITY'S NEXT GIANT LEAP IN ......– Buy 1 node or 100, you get a queue that guarantees access up to that many resources • But wait, there’s more!! – What

Community Cluster Program

The Rules

• You get out at least what you put in– Buy 1 node or 100, you get a queue that guarantees access up

to that many resources

• But wait, there’s more!!– What if your neighbor isn’t using his queue?

• You can use it, but your job has to run in 4-hour chunks if he wants to run.

• You don’t have to do the work– Your grad student gets to do research rather than run your cluster.

• Nor do you have to provide space in your lab for computers.

– Central IT provides data center space, systems administration, application support.

– Just submit jobs!

Page 9: DRIVING PURDUE UNIVERSITY'S NEXT GIANT LEAP IN ......– Buy 1 node or 100, you get a queue that guarantees access up to that many resources • But wait, there’s more!! – What

#166 RICE 2015

#302 BROWN2017

Page 10: DRIVING PURDUE UNIVERSITY'S NEXT GIANT LEAP IN ......– Buy 1 node or 100, you get a queue that guarantees access up to that many resources • But wait, there’s more!! – What

Take Giant Leaps – Sustainable Planet

Dr. Jian Jin

Jian Jin uses hyperspectral imaging to

collect and analyze data about plant

phenotypes.

Jin’s lab can generate 10 TB of image

data each day, and image processing

can be greatly accelerated by the

V100 GPUs on Gilbreth.

Page 11: DRIVING PURDUE UNIVERSITY'S NEXT GIANT LEAP IN ......– Buy 1 node or 100, you get a queue that guarantees access up to that many resources • But wait, there’s more!! – What

Integrative Data Science Initiative

IDSI Vision

Interdisciplinary Approach to Research• Support structured research efforts in Data Science Theory and

Fundamentals, Data-Driven Discovery, and Data Science Applications.

Pervasive Inclusion of Data Science in Education• Establish a Data Science Education Ecosystem incorporating data science

across campus. GOAL: every undergraduate complete her or his studies with relevant professional skills in data science.

• Create physical presence for IDSI and the Educational Ecosystem to promote creative collaboration through proximity and physical interaction.

Corporate & Non-Profit Engagement• Increase data science research and education collaborations with business

and industry.

Launched in 2018

Page 12: DRIVING PURDUE UNIVERSITY'S NEXT GIANT LEAP IN ......– Buy 1 node or 100, you get a queue that guarantees access up to that many resources • But wait, there’s more!! – What

Integrative Data Science Initiative

Research

• 3 Broad Areas:– Data science theory and fundamentals

– Data-driven discovery

– Data science applications

• Themes of Excellence:

– Health and life sciences

– Agriculture

– Manufacturing

– Transportation and civil infrastructure

40 +Faculty

22Hired since

2015

150+Faculty

83Hired Since

2015

Data Science Fundamentals

Data Science Applications

Page 13: DRIVING PURDUE UNIVERSITY'S NEXT GIANT LEAP IN ......– Buy 1 node or 100, you get a queue that guarantees access up to that many resources • But wait, there’s more!! – What

INTERNAL FUNDING – RESEARCH

Eight data science research projects launched by IDSI

The selected projects create synergies among researchers to collaborate and explore data science questions at the

nexus of:

§ health care;§ defense ethics; § society and policy; § and fundamentals, methods and algorithms

52proposal teams

comprised of

172faculty across

48departments,

andcolleges

Nearly

$2Mawarded

Page 14: DRIVING PURDUE UNIVERSITY'S NEXT GIANT LEAP IN ......– Buy 1 node or 100, you get a queue that guarantees access up to that many resources • But wait, there’s more!! – What

Integrative Data Science Initiative

Funded Research Projects

• Fingerprints of the Human

Brain: A Data Science

Perspective

• Quantum Machine Learning for

Data Analytics and

Optimization

• A Relational-Based Measure of

State Legislator Consequence

Page 15: DRIVING PURDUE UNIVERSITY'S NEXT GIANT LEAP IN ......– Buy 1 node or 100, you get a queue that guarantees access up to that many resources • But wait, there’s more!! – What

Integrative Data Science Initiative

Education• Core Data Science and Related Fields

– Increase the number and quality of graduates with core skills in data science

• Infuse Data Science Across Disciplines

– GOAL: every undergraduate complete her or his studies with relevant professional skills in data science.

• The Data Mine

– A data science application centric living learning community for all discipline

• Data Science Applications Certificate Program

– A formal certificate program designed for all Purdue students

Office of the Provost funded 12 proposals as a part of the Education Ecosystem sector of IDSI

Page 16: DRIVING PURDUE UNIVERSITY'S NEXT GIANT LEAP IN ......– Buy 1 node or 100, you get a queue that guarantees access up to that many resources • But wait, there’s more!! – What

Integrative Data Science Initiative

The Data Mine Learning Community

20NEW LEARNING

COMMUNITIES

For more information, contact Mark Daniel Ward ([email protected])

A VISION

A PLACE

AN EXPERIENCE

AN INNOVATIVE ENGINE

The first large-scale living

learning community for undergraduates from all

majors focused on Data Science for All

Corporate partners, faculty and TAs

are mentors for teams of 4-6 students, who develop practical

solutions to open-ended, data-driven problems

Hillenbrand Hall, 800-

student capacity, 100%

committed to The Data

Mine, with dedicated co-

working space

Interdisciplinary teams bring creativity

and new perspectives to tough

problems, where data science is a key

part of the solution

300+HOURS STUDENTS

INVEST ON PARTNER

PROJECTSPer Academic Year

33%WOMEN

616STUDENTSDEDICATED TO

DATA MINE

PROJECTS

Page 17: DRIVING PURDUE UNIVERSITY'S NEXT GIANT LEAP IN ......– Buy 1 node or 100, you get a queue that guarantees access up to that many resources • But wait, there’s more!! – What

Take Giant Leaps - Health and Longevity

Dr. Wen Jiang

Using cryo-electron microscopy (cryo-EM), Dr.

Wen Jiang’s research team operates multiple

Cryo-EM microscopes.

The research team uses community clusters,

Data Depot, and the V100 GPUs on Gilbreth to

accelerate Cryo-EM applications like RELION

and Cryo-Sparc.

Page 18: DRIVING PURDUE UNIVERSITY'S NEXT GIANT LEAP IN ......– Buy 1 node or 100, you get a queue that guarantees access up to that many resources • But wait, there’s more!! – What

The Landscape, early 2018

We’ve gone full circle▪ Now with a University strategic effort in data science,

▪ ... We see privately-run clusters and GeForce GPU workstations are proliferating in labs and offices across campus

Page 19: DRIVING PURDUE UNIVERSITY'S NEXT GIANT LEAP IN ......– Buy 1 node or 100, you get a queue that guarantees access up to that many resources • But wait, there’s more!! – What

Gilbreth - Community Cluster

Gilbreth

• GPU-based system ideal for

machine learning, AI, big data

science – as well as FEA,

Chemistry, MD

• 50 nodes, 100 GPUs

– 1 PF of single-precision!• 2-3PB parallel filesystem storage

• Flash storage

• Annual subscription fee for

access

Over 40 faculty investors in one yearProf. Lil lian Moller Gilbreth

Page 20: DRIVING PURDUE UNIVERSITY'S NEXT GIANT LEAP IN ......– Buy 1 node or 100, you get a queue that guarantees access up to that many resources • But wait, there’s more!! – What

Take Giant Leaps - Aerospace

Dr. Alina Alexeenko

Developed an mplementation of the multi-

species Discontinuous Galerkin Fast

Spectral (DGFS) method for solution of

multi-species monoatomic full Boltzmann

equation on multi- GPU/multi-CPU

architectures.

Using 36 nodes of Gilbreth, Alexeenko’s

research team saw parallel efficiency of

.95, and would allow CFD simulations to

complete in less than a day vs months on

CPU platforms.

http://www.cfd.tu-berlin.de/~panek/cfd/jet2.png

Page 21: DRIVING PURDUE UNIVERSITY'S NEXT GIANT LEAP IN ......– Buy 1 node or 100, you get a queue that guarantees access up to that many resources • But wait, there’s more!! – What

The next Giant Leap?

The Goal

Enable discovery at Purdue that previously was not possible!

▪ By bringing the proliferated labs of GPU workstations in from the

cold and into the campus ecosystem

Page 22: DRIVING PURDUE UNIVERSITY'S NEXT GIANT LEAP IN ......– Buy 1 node or 100, you get a queue that guarantees access up to that many resources • But wait, there’s more!! – What

EA/EOU

THANK YOUwww.rcac.purdue.edu

[email protected]