76
Introduction to Compute Canada ARC resources Grigory Shamov, UManitoba / Westgrid Oct 18, 2017 Thanks to: Alex Razoumov and John Simpson

Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

Introduction to Compute Canada ARC resources

Grigory Shamov, UManitoba / WestgridOct 18, 2017

Thanks to: Alex Razoumov and John Simpson

Page 2: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

Introduction & Outline

● Who we are

● What are the ARC resources

● What resources are available from ComputeCanada

2

Page 3: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

What is ComputeCanada

3

Page 4: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

Regional Consortia

4

Page 5: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

What is WestGrid

5

Page 6: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

WestGrid Members and Partners

6

Page 7: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

WestGrid Site at UofM.

○ARC Support staff : Ali Kerrache , Grigory Shamov ○Sysadmin team (UofM IST): Daryl Lorimer, Sean Kalynuk

●Legacy WestGrid system: Grex○https://www.westgrid.ca/support/systems/grex

7

Page 8: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

Advanced Research Computing

What are the Advanced Research Computing Resources?

○Advanced means: something that has to scale out of a PC/workstation

○Compute, Data storage, Network○Scientific Platforms, Visualization○Support, Advocacy and Consultation

8

Page 9: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

●High Performance Computing Clusters ○“A cluster of workstations, with interconnect and under command of a batch system”. ○Resource supply is managed by RM/Scheduler○Batch mode (Mainframe from 60s-70s).○Scheduling gives high utilization and “traffic control” for jobs○Also “fairness” and accounting of the systems usage○Single, curated HPC software stack●Cloud Computing○Infrastructure as service (but also “cloud is a business model”) ○Create and manage your own Virtual Machine instances.○Natural for servers, but can be used for compute ○Most flexible, can use any OS or software stack; VMs can be portable.

9

Computing that scales: HPC / Cloud

Page 10: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

HPC System Example: Grex

Page 11: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

HPC Scheduler

●Scheduler is the component that does the policing

○Resource management/allocation is done through the Scheduler○Resource Allocations set priority targets ○Dynamically changed priorities based on recent usage○Allocations are not static reservations!○Needs steady supply of jobs throughout the year to fulfill the RAC targets

○Very large or very long jobs may need special treatment○Burst load also might be harder to accommodate.

Page 12: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

But what Scheduler?

●A BIG CHANGE: from Torque to SLURM !○All the legacy Westgrid systems were running Torque RM with Moab scheduler #PBS –l walltime=1:00:00,nodes=1:ppn=4,mem=4gb#PBS –A abc-678-aaqsub test.job○New National sites (Cedar, Graham) are using SLURM#SBATCH ???#SBATCH ??? squeue test.job

●Kamil’s (UofA) workshop for scheduling is coming this Fall●Any interest in a local workshop about SLURM?

12

Page 13: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

ComputeCanada HPC offerings

○https://docs.computecanada.ca/wiki/Cedar○https://docs.computecanada.ca/wiki/Graham

CPUs GPUs RAM per node

Storage Note

Cedar 27.6K x E5-2683v4 146 x 4 P100 128GB – 3TB 3.7PB /scratch9.3PB /project

Fully online! CPUs to be expand in 2018

Graham 33.4K x E5-2683v4 160 x 2 P100 128GB – 3TB 3.2PB /scratch11PB /project

Fully online! Storage to be expanded in 2018

Niagara TBD, ~60K 0 TBD TBD SciNet, expected by Apr 1, 2018

GP4 TBD TBD TBD TBD

13

Page 14: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

HPC Software stack

●Number-crunching software environment: ○Compilers, BLAS/LAPACK/PETSc, MPI, OpenMP●Dynamic languages and libraries○R, Python, Perl, Julia, ...●Domain-specific applications and packages○Engineering, Chemistry, MachineLearning, ...●Biomolecular pipelines, genomics etc.

●BigData (Hadoop, Spark, ElasticSearch)?

●Centralized CC software stack, distributed over CVMFS 14

Page 15: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

HPC Software stack

●Distributed over CVMFS to all National HPC systems○Using “modules” (LMod); module load abinit/8.2.2○https://docs.computecanada.ca/wiki/Available_software ○Module usage is tracked.

15

Page 16: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

●High Performance Computing Clusters ○“A cluster of workstations, with interconnect and under command of a batch system”. ○Resource supply is managed by RM/Scheduler○Batch mode (Mainframe from 60s-70s).○Scheduling gives high utilization and “traffic control” for jobs○Also “fairness” and accounting of the systems usage○Single, curated HPC software stack●Cloud Computing○Infrastructure as service (but also “cloud is a business model”) ○Create and manage your own Virtual Machine instances.○Natural for servers, but can be used for compute ○Most flexible, can use any OS or software stack; VMs can be portable.

16

Computing that scales: HPC / Cloud

Page 17: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

OpenStack Logical Diagram

Page 18: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

WestCloud example

Page 19: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

ComputeCanada Cloud offering

Current default limits: ○2 VMs, 4 VCPUs, 15G RAM, 50Gb persistent storage, 1 IP.○https://docs.computecanada.ca/wiki/Cloud_resources

Nodes RAM per node Storage Note

Arbutus 240, 2 x E5-2680v4 256GB on most 500TB Ceph Due to expansion, will include big data nodes

West Cloud 32, 2 x E5-2650v2 256GB on most Now part of Arbutus

East Cloud 36, 2 x E5-2650v2 128 Gb 100 TB Ceph U.de Scherbrooke

Cloud partitions of Cedar and Graham

TBD; for Graham ~56

TBD;for Graham 256GB

TBD

19

Page 20: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

Storage: Large, Fast and Scalable

●Parallel, globally consistent filesystems for HPC ○Lustre, (10PB + 12 PB, operational) ○IBM GPFS

●Object storage filesystems; filesystems for clouds ○Ceph, DDN WOS

●Long-term storage (Tape-and-Disk HSM)○~30PB/site , in progress

○https://docs.computecanada.ca/wiki/National_Data_Cyberinfrastructure 20

Page 21: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

Networks and Data Transfer

●CANARIE ○100TB/s at new National hosting sites; at least 10Gb/s to every major site.

●ComputeCanada GlobusOnline○http://globus.computecanada.ca ○Fast transfers, GridFTP○Batch, fire-and-forget○Identity management○Organizational endpoints○Personal endpoints!

○Institutional offerings and issues: Firewalls, security domains 21

Page 22: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

Other Data Services and Projects

●ownCloud: another online storage○Dropbox-like; free for Westgrid users ○Allows for easy data sharing; up to 50GB quota○http://owncloud.westgrid.ca

22

Page 23: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

Other CC Data Services and Projects

●Research Data Management Project ○https://portagenetwork.ca/frdr-dfdr in limited production now!○CC, USask, UBC, CARL, and Globus

23

FRDR is a single platform from which research data can be ingested, curated, preserved, discovered, cited and shared.

● Federated Storage● Federated Support● Scalable Platform

Page 24: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

Portals/Platforms and Viz

●Portals: not only batch processing○Portals, Workflows, and Databases-as services○GenAp, Galaxy; JuPyTer Hub, WebMO○User-developed portals and RPP competition

●Visualization: ○Specific software & hardware & Network requirements: GPU's, Monitors, communication libs like VNC or NX(X2go), low-latency and high-bandwidth networks, interactive nodes○generic (Paraview) and domain-specific (VMD, MaterialSudio, ANSYS). ○Interactive computing like iPython/JuPyTer

24

Page 25: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

Accessing ComputeCanada● Step 1:

○ Faculty member registers in the Compute Canada Database (CCDB)○ http://ccdb.computecanada.ca

● Step 2: ○ Once account is approved, students / colleagues can register & be

added as group members

25

Page 26: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

Accessing Regional Resources

26

Page 27: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

Accessing Support

27

●ComputeCanada support contacts:○[email protected] for the general support○specific support contacts like [email protected] ○Regional consortia might have their own support systems ○Westgrid: [email protected] ○Local UofM staff can be reached through IST Services Catalogue as well○http://umanitoba.ca/ist/service_catalogue/

●Documentation and Training○ComputeCanada: https://docs.computecanada.ca ○Westgrid Website: https://www.westgrid.ca ○Westgrid Training Events calendar: https://www.westgrid.ca//events

Page 28: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

Getting more than Default Share

28

●Default RAC (Rapid Access Service now)○A fraction of resources (compute, storage) is left for it.○Groups can start using it right away, no application needed○The fraction is not large, so you might want to take part in:

●Annual ComputeCanada Resource Competitions ○Resources for Research Groups (RRG), formerly RAC○Research Portals and Platforms (RPP) ○In detail today by Patrick Mann, Westgrid’s Director of Operations!

Page 29: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and
Page 30: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

Compute Canada’s Resource Allocation Competition (RAC):

BEST PRACTICESPatrick Mann, Director of Operations

(University of Manitoba Version)

Page 31: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

Introduction

1. What is the RAC?2. Manitoba Stats3. Overall and Admin Details4. Tips and Best Practices

a. Reviewersb. Basic Best Practicesc. RRG specificsd. RPP specifics

5. Questions and Discussion

Page 32: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

WG Best Practices

This document is a WestGrid-provided best practices guide for the Compute Canada Resource Allocation Competition (RAC). It does not replace the official RAC documents and PI’s should carefully review the Compute Canada official documentation. In cases of conflict between this Best Practices Guide and the official RAC documents, the official documents take precedence.

Page 33: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

What is the RAC?

Resource Allocation Competition for allocations on Compute Canada national services

About 20% reserved for opportunistic use.● Available to all CC users - Rapid Access Service (RAS).● Any researcher/student at a Canadian institution can become a user.

○ No application required beyond request for an account

~80% allocated through a competitive process.

Two competitions in RAC 2018:○ Research Platforms and Portals (RPP) Competition○ Resources for Research Groups (RRG) Competition

Page 34: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

Manitoba RAC 2017 Results● 100% success rate for 2017!

○ All 10 who applied were successful, receiving resources valued at $434,662. ○ 2-RPP, 3-Engineering, 1-Chemistry, 1-Bioinformatics, 1-Earth Sciences, 1-Nano,

1-Astronomy/Subatomic Physics

● Received 7.9% of the total RAC awards within WestGrid

○ Higher than the total % of users within WestGrid (6.5%)

● 3.3/5 average science score; average CPU requests scaled 55%

○ Equal to national averages

● However Manitoba researchers did receive the lowest CPU allocation relative to their request, receiving 52% of the core years they requested, compared to the WestGrid and Compute Canada average of 58%.

Page 35: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

Value & Total Awards

Page 36: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

Allocations by Research Area

Page 37: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

RAS vs RAC - What do you need?

RAS: Rapid Access Service (default resources available to any user with a Compute Canada account)

RAC - Resource Allocation Competitions (annual competitive process to allocate resources based on scientific excellence)

Page 38: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

2018 RAC Structure

Page 39: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

2018 RAC Key Dates

RAC 2018 Start FinishFast Track submission Oct 3, 2017 Nov 2, 2017

RRG & RPP full application submission Oct 3, 2017 Nov 16, 2017RPP Progress Report submission Jan 30, 2017 Mar 14, 2018

Award letters sent* Mid Mar, 2018 Late Mar, 2018

RAC 2018 Allocations implemented* Mid Apr, 2018 Early May, 2018

* Final dates to be confirmed.

Page 40: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

Under Pressure!

Note: does not include Challenge 1 awards or RRG out-of-round applications.

Page 41: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

Good news...System Current Status

Cedar(GP2, SFU)

Current: 27,696 CPU cores, 146 GPU nodes (4 x NVidia P100)● Major expansion in progress● ~doubling in size. Includes storage.

IN OPERATION (June 30, 2017)Expansion Dec.31/2017RAC 2017

Graham(GP3, Waterloo)

Current: 33,472 CPU cores, 160 GPU nodes (2 x NVidia P100)● Storage expansion

IN OPERATION (June 30, 2017)Expansion Dec.31/2017RAC 2017

Niagara(LP1, Toronto)

Large parallel system ~60,000 cores● Current: final contract negotiations with vendor

Expected online early 2018RAC 2018

GP4 New general purpose system (like cedar/graham)● Includes third “deep” storage site

Early stage RFP developmentLater in 2018Not included in RAC 2018

But: ~20 legacy clusters will be defunded -> in total slightly fewer cores, but they should be faster.

Page 42: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

Competitive Process

● CFI mandates allocations based on excellence

● Peer-review process to determine SCIENCE SCORE

● High Score = Less Scaling

***Scaling is required due to insufficient resources, which makes the RAC a very competitive process.***

Page 43: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

Admin Changes● Fast Track eligibility criteria have been relaxed to favour more users: 296

eligible PIs - largest number ever invited.○ Invitations have been sent out.○ If you are reasonably happy with your 2017 allocation then fast-track is

very straightforward and requires no further work on your part.○ Fast Track applications cannot be delegated: PIs must complete the

application themselves.● PIs can delegate the completion of a RPP or RRG application to a sponsored

user on CCDB.● Notice of Intent (NOI) is not a prerequisite for applying to any RAC

competition this year.● Storage types (e.g., /project, /nearline, etc) are shown only when the resource

you requested has that type of storage available for allocation (see list of allocatable resources for RAC 2018)

Page 44: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

CCV

● Required for RRG and RPP full applications (update not required for Fast-Tracks or progress reports).● Note: Applicants with a CCV on CCDB - need to delete and resubmit. Carefully read the instructions in

the CCV Submission Guide (section 6)

Keep your CCV up to date!1. CCV is used by the review committees - quality of the PI and his/her research.

Keep your bibliography up-to-date to ensure that reviewers are aware of the latest (and greatest!) work.2. CC relies on the CCV for bibliographic analyses used both for annual reporting, and for funding proposals.

The Field Weighted Citation Index (FWCI) is a key component of the metrics presented by CC.

“For future allocation requests, please consider adding references to published work in the proposal, progress made in the previous year, and clear description of how HQP will be involved in achieving the milestones.”

“Progress over the past year is missing. Based on the CCV, most of the group's publications were enabled through Compute Canada resources, so that information should have been accessible and would have been relevant to include.”

Page 45: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

Out-of-Round & Early-Stage

New faculty can apply for an Out-of-Round● Send request to [email protected]

Use the RAS for tests and prototypes1. Early-stage or new users get an account and learn about the systems.2. Run prototypes or test jobs, and if possible production jobs3. Acquire performance statistics4. Predict future requirements5. Use the above information to create a well-justified RAC application.

(out-of-round or regular)

Page 46: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

Tips & Best Practices

Page 47: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

No Appeal Process

It is worth noting right away that there is no appeal process.

If the reviewers misunderstand a proposal (which does happen) there is no way of correcting or amplifying the proposal.

● Need to get it right the first time.

Page 48: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

The Reviewers

● Science reviewers are experts in their discipline○ For details see list of committees and chairs on CC RAC pages

● But - not necessarily experts in the specific sub-discipline or area of any particular project.

● Very wide range of RAC proposals do not allow for area experts, so you cannot assume that your proposal is going to be read by someone who works directly in the area or field of the project.

Don’t have lots of detailed jargon. That’s very hard for someone outside the specific area. Write for a discipline expert, but not for the specific area.

Page 49: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

Use the Templates

RRG Regular Stream ● Compute between 50*-1999 Core Years● Storage between 10-999 TBs● GPUs between 10-199 GPU years● Cloud Compute between 80-499 VCPUs● Persistent Cloud between 10-99 VCPUs

RRG Large Stream Anything larger than the above

We strongly recommend that PIs use the templates.● RRG Application Template (Regular Stream) (Large Stream)● RPP Application Template

Summaries and details are in the following sections.

Page 50: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

Justify your Request

● Address the evaluation criteria (details to follow)● Provide details on significance and impact of research.● Citation rates of recent work help justify science.● Justify why/how the resources requested will be used to accomplish/support

the science.● Poor proposals generally do not provide sufficient information, or have mixed

technical requirements into the research justification. This is very difficult to decipher for both science and technical reviewers, and results in poor scores.

“Please improve motivation of why the proposed calculations are important, and what is to be learned and/or what other science depends on the results.”

Page 51: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

Provide Adequate Details

● Clearly explain WHAT the science is, using specific details. ● Use tables to provide resource details by project● If it is difficult to predict usage then emphasize the areas of uncertainty.

Project Team Members

Estimated Core-Years

/project Storage

Memory/ core Comments

Project 1 Student X 10,000 100TB 4 GB ...

Project 2 Students Y, Z 5,000 50TB 32 GB ...

Totals: 15,000 150TB

“The technical justification did not show a calculation of the computing and storage needs. Providing a table with this information, as suggested in the guidelines, would have made this section stronger.”

Page 52: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

Be Concise

● Provide ALL the information asked for, but ONLY what is asked for. ○ Use the templates!

● Answer the specific questions and provide only those details requested.● Do not re-use submissions from other proposals or competitions.

○ Students and post-docs can write sections (they are the experts) but the overall proposal needs an experienced, guiding hand.

● Be clear, avoid jargon - reviewers are not experts in all sub-domains.● Take time to edit and review the application before submitting it.

“The proposal is very long. The very detailed scientific justification is more reminiscent of a NSERC proposal. For next year, I recommend to shorten the science part, and to work out more clearly the purpose of the CC usage, and the results that will be enabled by the CC Allocation.”

Page 53: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

HQP Impact

● Ideally provide a table showing the expected HQP at each level (undergraduate, masters, doctorate, etc).

● Highlight the contribution of any excellent graduate students.● Mention any training opportunities beyond that of normal academic

teaching.

“For future allocation requests… describe in detail the involvement of HQP in past projects that utilized Compute Canada resources. The current proposal states that one PDF will be involved and reviewers

were wondering if there were plans for this PDF to mentor junior graduate or undergraduate students. If so, mention this in the proposal.”

***“This proposal was very clearly articulated with an integrated HQP training plan. The number of HQPs is

impressive.”

Page 54: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

Updated CCV & Past Year Progress

● Keep your CCV up to date!● Reference citations in the proposal (“Progress over the Past Year” section)

which were or will continue to be supported by the resources requested.

“For future allocation requests, please consider adding references to published work in the proposal, progress made in the previous year, and clear description of how HQP will be involved in achieving the

milestones.”***

“Progress over the past year is missing. Based on the CCV, most of the group's publications were enabled through Compute Canada resources, so that information should have been accessible and

would have been relevant to include.”

Page 55: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

Narrative

● This was emphasized by reviewers!

Science impact → solution techniques and challengesSolution techniques and challenges → the technologies and resource ask.

● Team (HQP, RPP management, ..) supports the science and the technologies/ask.

There should be a thread or narrative used to present a well-connected and justified story.

Page 56: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

Discrepancies in the Asks

Very important: Ensure there are NO discrepancies between the resources requested in your technical justification

document and the online application.

● Seems to happen a number of times every year.● Requests on the form are not consistent with the technical justification.● Please be careful - we have missed or mistaken the resource requests in

the past.

Page 57: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

RRG

Resources for Research Groups● Aimed at Job-based use of the big clusters.

Emphasis1. Quality of Science - including impact and technical

justification2. Quality of Applicant(s)

Page 58: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

RRG Evaluation CriteriaQuality of Science

60% Research Problem and Justification● originality and innovation● significance and expected contributions to research● clarity and scope of objectives● impact of the research

Technical Justification● clarity and appropriateness of methodology● feasibility● discussion of relevant issues

Training and Support of HQP

Quality of the Applicant(s)

40% Research Problem and Justification● knowledge, expertise, and experience● quality of contributions to, and impact on, the proposed and other research areas● importance of contributions

Training and Support of HQP

Blue headers are from the templates

Page 59: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

RRG TemplateSection Regular Large

Research Problem and Justification

1 section - 3 pages 2 subsections:● Research Problem (1 page)● Research Justification (2-3 pages)

Training and Support of HQP

1 page 1 page

Technical Justification 2 pagesCompute Request (3 sections)

● Code details, performance and utilization

● Memory● Compute

Storage Requests (1 section)

3 pagesCompute Request (5 subsections)

● System selection● Code details● Code performance and utilization● Memory requirements● Size of compute request

Storage Requests (3 subsections)● Details● Performance and utilization● Size

Progress over last year ½ page Length not specified. About a page is reasonable.

Page 60: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

Technical Justification I

Provide performance estimates.

Explain the projects being worked on and the proposed research plan.

● RAS - default resources available for testing and prototyping.● Performance tables really help the reviewers, and make the proposal look good!

● Tabulate the number and size of the needed runs/jobs/virtual machines● Again tables and breakdowns really add to the proposal.

○ Sub-tasks or sub-projects, phases, team members, ..

● If it is difficult to predict usage then emphasize the areas of uncertainty.○ early research issues where details are still to be worked out○ administrative issues like onboarding graduate students or postdocs who have not yet worked

out a detailed research plan.

Page 61: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

Technical Justification II

Provide details of the numerical and computational techniques used.

● Describe the numerical and computational techniques. Expect reviewers to be computationally literate, but not experts in your particular field.

● Explain the way these techniques will address the science and the challenges.● Explain any development of new or revised approaches.● Provide details of any software packages required. ● Show that the users are experienced and familiar with the performance

characteristics of such packages.● If the packages are new to the research team then describe the approach for

familiarization. If possible describe any tests that the team may undertake.

Page 62: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

Resource Ask

● A critical section!● Tables showing your performance/requirements.● Tables showing a project breakdown.● Tie the previous techniques, science, HQP sections

together into the final ask (the thread!).

Carefully justify the resource ask.

Both science and technical reviewers are very aware of inflated, unjustified asks and have unilaterally decreased asks in the past.

Page 63: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

Scaling

We really appreciate an upfront, clear presentation of the impact of scaling on the project. What would you as a researcher do if you were given some percentage of your ask?

● Impact on hiring graduate students.● Impact on hiring post-docs.● Issues with collaborative projects for the other co-PI’s or co-workers.● Critical or short-term projects that are necessary for further activities.● Identification of lower-priority projects or activities that could be postponed.● Projects which are critical to, for instance, an early-stage researcher who is

applying for tenure.

Page 64: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

RPP/Cloud

RPP is aimed at Platforms and Portals1. Portal - reasonably straightforward web server2. Platform - larger-scale

a. Analysis platform (job creation, job submission, virtual cluster, ..)b. Data dissemination (metadata, search, object storage, ..)

3. Computea. Unique or specialized software stacksb. International agreements requiring multi-year

There have been quite enthusiastic ideas with big asks, but in practice the user community is quite specialized and uptake is dependent on the user interfaces and the services offered. Such asks are critically reviewed and the allocation decreased.

Page 65: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

RPP Review Overview

Review Emphasis:1. Impact

a. what is the user community?b. what is being provided?

2. Development and Managementa. Is the team capable of developing and managing a major platform/portal

system in the cloud.b. Are the requested resources reasonable for the predicted user community?

The criteria and templates are more detailed compared to RRG.The current (RAC 2018) cloud resources are IaaS (“Infrastructure as a

Service”) resources. So it is completely up to the project team to design and implement a suitable architecture.

Page 66: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

RPP Evaluation Criteria ICriteria Weight Notes

Goal and Alignment

5 ● “Project goal is clearly stated”● “Area of focus is of importance to Canada, and will generate positive

benefits.

Use and development of a Platform/Portal

5 ● “Clearly explained the added value .. of the proposed portal”● “Driven by the targeted research community”● “Details level of interaction between Canadian and international research

groups”● Sharing data sets is well detailed.● “Addresses any potential accessibility issues”

Expected Impact 5 ● “Impacts have been clearly articulated”● Means by which impacts will be measured.● Clear timeline.● Outcomes are of relevance and importance, and will benefit the partners

and research community

Page 67: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

RPP Evaluation Criteria II

Criteria Weight Notes

Management and Operation

5 ● Management is well-defined and will provide broad access to research community

● Resource access process is well-defined.● Credible plan to maintain or increase the community.● If applicable scientific evaluation of projects accessing the portal will

result in excellent research.● Information collected will provide ability to track impacts (scientify,

HQP, social, ..) and the participants are maximizing the benefits of the resources accessed.

● The team has the right combination of skills.

Highly Qualified Personnel

5 ● Significant benefits to HQP● HQP have been clearly delineated across academic levels● Platform/portal will create unique training opportunities● Identify strong examples of cross-pollination between HQP disciplines

Page 68: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

RPP Evaluation Criteria IIIAppropriateness of Resources Requested

5 ● good understanding of the resources required● provided a justified request for resources.● resources are acceptable for the level of activities and anticipated

deliverables.● well justified over the entire duration of the request.● distribution of resources over the entire duration of the request is well

defined and includes any periods of high demand.● If applicable – locations of the resources requested are appropriate given

the capacity of the Compute Canada systems.● If applicable – storage requested has been appropriately categorized as

long-term/archival storage or a rapid accessibility storage solution.● For applications involving international participation, the application has

indicated the domestic proportion of resources for use across Canada compared to the international proportion and distribution.

● The applicant has provided a clear indication of the impacts of scaling down the resource allocation and has tied these impacts to the expected outcomes.

Page 69: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

RPP Template Overview

Resource Justification

Portals - 3 pagesPlatforms - 6 pages

● Resource request summary - a table● Expected Resource Usage and Needs● Support required from CC.

Strategic Plan

Portals - 3 pagesPlatforms - 15 pages

● Goal and alignment with CC strategic plan● Need for and Use of the Platform/Portal● Expected Impacts● Management and Operation of the Platform/Portal● Highly Qualified Personnel (HQP)

Resource Justification next slide

Page 70: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

RPP Template I

Resource JustificationPortals - 3 pagesPlatforms - 6 pages

● Resource request summary - a table● Expected Resource Usage

○ Description of storage and compute needs.○ Rationale for the choice of specific computing and storage resources (if

applicable).○ Description of high demand periods (If applicable).○ Description of who is expected to use the resources (domestic proportion

across country, international proportion and distribution).● Support required from CC.

Page 71: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

RPP Template II

Strategic PlanPortals - 3 pagesPlatforms - 15 pages

Goal and alignment with CC strategic planUse of a Platform/Portal

● need for a platform/portal and added value created for the user community● context of the demand by the research community to consolidate efforts into a platform/portal● level of interaction between Canadian and international research groups and platforms● approach to sharing data sets across or within the platform/portal

Expected Impacts● expected impacts of the research resulting from use of the platform/portal● list of the outcomes resulting from the use of Compute Canada resources and the expected schedule for

achieving these outcomes● Explain how other partners, researchers or scientific areas will benefit from the outcomes of the research

Management and Operation of the Platform/Portal● Details of the management system and how the resources will be shared/distributed across the

platform/portal● Explain how the research community will access the resources and how the platform/portal will

maintain/increase the population accessing resources● evaluation mechanisms used to provide scientific rigor in selecting projects● information collected and the regularity of the project reporting to track the progress towards the

anticipated expected outcomesHighly Qualified Personnel (HQP)

● Students and post-docs● Canadian and International collaborators

Page 72: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

RPP Phased Plans

We encourage phased plans, with for instance performance indicators showing a well-justified and strong understanding if the issues involved with not only technical development, but also marketing and communications.

If possible define performance and success metrics. It’s always useful to have a nice table defining your Key Performance Indicators (KPIs)

Especially startups - lots of large, enthusiastic but unjustified estimates.● Start small.● Include KPI’s to measure performance.● Briefly address marketing and communications.

Page 73: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

RPP Impact Best Practices

1. Identify the audience/community.a. Who would be interested in using the proposed platform or portal?b. Why are they interested?c. Is the community national in scope? International?

2. What is the size of the community?a. justify any such estimates.

3. What kinds of research would you expect the community to carry out?a. examples of exciting projects that would make use of the portal/platform.

4. Are there any agreements in place that would put conditions on the request?a. For instance the Atlas High-Energy physics project is part of the Large

Hadron Collider collaboration, and allocations must satisfy international agreements.

Page 74: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

RPP Dev & Management1. Carefully describe the server architecture.

a. Feeds into the resources ask.

2. Describe any security/privacy concerns or requirementsa. steps that will be taken to satisfy such requirements.

3. Is the architecture scalable?a. Refer to the size of the community to justify any scalability issues or plans.

4. Demonstrate the expertise and capability of your staff.a. A portal or platform requires system management so identify the staff resources necessary to

both develop and operationally manage your proposed system.

5. Special requirements?a. Cloud or tape based backupsb. Particularly large memory requirementsc. Performance or response requirementsd. Redundancy and Reliability requirements. For instance multi-site architectures.e. ...

Page 75: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

GPUs

● Both Cedar and Graham have GPUs (P100’s)○ See docs.computecanada.ca for details.

● Currently phase 2 discussions about GPU’s○ Cedar expansion (2x): should it include GPU nodes?○ GP4: what mix of GPU’s? Volta? Tesla? Maybe even gamer versions?○ How many GPU’s per node?○ Tradeoff: cheap cpu nodes vs more expensive GPU nodes

● Now is the time to make special requests to the GP4 and Cedar teams.○ Send comments to me: [email protected]

Page 76: Introduction to Compute Canada ARC resources and RAC at UManitoba OCT-2017 .… · If you are reasonably happy with your 2017 allocation then fast-track is very straightforward and

Resource Links

● Ask a question:○ About RAC ~ [email protected] ○ About Compute Canada CCV ~ [email protected]

● Find an answer:○ General Competition Information○ RPP Guide

■ Application Template○ RRG Guide

■ Application Template (Regular Stream) (Large Stream)○ Technical Glossary○ Frequently Asked Questions○ CCV Guide

● Presentations & Guides○ WestGrid’s Summary of Best Practices for RAC 2018○ The Full WestGrid Best Practices Guide○ Compute Canada RAC Q&A October 2017

Contact us anytime:[email protected]

www.westgrid.cadocs.computecanada.ca

GOOD LUCK!

Questions?