37
An ESG Walkthrough -ESG Federation website -- DCC File system for ESG Muhammad Atif

An ESG Walkthrough -ESG Federation website -- DCC File system for ESG Muhammad Atif

Embed Size (px)

Citation preview

Page 1: An ESG Walkthrough -ESG Federation website -- DCC File system for ESG Muhammad Atif

An ESG Walkthrough

-ESG Federation website-- DCC File system for ESG

Muhammad Atif

Page 2: An ESG Walkthrough -ESG Federation website -- DCC File system for ESG Muhammad Atif

ESG-NCI Gateway

Latest newsLatest news

SearchSearchWe highly recommend

subscribing to our tweets

We highly recommend subscribing to our

tweets

Page 3: An ESG Walkthrough -ESG Federation website -- DCC File system for ESG Muhammad Atif

Search and Access Data You can search without login; but not download Quick Links → Create Account

Follow the on screen instructions You will receive an email to confirm your

registration; at the same time the administrators are also notified.

Admin has to validate you before you can download the data.

After confirmation email from admin; login to the NCI Node

Account → Apply for Group Membership Mk-3.6 CMIP-5 research (not in our control – request

goes to PCMDI) Requests to NCI are usually quick. Same ID can be used on all the gateways.

Your OpenID: https://esg.nci.org.au/esgcet/myopenid/<username>

Page 4: An ESG Walkthrough -ESG Federation website -- DCC File system for ESG Muhammad Atif

Searching for data

•Recommend that you browse the website and get familiarized.•If asked about authentication, you may use a temporary openidhttps://esg.nci.org.au/esgcet/myopenid/dcc000Password: abc123

Please note that this openid will be removed after the workshop, it is highly recommended that you create your openid

•Recommend that you browse the website and get familiarized.•If asked about authentication, you may use a temporary openidhttps://esg.nci.org.au/esgcet/myopenid/dcc000Password: abc123

Please note that this openid will be removed after the workshop, it is highly recommended that you create your openid

Page 5: An ESG Walkthrough -ESG Federation website -- DCC File system for ESG Muhammad Atif

Download data from the Gateway

Three Download Methods Using the web browser. A set of wget-scripts. Via GridFTP (Data mover lite)

Page 6: An ESG Walkthrough -ESG Federation website -- DCC File system for ESG Muhammad Atif

Download Method-1 (Web browser based)

Intuitive but slow Follow on-screen instructions and nothing can go wrong.

Works like normal downloads from the browser of your choice i.e. click to download. IE, FF, Chrome and Safari are supported.

Works well if you are after couple of files

Page 7: An ESG Walkthrough -ESG Federation website -- DCC File system for ESG Muhammad Atif

Download -2 (Wget Scripts)

Ability to download multiple files Select the files (variables) you are interested in. Presents a wget-download.sh script, that you need to save and run. Command line based – No GUI

Two methods of Authentication Authorization token (Depreciated)

PCMDI gateway only. No login required, However the token expires in 24 hours and

the script is of no use after that. My Proxy Login (Official)

Needs a separate step for authentication. Need to run Java applet or MyProxyClient software If authentication expires, just run the MyProxyClient software

Note: If you are interested in doing lots of downloads, we can provide a custom

script to speedup the process on DCC……… Example to follow later

Page 8: An ESG Walkthrough -ESG Federation website -- DCC File system for ESG Muhammad Atif

Process for MyProxyLogonDownload on the DCC

•ssh [email protected] -Y•Download MyProxyLogon-ESG.jar file

•wget http://esg.nci.org.au/esgcet/webstart/myProxyLogon/MyProxyLogon-ESG.jar

•Run MyProxyLogon.jar file; •module load java•Java –jar MyProxyLogon-ESG.jar

•It writes the certificates to your $HOME/.esg folder•Run the wget-download.sh command

Page 9: An ESG Walkthrough -ESG Federation website -- DCC File system for ESG Muhammad Atif

Download-3 (DML)

Parallel Downloads DML Preferences Concurrency

Faster than wget Uses GridFTP

Caveat: Not available on all ESG nodes NCI one of the few that has the facility

Page 10: An ESG Walkthrough -ESG Federation website -- DCC File system for ESG Muhammad Atif

ESG Data on the DCC

IPCC AR5/CMIP5 CSIRO-QCCCE Mk-3.6 CAWCR ACCESS Replicated data from the other ESG nodes.

Other data CMIP3 Observational data Processed data

Page 11: An ESG Walkthrough -ESG Federation website -- DCC File system for ESG Muhammad Atif

DCC File system organization

All ESG data in /projects/ESG:• Authoritative• Unofficial-ESG-replica• CAWCR_CVC_processed

/projects/ESG/Authoritative Serves data using the policies of the ESG Federation This is the directory that our ESG software serves data from All data is the current official copy

User example: login to the DCC and have a look.

Page 12: An ESG Walkthrough -ESG Federation website -- DCC File system for ESG Muhammad Atif

Unofficial Replica

/projects/ESG/Unofficial_Replica IPCC

The IPCC directory is where you can reference data that we have downloaded from other nodes (though not an official replica). The subdirectories could be partial datasets or complete ones.

IPCC_tmp_flat Direct symlinks to files, flat directory structure

tmp You can download your data here in $USER folder. We can provide you with scripts to help download data here

GlobalObs_and_Reanalysis data sourced from various places that Lawrie Rikus/Ben Hu have been

maintaining. Also served through a THREDDS service - for remote access

Page 13: An ESG Walkthrough -ESG Federation website -- DCC File system for ESG Muhammad Atif

Unofficial Replica

/projects/ESG/Unofficial-ESG-replica/IPCC User downloads using wget scripts/DML. Partial data; Not all of the data is downloaded. Does not necessarily contain the most up to date version

Data may be changed by the remote node since last download.

ESG (and official replica directory) always has the latest version.

Organised as Data Reference Syntax (DRS)

Page 14: An ESG Walkthrough -ESG Federation website -- DCC File system for ESG Muhammad Atif

Data Reference Syntax (How files are organized @ DCC)

• This is how the tree looks like compared to DRS

• DRScmip5.<product>.<institute>.<model>.<experiment>.

<time_frequency>.<realm>.<cmor_table>.<ensemble>

• File System/projects/ESG/unofficial-ESG-replica/IPCC/CMIP5/output1/NCC/NorESM1-M/

historical/mon/seaIce/OImon/r1i1p1/v20110901/sic/<FILE>

Page 15: An ESG Walkthrough -ESG Federation website -- DCC File system for ESG Muhammad Atif

Downloading data to DCC File System

•If you would like a significant amount of data that we don’t have, then … please contact us.•Reasons:

• It may already be downloaded but not linked• Downloading data is still tricky• Space management

•That said – we would like to facilitate downloads of priority data.•How? …. Lets do it

Page 16: An ESG Walkthrough -ESG Federation website -- DCC File system for ESG Muhammad Atif

Demo

• Download the wget file from esg-gateway• ssh –Y [email protected]• java –jar MyProxyLogon-ESG.jar• Copy wget-file to dcc (scp, copy n paste) in a new folder• ./esg-download.py wget-download.sh• View the directory, it should have a number of wget-split-*• ./esg-qsub-download.py –i wget-split-• Press “y”• Check the files after some time

We will be streamlining it further

Page 17: An ESG Walkthrough -ESG Federation website -- DCC File system for ESG Muhammad Atif

Help

• DCC and ESG both are evolving continuously• Comments and suggestions are always welcome

• Help Desks– Anything related to ESG federation website/ other

models are not native to NCI• [email protected]

– Related to DCC compute cluster• [email protected]

Page 18: An ESG Walkthrough -ESG Federation website -- DCC File system for ESG Muhammad Atif

•Downloads are managed by Us•GridFTP•Fast

•Downloads managed by you as a user

Page 19: An ESG Walkthrough -ESG Federation website -- DCC File system for ESG Muhammad Atif

Controlled Vocab: http://esg-pcmdi.llnl.gov/internal/esg-data-node-

documentation/cmip5_controlled_vocab.txt

Page 20: An ESG Walkthrough -ESG Federation website -- DCC File system for ESG Muhammad Atif

ESG-NCI Gateway

We highly recommend subscribing to our tweets

Search by categories

Page 21: An ESG Walkthrough -ESG Federation website -- DCC File system for ESG Muhammad Atif

Search and Access Data

You can search without login; but not download Quick Links → Create Account

Follow the on screen instructions You will receive an email to confirm your

registration; at the same time the administrator(s) are also notified.

Admin has to validate you before you can download the data.

After confirmation email; login to the NCI Node Account → Apply for Group Membership

Mk-3.6 CMIP-5 research (not in our control –

request goes to PCMDI) Requests to NCI are usually quick. For others, this may take time (one – two days)

Page 22: An ESG Walkthrough -ESG Federation website -- DCC File system for ESG Muhammad Atif

Searching for data

Page 23: An ESG Walkthrough -ESG Federation website -- DCC File system for ESG Muhammad Atif

Download data from the Gateway

Three Methods Using the web browser. A set of wget-scripts. Via GridFTP (Data mover lite)

Page 24: An ESG Walkthrough -ESG Federation website -- DCC File system for ESG Muhammad Atif

Download – 1 (Web based)

Intuitive but slow Follow on-screen instructions and nothing can go

wrong. Works like normal downloads from the browser

of your choice IE, FF, Chrome and Safari are supported.

Works well if you are after couple of files

Page 25: An ESG Walkthrough -ESG Federation website -- DCC File system for ESG Muhammad Atif

Download -2 (Wget Scripts)

Ability to download multiple files Presents you with a wget-download.sh script. Command line based – No GUI

Two methods Authorization token (Depreciated)

PCMDI gateway only. My Proxy Login (Official)

Needs a separate step for authentication Need to run Java applet or MyProxyClient software

Note: If you are interested in doing lots of downloads, we can provide a custom script to speedup the downloads on DCC.

Page 26: An ESG Walkthrough -ESG Federation website -- DCC File system for ESG Muhammad Atif

Process for MyProxyLogonDownload on the DCC

•ssh [email protected] -Y•Download MyProxyLogon-ESG.jar file

•wget http://esg.nci.org.au/esgcet/webstart/myProxyLogon/MyProxyLogon-ESG.jar

•Run MyProxyLogon.jar file; instructions are provided in the wget download script that you have already downloaded via web-browser.

•module load java•Java –jar MyProxyLogon-ESG.jar

•It writes the certificates to your $HOME/.esg folder

Page 27: An ESG Walkthrough -ESG Federation website -- DCC File system for ESG Muhammad Atif

Download-3 (DML)

Parallel Downloads DML Preferences Concurrency

Faster than wget Uses GridFTP

Caveat: Not available on all ESG nodes NCI one of the few that has the facility

Page 28: An ESG Walkthrough -ESG Federation website -- DCC File system for ESG Muhammad Atif

Interacting with ESG data on DCC

Page 29: An ESG Walkthrough -ESG Federation website -- DCC File system for ESG Muhammad Atif

ESG Data on the DCC

IPCC AR5/CMIP5 CSIRO-QCCCE Mk-3.6 CAWCR ACCESS Replicated data from the other ESG nodes.

Other data CMIP3 Observational data Processed data

Page 30: An ESG Walkthrough -ESG Federation website -- DCC File system for ESG Muhammad Atif

DCC File system organization

All ESG data in /projects/ESG:• Authoritative• Unofficial-ESG-replica• CAWCR_CVC_processed

/projects/ESG/Authoritative Serves data using the policies of the ESG Federation This is the directory that our ESG software serves data from All data is the current official copy.

User example: login to the DCC and have a look.

Page 31: An ESG Walkthrough -ESG Federation website -- DCC File system for ESG Muhammad Atif

Unofficial Replica

/projects/ESG/Unofficial_Replica IPCC

The IPCC directory is where you can reference data that we have downloaded from other nodes (though not an official replica). The subdirectories could be partial datasets or complete ones.

IPCC_tmp_flat Direct symlinks to files, flat directory structure

tmp You can download your data here in $USER folder. We can provide you with scripts to help download data here

GlobalObs_and_Reanalysis data sourced from various places that Lawrie Rikus/Ben Hu have been

maintaining. Also served through a THREDDS service - for remote access

Page 32: An ESG Walkthrough -ESG Federation website -- DCC File system for ESG Muhammad Atif

Unofficial Replica

/projects/ESG/Unofficial-ESG-replica/IPCC User downloads using wget scripts/DML. Partial data; Not all of the data is downloaded. Does not necessarily contain the most up to date version

Data may be changed by the remote node since last download. ESG (and official replica directory) always has the latest

version.

<MOVE TO NEW SLIDE>Organised as Data Reference Syntax (DRS)

cmip5.<product>.<institute>.<model>.<experiment>.<time_frequency>.<realm>.<cmor_table>.<ensemble>

Ref the official link to the standard.

Page 33: An ESG Walkthrough -ESG Federation website -- DCC File system for ESG Muhammad Atif

Downloading data to our ESG

•If you would like a significant amount of data that we don’t have, then … please contact us.•Reasons:

• It may already be downloaded but not linked• Downloading data is still tricky• Space management

•That said – we would like to facilitate downloads of priority data.•How? …. (new slide)

Page 34: An ESG Walkthrough -ESG Federation website -- DCC File system for ESG Muhammad Atif

•Downloads are managed by Us•GridFTP•Fast

•Downloads managed by you as a user

Page 35: An ESG Walkthrough -ESG Federation website -- DCC File system for ESG Muhammad Atif

Controlled Vocab: http://esg-pcmdi.llnl.gov/internal/esg-data-node-

documentation/cmip5_controlled_vocab.txt

Page 36: An ESG Walkthrough -ESG Federation website -- DCC File system for ESG Muhammad Atif
Page 37: An ESG Walkthrough -ESG Federation website -- DCC File system for ESG Muhammad Atif