20
CMIP5 Download Tutorial Jennifer M. Adams 12 January 2012 /data/cmip5/extras/CMIP5_Tutorial.pptx

CMIP5 Download Tutorial Jennifer M. Adams 12 January 2012 /data/cmip5/extras/CMIP5_Tutorial.pptx

Embed Size (px)

Citation preview

CMIP5 Download Tutorial

Jennifer M. Adams12 January 2012

/data/cmip5/extras/CMIP5_Tutorial.pptx

Vocabulary Lesson

/data/cmip5/docs/data_reference_syntax.pdf

Eleven keywords uniquely describe a data set:Project. Institute. Model. Experiment.

Frequency. Realm. MIP Table. Product. Variable. Ensemble. Version.

Project (Activity): cmip5

Institute: The institute responsible for model results, e.g. MOHC, NOAA-GFDL, NCAR, MPI-M, NASA-GMAO.…

Model: The name of the model used,e.g. HadCM3, GFDL-ESM2M, CCSM4, MPI-ESM-LR….

Experiment: The name of the experiment family and type,e.g. amip, amip4xCO2, decadal1960, piControl, historical….

Frequency: The interval between time samples. Options are:yr, mon, day, 6hr, 3hr, subhr, monClim, fx

Realm: The high level modeling component. Options are: atmos, ocean, land, landIce, seaIce, aerosol, atmosChem, ocnBgchem

MIP Table: A spreadsheet* entry for realm and variable components,e.g. Amon, Omon, Lmon, LImon, day, ….

* /shared/cmip5/docs/standard_output.xlsx

Variable: A short name that identifies a physical quantity, e.g. pr, ps, psl, tas, tauu, tauv uas, vas….

Product: A designation of CMIP5 files. Options are: output, output1, output2, unsolicited

Ensemble: A name that distinguishes among closely related simulations and includes 3 numbers: realization (rN), initialization method (iM), and perturbed physics (pL).e.g. r1i1p1, r7i1p1, r1i3p1, r1i1p2.…

Version: Uniquely identifies a particular version of the data set, e.g. v20110923, v20111208, v1, v2.…

Desired & Acquired Data Sets

/project/cmip5/jma/desired.txtThis list, based on user requests, is managed by Tim and Jennifer.

/project/cmip5/jma/acquired.txt This list, based on contents of /data/cmip5/data, is auto-updated.

Each Downloader will be assigned an item from the desired list and must grab all available models and ensemble members for the given Experiment/Realm/Frequency/MIP Table.

ESGF Gateways

For Downloading:http://pcmdi3.llnl.gov/esgcet/home.htmhttp://cmip-gw.badc.rl.ac.uk/home.htmhttp://ipcc-ar5.dkrz.de/home.htm

For Searching:http://esg.prototype.ucar.edu/home.htm

Always Use Firefox!

Gateway Login

Authentication

OpenID: https://pcmdi3.llnl.gov/esgcet/myopenid/jennifer

Username: jenniferPassword: sdf,WER.5

Gateway Search

Gateway Search

Gateway Search Results

Dataset URL

Dataset ID

Select Files

Download Wget Script

Rename Wget Script

mv wget-download.sh wget.cmip5.output1.MOHC.HadCM3.decadal2000.mon.atmos.Amon.r1i2p1.v20110708.sh

IT IS VERY IMPORTANT THAT YOU DO THIS CORRECTLY!

Because …1. Otherwise there is no way to tell what dataset wget-download.sh is configured to grab

2. I need the keywords in the new script name to enable several automation scripts3. Some of the metadata (esp. version number) can’t be captured any other way

Set Up Work Environment

Create a top level working directory (you will need my help with this):mkdir /shared/working/cmip5/jma

Make several subdirectories under your top level directory, one for each wget: cd /shared/working/cmip5/jmamkdir a01 a02 a03 a04 a05 a06 a07 a08 a09 a10 a11 a12 a13 a14 a15 a16

sftp wget.renamed.sh from laptop to /shared/working/cmip5/jma/a01/

Certificatesmkdir $HOME/.esgcd $HOME/.esgcp /data/cmip5/extras/MyProxyLogon-ESG.jar .cp /homes/jma/.esg/update ../update

Launch the Wget

Login to a server: cpu1-cpu6cd /shared/working/cmip5/jma/a01

/project/cmip5/jma/dorun.sh & This script filters out unwanted files, notes wget format, runs edited wget script, captures output in log file, and monitors script’s progress. Run it in the background.

/project/cmip5/jma/ckrun.sh When dorun.sh is no longer working, this script evaluates the success of wgets, checks if all desired files are here, returns status of download.

Repeat As Necessary

If ckrun.sh determines that the download is complete, it will create a file called ‘done’. Take a moment to celebrate, then move on to next data set.

If ckrun.sh determines that download is incomplete, there was an error.Look for a file in working subdir named ‘formatA’ or ‘formatB’ or ‘formatC’. If format=A:

• Go back to the Gateway and get a new wget script • Copy new wget.renamed.sh into working subdirectory• Rerun dorun.sh

If format=B or C: • You may need to update your certificates• You do not need a new wget script; just rerun dorun.sh

Pitfalls & Shortcuts

• Watch out for zombies, a.k.a. processes that are hung and can’t be killed• Use dataset URLs to circumvent the search• Start by downloading only these high-priority models first:

Model Gateway

CCSM4 http://www.earthsystemgrid.org

GFDL-CM3 http://pcmdi3.llnl.gov/esgcet

GISS-E2-H http://pcmdi3.llnl.gov/esgcet

HadCM3 http://cmip-gw.badc.rl.ac.uk

MPI-ESM-LR http://ipcc-ar5.dkrz.de