16
Australian Ocean Data Network utilises Amazon Web Service Batch processing for gridded data Sebastien Mancini, Roger Proctor, Peter Blain, Kate Reid University of Tasmania 5 November 2018 Australia’s Integrated Marine Observing System (IMOS)

Australia’s Integrated Marine Observing System (IMOS)

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Australian Ocean Data Network utilises Amazon Web Service Batch processing for gridded dataSebastien Mancini, Roger Proctor, Peter Blain, Kate ReidUniversity of Tasmania

5 November 2018

Australia’s Integrated Marine Observing System (IMOS)

12 minutes, plus 3 minutes for questions

1. About IMOS

2. Gridded datasets

3. Cloud computing and AWS Batch

4. Helpdesk and user statistics

5. Future improvements

Outline of the talk

IMOS Facilities

https://portal.aodn.org.au

All data discoverable, accessible, usable and reusable

IMOSgriddeddatasets

• SeaSurfaceTemperature(SST)products• OceanColourproducts• Satellitealtimetryproducts• Coastalradarproducts• Climatology• Bathymetry

Whatdoestheuserwant?

• Visualisedatabeforedownloading• RetrieveatimeseriesataparticularlocationanddownloaddatainCSVformat

• SubsetandaggregatedataandgenerateoutputfileinnetCDFformat

• MostofthemdonotwanttoaccessdataonTHREDDS

Whatdoestheuserwant?MostdownloadeddatasetsontheAODNPortalfortheperiod01/2016to04/2018usingGoogleAnalyticsRank EventLabel TotalEvents1 IMOS- AustralianNationalMooringNetwork(ANMN)Facility- Currentvelocitytime-series 9442 IMOS- ArgoProfiles 8533 IMOS- AustralianNationalMooringNetwork(ANMN)Facility- Temperatureandsalinitytime-series 7764 IMOS- SRSSatellite- SSTL3S- 01daycomposite- nighttime 7655 IMOS- SRS- MODIS- 01day- Chlorophyll-aconcentration(OC3model) 6796 IMOS- AustralianNationalFacilityforOceanGliders(ANFOG)- delayedmodegliderdeployments 5017 IMOS- SRSSatellite- SSTL3S- 06daycomposite- daytime 4588 IMOS- OceanCurrent- Griddedsealevelanomaly- Delayedmode 439

9IMOSNationalReferenceStation(NRS)- Salinity,Carbon,Alkalinity,OxygenandNutrients(Silicate,Ammonium,Nitrite/Nitrate,Phosphate) 420

10 IMOS- SRSSatellite- SSTL3S- 1monthcomposite- dayandnighttimecomposite 40411 IMOS- SRSSATELLITE- SSTL3S- 01daycomposite- dayandnighttimecomposite 37612 IMOS- OceanCurrent- Griddedsealevelanomaly- Nearrealtime 36815 IMOS- SRS- MODIS- 01day- OceanColour- SST 32417 IMOS- SRSSATELLITE- SSTL3S- 03daycomposite- nighttime 28019 IMOS- SRSSatellite- SSTL3S- 03daycomposite- dayandnighttimecomposite 22720 IMOS- SRSSatellite- SSTL3S- 01daycomposite- daytime 20922 IMOS- SRSSatellite- SSTL3S- 06daycomposite- dayandnighttimecomposite 16923 IMOS- SRS- MODIS- 01day- Chlorophyll-aconcentration(GSMmodel) 15726 MARVL3- Australianshelftemperaturedataatlas 14427 IMOS- SRSSatellite- SSTL3S- 1monthcomposite- daytime 141

Serverlessarchitecture

• Serverdetailsgetabstractedaway• Serversonlyrunwhenneeded• Leaveservermanagementtoacompanythatdoesthatastheirbreadandbutter

• Focusonwhatmattersinstead- thecodeanddata

Image:Creator:JohnVoo,Url:https://www.flickr.com/photos/138248475@N03/Licence:CCBY2.0

ArchitectureoverviewofAWSbatch

AODNPortal

UserorAdmin

Helpdesk:QueueMonitoring

Helpdesk:QueueMonitoring

Jobdetails:• E-mailaddress• Datasetcollection• Filtersapplied

Userstatistics:Sumologic dashboard

Userstatistics:FebruarytoSeptember2018

240uniqueusers

42userspermonth

1600downloadsovertheentireperiod

350downloadsinJune

32faileddownloadsovertheentireperiod

Processtimeof<8minfora1000jobs10%jobshaveanaveragedprocessedtimeof900min

Queuetimeof<7minfora1000jobs10%jobshaveanaveragedqueuedtimeof550min

Costcomparison:

AWSBatch Normalapproach

EC2SPOTinstance+localstorage+Lambda

EC2instance(ifreserved)+Localstorage

80$permonth 350$permonth

Additionalcostforbothapproaches:• Storageforoutputfile• S3operations(dataaccess)

vs

VendorLock-in

+majorityof aggregationcodewrittenasagenericlibrary/utility

- requesthandlerLambdahassomeAWSspecificcode,butcouldberefactoredtoamoregenericAPIhandler

- statusserviceLambdaisquitespecifictoAWSBatch/S3currently,butcouldbemademoreabstract

- thecomponentsaregluedtogetherbyCloudFormationanddomakeassumptionsaboutrunningonLambda,howeverthemajorityoftheclassesaregeneric,soifrunningoutsideofAWSwasarequirement,theprojectcouldberefactoredtomakeconceptslikestorageandAPIsmoregeneric

Futureimprovements

• Implementcloudcomputingtootherelementsofourinfrastructure:• Developmentofourapplicationstack(AODNPortal,Geonetwork,Geoserver,Geowebcache,ncWMS …)

• Improvesubsetting/aggregatingcode:• Downloadmultiplepointtimeseriesatthesametime• Improveefficiencytoproduceresultquicker

• Useofsystemanalyticstoimprovequeuedesign• Multiplequeuesdependingonsizeofjobs,typeofusers…

• ApplyAWSBatchforothertypeofdatadownloads:• LargeCSVdownloadsusingWFSrequestsfromGeoserver• SubsetGeotiff (e.g.Bathymetrydata)