Accessing Cloud Computing to Support Water Resources Modeling

Preview:

DESCRIPTION

Scott D. Christensen, Nathan R. Swain, E. James Nelson, Norman L. Jones,

Citation preview

Accessing  Cloud  Computing  to  Support  Water  Resources  ModelingScott  D.  Christensen,  sdc50@byu.net

Nathan  R.  Swain,  nathan.swain@byu.netE.  James  Nelson,  jimn@byu.edu

Norman  L.  Jones,  njones@byu.edu

This  material  is  based  upon  work  supported  by  the  National  Science  Foundation  under  Grant  No.  1135483.

Background Applications Tethys  Platform  Integration

CondorPy  and  TethysCluster Summary

Advances in water resources modeling are providing us with better information,however, they require more computational power to run. Cloud computingenables universal access to cost-­‐effective computing, yet there still remains asignificant technical barrier to accessing these resources. Here we present a setof Python tools, TethysCluster and CondorPy, that have been developed tolower the barrier to modeling in the cloud by providing :

(1)programmatic  access  to  dynamically  scalable  computing  resources

(2)a  batch  scheduling  system  to  queue  and  dispatch  the  jobs  to  the  computing  resources

(3)data  management  for  job  inputs  and  outputs(4) the  ability  to  dynamically  create,  submit,  and  monitor  computing  jobs

While TethysCluster and CondorPy can be used independently to provisioncomputing resources and perform large modeling tasks, they have also beenintegrated into Tethys Platform, a development platform for water resourcesweb apps, to enable computing support for modeling workflows and decisionsupport systems deployedas web apps.

Two Python modules have been developed to lower the technical barrier to

accessing cloud computing for performing large modeling tasks. TethysCluster

automates the process of provisioning diverse cloud resources and configuring

them with HTCondor. CondorPy interfaces with HTCondor to enable computing

jobs to programmatically be created, submitted, and monitored.

CondorPy and TethysCluster have been integrated into Tethys Platform enabling

web apps to easily perform large computing tasks.

Stochastic Analysis

Uncertainty is inherent to hydrologic modeling,

and is often accounted for my performing a

stochastic analysis which requires running

hundreds or thousands of model simulations.

For a spatially-­‐distributed, physics-­‐based models

such as GSSHA running thousands of models

may take months or even years. TethysCluster

and CondorPy enable this type of analysis to be

done much faster using cloud computing.

Job  ManagerCondorPy has been integrated intothe Tethys Platform Python SDK in theform of a job manager that enablesdevelopers to define computing jobsand submit them to the HTCondorpools to offload large computingtasks.

CondorPyHTCondor is a software system that that enables

High Throughput Computing (HTC) by managing

computing resources and scheduling computing

jobs. It enables diverse computing systems to be

linked together into a unified computing pool.

CondorPy serves as a cross-­‐platform, high-­‐level

interface for HTCondor, and allows jobs to be

created, submitted and monitored from a Python

scripting environment. This interface facilitates the

use of HTCondor in a web environment like Tethys

Platform (see panel D).

TethysClusterLarge modeling tasks often require a large amount of

computing resources. Commercial cloud providers

such as Amazon Web Services (AWS), and Microsoft

Azure provide on-­‐demand, scalable resources,

however configuring them HTCondor can prove

challenging. StarCluster is a Python module that

automatically provisions and configures Linux

computing resources with AWS. TethysCluster is an

adaptation of StarCluster and expands it’s functionality

to work with both Linux and Windows resources with

AWS as well as Azure. ci-­‐water.github.io/condorpy

A C D

B ETethysCluster

CondorPy

Tethys  PlatformTethys Platform is a water resourcesweb development platform thatlowers the barrier to creating webapps. Tethys Platform provides opensource web GIS and visualization toolsall integrated into a unified PythonSDK .

Cluster  ManagementCloud computing resources are easy toprovision through admin site of TethysPortal, the web interface of TethysPlatform. TethysCluster works behindthe scenes to automatically configurethe cloud resources into an HTCondorcomputing pool.

CondorPy

TethysCluster

Ensemble Forecast Processing

TethysCluster and CondorPy are used by the Streamflow Prediction Tool (a Tethys web app) to

automatically process a 52-­‐member ensemble forecast produced by the European Center for

Medium-­‐Range Weather Forecasts. A scheduled Python script creates 52 jobs using CondorPy to

process each ensemble forecast every 12 hours when a new forecast is available. TethysCluster can

be used to automatically provision and de-­‐provision cloud computing resources.

Hierarchical Modeling

Running high fidelity models over large

domains often requires powerful computers

and lots of time. One way to alleviate this

problem is to partially parallelize the

computation by decomposing the domain into

smaller models. This results in a series of

hierarchical models whose execution must be

coordinated. CondorPy facilitates running this

type of workflow with HTCondor in a parallel

computing environment.

ci-­‐water.github.io/TethysCluster

CondorPy TethysCluster

Probabilistic   flood  map  resulting  from  5000  model  runs   using  the  spatially-­‐distributed   physics-­‐based   hydrologic  model  GSSHA.

Top:  large  watershed  shown  divided   into  hierarchical  sub-­‐basis.  Bottom:  Diagram  showing  the  parallelization  and  hierarchy  of  the  models.  

Screenshot  of  a  Tethys  web  app,   the  Streamflow  Prediction  Tool,  which  uses  CondorPy   and  TethysCluster  to  process  ensemble  forecasts.

Recommended