ESRI UC 2010 - ArcGIS Server Virtualization and High-Performance Computing

Preview:

Citation preview

ArcGIS Server Virtualization and High-

Performance Computing

Sherwin Faria <seftch@rit.edu>

Sidney Pendelberry<slpits@rit.edu>

Presented at the ESRI User Conference 2010

July 11-16, 2010

San Diego, California

Agenda

The Problem

The Solution

Use of Virtualization

To Use or Not to Use

What to Expect

What’s Next?

The Problem

Large datasets

Testing and debugging is a nightmare due to lengthy

processing times, even on a subset of the data

The same tasks must be run over and over again

Time constraints

Too much productivity is lost constantly waiting for data

Projects fall too far behind schedule

Repetitive tasks, why can’t they be run in parallel?

The Solution: HPC and ArcGIS

HPC runs tasks in parallel

HPC can be spread across a huge pool of nodes

Not only can we run tasks in parallel:

We can run multiple unrelated jobs at the same time

Multiple users can use the same HPC resources if desired

End result, more efficient use of available resources, and less

productivity lost

Managing HPC with Excel!

Managing HPC with Excel!

Managing HPC with Excel!

Managing HPC with Excel!

Managing HPC with Excel!

Delegation of Privileges

Integration with Active Directory

Simple user and group management

Allows extension of services to other departments or

organizations

Other institutions, partners, customers, etc.

The HPC Management Interface

Use of Virtualization

If the virtual infrastructure is already there, why not take

advantage of it.

No additional power overhead, network connectivity, or

rack space necessary

If the environment is large enough, can easily handle several

dozen nodes or more.

Simplify additional deployments

Gain the ability to quickly clone an existing machine with all

of its configuration intact

To Use or Not to Use

This is not a multithreaded ArcGIS process

Works best with:

Extremely Large datasets

Repetitive Processes

Examine the costs and benefits

Setting up a brand new environment

Management

Licensing

What To Expect…

Things to take into consideration

Managing configuration for each node

Managing multiple outputs

Tracking job status

Managing HPC

Control methods

Command line arguments

Database driven

Control file

Tracking and managing jobs

How do you track and

perhaps cancel a job

after submission?

Output

Sent to filesystem and

combine somehow?

Added to a table in

the database?

Maybe stored in HPC and retrieved upon

completion

What’s Next?

Enable Web submission with Shibboleth

Enable job submission, management and control through

a web interface

Questions?

?

Recommended