Click here to load reader
Upload
sherwin-faria
View
1.242
Download
1
Embed Size (px)
Citation preview
ArcGIS Server Virtualization and High-
Performance Computing
Sherwin Faria <[email protected]>
Sidney Pendelberry<[email protected]>
Presented at the ESRI User Conference 2010
July 11-16, 2010
San Diego, California
Agenda
The Problem
The Solution
Use of Virtualization
To Use or Not to Use
What to Expect
What’s Next?
The Problem
Large datasets
Testing and debugging is a nightmare due to lengthy
processing times, even on a subset of the data
The same tasks must be run over and over again
Time constraints
Too much productivity is lost constantly waiting for data
Projects fall too far behind schedule
Repetitive tasks, why can’t they be run in parallel?
The Solution: HPC and ArcGIS
HPC runs tasks in parallel
HPC can be spread across a huge pool of nodes
Not only can we run tasks in parallel:
We can run multiple unrelated jobs at the same time
Multiple users can use the same HPC resources if desired
End result, more efficient use of available resources, and less
productivity lost
Managing HPC with Excel!
Managing HPC with Excel!
Managing HPC with Excel!
Managing HPC with Excel!
Managing HPC with Excel!
Delegation of Privileges
Integration with Active Directory
Simple user and group management
Allows extension of services to other departments or
organizations
Other institutions, partners, customers, etc.
The HPC Management Interface
Use of Virtualization
If the virtual infrastructure is already there, why not take
advantage of it.
No additional power overhead, network connectivity, or
rack space necessary
If the environment is large enough, can easily handle several
dozen nodes or more.
Simplify additional deployments
Gain the ability to quickly clone an existing machine with all
of its configuration intact
To Use or Not to Use
This is not a multithreaded ArcGIS process
Works best with:
Extremely Large datasets
Repetitive Processes
Examine the costs and benefits
Setting up a brand new environment
Management
Licensing
What To Expect…
Things to take into consideration
Managing configuration for each node
Managing multiple outputs
Tracking job status
Managing HPC
Control methods
Command line arguments
Database driven
Control file
Tracking and managing jobs
How do you track and
perhaps cancel a job
after submission?
Output
Sent to filesystem and
combine somehow?
Added to a table in
the database?
Maybe stored in HPC and retrieved upon
completion
What’s Next?
Enable Web submission with Shibboleth
Enable job submission, management and control through
a web interface
Questions?
?