20
Grid Platform for Geospatial Applications & Fine Granule Scheduler Presented by Bin Zhou Bin Zhou, Jibo Xie, Chaowei Yang Joint Center for Intelligent Spatial Computing George Mason University

Grid Platform for Geospatial Applications & Fine Granule Scheduler

  • Upload
    joelle

  • View
    43

  • Download
    0

Embed Size (px)

DESCRIPTION

Presented by Bin Zhou Bin Zhou, Jibo Xie, Chaowei Yang Joint Center for Intelligent Spatial Computing George Mason University. Grid Platform for Geospatial Applications & Fine Granule Scheduler. Agenda. Grid Computing Introduction CISC & SURA Grid Geospatial Applications Require Grid - PowerPoint PPT Presentation

Citation preview

Page 1: Grid Platform for Geospatial Applications   & Fine Granule Scheduler

Grid Platform for Geospatial Applications

& Fine Granule Scheduler

Presented by Bin ZhouBin Zhou, Jibo Xie, Chaowei Yang

Joint Center for Intelligent Spatial Computing

George Mason University

Page 2: Grid Platform for Geospatial Applications   & Fine Granule Scheduler

Agenda➲ Grid Computing Introduction➲ CISC & SURA Grid➲ Geospatial Applications Require Grid➲ CISC Fine Granule Scheduler➲ Architecture,Strategy➲ Progress Status

Page 3: Grid Platform for Geospatial Applications   & Fine Granule Scheduler

Grid Computing Introduction➲ Definition

Grid computing is an emerging computing infrastructure that treats all resources as a collection of manageable entities with common interfaces to such functionality as lifetime management, discoverable properties and accessibility via open protocols

– wikipedia➲ Popular Grid Middleware

Condor Globus Condor-G Unicore

Page 4: Grid Platform for Geospatial Applications   & Fine Granule Scheduler

CISC_Grid ManagerAdministrator

Ethernet

NASA

GMU

UserHigh Speed Connection

SURA Grid

Other Universities Virtual Organization

1G bits/s(can be updated to

10G bits/s)

CISCGrid Portal

Worker

CISC Computing

Pool

Worker Worker

KeyServer

GMUCertificate

Server

CertificateServer

ManagerFile Server

Page 5: Grid Platform for Geospatial Applications   & Fine Granule Scheduler

GMU grid environment

• SURAgrid

GMU

CISC GMU Grid can access the computing resources contributed by SURAgrid member universities

Page 6: Grid Platform for Geospatial Applications   & Fine Granule Scheduler
Page 7: Grid Platform for Geospatial Applications   & Fine Granule Scheduler

GMU grid environmentLambdaRail

GMU

CISC Grid can setup 1-10Gbps connection to any of the LamdaRail supported Universities, Agencies, and Centers, such

as GSFC & SDSC

Page 8: Grid Platform for Geospatial Applications   & Fine Granule Scheduler

CISC Computing Pool

Page 9: Grid Platform for Geospatial Applications   & Fine Granule Scheduler

Geospatial Requirements➲ Large Data Set

Map Data, Sensor Data, in Tera-bytes➲ Reliability,Interoperability

collaboration➲ Intensive Computation

More Complex Algorithms Adaptive Algorithms Intelligent Processing

Page 10: Grid Platform for Geospatial Applications   & Fine Granule Scheduler

Grid Computing Could Satisfy these requirements

➲ Reliable File Transfer➲ Resource Management and Allocation➲ Authorization & Control➲ Job Control➲ Web Service Oriented

Page 11: Grid Platform for Geospatial Applications   & Fine Granule Scheduler

Detecting Watersheds from multi-scale DEM

➲ Watershed boundaries are not known before processing massive data

➲ extract coarse watershed boundaries from multi-scale DEM ➲ Using the boundaries to decompose the massive data with

some redundancy

resample

Extraction

Xie 2006

Page 12: Grid Platform for Geospatial Applications   & Fine Granule Scheduler

Use 24 units to test the speed up

(each unit is 3.08M)

(Xie 2006)

Page 13: Grid Platform for Geospatial Applications   & Fine Granule Scheduler

CISC Test Applications

30 30

20 30

322s 293s

5.2 5.75

0.26 0.19

30

10

374s

4.5

0.45

Job Amount

CPUs

Executing TimeSpeed Up

Efficiency

30

1

1686s

1

1

Real Time Routing Test Result:

The efficiency decreases with the CPU numbers because the overhead increase, but the major problem is Condor can’t handle small jobs efficient.Demonstrates the need for fine granule scheduler

Page 14: Grid Platform for Geospatial Applications   & Fine Granule Scheduler

Specific Applications: Fine-Grained Near Real Time Jobs

➲ Fine-Grained Very Short Executing Time Huge Amount Job Similarity

➲ Near Real Time Sensitive to scheduling latency example: Real-Time Routing, Short-Time stock

prediction,Condor cannot be used for tasks that require less than 3.5 min to

complete ---Gregg Cooke, IT Technical Council ,"Evaluating Condor for

Enterprise Use: A UBS Case Study"

Page 15: Grid Platform for Geospatial Applications   & Fine Granule Scheduler

CISC Scheduler ➲ Purpose

improve near real time job response time improve mass Fine Granularity job throughput

➲ Scheduling Strategy Short Communicating Message Simple Match-Making Function Dynamic Index Multi-Dispatch

Page 16: Grid Platform for Geospatial Applications   & Fine Granule Scheduler

System Architecture

TCP/UDP SocketFile Transfer Process Other

LibServices

Abstract Interface /APIs

Message passing Memory

System Function

Dispatcher

Collector

Container

Resource Manager

Submitter

Algorithm module

Central ManagerWorker User Interface

Page 17: Grid Platform for Geospatial Applications   & Fine Granule Scheduler

ComponentsServer Daemon

Scheduler

Collector

Dispatcher

Dynamic IndexJob Queue

Machine Queue

CPU

Poll orEvent Driven

Multi-Dispatch

Priority Deadline

Sort

Client Daemon

Submittor

Resource Manager

Job Package

File Fetcher

Worker Status

Resource Info

ShieldLocal

Running Env

Match Queue

Dynamic Index

Dependency

Rank

Availability Memory Disk Others

Job&Workerinfo

Job Parser

Job File Server…...

Page 18: Grid Platform for Geospatial Applications   & Fine Granule Scheduler

Job Work FlowSubmit

InQueue

Schedule

Running

Dispatch

Staging out

Finished

Resource

Scheduling Algorithm

RemoveFrom

Queue

If Error

If Error

Page 19: Grid Platform for Geospatial Applications   & Fine Granule Scheduler

Prototype Overhead Test

➲ Test Case Insertion Sort 200,000 integers Dataset: 5.56M Execute File : 1.8M

➲ Test Platform OS: ubuntu 6.10 Network: 100Mbps CPU: Celeron M 1.6G Memory: 1G

Job Amount

File Transfer Time

Job Executing Time

Other Overhead

CommunicatingOverhead

Efficiency

1 1s 27s 0.4s 18ms 95.1%

5 4s 154s 1.2s 20ms 98.9%

Page 20: Grid Platform for Geospatial Applications   & Fine Granule Scheduler

Thanks

Questions?