Upload
jorryn
View
44
Download
0
Embed Size (px)
DESCRIPTION
G-JavaMPI: A Grid Middleware for Distributed Java Computing with MPI Binding and Process Migration Supports. Lin Chen, Cho-Li Wang, Francis C. M. Lau and Ricky K. K. Ma Department of Computer Science and Information Systems The University of Hong Kong {lchen2+clwang+fcmlau+kk1ma}@csis.hku.hk. - PowerPoint PPT Presentation
Citation preview
G-JavaMPI:A Grid Middleware for Distributed Java Computing with
MPI Binding and Process Migration Supports
Lin Chen, Cho-Li Wang, Francis C. M. Lau and Ricky K. K. Ma
Department of Computer Science and Information Systems
The University of Hong Kong
{lchen2+clwang+fcmlau+kk1ma}@csis.hku.hk
GCC2002 PresentationLin Chen, CSIS, HKU (Dec. 26, 2002) 2
OutlineMotivation
Overall system architecture
Detailed Issues
Related works
Conclusion & Future Work
GCC2002 PresentationLin Chen, CSIS, HKU (Dec. 26, 2002) 3
Motivation
Grid computing: large-scale resource sharing, high performance
Globus Project: basic services required by building and using a Grid (authentication, security, resource allocation, remote data access, information services, etc.)
However long-running applications continuous computation Better utilization of resource scheduling and load balancing
Java process migration architecture-independent bytecode makes migration easier
GCC2002 PresentationLin Chen, CSIS, HKU (Dec. 26, 2002) 4
Motivation
Let the programmer write a grid application easily no care about inter-site communication and intra-site
communication (we must care about it if directly using globus communication libraries)
SPMD: one program can be executed in multiple places or sites
MPI paradigm a group of distributed processes, they can do peer-to-peer
or collective communication Communication source or destination addresses are
unrelated with the real physical network address (adaptable)
GCC2002 PresentationLin Chen, CSIS, HKU (Dec. 26, 2002) 5
System Overview
(*)
Gatekeeper
(1)(1*)
LS
Gatekeeper(3*)
LS
Gatekeeper
(3)
LS
(2)
WAN
Migrating(restarting a new process through Globus remote job request with delegated user credentials and Java-MPI job credentials)
Java-MPI communication
Some legacymessages are redirectedduring migration
(2*)
JVM
M
Migration module resides in each JVM
GCC2002 PresentationLin Chen, CSIS, HKU (Dec. 26, 2002) 6
System overview
Globus Toolkit Libraries Java MPI communication daemons Local schedulers Java-MPI processes Migration modules
A Java-MPI process Java-MPI process (before migration) (after migration) (1*) – (2*) – (3*): MPI communication route before migration (1*) – (2*) – (3*): MPI communication route after migration (*): Java MPI communication daemons redirect some legacy messages which should be go to the migrated process
MLS
GCC2002 PresentationLin Chen, CSIS, HKU (Dec. 26, 2002) 7
Layered design
RestorableCommunication Services
Authentication
ControlBlock
DLBPolicy
Info.Update
(Restorable MPI Comm Layer) (Load Balancing Module)
Java-MPI Applications
MPICH-G2
MessageQueues
Globus Services
OS
JVMJVMDI
Execution State Probe & Migration Plug-in
(Migration Layer)
Java-MPI API & Java API(Java-MPI API Layer)
Hardware
MigrationInstructions
GCC2002 PresentationLin Chen, CSIS, HKU (Dec. 26, 2002) 8
Java-MPI bindingRestorable communication layerDaemon, a running MPICH-G2 process,
providing MPI communication servicesCommunicate with JavaMPI process
through IPCPost-migration message
re-direction
RestorableCommunication
Processspace
GCC2002 PresentationLin Chen, CSIS, HKU (Dec. 26, 2002) 9
Java Process MigrationState capturing: a probe attached in each JVM, saves the process
context through JVMDI (JVM Debugger Interface) All runtime data: PC register, stack frames, objects,
method area (local variables), etc. Event notification: method_entry, frame_pop, etc.
Use object serialization to package all reachable objects in heap
New JDK1.4.0 & 1.4.1 released in Aug. 2002 support “full-speed debugging”
JVM probeJVMDI
1. Execution state data2. Eventnotification
GCC2002 PresentationLin Chen, CSIS, HKU (Dec. 26, 2002) 10
Java process migration
State Restoration:Exception handler inserted in bytecode
(pre-processing before execution) to restore local variables and “jump” to the original execution point
Re-allocate objects when re-starting JVMDynamic class loading
GCC2002 PresentationLin Chen, CSIS, HKU (Dec. 26, 2002) 11
Information updateMigration
Sourcesite
MigrationDestination
site
Other sites
Migration begin
Notify other sites (including destination site)
The process arrives the safe migration point
(consume all legacy messages)
Update local site of the process’s new place
Begin process state capturation
GCC2002 PresentationLin Chen, CSIS, HKU (Dec. 26, 2002) 12
Process Restart
OriginalProcess
OriginalProcess
New-startedProcess
New-startedProcess
creates a new user certificate proxy(proxy_init_cred )
delegated to remote site
get the resource allocation
The new process can be started(similar to normal globus job submit)
JVM initializationAt the same time, the probe started
Process suspended in the beginning,Probe read out context from dumpfile
Restoring the execution context
Process resumed and continued from the last point
GCC2002 PresentationLin Chen, CSIS, HKU (Dec. 26, 2002) 13
Experiment ResultsHardware 32-node Cluster “ostrich” configured as two grid points of 16 nodes 733MHz Pentium III processor 392MB of memory connected by a 24-port Fast Ethernet switch
Software Linux 2.2.14 Gloubs 2.0 Sun JDK 1.4.0_02 (supporting JVMDI with
full-speed debugging mode) MPICH 1.2.4 (MPICH-G2)
GCC2002 PresentationLin Chen, CSIS, HKU (Dec. 26, 2002) 14
Experiment resultsBandwidth
0500
100015002000250030003500400045005000
8 16 32 64 128 256 512 1024 2048
Message Size (byte)
Ban
dw
idth
(K
byt
e/s)
Intra-site bandwidth Inter-site bandwidth
Bandwidth comparison between inter-site and intra-site communication with the installation of the MPI communication layer.
GCC2002 PresentationLin Chen, CSIS, HKU (Dec. 26, 2002) 15
Experiment resultsLatency
0
0.1
0.2
0.3
0.4
0.5
0.6
1 2 4 8 16 32 64 128 256 512 1024 2048
Message Size (byte)
Late
ncy
(s)
Inter-site latency Intra-site latency
Latency comparison for small messages between intra-site and inter-site communication with the installation of the MPI communication layer.
GCC2002 PresentationLin Chen, CSIS, HKU (Dec. 26, 2002) 16
Experiment resultsTime for capturing and restoring objects
0
500
1000
1500
2000
2500
3000
1 10 100 1000 10K 100K 1M 10M
object size (byte)
tim
e (
mic
ros
ec
on
d)
capturing objects restoring objects
Time spent in capturing and restoring objects
GCC2002 PresentationLin Chen, CSIS, HKU (Dec. 26, 2002) 17
Experiment resultsTime for capturing and restoring Java frames
0
1
2
3
4
5
6
1 10 20 50 100 200 300
number of frames
tim
e (
se
co
nd
s)
capturing frames restoring frames
Time spent in capturing and restoring frames
GCC2002 PresentationLin Chen, CSIS, HKU (Dec. 26, 2002) 18
Related WorksJava bindings for MPI: “mpiJava”, “JavaMPI”, “MPIJ”, etc.Java process or thread migration: Add additional backup codes in programs [Aglets[IBM96]] Insert backup statements in the source or byte code, a backup
object is used to store state [Wasp project [Funfrocken98]] Extend the JVM, make state accessible from Java programs,
support type recognition of Java stack [sara Bouchenak 2000] Use JVMDI to capture state, insert bytecode instructions in
program body to help restoring [Torsten2001] JESSICA (supports thread migration in JVM)
GCC2002 PresentationLin Chen, CSIS, HKU (Dec. 26, 2002) 19
Conclusion
a new middleware for the Grid with Java-MPI communication and transparent process migration features. write MPI-style programs in Java language Java process migration mechanism
supports the development of any dynamic load balancing policy or fault tolerance mechanism
GCC2002 PresentationLin Chen, CSIS, HKU (Dec. 26, 2002) 20
Future Plan
Develop some scientific and engineering applications on top of this middlewareSupport of the transfer of other I/O (including file stage-in/out)Load balancing algorithm for the grid environment (both CPU and network load)
The End
Thanks !