Winter, 2004 CSS490 M++ 1
CSS490 M++CSS490 M++Textbook No Corresponding ChapterTextbook No Corresponding Chapter
Instructor: Munehiro Fukuda
These slides were thoroughly the instructor’s original materials.
Winter, 2004 CSS490 M++ 2
Thread Migrationoverview
Moving: CPU status Stack (Heap)
LAN/SAN
Distributed Shared Memory (in most cases)
Local memory Local memory Local memory
Migrating a thread
Memory accessMemory access
Actual memoryaccess with high latency
Winter, 2004 CSS490 M++ 3
Thread MigrationExamples
System models
Examples Remark
SMP Active threads Migrating threads to a different processor of SMP for dynamic load balancing
DSM Olden, Nomadic threads
Dispatching threads to a remote site close to data they access
Message Passing
uPVM, Ariadne, PM2 Injecting threads to another process for dynamic load balancing
Remote Execution
Nexus Sending a function/arguments with a thread to a remote for remote execution
Winter, 2004 CSS490 M++ 4
M++ Mobile Agents
Merits: Inherent parallelism and multi-agent-base applications
Demerit: Performance problem because of their interpretation
Thread Migration Merit: Fast migration Demerits:DSM needed and No navigational autonomy
Logical Network Global application-specific network Constructed at run time Node data referred to after migrating that node.
Extension of C++ Language Agent encapsulated by object-oriented language Navigational autonomy with strong migration
Winter, 2004 CSS490 M++ 5
Multi-Agent Approach
Simulate the interaction among agents
Advantages: Overcomes the limitations of
mathematical techniques Encapsulates simulation
models Allows open-ended simulation
Disadvantages: Performance Parallel programming
Winter, 2004 CSS490 M++ 6
Parallelizing Multi-Agent Application with MPI
MPI Process must maintain each
agent as a static object
maintain as a simulation space a two-dimensional array
change each status periodically
exchange them with remote processes
Synchronize with remote processes
LAN/SAN
Process 1
Object Queue
Obj 6
Obj 5
Obj 4
Obj 3
Obj 2
Obj 1
null
Object Queue
Obj 6
Obj 5
Obj 4
Obj 3
Obj 2
Obj 1
nullObj 7
Process 2
Exchange
Winter, 2004 CSS490 M++ 7
Parallelizing Multi-Agent Application with Mobile Agent
We don’t need a control process!
Each simulation entity works independently as an agent.
LAN/SAN
agent Shark { Shark( ) { … } void run( ) { while (alive) { compute( ); go( ); } }}
Winter, 2004 CSS490 M++ 8
Performance Problem Mobile Agents:
They were cognitive agents designed to perform large remote jobs.
They are code in Java and Tcl, and thus interpreted.
Their size is so big and their speed is so slow. Performance Improvement:
Self-migrating threads Zero-copy migration library
Winter, 2004 CSS490 M++ 9
M++ Execution Model
DaemonDaemon DaemonDaemon DaemonDaemon
Daemon NetworkDaemon Network
LAN/SAN
Logical NetworkLogical Network
NodeNode NodeNode
NodeNode
NodeNode
Link
1 2 3
13
1
hop(1@2)
Linkcreate Node with 3@2
1
Thread
Physical NetworkPhysical Network
Winter, 2004 CSS490 M++ 10
M++ Language Overviewclass Node { // Defining nodes (and links) as C++ cl
assespublic: int var;};
thread Thread { // Defining an M++ threadprivate: main() { // Executing its own behavior independently create node<Node> with 1; // Creating a network node, (link, and thread) hop( 1 ); // Migrating itself (and forking) node<Node>.var = 10; // Accessing the current node, (and link) }};
Winter, 2004 CSS490 M++ 11
M++ Compilation
Source program must have an mpp extension. may include two or more agent code.
The mpp compiler first preprocesses the source code into C++ files, each corresponding to a diff
erent agent. pass them to the g++ compiler, and generates executable files, each again cor
responding to a different agent.
m++@MEDUSA[16]% lsCar.mpp profilem++@MEDUSA[17]% m++ Car.mpp***** compiling GSCashier***** compiling SportsCar***** compiling Truckm++@MEDUSA[18]% lsCar.mpp GSCashier* SportsCar* Truck* profilem++@MEDUSA[19]%
Includes three pieces of agent code
Compile it with the m++ compiler
Message printed out from m++
Three executable files obtained
Winter, 2004 CSS490 M++ 12
M++ Profile
m++@MEDUSA[21]% cat profileport 10145hostnum 4mode sthread
describe daemonidmedusa 0mnode1 1mnode2 2mnode3 3describe_endm++@MEDUSA[22]%
Port A socket port # used for thread migration
Hostnum # PCs constituting the system
Mode Pthread or sthread (the latter is the M++ original thread library)
IP name and daemon ID
Each PC’s IP name and is corresponding daemon ID (>= 0).
Winter, 2004 CSS490 M++ 13
M++ Daemon Invocation Run “thrd” on each PC
Open a new window for each PC and type thrd, OR
Remotely login each PC and type “nohup thrd&”
All thrd print out “ready for being injected”
You are ready to launch new m++ threads.
m++@mnode3[4]% nohup thrd& exit[1] 6659logoutrlogin: connection closed.m++@mnode2[3]% nohup thrd& exit[1] 6678logoutrlogin: connection closed.m++@mnode1[3]% nohup thrd& exit[1] 8200logoutrlogin: connection closed.m++@MEDUSA[10]% thrd&[1] 4954port : 10145Connction established to medusa with daemon 0Connction established to mnode1 with daemon 1Connction established to mnode2 with daemon 2Connction established to mnode3 with daemon 3Total daemon(s) : 4Mode : sthreadStack : 65536 byteReady for being injected.m++@MEDUSA[11]%
Winter, 2004 CSS490 M++ 14
M++ Thread Launching
To launch a M++ thread, type Inject target_IPname executable_file #threads arguments
To stop the M++ system Bring back one of thrds to foreground and type cntl+c
m++@MEDUSA[11]% inject medusa ./GSCashier 1Injected 1 ./GSCashier thread(s) to medusam++@MEDUSA[12]% GSCashier: Waiting for cars.
m++@MEDUSA[12]% inject medusa ./Truck 1Injected 1 ./Truck thread(s) to medusam++@MEDUSA[13]% Truck: I'm at the GasStation.GSCashier: Injected fuel to TruckDiesel: 900 Regular: 1000GSCashier: Waiting for cars.Truck: I'm filled with fuel.
m++@MEDUSA[13]% fgthrdthrd: Interrupted.thrd: Closed all connections.thrd: Terminated.m++@MEDUSA[14]%
Winter, 2004 CSS490 M++ 15
M++ Language in DetailsThread Definitions and Deployment
A thread constructor receives arguments starting from argv[4]
A thread immediately starts its own execution in main( ).
thread Child { // spawned by a Parent threadpublic: Child(int argc, const char** argv):id( atoi(argv[5])){ strncpy(msg, argv[4], 16 ); } void main() { cout << msg << id << endl; }private: char msg[16]; int id;};
thread Parent { // injected firstpublic: void main( );private: int i; char idstr[10];};void Parent::main( ) { for (i=0; i<10; i<++ ) { // spawn 10 Child threads sprintf(idstr, “%d”, i); create thread<Child>(“I am Thread No”, idstr);} }
Winter, 2004 CSS490 M++ 16
M++ Language in DetailsNetwork Components
Components
Descriptions Definitions/usage
Daemon class
provides the current daemon status. It is system-predefined and read-only class.
daemon.name( ) returns the IP name.daemon.id( ) returns the daemon id.daemon.total( ) returns #daemons
DaemonObj class
is defined by a user and shared by all threads on the same daemon. Read/write
class DaemonObj { … };thread T { …} // all thread Ts share daemonObj
Nodes Includes an object of user-defined class. Each node can include an object of a different class.
class Place { … };Thread T { Void main( ) { create node<place>( ) with(nid@ did);}
Links Are visible from their endpoint nodes. They include an object of user-defined class. Each linke can include an object of a different class.
class Path { … };Thread T { void main( ) { create link<Path>( ) with slid to (nid@ did) with(dlid);}
Winter, 2004 CSS490 M++ 17
M++ Language in DetailsLogical Network Construction
Node creation and destruction [e]create node<NodeClassName>(argsList) with (NodeId[@DaemonId]); [e]destroy node(NodeId[@DaemonId]);
Link creation and destruction [e]create link<LinkClassName>(argsList) with (SrcLinkId) to (NodeId[@DaemonId] with (De
stLinkId) [e]destory link(srcLinkId);
Only available in thread main( )
class Place { };class Path { };
thread Net {public: void main( ) { create node<Place>( ) with( 1@1 ); create link<Path>( ) with(10)to(1@1) with(20);} }
DaemonDaemon
InitInit
DaemonDaemon 0 1
0PlacePlace
110 20Path
Winter, 2004 CSS490 M++ 18
M++ Language in DetailsNetwork Navigation
[e]hop( [NodeId@DaemonId] ); [e]hopalong( srcLinkId ); [e]fork([ NodeId@DaemonId] ); [e]forkalong( srcLinkId );
class Place { };class Path { };
thread Net {public: void main( ) { create node<Place>( ) with( 1@1 ); create link<Path>( ) with(10)to(1@1) with(20); hop( 1@1 ); hopalong( 20 ); forkalong( 10 ); cout << “hi!” << endl; }}
DaemonDaemon
InitInit
DaemonDaemon 0 1
0PlacePlace
110 20Path
hi!
hi!
Winter, 2004 CSS490 M++ 19
M++ Language in DetailsInter-Thread Communication
Indirect communication: threads can read and write the same variable on the same node.
Thread 1hop( node3@daemon1 );node<Place>.var = 10;
Thread 2hop( node3@daemon1 );myVar = node<Place>.var;
Direct communication: A thread can access another thread directly.
Thread 1 reads Thread2myVar = Thread[2].myVar;
Thread 2 writes Thread1Thread[1].myVar = myVar;
DaemonDaemon
PlacePlace
var
write read
Thread 1 Thread 2
myVarmyVar
Indirect
Direct
Winter, 2004 CSS490 M++ 20
M++ Language in DetailsThread Synchronization
Nodes are monitors Unless a thread executes a network construction or a migration stat
ement, it can continuously and exclusively run on the current node. Synchronization statements
sleep: suspends until “wakeup threadId” or “wakeupall”. sleep node: suspends until “wakeup[all] node”. sleep daemon: suspends until “wakeup[all] daemon”. wakeup threadId: resumes the thread with threadId. wakeup node: resumes one thread sleeping on node. wakeup daemon: resumes one thread sleeping on daemon. wakeupall node: resumes all threads sleeping on node. wakeupall daemon: resumes all threads sleeping on daemon.
Winter, 2004 CSS490 M++ 21
Transfer
SrcDaemonSrc
Daemon
UsrClass thrvar;thread Thread { main() { : hop(); :} };
200
UsrClass thrvar;
200
Execution
Thread
Pointer
Virtual function table
Pointer
Virtual function table
DstDaemonDst
Daemon
200
Resume
Thread
New pointer
Old pointerCopy constructor
UsrClass thrvar;thread Thread { main() { : hop(); :} };
UsrClass thrvar;
200
ImplementationState Capturing
Winter, 2004 CSS490 M++ 22
AA CC
CC MainMain
11
AA BB CCMainMain
11 22
AA BB CCMainMain
AA BB CC MainMainAA BB CC
Pthread Library Sthread Library
User level
Kernel level
1:1:
2:
ImplementationThread Scheduling
Winter, 2004 CSS490 M++ 23
RAM
Thread 1
RAM
Network Interface Network Interface
Kernel Area Kernel Area
User AreaUser AreaSender Receiver
Thread 2Thread 3
Thread 1
Thread 2Thread 3
(1)
(3)(2) gather scatter
Packed Threads Packed ThreadThread 2Thread 3
Zero-copy transfer
Pin-downed pages
Immediateresumption
ImplementationZero-copy communication
Winter, 2004 CSS490 M++ 24
Exercises (No turn-in)
1. Consider the pros and cons of MPI and M++ when used to design a multi-agent application.
2. Why and when is thread migration necessary?
3. What is zero-copy communication? Why is it so effective?
Winter, 2004 CSS490 M++ 25
Final Project Find a parallel application and program in both MPI a
nd M++. Compare their programmability and performance. You may have a partner. One codes an MPI version,
the other M++ version. Exclude the following applications (already programm
ed): Antfarm Artificial societies Growing neural network Matrix multiplication Mandelblot