Advanced Charm++ Tutorial

Embed Size (px)


Advanced Charm++ Tutorial. Charm Workshop Tutorial Gengbin Zheng 10/19/2005. Building Charm++ Advanced messaging Interface file (.ci) Groups Delegation Array multicast Advanced load balancing Threads SDAG. Charm++ on Parallel Machines. Runs on: - PowerPoint PPT Presentation

Text of Advanced Charm++ Tutorial

  • Advanced Charm++ TutorialCharm Workshop Tutorial

    Gengbin Zhengcharm.cs.uiuc.edu10/19/2005

  • Building Charm++Advanced messagingInterface file (.ci)GroupsDelegationArray multicastAdvanced load balancingThreadsSDAG

  • Charm++ on Parallel MachinesRuns on:Any machine with MPI, includingIBM SP, Blue Gene/LCray XT3Origin2000PSCs Lemieux (Quadrics Elan)Clusters with Ethernet (Udp/Tcp)Clusters with Myrinet (GM)Clusters with Amasso cardsApple clustersEven Windows!SMP-Aware (pthreads)

  • Communication ArchitectureConverseCommunication APINetMPIElanUDP(machine-eth.c)TCP(machine-tcp.c)Myrinet(machine-gm.c)Ammasso(machine-ammasso.c)BG/L

  • Compiling Charm++./buildUsage: build [charmc-options ...]

    : converse charm++ LIBS AMPI FEM bluegene pose jade msa: doc ps-doc pdf-doc html-doc:bluegenel mpi-sol net-sol-amd64elan-axp mpi-sp net-sol-x86elan-linux-ia64 ncube2 net-sunexemplar net-axp net-win32mpi-axp net-cygwin origin2000mpi-bluegenel net-hp origin-pthreadsmpi-crayx1 net-hp-ia64 paragon-redmpi-crayxt3 net-irix shmem-axpmpi-exemplar net-linux sim-linuxmpi-hp-ia64 net-linux-amd64 sp3mpi-linux net-linux-axp t3empi-linux-amd64 net-linux-ia64 uth-linuxmpi-linux-axp net-ppc-darwin uth-win32mpi-linux-ia64 net-rs6k vmi-linuxmpi-origin net-sol vmi-linux-ia64mpi-ppc-darwin

    : compiler and platform specific optionscc cc64 cxx kcc pgcc acc icc ecc gcc3 mpcc pathscalehelp smp gm tcp vmi scyld clustermatic bluegene ooc syncft papi--incdir --libdir --basedir --no-build-shared -j

    : normal compiler options e.g. -g -O -save -verbose

    To get more detailed help, run ./build --help

  • Build Options: compiler and platform specific optionscc cc64 cxx kcc pgcc acc icc ecc gcc3 mpcc pathscalehelp smp gm tcp vmi scyld clustermatic bluegene ooc syncft papi--incdir --libdir --basedir --no-build-shared -j

    For platform specific options, use help option: help platform specific help, e.g. ./build charm++ net-linux help

    Choose a compiler (only one option is allowed from this section): cc, cc64 For Sun WorkShop C++ 32/64 bit compilers cxx DIGITAL C++ compiler (DEC Alpha) kcc KAI C++ compiler pgcc Portland Group's C++ compiler acc HP aCC compiler icc Intel C/C++ compiler for Linux IA32 ecc Intel C/C++ compiler for Linux IA64 gcc3 use gcc3 - GNU GCC/G++ version 3 mpcc SUN Solaris C++ compiler for MPI pathscale use pathscale compiler suite

    Platform specific options (choose multiple if apply): smp support for SMP, multithreaded charm on each node mpt use SGI Message Passing Toolkit (only for mpi version) gm use Myrinet for communication tcp use TCP sockets for communication (ony for net version) vmi use NCSA's VMI for communication (only for mpi version) scyld compile for Scyld Beowulf cluster based on bproc clustermatic compile for Clustermatic (support version 3 and 4)

    Advanced options: bluegene compile for BigSim (Blue Gene) simulator ooc compile with out of core support syncft compile with Charm++ fault tolerance support papi compile with PAPI performance counter support (if any)

    Charm++ dynamic libraries: --build-shared build Charm++ dynamic libraries (.so) (default) --no-build-shared don't build Charm++'s shared libraries

    Miscellaneous options: --incdir=DIR specify additional include path for compiler --libdir=DIR specify additional lib path for compiler --basedir=DIR shortcut for the above two - DIR/include and DIR/lib -j[N] parallel make, N is the number of paralle make jobs --with-romio build AMPI with ROMIO library

  • Build ScriptBuild script does:./build [charmc-options ...] Creates directories and /tmp Copies src/scripts/Makefile into /tmp Does a "make OPTS=" in /tmp.That's all build does. The rest is handled by the Makefile.

  • How build worksbuild charm++ net-linux gm smp kcc bluegeneSort gm, smp and bluegeneMkdir net-linux-bluegene-gm-smp-kccCat conv-mach-[kcc|bluegene|gm|smp].h to conv-mach-opt.hCat conv-mach-[kcc|bluegene|gm|smp].sh to conv-mach-opt.shGather files from net, etc (Makefile)Make charm++ under net-linux-bluegene-gm-smp-kcc/tmp

  • How Charmrun Works?Charmruncharmrun +p4 ./pgm

  • Charmrun (batch mode)Charmruncharmrun +p4 ++batch 8

  • Debugging Charm++ ApplicationsPrintfGdbSequentially (standalone mode)gdb ./pgm +vp16Attach gdb manuallyRun debugger in xtermcharmrun +p4 pgm ++debugcharmrun +p4 pgm ++debug-no-pauseMemory paranoid-memoryParallel debugger

  • How to Become a Charm++ HackerAdvanced Charm++Advanced MessagingInterface files (ci)Writing system librariesGroupsDelegationArray multicastThreadsSDAG

  • Advanced Messaging

  • Prioritized ExecutionCharm++ schedulerDefault - FIFO (oldest message)Prioritized executionIf several messages available, Charm will process the messages in the order of their prioritiesVery useful for speculative work, ordering timestamps, etc...

  • Priority ClassesCharm++ scheduler has three queues: high, default, and lowAs signed integer priorities:-MAXINT Highest priority -- -10 Default priority1 -- +MAXINT Lowest priorityAs unsigned bitvector priorities:0x0000 Highest priority -- 0x7FFF0x8000 Default priority0x8001 -- 0xFFFF Lowest priority

  • Prioritized MessagesNumber of priority bits passed during message allocation FooMsg * msg = new (size, nbits) FooMsg; Priorities stored at the end of messages

    Signed integer priorities:*CkPriorityPtr(msg)=-1;CkSetQueueing(m, CK_QUEUEING_IFIFO);Unsigned bitvector prioritiesCkPriorityPtr(msg)[0]=0x7fffffff;CkSetQueueing(m, CK_QUEUEING_BFIFO);

  • Prioritized Marshalled MessagesPass CkEntryOptions as last parameterFor signed integer priorities:CkEntryOptions opts;opts.setPriority(-1);,y,opts);

    For bitvector priorities:CkEntryOptions opts;unsigned int prio[2]={0x7FFFFFFF,0xFFFFFFFF};opts.setPriority(64,prio);,y,opts);

  • Advanced Message FeaturesRead-only messagesEntry method agrees not to modify or delete the messageAvoids message copy for broadcasts, saving timeInline messagesDirect invocation if on local processorExpedited messagesMessage do not go through the charm++ scheduler (faster)Immediate messagesEntries are executed in an interrupt or the communication threadVery fast, but tough to get rightImmediate messages only currently work for NodeGroups and Group (non-smp)

  • Read-Only, Expedited, ImmediateAll declared in the .ci file{ entry [nokeep] void foo_readonly(Msg *); entry [inline] void foo_inl(Msg *); entry [expedited] void foo_exp(Msg *); entry [immediate] void foo_imm(Msg *); .. .. .. };

  • Interface File (ci)

  • Interface File Example mainmodule hello { include myType.h initnode void myNodeInit(); initproc void myInit();

    mainchare mymain { entry mymain(CkArgMsg *m); };

    array[1D] foo { entry foo(int problemNo); entry void bar1(int x); entry void bar2(myType x); }; };

  • Include and InitcallIncludeInclude an external header filesInitcallUser plugging code to be invoked in Charm++s startup phaseInitnodeCalled once on every nodeInitprocCalled once on every processorInitnode calls are called before Initproc calls

  • Entry AttributesThreadedFunction is invoked in a CthThreadSyncBlocking methods, can return values as a messageCaller must be a threadExclusiveFor Node GroupDo not execute while other exclusive entry methods of its node group are executing in the same node NotraceInvisible to trace projectionsentry [notrace] void recvMsg(multicastGrpMsg *m);

  • Groups/Node Groups

  • Groups and Node GroupsGroups - a collection of objects (chares)Also called branch office chares (BOC)Exactly one representative on each processorIdeally suited for system librariesSimilar to arrays:Broadcasts, reductions, indexingBut not completely like arrays:Non-migratable; one per processorNode GroupsOne per nodeSMP

  • file group mygroup { entry mygroup(); //Constructor entry void foo(foomsg *); //Entry method }; nodegroup mynodegroup { entry mynodegroup(); //Constructor entry void foo(foomsg *); //Entry method };C++ fileclass mygroup : public Group {mygroup() {} void foo(foomsg *m) { CkPrintf(Do Nothing);}};class mynodegroup : public NodeGroup {mynodegroup() {} void foo(foomsg *m) { CkPrintf(Do Nothing);}};

  • Creating and Calling GroupsCreationp = CProxy_mygroup::ckNew();Remote; //broadcastp[1].foo(msg); //, npes, pes); // list sendDirect local access mygroup *g=p.ckLocalBranch();g->foo(.); //local invocationDanger: if you migrate, the group stays behind!

  • Delegation

  • DelegationCustomized implementation of messagingEnables Charm++ proxy messages to be forwarded to a delegation manager groupDelegation managertrap calls to proxy sends and apply optimizationsDelegation manager must inherit from CkDelegateMgr classUser program must to callproxy.ckDelegate(mgrID);

  • Delegation filegroup MyDelegateMgr {entry MyDelegateMgr(); //Co