Upload
clarence-gaines
View
221
Download
0
Embed Size (px)
DESCRIPTION
CMPPs CMPP 98: BS -calculus (Loulergue, Hains, Foisy) A confluent calculus of functional Bulk Synchronous Parallel programs CMPP 98: BS -calculus (Loulergue, Hains, Foisy) A confluent calculus of functional Bulk Synchronous Parallel programs CMPP 2000: Bulk Synchronous Parallel ML (Hains, Loulergue) A library for Objective Caml based on BSl CMPP 2000: Bulk Synchronous Parallel ML (Hains, Loulergue) A library for Objective Caml based on BSl CMPP 2004: Departmental Metacomputing ML CMPP 2004: Departmental Metacomputing ML
Citation preview
A Functional Language A Functional Language for Departmental for Departmental MetacomputingMetacomputing
Frederic Gava & Frederic Frederic Gava & Frederic LoulergueLoulergue
Universite Paris Val de MarneUniversite Paris Val de MarneLaboratory of Algorithms, Complexity Laboratory of Algorithms, Complexity
and Logicand Logic
ConcurrentProgramming
AutomaticParallelisation
Structured Parallelism
• High Level Programming• Deadlock free, deterministic
•Parallel Algorithms (BSP)
CMPPsCMPPs CMPP 98: BSCMPP 98: BS-calculus (Loulergue, Hains, -calculus (Loulergue, Hains,
Foisy)Foisy)A confluent calculus of functional Bulk A confluent calculus of functional Bulk Synchronous Parallel programsSynchronous Parallel programs
CMPP 2000: Bulk Synchronous Parallel ML CMPP 2000: Bulk Synchronous Parallel ML (Hains, Loulergue)(Hains, Loulergue)A library for Objective Caml based on BSlA library for Objective Caml based on BSl
CMPP 2004: Departmental Metacomputing CMPP 2004: Departmental Metacomputing MLML
The Caraml Project 2002-The Caraml Project 2002-20042004
French National Grid French National Grid ProgramProgram3 Phases: 3 Phases:
Improvement of the safety of the Improvement of the safety of the existing parallel programming librariesexisting parallel programming libraries
Multiprogramming on one cluster:Multiprogramming on one cluster:Parallel Compositions for BSML (Parallel Compositions for BSML (Iccs, Iccs, EuroparEuropar))
Programming clusterProgramming clusterss (cluster of (cluster of clusters)clusters)
http://www.caraml.orghttp://www.caraml.org
OutlineOutlineBulk Synchronous Parallel MLBulk Synchronous Parallel ML
The Bulk Synchronous Parallel modelThe Bulk Synchronous Parallel model BSML primitivesBSML primitives
++ Minimally Synchronous Parallel ML Minimally Synchronous Parallel ML The Message Passing Machine modelThe Message Passing Machine model MSPML primitivesMSPML primitives
== Departemental Metacomputing ML Departemental Metacomputing ML DMML primitivesDMML primitives SemanticsSemantics ImplementationImplementation
Bulk Synchronous Bulk Synchronous ParallelismParallelism
BSP machine: BSP machine: p p processor-memory pairs processor-memory pairs + network+ global synchronisation unit+ network+ global synchronisation unit
BSP execution: sequence of super-stepsBSP execution: sequence of super-steps Asynchronous computation phaseAsynchronous computation phase Communication phaseCommunication phase Global synchronisationGlobal synchronisation
BSP cost model (one super-step):BSP cost model (one super-step):(max(max0<=i<0<=i<pp wwii) + h*) + h*gg + + ll
The BSMLlib LibraryThe BSMLlib Library For Objective CamlFor Objective Caml No SPMDNo SPMD Deadlock free, deterministic semanticsDeadlock free, deterministic semantics Usual Objective Caml programs + Usual Objective Caml programs +
operations on a parallel data structureoperations on a parallel data structure Parallel vector of size Parallel vector of size pp: : par par (no (no
nesting)nesting) Access to the BSP parameters: Access to the BSP parameters:
bsp_pbsp_p,, bsp_g bsp_g,, bsp_l bsp_l
BSML Primitives (1)BSML Primitives (1) mkparmkpar: (int->: (int->) -> ) -> par par
mkpar f = <(f 0), … , (f (p-1))>mkpar f = <(f 0), … , (f (p-1))> apply: apply: (() par -> ) par -> par -> par -> par par
apply <fapply <f00,…f,…fp-1p-1> <v> <v00,…,v,…,vp-1p-1>=>= <(f <(f00 v v00),…(f),…(fp-1p-1 v vp-1p-1)>)>
projection: projection: atat
BSML Primitives (2)BSML Primitives (2) put: put: (int->(int-> option) par -> (int-> option) par -> (int-> option) par option) par
put <fput <f00,…,f,…,fp-1p-1> = <g> = <g00,…,g,…,gp-1p-1>>
If If (f(f00 2) = None 2) = None “0 will send no value to 2”“0 will send no value to 2”
(f(f00 3) = Some v 3) = Some v “0 will send value v to 3”“0 will send value v to 3”Then (gThen (g22 0)=None 0)=None “2 received nothing from 0”“2 received nothing from 0”
(g (g33 0)=Some v 0)=Some v “3 received value v from 0”“3 received value v from 0” BSMLlib 0.25: http://bsmllib.free.frBSMLlib 0.25: http://bsmllib.free.fr
The Message Passing The Message Passing Machine modelMachine model
machine: machine: p p processor-memory pairs + processor-memory pairs + networknetwork
execution: seen as sequence of m-stepsexecution: seen as sequence of m-steps Asynchronous computationsAsynchronous computations Asynchronous communicationsAsynchronous communications
cost model:cost model: p: number of processorsp: number of processors g: time required to send one wordg: time required to send one word L: network latencyL: network latency
MSPML PrimitivesMSPML Primitives Same primitives than BSMLSame primitives than BSML But for communications: But for communications: getget instead of instead of putput get: get: par -> int par -> par -> int par -> par par
get <vget <v00,…v,…vp-1p-1> <i> <i00,…,i,…,ip-1p-1>=>= <v <vii00
,…v,…viip-1p-1>>
communication environmentscommunication environments http://mspml.free.frhttp://mspml.free.fr
Departemental Departemental Metacomputing MLMetacomputing ML
machine: machine: cluster of BSP clusters within the same cluster of BSP clusters within the same organisationorganisation
contains the whole BSML languagecontains the whole BSML language
based on departmental vectors: based on departmental vectors: dep dep int depint dep: all proc. of the same BSP unit : all proc. of the same BSP unit
contain the same valuecontain the same value int par depint par dep: possibly different values: possibly different values
Access to ParametersAccess to Parameters Parameters of the BSP units:Parameters of the BSP units:
(dm_bsp_p i)(dm_bsp_p i): number of proc. of BSP unit : number of proc. of BSP unit ii (dm_bsp_g i)(dm_bsp_g i) (dm_bsp_l i)(dm_bsp_l i) (dm_bsp_s i)(dm_bsp_s i): processor speed of BSP unit : processor speed of BSP unit II
Parameters of the metacomputer:Parameters of the metacomputer: dm_p()dm_p(): number of BSP units: number of BSP units dm_g()dm_g(): time required to echange 1 word : time required to echange 1 word
between two unitsbetween two units dm_l(): dm_l(): inter-units network latencyinter-units network latency
PrimitivesPrimitives mkdepmkdep: (int->: (int->) -> ) -> dep depdm_p()=3 dm_bsp_p = [3;2;4] dm_p()=3 dm_bsp_p = [3;2;4] mkdep(fun c->mkpar(fun i->c+2*i))mkdep(fun c->mkpar(fun i->c+2*i))
[ <0,2,4> , <1,3> , <2,4,6,8> ] [ <0,2,4> , <1,3> , <2,4,6,8> ] apply: apply: (() dep -> ) dep -> dep -> dep -> dep dep projection: projection: atdepatdep
CommunicationsCommunications getget: (int-> int-> int option) par dep -> (* f *): (int-> int-> int option) par dep -> (* f *)
(int-> (int-> option) par dep -> (* g *) option) par dep -> (* g *) (int-> int-> (int-> int-> option) par dep (* h *) option) par dep (* h *)
On cluster On cluster aa at process at process ii:: (f b j)=Some n : wants the n(f b j)=Some n : wants the nthth value of process value of process jj on on
cluster cluster bb (h b j)= Some v: values sent by process (h b j)= Some v: values sent by process j j of cluster of cluster bb
On cluster On cluster bb at process at process jj:: (g n) = Some v(g n) = Some v
Formal SemanticsFormal Semantics Operational semantics:Operational semantics:
mini-ML (purely functional)+ DMML mini-ML (purely functional)+ DMML operationsoperations
ConfluenceConfluence
Formal costs:Formal costs:C: expression -> cost formulaC: expression -> cost formula
Implementation & Implementation & ExperimentsExperiments
Library for Objective Caml:Library for Objective Caml: MPI for intra BSP unit communicationsMPI for intra BSP unit communications TCP/IP for inter BSP units communicationsTCP/IP for inter BSP units communications
Experiments … Experiments … not yet: 2 months ago no cluster dedicated not yet: 2 months ago no cluster dedicated to researchto research
Future metacomputer:Future metacomputer: 12 nodes PIII cluster+FastEthernet (classroom)12 nodes PIII cluster+FastEthernet (classroom) 6 nodes PIV cluster + Gigabit Ethernet6 nodes PIV cluster + Gigabit Ethernet Virtual Private NetworkVirtual Private Network
Future Work (1)Future Work (1) Algorithms & implementation in DMMLAlgorithms & implementation in DMML Benchmark suite:Benchmark suite:
recently done for BSMLrecently done for BSML Ongoing work for MSPML & DMMLOngoing work for MSPML & DMML
Modular implementation (parametric Modular implementation (parametric modules)modules) BSMLlib 0.3 in september BSMLlib 0.3 in september
(MPI,PVM,PUB,TPCP/IP)(MPI,PVM,PUB,TPCP/IP) Abstraction done for MSPMLAbstraction done for MSPML Ongoing for DMMLOngoing for DMML
Future Work (2)Future Work (2) Type system:Type system:
to avoid nesting of par and depto avoid nesting of par and dep interaction with imperative featuresinteraction with imperative features
Certification:Certification:Program extraction from Coq proofsProgram extraction from Coq proofs
Fault toleranceFault tolerance