10
The Design of a CCA Framework with Distribution, Parallelism, and Recursive Composition Francisco Heron de Carvalho-Junior and Ricardo Cordeiro Corrˆ ea Mestrado e Doutorado em Ciˆ encia da Computac ¸˜ ao Universidade Federal do Cear´ a Fortaleza, Brazil Email: {heron, correa}@lia.ufc.br Abstract—HPE is a platform of parallel components that complies to the # component model, whose components are intrinsically parallel. This paper describes the design of a new CCA framework based on HPE, aimed to reconcile distribution and parallelism of components. Besides exposing the essential differences between the two platforms, the new framework has a set of features that distinguishes it from other CCA frameworks. I. I NTRODUCTION In recent years, the high performance computing (HPC) community has observed the emergence of components as an alternative to deal with the complexity of large scale applications in their domains of interest. In the first attempts of the 1990’s, existing component platforms, such as CCM (Corba Component Model), were applied for HPC application needs. These initiatives have inspired the researchers who worked on the proposal, development, and evolution of the CCA component model and their compliant frameworks, most of whom are from computational science fields. CCA is an acronym for Common Component Architecture, an HPC com- ponent model focusing on the componentization of legacy code and on the performance of connections among components [1], requirements of most HPC applications. Faced with the challenges to address parallelism in a component model, the CCA designers have decided to let CCA frameworks to investigate how to reconcile distribution, parallelism, and multithreading, without losing the simplicity of the original specification. CCAffeinne[2], DCA[3], and XCAT[4] are frameworks that have investigated partial so- lutions, but DCA was the first one to focus on integration of distribution and parallelism by emphasizing PRMI (Paral- lel Remote Method Invocation) techniques, which has been widely investigated by CCA researchers. This paper presents the design of a new CCA framework based on HPE (Hash Programming Environment), a platform of parallel components based on the # component model we have designed and prototyped for the last five years, targeting cluster computing platforms 1 . The components of HPE, so called #-components, are intrinsically parallel and subject to recursive composition by overlapping. The HPE 1 A parallel computing platform with distributed architecture, composed of a set of homogenous and distributed memory processing nodes. A processing node may have many processors, or cores, sharing memory. prototype is publicly available at http://hash-programming- environment/googlecode.com. The proposed framework inherits all the HPE features, some of which distinguishes it from other CCA frameworks. It approaches the full integration of distribution and parallelism, with less restrictions in connecting distributed parallel com- ponents. Indeed, the sets of processing nodes where parallel components are deployed may be equal, disjointed, or overlap- ping. Moreover, in a binding of ports, any subset of processes in the client and in the server parallel components may participate in a method call. All bindings are direct, without losing expressiveness. On the contrary, the use of binding components to connect components that reside in disjointed sets of nodes promotes a clean interface between the client and the server, which do not need to know about the parallel nature of their counterparts, unlike PRMI. Performance gains are possible, since binding components can be tuned for specific parallel component couplings, including the M ×N ones. Other distinguishing features of the framework are: the support for recursive compositions to deal with the complexity of large scale component ensembles; a fully fledged type system, with support to advanced concepts from high level programming languages; skeletal programming [5], for high- level and efficient parallel programming; and non-functional concerns, addressed by specific kinds of #-components. The paper comprises four more sections. Section II outlines research context on HPC component platforms, detailing HPE by using an example. Section III describes the extensions to HPE for CCA compliance. Section IV discusses the main features of the framework whose design is proposed in this paper, with focus on their distinguishing features in relation to other CCA frameworks and GCM (Grid Component Model) [6], another prominent component model for HPC. Section V presents our final considerations, outlining the additional works that will be launched from the ideas herein presented. II. HPC ORIENTED COMPONENT PLATFORMS The first attempts to use components for dealing with com- plexity of large scale applications in HPC domain started with CORBA. At the end of the 1990’s, a group of researchers, most of them computational scientists from the national laboratories of the USA DOE (Department of Energy) proposed the design 978-1-4244-9349-4/10/$26.00 © 2010 IEEE 11 th IEEE/ACM International Conference on Grid Computing 339

[IEEE 2010 11th IEEE/ACM International Conference on Grid Computing (GRID) - Brussels, Belgium (2010.10.25-2010.10.28)] 2010 11th IEEE/ACM International Conference on Grid Computing

Embed Size (px)

Citation preview

Page 1: [IEEE 2010 11th IEEE/ACM International Conference on Grid Computing (GRID) - Brussels, Belgium (2010.10.25-2010.10.28)] 2010 11th IEEE/ACM International Conference on Grid Computing

The Design of a CCA Framework with Distribution,Parallelism, and Recursive Composition

Francisco Heron de Carvalho-Junior∗ and Ricardo Cordeiro Correa∗

∗Mestrado e Doutorado em Ciencia da Computacao

Universidade Federal do Ceara

Fortaleza, Brazil

Email: {heron, correa}@lia.ufc.br

Abstract—HPE is a platform of parallel components thatcomplies to the # component model, whose components areintrinsically parallel. This paper describes the design of a newCCA framework based on HPE, aimed to reconcile distributionand parallelism of components. Besides exposing the essentialdifferences between the two platforms, the new framework has aset of features that distinguishes it from other CCA frameworks.

I. INTRODUCTION

In recent years, the high performance computing (HPC)

community has observed the emergence of components as

an alternative to deal with the complexity of large scale

applications in their domains of interest. In the first attempts

of the 1990’s, existing component platforms, such as CCM

(Corba Component Model), were applied for HPC application

needs. These initiatives have inspired the researchers who

worked on the proposal, development, and evolution of the

CCA component model and their compliant frameworks, most

of whom are from computational science fields. CCA is an

acronym for Common Component Architecture, an HPC com-

ponent model focusing on the componentization of legacy code

and on the performance of connections among components [1],

requirements of most HPC applications.

Faced with the challenges to address parallelism in a

component model, the CCA designers have decided to let

CCA frameworks to investigate how to reconcile distribution,

parallelism, and multithreading, without losing the simplicity

of the original specification. CCAffeinne[2], DCA[3], and

XCAT[4] are frameworks that have investigated partial so-

lutions, but DCA was the first one to focus on integration

of distribution and parallelism by emphasizing PRMI (Paral-

lel Remote Method Invocation) techniques, which has been

widely investigated by CCA researchers.

This paper presents the design of a new CCA framework

based on HPE (Hash Programming Environment), a platform

of parallel components based on the # component model

we have designed and prototyped for the last five years,

targeting cluster computing platforms1. The components of

HPE, so called #-components, are intrinsically parallel and

subject to recursive composition by overlapping. The HPE

1A parallel computing platform with distributed architecture, composed ofa set of homogenous and distributed memory processing nodes. A processingnode may have many processors, or cores, sharing memory.

prototype is publicly available at http://hash-programming-environment/googlecode.com.

The proposed framework inherits all the HPE features, some

of which distinguishes it from other CCA frameworks. It

approaches the full integration of distribution and parallelism,

with less restrictions in connecting distributed parallel com-

ponents. Indeed, the sets of processing nodes where parallel

components are deployed may be equal, disjointed, or overlap-

ping. Moreover, in a binding of ports, any subset of processes

in the client and in the server parallel components may

participate in a method call. All bindings are direct, without

losing expressiveness. On the contrary, the use of binding

components to connect components that reside in disjointed

sets of nodes promotes a clean interface between the client and

the server, which do not need to know about the parallel nature

of their counterparts, unlike PRMI. Performance gains are

possible, since binding components can be tuned for specific

parallel component couplings, including the M×N ones.

Other distinguishing features of the framework are: the

support for recursive compositions to deal with the complexity

of large scale component ensembles; a fully fledged type

system, with support to advanced concepts from high level

programming languages; skeletal programming [5], for high-

level and efficient parallel programming; and non-functional

concerns, addressed by specific kinds of #-components.

The paper comprises four more sections. Section II outlines

research context on HPC component platforms, detailing HPE

by using an example. Section III describes the extensions to

HPE for CCA compliance. Section IV discusses the main

features of the framework whose design is proposed in this

paper, with focus on their distinguishing features in relation to

other CCA frameworks and GCM (Grid Component Model)

[6], another prominent component model for HPC. Section

V presents our final considerations, outlining the additional

works that will be launched from the ideas herein presented.

II. HPC ORIENTED COMPONENT PLATFORMS

The first attempts to use components for dealing with com-

plexity of large scale applications in HPC domain started with

CORBA. At the end of the 1990’s, a group of researchers, most

of them computational scientists from the national laboratories

of the USA DOE (Department of Energy) proposed the design

978-1-4244-9349-4/10/$26.00 © 2010 IEEE 11th IEEE/ACM International Conference on Grid Computing339

Page 2: [IEEE 2010 11th IEEE/ACM International Conference on Grid Computing (GRID) - Brussels, Belgium (2010.10.25-2010.10.28)] 2010 11th IEEE/ACM International Conference on Grid Computing

of a new component model to fit the specific requirements

of their HPC applications, mainly regarding performance of

component connections and support for commonly used types

in the scientific domain [7]. Since then, many CCA related

research initiatives have been conducted in several institutions,

resulting in the design and implementation of CCA frame-

works, as well as interoperability support tools, exploiting

different aspects regarding the use of components in HPC

domain.

Simultaneously, computer scientists in the context of the

CoreGrid European research consortium, have proposed Frac-

tal [8], and later GCM (Grid Component Model)[6], with a

stronger emphasis on advancing state-of-the-art on component

platforms, by introducing new high-level capabilities to HPC

components - for introspection, adaptation, and reconfigura-

tion. In particular, GCM extends Fractal to deal with the

deployment of components on grid computing platforms, also

enriching the mechanisms for adaptation and reconfiguration.

CCA and Fractal/GCM are the most prominent initiatives in

the support of components in HPC domain, validated by real

applications. We assume that the readers have prior knowledge

about their main features.

In the last five years, based on our experience with Haskell#,

a parallel extension to the Haskell functional language [9], we

have proposed a model of parallel components named # (read

hash), from which compliant component platforms could be

instantiated, called # programming systems [10]. The design of

# programming systems requires attention to three premises:

• Parallel components consisting of a set of parts, called

units, each one independently deployed in a processing

node of a parallel computing platform. Such components

are called #-components;

• Recursive composition of #-components by overlapping;

• Support for one or more component kinds, grouping com-

ponents that present common components, connection,

and deployment models [11], as well as their life-cycle.

HPE is our reference # programming system, whose

free source code is hosted at http://hash-programming-environment.googlecode.com [12].

A. HPE - Hash Programming Environment

In order to be used for general purpose parallel program-

ming targeting clusters, HPE now supports eight component

kinds: platforms, representing parallel platforms, whose units

represent their nodes, making it possible to access their

features and current states; environments, representing paral-

lelism enabling infrastructures, whose units allows access to

services of some message passing libraries, grid middlewares,

computational frameworks, and so on; qualifiers, denoting ad-

hoc non-functional concerns, whose units may carry relevant

information about such concerns; computations, representing

parallel computations, whose units implement the role of a

process with respect to the computation, i. e., what the unit

computes; synchronizers, denoting patterns of synchronization

and communication among processes that have their units as

slices; data structures, representing data structures processed

FARM[input type = I:DATA, output type = O:DATA, job type = J:DATA, result type

= R:DATA, scatter strategy = S:DISTRIBUTE[data source = I, data target = J], collect

strategy = C:COLLECT[data source = R,data target = O], work = W:WORK[data input =

J,data output = R]] (context signature)

Fig. 1. Configuration of the Farm Skeleton

by computations and/or synchronizers in an abstract data type

or object-oriented style. applications, which are computations

that may be launched on a parallel computer to execute;

enumerators, which makes it possible to express scalable con-

figurations, with an arbitrary number of configuration elements

(units and inner components).

The HPE architecture comprises the orchestration of three

loosely coupled modules, which interact by means of Web

Services: the FRONT-END, where programmers build con-

figurations of #-components and control their life-cycle; the

CORE, which manages a library of #-components distributed

across a set of locations; and the BACK-END, which manages

the deployment and execution of #-components in a cluster.

The use of Web Services promotes the independence of

the HPE modules, regarding their localization and their devel-

opment platforms. The BACK-END of HPE has been imple-

mented on top of the CLI/Mono platform, whereas the FRONT-

END and the CORE have been implemented in Java.

The following example2 will introduce the key concept of

abstract component, representing sets of #-components that

implement a given concern by supposing different contexts.

B. A Motivating Example of HPE Components

Algorithmic skeletons have been widely investigated as a

technique for increasing abstraction in parallel programming

[5]. In the context of parallel components, the works with

HOCs (Higher Order Components) have shown interesting

uses cases for applying skeletons for grid computing [13].

The farm skeleton captures the most used way to exploit

parallelism in a distributed platform, having two kinds of

processes: one manager, which distributes jobs among workers

and combine their results, and many workers, which compute

2http://hash-programming-environment.googlecode.com/svn/trunk/Examples/CaseStudies/SimpleAdaptativeQuadrature.

340

Page 3: [IEEE 2010 11th IEEE/ACM International Conference on Grid Computing (GRID) - Brussels, Belgium (2010.10.25-2010.10.28)] 2010 11th IEEE/ACM International Conference on Grid Computing

the results of lists of jobs. Well-known parallel programming

techniques for grid computing, like bag-of-tasks and the

MapReduce framework [14], are variants of farm.

Figure 1 depicts the configuration of the abstract component

FARM, as presented in the visual configuration environment

of the FRONT-END of HPE. Below the figure, the contextsignature of FARM lists their formal context parameters, each

one represented by id = X : C, where X is a context variable,

C is its context bound, and id is a name, which can be used for

referencing it in instantiations. An instantiation is defined as

the application of an abstract component to a set of abstract

components, representing the actual context parameters that

will supply the formal context parameters by obeying the

subtype restrictions imposed by their context bounds.

The inner components of FARM, represented by ellipses,

are annotated with their types. Except for mpi, simply typed

by the parameterless instantiation MPI, the other inner com-

ponents are typed by a context variable. For instance, the

inner component input, representing input data structure to

be partitioned in jobs, is typed by I , the context variable

of input type, whose context bound is DATA. Analogously,

the inner components job[I], where I is the enumerator of

workers, representing the jobs, are typed by J , defined by the

context parameter job type, also bounded by DATA. The inner

component scatter, responsible for partitioning input in jobs,

and sending them to the workers, is typed by S, defined by the

context parameter scatter strategy, bounded by DISTRIBUTE

with their context parameters data source and data targetsupplied with the context variables I and J , respectively.

In an instantiation of FARM, by supplying S with an abstract

component that specializes DISTRIBUTE, one may build a #-

component implementing FARM specifically designed for a

context defined by some specific strategy of partition and

distribution of jobs to the workers. Analogously, the FARM

instantiation may be tuned to work with a specific strategy

to collect results (inner component gather, typed by G) and

to process the jobs (inner components work, typed by W ).

Finally, it is also possible to specialize FARM for the data

structures of the data to be partitioned by the manager (context

variable I), data structure representing a job (J), data structure

representing a result of job processing (R), and data structure

representing the final result (O).

The units of FARM, represented by rectangles, are managerand worker[I], for each element of the enumerator I The

configuration elements IManager.cs and IWorker.cs rep-

resent the unit modules attached to their respective units.

In HPE, the unit modules of an abstract component are

packages, independently deployed in assemblies, contain-

ing C# interfaces, whereas they contain C# classes in unit

modules of #-components. The classes in the unit mod-

ules of a #-component must implement the interfaces in

the unit modules of the abstract component it instanti-

ates. The slices of units, which will become properties of

unit module interfaces, are annotated with 〈slice name〉 :〈inner name〉.〈unit name〉, where 〈slice name〉 denotes

their identifiers and 〈inner name〉.〈unit name〉 denotes the

Fig. 2. ROMBERGINTEGRATOR and INTEGRALCASE

unit from which they have their origin.

Figure 3 presents the configuration of a #-component im-

plementing the FARM, called FarmImpl. By inspecting their

actual context parameters, one may notice that it may work

with any implementation of message passing library, kind of

data structures, gather/scatter strategy and work. The Back-

End of HPE is responsible for applying a resolution proce-dure to find the best #-component that implements a given

instantiation [15]. For that, it generalizes the instantiation, by

traversing the supertypes of the actual context parameters.

To test the farm, an application implementing a parallel

calculation of the integral of a function over a specified interval

using the Romberg’s method was developed. The configuration

of the abstract component ROMBERGINTEGRATOR is depicted

in Figure 2. The root process partitions the interval in nsubintervals, representing jobs. Then, a list of jobs are sent

to each peer process, which apply the Romberg’s integration

procedure over the subintervals, sending the results back to the

root, where they are added. It abstracts from the type of the

function by defining a context parameter function, bounded

by FUNCTION, that must be supplied by their #-components,

such as RombergIntegratorImpl in Figure 3.

The configuration of ROMBERGINTEGRATOR involves the

following specialized abstract components, with their respec-

tive #-components: INTEGRALCASE, for supplying the context

parameters input type and job type, representing the type

of jobs, carrying an interval and an integrating function;

APPROXIMATEINTEGRAL, for supplying W , representing the

Fig. 3. Configurations of FarmImpl and RombergIntegratorImpl

341

Page 4: [IEEE 2010 11th IEEE/ACM International Conference on Grid Computing (GRID) - Brussels, Belgium (2010.10.25-2010.10.28)] 2010 11th IEEE/ACM International Conference on Grid Computing

Fig. 4. Interpretation of CCA Ports from HPE

approximation of the integral of a function over a given

interval using the Romberg’s method; DISTRIBUTEINTERVAL,

for supplying S, representing the strategy to partition the

integration interval and distribute the partitions to the workers;

SUMINTEGRALS, for supplying G, representing the strategy to

collect and sum the partial integrals calculated by each worker.

III. CCA COMPLIANCE FOR HPE

This section presents how to make the HPE Back-End a

CCA compliant framework, distinguished from other existing

CCA frameworks for the following reasons:

1) It is less restricted on the integrated support for dis-

tributed and parallel components;

2) It approaches recursive composition;

3) It provides a potential support for non-functional con-

cerns, like those supported by GCM;

4) It supports skeletal programming with components.

A. Uses and Provides Ports

The support for the Uses-Provides design pattern is a key

concept behind CCA components.

In Section II-B, it has been shown that a #-component is

associated with an instantiation, defined as an abstract compo-

nent applied to a context. The units of such abstract component

are attached to C# interfaces which must be implemented by

the C# classes attached to units of the #-component. Such an

interface includes members required by the kind of the abstract

component, but there is support for the user to include new

members. The instantiation implemented by a #-component

Fig. 5. Generalizing #-Components with Multiple Provides Ports

defines a provides port. Analogously, the uses ports of #-

components correspond to their inner components, which are

also typed by instantiations. Thus, instantiations are port types.

The Figure 4 depicts uses and provides ports of a

#-component. impl, inner 1, . . . , inner k are instantiations.

impl is implemented by the #-component C, whereas

inner 1, . . . , inner k type their inner components. The arrows

indicate that such inner components are required by the

configuration of the abstract component of impl.CCA components may declare many provides ports. How-

ever, CCA #-components have a single provides port, since

#-components are fine-grained, designed for a single and well

defined concern specified by the instantiation they implement.

In order to support many provides ports, #-components will

be allowed to implement many instantiations, as depicted in

Figure 5. Consequently, a uses port will have an implicit

association with one or more provides ports.

The name of a uses port is the usual name of an inner

component in a configuration. However, it is still necessary

to name provides ports, which is not necessary in usual #

components of HPE since they have a single provides port.

Fig. 6. Direct Binding

B. Bindings: Connections Between Uses and Provides Ports

Figure 6 depicts a binding between two CCA #-components.

In (a), the #-component configuration is depicted, where

u 1, u 2, . . . , u k is a subset of the units of the client #-

component C, and p 1, p 2, . . . , p k is the set of units of the

server #-component S. In (b), C and S are depicted using

our notation for CCA components. The distributed nature of

CCA #-components is emphasized, since each pair of units

〈u i, p i〉 i=1...k resides in a distinct processing node. In (c),

the implementation of a binding using direct bindings between

pairs of units in the same address spaces.

The interfaces of the direct bindings may be distinct, since

they connect distinct units. In fact, it is a concern of the

implementation of the #-components to define which units will

participate in a method call, a feature inherited from HPE.

342

Page 5: [IEEE 2010 11th IEEE/ACM International Conference on Grid Computing (GRID) - Brussels, Belgium (2010.10.25-2010.10.28)] 2010 11th IEEE/ACM International Conference on Grid Computing

Fig. 7. Rendezvous (Synchronizer) Indirect Binding

In existing CCA frameworks, indirect bindings link uses

and provides ports of CCA components residing in distinct

address spaces. They are implemented by distributed CCA

frameworks, such as XCAT [4] and DCA [3].

The bindings between CCA #-components are always direct,

since the slices of a unit are always in the same address space

of the unit. If two CCA #-components are deployed in disjoint

subsets of nodes of a distributed platform, or in different

clusters, an intermediate CCA #-component, representing an

indirect binding, is necessary. The proposed framework sup-

ports two kinds of such bindings, illustrated in Figures 7 and

9, with different usage modes and semantics:

• Rendezvous bindings, of kind Synchronizer, originally

proposed to be supported by HPE on clusters. As depicted

in Figure 7, the rendezvous binding must be an inner

component of both the client and the server components.

Thus, in CCA terms, it has two provides ports, one

for the client component and another one for the server

component. Moreover, the binding has two public inner

components, not shown in Figure 7, representing the data

structure to be mapped from the client to the server

through the binding. Communication occurs when both

the client and the server execute the method synchronizeof the binding interface (the rendezvous), required by

units of Synchronizer kind, causing the copy of the data

structure instance from the client side to the server side.

For that, a specialized procedure tuned according to the

type and distribution of the data structure at the client

and at the server sides may be applied.

• RPC bindings: in order to support the semantics of remote

procedure calls, we propose the introduction of a kind

of #-components called Service into HPE. As depicted

in Figure 9(a), the client #-component S has an inner

component whose type (abstract component SAbs) is

implemented by the binding #-component B, of Service

kind. The binding also has an inner component of the

same type (SAbs), which will be instantiated to the server

component (S), which also implements SAbs. Figure 9(b)

depicts the resulting configuration at run-time, whose

CCA interpretation is depicted in Figure 9(c).

The reader may notice that the remote nature of the

call may be transparent to the client #-component Cand the server #-component S. Therefore, if S resides

in the same nodes of C, C could be connected directly to

S. The binding is responsible for performing all input

and output argument data transfers between the client

and the server units, since it knows the data distribution

at both the client and server ends. The framework may

implement the remote procedure call (RPC) (or remote

method invocation) semantics, from its knowledge about

the kind of B (Service).

Besides allowing tuned implementations of bindings for spe-

cific applications, one may develop generic bindings, as well

as to organize libraries of component bindings for common

uses, by using the support for skeletal programming and the

polymorphism of HTS (Hash Type System).

Rendezvous and RPC bindings may implement general

M×N couplings [16], with the advantage that they encap-

sulate all the data exchanges between the client and the server

units. A more detailed discussion about M×N couplings

between CCA #-components is presented in Section IV-A.

...

......... ...

s_1 s_2 s_k

q_1 q_2 q_k#-component

C

S2...p_1 p_2 p_k

S1inner component inner component

...

...override void createSlices { ... DGAC.createSlice( ... p_i ... ); ... DGAC.createSlice( ... q_i ...); ...}...

...

...

...

p_1

p_2

p_k

...

...

...

...

s_1

s_2

s_k

C

S1 S2...

...

...

...

...

q_1

q_2

q_k

(a) class of s_i

...void setServices(Services s) { ... s.addProvidesPort( ... ); ... s.addProvidesPort( ... ) ... s.registerUsesPort( ... p_i ... ); ... s.registerUsesPort( ... q_i ...); ...}...

(b) class of s_i

CCAcompliance

Fig. 8. Instantiation of Components in HPE

343

Page 6: [IEEE 2010 11th IEEE/ACM International Conference on Grid Computing (GRID) - Brussels, Belgium (2010.10.25-2010.10.28)] 2010 11th IEEE/ACM International Conference on Grid Computing

Fig. 9. Service Indirect Binding

C. Instantiation of Ports

A #-component of HPE is implemented as a set of objects,

representing their units, each one residing in a distinct pro-

cessing node. Figure 8 outlines the instantiation procedure

of inner components in HPE, and how such a procedure

could be made CCA compliant. A #-component C has two

inner components, named S 1 and S 2. The objects that

represent the units s 1, s 2, . . . , s k of C must implement the

method createSlices, where the method DGAC.createSliceis invoked for each unit, belonging to some inner component,

which is one of their slices, as depicted in Figure 8(a).

The code of the method createSlice is automatically gen-

erated from the HCL code of the #-component. However, the

programmer could instantiate an inner component program-

matically, by taking care to perform a correct parallel method

invocation to DGAC.createSlice from all the units where

there is a slice of the inner component being instantiated.

In a CCA #-component, the method setServices of the

CCA specification will replace the method createSlice for

declaring and binding provides ports and uses ports, the later

one corresponding to instantiation of inner components. Figure

8(b) illustrates that declaration of provides ports is unnecessary

in HPE, but relevant for CCA #-components.

TypeMap properties will be used to specify the actual

contexts of the instantiations of uses and provides ports type.

D. The GoPortThe ports of GoPort type, defined by the CCA specification,

are provided by any CCA component whose execution is

initiated by the framework, by invoking the parameterless

method go. Such components are the entry points of CCA

applications. They correspond in HPE to #-components of

Application kind, whose objects that represent their units must

implement the go method.

For CCA compliance, the units of an application abstract

component now define the interface GoPort of the CCA

specification. Thus, a CCA #-component must only provide

a port whose type is an abstract component of Application

kind in order to be initiated by the framework.

E. The New Back-End ArchitectureFigure 10 depicts the components of the new CCA com-

pliant HPE Back-End. The components DGAC and DDAOimplement the current services provided by HPE, for compat-

ibility. DGAC is responsible for processing the requisitions

that come from the HPE Back-End, including deployment

of components, specified in HCL configurations, and running

applications, by loading an #-component that implements an

instantiation of Application kind given by the user.In the execution of applications, DGAC is responsible

for applying the resolution algorithm recursively to load the

actual #-component for each instantiation representing the

application itself and all inner components in the hierarchy of

components. For that, DGAC uses the BuilderService port

of the framework (CCA PARALLEL FRAMEWORK) to

instantiate #-components and connect their ports.DDAO provides an access layer for the database where

abstract components and CCA #-components are deployed.

It provides a port to DGAC, using the same interface used

by DGAC to access the database in HPE, and another port to

the framework, using the ComponentRepository interface of

the CCA specification. The latter port provides a CCA view

of the components deployed in DGAC to the framework.The framework is a #-component itself, of a new Framework

kind, providing the BuilderService and Services ports. To

these ends, the framework #-component implements the en-

vironment abstract components BUILDERSERVICE and SER-

VICES, and has application #-components as inner compo-

nents, which are loaded dynamically with a reference to the

SERVICES inner component. Using this approach, a regular ap-

plication #-component of HPE could act as a driver, accessing

the BUILDERSERVICE port of a framework, as an environment

inner component, to instantiate and connect applications.The framework contains a unit Manager, residing in the

administration node of the cluster, and a set of Worker units,

residing in the processing nodes. By viewing units as CCA

components, each worker may be viewed as a sequential CCA

framework which loads units and directly connects their ports

in the memory space of the computing node, through the

344

Page 7: [IEEE 2010 11th IEEE/ACM International Conference on Grid Computing (GRID) - Brussels, Belgium (2010.10.25-2010.10.28)] 2010 11th IEEE/ACM International Conference on Grid Computing

control of the manager, which provides a parallel component

view of units that belongs to the same CCA compliant #-

component. Moreover, it is the responsibility of the manager

to set up an MPI communicator to enable communication

among the units of each CCA #-component, as in the DCA

framework. Notice that one could access the BuilderServiceand ComponentsRepository to instantiate and connect com-

ponents directly, bypassing HPE.

IV. RELATED WORKS AND COMPARISON

CCAffeine [2], XCAT [4], and DCA [3] are examples of

CCA frameworks that have tried to support distributed and

parallel components, by focusing on different aspects.

XCAT makes it possible to deploy sequential components

within a distributed architecture. It is a purely distributed

framework. Thus, parallel interactions should be implemented

through port bindings, which is inappropriate for parallel com-

puting since they only support client-server relations between

the parallel processes. In fact, most of the parallel algorithms

are specified using message passing, the natural mechanism to

synchronize peer processes.

CCAffeine targets parallel platforms. A parallel component

is a cohort of components of the same type, each one being ex-

ecuting in a processing node. This style of parallel component

is called SCMD (Single Component Multiple Data), capturing

the well-known SPMD style. Bindings between uses and

provides ports are always direct, as in our framework, since

the cohorts reside on all the processing nodes. The members

of a cohort may communicate through message passing, e.

g. MPI, out of the framework’s control. Such an approach

facilitates componentization of legacy MPI programs, but

opens a “backdoor” for clandestine communication between

components at run-time. From the software architect’s point-

of-view, such level of coupling between components is not

desirable, reducing reuse possibilities and becoming a source

of potential safety and security problems.

In the framework we propose, communication is encapsu-

lated inside a parallel component using a local MPI commu-

nicator, such as in DCA. A concern whose implementation is

scattered along a set of processing nodes is addressed by a

single #-component. In addition, #-components may reside in

any subset of nodes of a parallel computing platform, being

Fig. 10. Back-End Architecture Overview

possible to connect them by using direct bindings or binding

components through overlapping composition.

DCA was developed to investigate the combination of paral-

lelism and distribution in the same framework, on top of MPI.

Parallel components are defined as a set of processes, which re-

semble units of the framework we propose. The processes can

be placed on distinct processing nodes. Parallel components

residing in disjoint sets of nodes may be connected through

indirect bindings, using PRMI. There is a restriction that

imposes the participation of all server processes in a remote

method call. Such restriction is caused by PRMI, which do not

make the sharing of a communicator between the client and

the server processes possible. In the framework we propose,

an RPC binding component carries a local communicator,

which can be used to implement synchronization among the

client and server processes. The programmer decides which

processes at the client and server ends will participate in a

remote method call through the binding component.

A. M×N Couplings

The support for generic M×N couplings is a significant

criterion in evaluating the efficacy of general purpose dis-

tributed parallel component platforms [16], [17], [18]. This

section compares the approaches adopted by DCA and GCM

to the approach of the framework we propose.

DCA adopts a form of PRMI where input and output

arguments may be regular or distributed. In the former case,

the input (output) argument is broadcasted from the process 0

at the client (server) side to the processes at the server (client)

sides. In the latter case, input (output) arguments are handled

like buffers in MPI collective calls. The data is supposed to be

in a buffer (array of bytes), and the client (server) must inform

as to the total number of items, number of items to send to each

server (client) process, and displacement of buffer elements.

Thus, in DCA, the client must know about the data partition

and distribution at the server end, and vice-versa.

In our opinion, the main drawback of PRMI is the lack

of transparency in the implementation of a component with

respect to the parallel nature of the component bound to it.

In fact, we argue that it is not possible to implement, both

transparently and efficiently, a generic indirect binding be-

tween the server and the client processes, because partition and

distribution of data amongst them are application dependent.

For this reason, PRMI either forces scattering of the binding

concern along the implementation of the server and client

components or requires them to supply the framework with

information about the partition and distribution of the input

and output arguments, using pre-defined formats.

In the framework we propose, the binding concern is iso-

lated and encapsulated in the binding component, which makes

it possible to overcome the limitations of DCA regarding

participation of client and server processes in a parallel call,

since it is not necessary for communicators to be handled

at the client and server ends, but only within the binding.

Consequently, any subset of client processes may synchronize

with any subset of server processes.

345

Page 8: [IEEE 2010 11th IEEE/ACM International Conference on Grid Computing (GRID) - Brussels, Belgium (2010.10.25-2010.10.28)] 2010 11th IEEE/ACM International Conference on Grid Computing

The designers of GCM argue that the collective nature of

M×N connections must be attached to the definition of the

components, not to the bindings. They argue against the idea

to use an intermediate component to manage connections, in

the opposite direction to the framework we propose.

In order to support 1×N (broadcast) and M×1 (gathercast)

bindings, the standard GCM relies on the use of lists in the

signature of methods to promote automatic distribution of

input and output arguments. Moreover, GCM gives support for

introspection methods that make dynamic discovering of the

number of client (M ) and server (N ) processes in a collective

binding possible. This approach prevents programmers from

working directly with advanced distributed data structures in

the arguments of the parallel call, such as multidimensional

arrays, trees, and graphs. In such cases, a client is forced to

marshal non-linear data structures to lists of their partitions

before distributing them among the server processes. However,

this is only possible if the client knows the distribution of the

data at the server end. Otherwise, only trivial default distribu-

tion policies could be supported. The use of binding interfaces

to inform distribution of data to the binding is not general,

and complicates the solution. Some GCM implementations

give better support for programming custom marshalling/un-

marshalling procedures for complex data structures [19].

The support for M×N connections in GCM relies on the

use of M multicast interfaces plus N gathercast interfaces to

make direct connections possible. This solutions seems very

complex compared to the binding components we propose,

since the programmer is still responsible for configuring all

direct data transfers, but without support for generic bindings

and libraries of binding components provided by HPE. In [6],

the efficacy of the GCM approach is demonstrated by means

of an example that the authors consider representative. The use

of coupling controllers to configure the binding in a simpler

way, but that do not free programmers from the total control

of direct transfers, is also discussed.

For the reasons mentioned above, we claim that binding

components, using rendezvous or RPC semantics, have advan-

tages when compared to PRMI and collective bindings of DCA

and GCM, respectively. Indeed, a binding may be specifically

designed and tuned for a specific coupling, on top of the

most appropriate message transport protocol, optimizing data

transfers necessary for distribution and gathering of input and

output arguments of any kind of data structure. Potentially, the

performance level of raw communication could be achieved.

One may argue that such an approach delegates much more

work to the programmers than PRMI based and collective

interfaces of DCA and GCM. However, it is still possible

to implement generic bindings, or even libraries of reusable

generic bindings for many purposes, using the advanced typing

capabilities supported by HPE, as demonstrated in [20].

We think that, when programming for HPC, the program-

mers must be able to work at their desired level of abstraction.

This is one of the reasons why MPI became so popular

despite the low-level nature of their message-passing interface.

Thus, we consider realistic to suppose that a user will be

Fig. 11. Hierarchical Composition of CCA #-Components

interested in implementing a specific coupling binding to

boost performance of a critical application, probably using

generic bindings for prototyping and testing. If only generic

bindings were available, this could not be possible, since

programmers would need configuration capabilities to tune

the generic binding for their purposes. But there is no way

to ensure expressiveness of such configuration interface for

their general requirements.

B. Recursive (Hierarchical) Composition

Recursive composition is a feature of component platforms,

such as Fractal/GCM and HPE, that makes it possible to

view a component ensemble as a component itself, subject to

composition with other components. It improves the ability to

deal with the complexity of large scale applications. Recursive

composition is not supported by CCA frameworks [21]. With-

out recursive composition, CCA frameworks are less suitable

to work with fine-grained component ensembles compared to

Fractal/GCM and HPE. Definitively, this is not a design flaw

of CCA, but a consequence of a stronger emphasis on the

support of componentization of existing complete programs

but without affecting the ability to develop new components

from scratch. We argue that our framework brings a notion of

hierarchical composition to a CCA framework, improving its

ability to work with fine-grained component ensembles.

HPE relies on the so called overlapping composition, a form

of hierarchical composition suitable for parallel components of

# programming systems (#-components). From the perspective

of CCA, the relation of a #-component with their inner

components is of the same nature of the relation of a CCA

component to their uses ports, which will be connected to CCA

components that will provide the required services through

their provides ports. Therefore, as depicted in Figure 11,

a hierarchy of CCA #-components viewed from the HPE

perspective (a) becomes a tree chain of CCA components

connected by their bindings (b).Using the DGAC interface (Figure 10), all the inner compo-

nents in the hierarchy of components of a CCA #-component

of Application kind may be instantiated automatically, freeing

programmers from the painful task of finding the best tuned

components that provide a compatible port to be connected

to each uses port which may appear in complex hierarchies

of CCA #-components. From this perspective, one may as-

sume that application #-components have the granularity of

CCA components in existing CCA frameworks, and the other

346

Page 9: [IEEE 2010 11th IEEE/ACM International Conference on Grid Computing (GRID) - Brussels, Belgium (2010.10.25-2010.10.28)] 2010 11th IEEE/ACM International Conference on Grid Computing

Fig. 12. Controller CCA #-Components

ones are automatically bound to the application by recursive

composition using the HPE resolution approach.

C. Non-Functional Concerns

The support for non-functional concerns is another impor-

tant feature that distinguishes Fractal/GCM from CCA [21].

GCM designers argue that “the absence of standard interfaces

dedicated to non-functional concerns makes it hard to achieve

certain features, for instance, dynamic reconfiguration based

on observed performance or failures” [6].

Fractal/GCM equip components with a non-functional (NF)

control part, called membrane, constituting a set of controllers,

each one addressing a non-functional aspect. For improving

adaptive capabilities, GCM has adopted a component-oriented

membrane, where controllers are components that can be

reconfigured. However, the membrane components cannot be

regular GCM components, since they have particular features,

including a different life-cycle.

In HPE, the supported non-functional concerns are en-

capsulated in components of the following kinds: Qualifier,

Architecture, and Environment. Component kinds make the

coexistence of components with different characteristics pos-

sible, including their life-cycle, deployment models, supported

connectors, recursive composition restrictions, and so on.

Qualifier components exist to qualify #-components of a

given abstract component according to the way how they are

implemented, affecting their non-functional properties, such

as performance and security, and, less often, their functional

properties. To this end, they are used as bounds of context pa-

rameters of abstract components of other kinds. For example,

one may define a context parameter matrix type, bounded by

the qualifier MATRIXTYPE, to specify an abstract component

LSSOLVER whose #-components implement different solution

strategies for sparse linear systems according to the pattern

of non-zeroes of the input matrix. Then, each #-component

that implements LSSOLVER must supply matrix type with

an appropriate qualifier that is a subtype of MATRIXTYPE,

such as ILU (resultant from a incomplete LU factorization),

BLOCKTRIDIAGONAL, PENTADIAGONAL, etc.

Inner #-components of Architecture kind allow #-

components to collect dynamic information about the parallel

computing platform in order to affect their execution, whereas

inner #-components of Environment kind enable the access of

#-components to the parallel computing enabling software of

the platform. For example, the framework and the MPI library

are encapsulated in environment #-components.

HPE does not support a concept similar to GCM controllers,

which could affect the performance of a #-component ex-

ternally, by reconfiguring it guided by run-time information.

Since HPE configurations are static, such a feature was not

considered relevant in its initial design.

Component controllers could be supported as regular CCA

#-components, by introducing a new kind of #-components

called Controller. Since each kind may assume distinct life-

cycles and any other characteristics relevant for the definition

of components belonging to them, such an approach meets the

GCM designers’ understanding that controllers must be treated

as special components.

An abstract component of Controller kind could include

#-components intended to address some level of adaptation

and reconfiguration control over other #-components. For

example, controller abstract components named BINDING-

CONTROLLER, ATTRIBUTECONTROLLER, CONTENTCON-

TROLLER, and LIFECYCLECONTROLLER could exist to im-

plement the many levels of control supported by Fractal/GCM

[8]. However, HPE is not yet prepared to implement the kind of

control capabilities supported by Fractal/GCM. Further works

will investigate which kind of adaptation and reconfiguration

facilities could be introduced to CCA #-components, but it is

possible to exercise how #-components could be connected to

controller #-components. As depicted in Figure 12, a CCA #-

component S can implement one or more controllers, which

correspond to declare provides port that can be connected to

the controller #-components (C1 and C2). The interface at-

tached to the units of a controller abstract component requires

the following method:

u s i n g hpe . b a s i c ;

i n t e r f a c e I C o n t r o l l e r {I U n i t u n i t { g e t ; } ;

}

, which returns the unit object. It can be automatically gener-

ated for any unit of #-component that has a controller slice.

Using the unit object, the controller can access all the

relevant information about the unit. At present, the unit objects

do not provide an interface for adaptation and reconfiguration

capabilities. However, they can be extended for such a purpose,

giving all programmability ability for advanced reconfigura-

tion and adaptation to #-components of Controller kind.

V. CONCLUSIONS AND FURTHER WORKS

This paper presented our efforts to bring the prominent

features of the HPE platform to CCA frameworks, motivated

by our understanding that this could be performed in a

straightforward way by applying the following extensions and

mapping of concepts to HPE: a #-component is interpreted

as a CCA component; inner components are interpreted as

347

Page 10: [IEEE 2010 11th IEEE/ACM International Conference on Grid Computing (GRID) - Brussels, Belgium (2010.10.25-2010.10.28)] 2010 11th IEEE/ACM International Conference on Grid Computing

uses ports; the instantiation (i. e. abstract component with

context parameters supplied) implemented by a #-component

is interpreted as a provides port; to support many provides

ports, a #-component now can implement many instantiations;

the framework is parallel, implemented as a #-component,

with a unit in each processing node of the parallel computing

platform; a CCA #-component has an inner component (uses

port) typed as SERVICES, to be connected to the framework

by the framework itself.

The framework we propose inherits characteristics of HPE

which distinguishes it from other CCA frameworks.

First of all, it supports full integration of distribution and

parallelism. Components that reside in the same processing

nodes are directly connected, whereas components that reside

in disjoint sets of processing nodes may communicate through

explicit binding components, reused from libraries or devel-

oped specifically for the application context. Rendezvous and

RPC semantics are supported. The paper argues the advantages

of such an approach in relation to PRMI and collective inter-

faces of GCM, especially for solutions to M×N couplings.

Another important feature is encapsulation and isolation of

parallelism interactions within the scope of the components,

avoiding “backdoors” for clandestine communication among

components that we find in some approaches for supporting

parallelism on component platforms.

Also, it supports a fully fledged type system for com-

ponents, supporting advanced concepts found in high level

programming languages, such as universal and existential

polymorphism, as well as recursive types.

Recursive composition is another important feature inherited

from HPE, a way to deal with complexity of large scale com-

ponent ensembles that is not supported by CCA frameworks.

Finally, the support for non-functional concerns must be

mentioned, addressed by #-components of specific kinds, by

bringing and evolving ideas from GCM designers.

The implementation depends on simple refactorings of HPE

design and implementation. Then, we need to validate the

framework with real world applications. The new framework

will permit our research group to investigate the design of

CCA frameworks, mainly regarding the integration of distri-

bution and parallelism, as well as support for non-functional

concerns. In particular, we plan to investigate how to extend

the framework for grid platforms, by using some original ideas

of GCM, mainly regarding adaptability and reconfiguration

capabilities (e. g. controllers), but proposing simpler strategies

for component deployment, relying on component kinds.

VI. ACKNOWLEDGEMENTS

This work is sponsored by CNPq3, grant 480307/2009-1.

REFERENCES

[1] R. Armstrong, G. Kumfert, L. C. McInnes, S. Parker, B. Allan, M. Sot-tile, T. Epperly, and D. Tamara, “The CCA Component Model For High-Performance Scientific Computing,” Concurrency and Computation:Practice and Experience, vol. 18, no. 2, pp. 215–229, 2006.

3National Council for Scientific and Technological Development, Brazil(http://www.cnpq.br).

[2] B. A. Allan, R. C. Armstrong, A. P. Wolfe, J. Ray, D. E. Bernholdt,and J. A. Kohl, “The CCA Core Specification in a Distributed Mem-ory SPMD Framework,” Concurrency and Computation: Practice andExperience, vol. 14, no. 5, pp. 323–345, 2002.

[3] F. Bertrand and R. Bramley, “DCA: a distributed CCA frameworkbased on MPI,” in Proceedings of the HIPS2004 - 9th InternationalWorkshop on Highlevel Parallel Programming Models and SupportiveEnvironments, 2004.

[4] S. Krishnan and D. Gannon, “XCAT3: a framework for CCA com-ponents as OGSA services,” in Proceedings of the HIPS2004 - 9thInternational Workshop on Highlevel Parallel Programming Models andSupportive Environments, 2004.

[5] M. Cole, “Bringing Skeletons out of the Closet: A Pragmatic Manifestofor Skeletal Parallel Programming,” Parallel Computing, vol. 30, no. 3,pp. 389–406, 2004.

[6] F. Baude, D. Caromel, C. Dalmasso, M. Danelutto, W. Getov, L. Henrio,and C. Prez, “GCM: A Grid Extension to Fractal for AutonomousDistributed Components,” Annals of Telecommunications, vol. 64, no. 1,pp. 5–24, 2009.

[7] R. Armstrong, D. Gannon, A. Geist, K. Keahey, S. Kohn, L. McInnes,S. Parker, and B. Smolinski, “Towards a Common Component Archi-tecture for High-Performance Scientific Computing,” in The 8th IEEEInternational Symposium on High Performance Distributed Computing.IEEE, 1999.

[8] E. Bruneton, T. Coupaye, M. Leclercq, V. Quma, and J.-B. Stefani, “TheFractal Component Model and Its Support In Java,” Software - Practiceand Experience, vol. 36, pp. 1257–1284, 2006.

[9] F. H. Carvalho Junior and R. D. Lins, “Haskell#: Parallel ProgrammingMade Simple and Efficient,” Journal of Universal Computer Science,vol. 9, no. 8, pp. 776–794, Aug. 2003.

[10] F. H. Carvalho Junior and R. D. Lins, “Separation of Concerns forImproving Practice of Parallel Programming,” INFORMATION, AnInternational Journal, vol. 8, no. 5, Sep. 2005.

[11] A. J. A. Wang and K. Qian, Component-Oriented Programming. Wiley-Interscience, 2005.

[12] F. H. Carvalho Junior and R. D. Lins, R. C. Correa, and G. A. Araujo,“Towards an Architecture for Component-Oriented Parallel Program-ming,” Concurrency and Computation: Practice and Experience, vol. 19,no. 5, pp. 697–719, 2007.

[13] J. Dunnweber and S. Gorlatch, Higher-Order Components for GridProgramming. Springer-Verlag, 2009.

[14] J. Dean and S. Ghemawat, “MapReduce: Simplified Data Processingon Large Clusters,” Communications of the ACM, vol. 51, no. 1, pp.107–113, 2008.

[15] F. H. Carvalho Junior and R. D. Lins, “A Type System for ParallelComponents,” 2009.

[16] F. Bertran, R. Bramley, A. Sussman, D. E. Bernholdt, J. A. Kohl,J. W. Larson, and K. B. Damevski, “Data Redistribution and RemoteMethod Invocation in Parallel Component Architectures,” in 19th IEEEInternational Parallel and Distributed Processing Symposium (IPDPS).IEEE, Apr. 2005.

[17] B. Jacob, J. Larson, and E. Ong, “M×N Communication and ParallelInterpolation in Community Climate System Model Version 3 Usingthe Model Coupling Toolkit,” The International Journal of High Perfor-mance Computing Applications, vol. 19, no. 3, pp. 293–307, 2005.

[18] “MxN Research @ Indiana University,”http://www.cs.indiana.edu/ febertra/mxn/, 2009.

[19] E. Mathias, F. Baude, and V. Cave, “A gcm-based runtime supportfor parallel grid applications,” in CBHPC ’08: Proceedings of the2008 compFrame/HPC-GECO workshop on Component based highperformance. New York, NY, USA: ACM, 2008, pp. 1–10.

[20] F. H. Carvalho Junior, R. C. Correa, R. Lins, J. C. Silva, and G. A.Araujo, “On the Design of Abstract Binding Connectors for HighPerformance Computing Component Models,” in Joint Conference onHPC Grid programming Environments and Components (HPC-GECO)and on Components and Frameworks for High Performance Computing(4thCompFrame), 2007.

[21] M. Malawski, M. Bubak, F. Baude, D. Caromel, L. Henrio, andM. Morel, “Interoperability of Grid Component Models: GCM and CCACase Study,” in CoreGRID, T. Priol and M. Vanneschi, Eds. Springer,2007, pp. 95–105.

348