PerLa: a Language and Middleware Architecture for Data ...perlawsn.sourceforge.net/files/documentation/TSE-2010-11-0328.R1_Schreiber.pdffor Data Management and Integration in Pervasive

2

PerLa: a Language and Middleware Architecturefor Data Management and Integration in

Pervasive Information SystemsFabio A. Schreiber, Life Senior Member, IEEE, Romolo Camplani, Marco Fortunato, Marco Marelli and

Guido Rota

Abstract—A declarative SQL-like language and a middleware infrastructure are presented for collecting data from different nodes of apervasive system. Data management is performed by hiding the complexity due to the large underlying heterogeneity of devices, whichcan span from passive RFID(s) to ad-hoc sensor boards to portable computers. An important feature of the presented middleware is tomake the integration of new device types in the system easy, through the use of device self-description. Two case studies are describedfor PerLa usage, and a survey is made for comparing our approach with other projects in the area.

Index Terms—Declarative language, Device heterogeneity, Functionality Proxy, Middleware infrastructure, Pervasive System, SQL,Wireless Sensor Networks.

✦

1 INTRODUCTIONIn the last years, the number of real applications re-quiring the cooperation among several heterogeneousdevices has been rapidly increasing. Components be-longing to very different technologies and characterisedby different complexity levels, ranging from simpleRFID tags to on-board PCs, are often employed togetherwithin the same scenario. The application goals can bevarious, but they can usually be reduced to the gatheringof raw data from the environment, the processing ofcollected data, and the control of some environmentalparameters.

Home automation and monitoring of natural catas-trophic phenomena, like earthquakes and rockfalls, aretwo completely different examples of applications inter-acting with many heterogeneous devices. Although thegoals of these applications and the types of involvedsensors and actuators are completely different, the issuesto be managed are exactly the same. Firstly, data have tobe sampled from several heterogeneous nodes belongingto different technologies and having different samplingsemantics; then, gathered data have to be integratedand processed in order to extract information relativeto the monitored environment; finally, the obtained in-formation can be used to control a set of actuators, ableto modify some environmental conditions or to reportalarm messages.

As an example, consider one of the case studies ofthe ARTDECO project [1], related to the monitoringof the wine production process from the vineyard tothe table. Many hardware technologies are employed

• Dipartimento di Elettronica e Informazione, Politecnico di Milano, ViaPonzio 34/5, 20133 Milano Italia.

in order to gather all the required data: ad-hoc boardshosting temperature, humidity and pressure sensors areplaced in the vineyard; RFID tags and GPS devicesprovide tracking support during the transport phase;Personal Digital Assistants (PDAs) allow human opera-tors working on the vineyard to interact with the system.Some actuators, able to modify the temperature andthe humidity levels in the yard, are also provided inorder to allow environmental control when monitoredparameters are out of their optimal ranges.

The implementation of a system able to achieve theabove presented application goals usually implies themanagement of many ad hoc heterogeneous embeddeddevices in order to sample all the involved environ-mental variables. Each of these devices can belong todifferent technologies and can be characterised by dif-ferent features, both in terms of computational power,power supply requirements, and exposed ApplicationProgram Interfaces (APIs). Moreover, in some applica-tions presenting unusual requirements, the design ofspecific hardware platforms becomes a necessary task,increasing in this way the overall heterogeneity level. Asa consequence, the development of an ad hoc softwareinfrastructure for each device and for each applicationto be deployed is the easiest and the most commonapproach. However, it has the main disadvantage ofrequiring the recoding of large pieces of software when-ever the application requirements slightly change, ora new device has to be integrated within the existingsystem.

Splitting the whole problem in two different activitiesis certainly a better approach: first, a data gatheringphase handles sampling issues; then, a data processingtask extracts information from sampled data. Many re-search efforts are nowadays related to the second phase

3

and many of the academic projects currently underdevelopment are mostly focused on the informationextraction issues rather than on the data collection ones.More specifically, as we discuss in Section 2, there area number of projects aiming at developing softwarecomponents, offering autonomic services able to self-organise and self-adapt in order to provide adaptiveand situated communication services. These works tryto identify a fundamental and uniform abstraction, ableto characterise autonomic components at different gran-ularity levels, in which self awareness, semantic self-organisation, self-similarity and autonomic componentaware principles can converge. These projects oftenassume the existence of an underlying infrastructure,which can collect data from physical devices and sensors.Sometimes, the design and the implementation of thislow level infrastructure are not considered at all sincethey are out of project scopes. In fact, the use of devicesimulators is a valid approach to test high level softwarecomponents, postponing the issues related to data gath-ering.

However, the existence of an infrastructure allowinga systematic data collection process is fundamental bothto implement working prototypes of autonomic compo-nents (and other high level abstractions) and for engi-neering the development of pervasive applications. Themain goal of this paper is to introduce the requirementsand the issues related to the design and the implemen-tation of a solid substratum, able to provide samplingand first data processing functionalities, abstracting fromdevices’ specificities. We will show how this goal canbe achieved, both through the definition of a querylanguage and the implementation of a middleware in-frastructure.

Many different approaches have been designed tointeract with the physical devices composing a pervasivesystem. The most common interfaces are procedurallanguages, virtual machines, message oriented middle-wares, etc. In this paper a database-like abstraction ofthe pervasive system is proposed, since the similaritiesamong the ”world of data” and the ”world of sen-sors” suggest that it could be a good approach. In the”world of data”, the abstraction concept has been widelyused [2]. The main issue when managing information isthat the same data can be stored on different storageplatforms. Databases solve this issue by providing ahigh level homogeneous view of data, regardless oftheir physical representation details. In the ”world ofsensors”, a similar problem exists since many differenttechnologies and communication protocols have to bemanaged in order to retrieve the needed information.Whereas each sensor node can be thought as a sourceof raw data, the goal of a pervasive system middlewareis similar to the goal of a Database Management System(DBMS): it should provide a high level homogeneousview of the heterogeneous devices, masking how dataare generated. Moreover, DBMS research identified astandard query interface (the SQL language) to manipu-

late data and retrieve final results. A similar approachhas been used in Wireless Sensor Networks (WSNs); infact, some declarative languages have been defined tocontrol the behaviour of a sensor network. A goal of thePerLa project is to extend this approach in order to sup-port a whole pervasive system. However, the databaseabstraction in the ”world of sensors” should handlesome additional issues that do not exist in traditionaldatabases. The dynamics of the nodes must certainly beconsidered since a device can frequently join or leavethe pervasive system, for example when the status ofthe battery or of the wireless connection changes. Espe-cially for monitoring purposes, the use of a declarativelanguage is the approach that better allows to abstractthe pervasive system, hiding the technical details relatedto device access. The advantage of defining a SQL-likelanguage is the reduction of the effort needed by a useralready experienced with standard SQL to learn it.

Context-Awareness is another important feature ofPervasive Systems [3]. Most of the variables defining acontext can be given values gathered from environmen-tal sensors [4]; therefore, even if in its present devel-opment status PerLa does not envisage explicit contextmanagement functions, we are actively working towardsextending it in order to manage dynamic contexts as well(see Section 6).

The rest of this paper is organised as follows. InSection 2 the most relevant projects, dealing with thedifferent aspects of pervasive and ubiquitous systems,are introduced and compared with the goals and func-tionalities of PerLa.

Section 3 presents the description of the PerLa systemtogether with the main reasons for the introduction ofthe new architecture composed of the PerLa Language [5]and the PerLa Middleware: the first one is an SQL-likelanguage, that allows one to define how data have to begathered from the pervasive system and how collecteddata have to be processed; the second is the systemdevelopment and language run-time support.

Section 4 describes two case studies in which PerLahas been employed: a wine production process monitor-ing system and a rockfall monitoring system.

Section 5 presents an evaluation of the system, byanalyzing the middleware performances and by high-lighting the development speed improvement achievedthrough the use of PerLa.

Finally in Section 6 some concluding remarks andfuture work are presented.

2 STATE OF THE ART AND RELATED WORKSThe term “Pervasive computing” refers to a wide set ofapplications, and it encompasses many research areas,from networking and distributed computing to softwareengineering, knowledge management and databases.Many are the existing projects, both in the academic andthe industrial world, that could be considered part ofthe pervasive computing research or strictly related to

4

it. A wide and complete survey on all these works iscertainly interesting, but it is out of the scope of thispaper. A selection of the most representative projects is,instead, useful to outline the most considered researchaspects and to find out the less studied ones. Thus,the introduction of a non trivial definition of pervasivesystem allows us to operate this selection, excluding allthe projects not directly related to the research field.

The idea of ubiquitous computing, that is the keyconcept pervasive systems are based on, was firstlyintroduced by Mark Weiser in [6]. He thought of ubiqui-tous computing as a human-machine interaction model,characterised by the replacement of existing desktopPCs with a widespread diffusion of hardware and soft-ware components into everyday life objects. In this way,technology becomes almost hidden to the final user and itplays a background role. In fact, it is distributed ona widespread network of heterogeneous devices, calledpervasive system.

Afterward, Kindberg and Fox [7] pointed out two prin-ciples a system should respect in order to be consideredpervasive, accordingly to the definition given by MarkWeiser. These two characteristics are physical integrationand spontaneous interoperability. The first one refers tothe ability of hardware and software components tointegrate and hide themselves into everyday life objects,interacting with the environment in a transparent way;the second refers to the devices’ abilities to start com-municating whenever required by the context, withoutexplicit developer scheduling.

Existing projects related to pervasive systems seldomcover both the aspects defined by Kindberg and Fox:they mainly focus on a single feature. The works thattry to reach a high level of integration typically dealwith low level hardware and software, and they arerarely interested in obtaining spontaneous interoperabil-ity features. Vice versa, projects that concentrate theirefforts on spontaneous interoperability usually considerthe integration issues only at a high abstraction level.They are not interested in the details allowing deviceintegration, and they require the devices to expose highlevel interfaces in order to communicate with the system(e.g., XML over HTTP protocols).

A complete pervasive application is composed ofmany architectural layers requiring different abilities,from the low level hardware programming to the de-sign of high level abstractions. Physical integration isachieved by operating on low level layers, while spon-taneous interoperability is more related to the highestlevels. This is one of the reasons that make it difficult andunusual to achieve both goals within the same researchproject.

Several papers, available in the technical literature,provide different surveys that aim at classifying exist-ing projects related to pervasive systems; each of themproposes one or more dimensions a comparison can bebased on. As an example, Messer et al. [8] cross thefunctionality layers (e.g., sensing, context management,

execution platforms, user interaction, etc.) with the per-vasive properties (e.g., device mobility, user mobility,software mobility, power management, adaptation, etc.).Salem and Nader’s analysis [9] focuses instead on theprogramming model adopted to write the code deployedon the devices. Information about the compliance levelwith the expected pervasive system features is alsoprovided for each considered project.

The analysis of the literature highlighted two criteriato cluster the projects related to pervasive systems: thefirst one is based on the covered architectural layers,while the second one is based on the middleware goals.Addressing the second criterion, there are monitoringprojects that provide some interfaces to perform ex-plicit monitoring and user-oriented projects that hidethe technology and adapt their behaviour with respectto the context. Basically, low level projects are usuallymonitoring-oriented, while high level projects are usu-ally user-oriented.

Even if we do not want to deepen the topic here, afinal remark is in order to relate Pervasive Systems ar-chitectures to other classes of systems which are alreadyestablished or in course of development. An obviousthought goes to Distributed Databases [10], where, sincelast century’s seventies, a lot of work has been madeon data allocation and query optimisation. Much of thiswork is keeping on in the domain of Cloud Computing[11]. The same can be said about the convergence claimfor Ubiquitous computing technologies and the Grid [12];while the Grid has been traditionally mainly considereda tool for High Performance Computing (HPC), quoting[13], ”... it is now becoming a more generic platform forsharing any kind of networked resource and a GlobalGrid Forum Research Group Ubicomp-RG has beenestablished to explore the synergies between the twotypes of systems”. Moreover, the advent of wireless grids[14] makes the integration in a pervasive system of veryheterogeneous devices, such as sensor endowed cellularphones, concretely feasible.

In the following, a list of representative projects relatedto pervasive systems is considered, detailing the coveredarchitectural layers and highlighting the reasons moti-vating the development of PerLa, the system presentedin this paper. The first class of projects we cite is com-posed of all the monitoring projects ad hoc developed forspecific applications. They are usually based on ad hocdesigned boards, running low and high level softwarecomponents that are implemented from scratch andcompletely fitted on the given application. The routinesfor managing sensors, transmitting sampled data, andelaborating gathered data according to the applicationrequirements, are designed and implemented for theadopted board. Usually, these kinds of projects do notprovide high-level devices abstractions, but rather theyembed the system behaviour customising the code ofeach involved architectural layer. Some examples ofthese kinds of work can be found in [15] [16] [17] [18].There are many real applications that require the moni-

5

toring of some physical sensors in order to collect envi-ronmental data: developing an ad hoc system for eachof these applications requires low level programmingskills and makes very difficult both reusing the code andproviding support for the addition and the integrationof new devices. PerLa also aims at supporting this kindof monitoring application and to speed up the designand the development phases by providing a generalmiddleware that allows one to easily integrate differentkinds of devices and to reduce the amount of the neededlow level code.

The above presented projects hard-code the behaviourof each device, but they do not usually provide either auser interface or a network abstraction, that allow oneto modify the behaviour of the whole system. TinyDB[19] [20] is the first project that introduced the idea ofabstracting a Wireless Sensor Network as a database.In this way, the final user can define the desired datagathering by writing an SQL-like query. At the higherlevel, an interface is provided to inject queries in thesystem. Each query is sent from the user workstation tothe WSN base station; then, the query is distributed to allthe nodes, and data are collected from the sensors in theenvironment, filtered, possibly aggregated and routedout to the base station. TinyDB is defined over TinyOS[21], and it exploits a power-efficient in-network process-ing algorithm. The main constraint of TinyDB is thatonly homogeneous WSNs composed of TinyOS baseddevices can be managed. The idea behind PerLa is toextend the database-like abstraction from a single WSNto a whole pervasive system and to provide full supportfor heterogeneity, both at run-time and deployment time.

The DSN [22] [23] project is similar to TinyDB: themain difference between them is the interface the finaluser is provided with. In fact the DSN user can controlthe system through a declarative language, called Snlog,that is a dialect of Datalog. Although this is an interest-ing approach, in PerLa we decided to adopt an SQL-likelanguage since we think it could be easily learned by afinal user already experienced with databases.

Both in the academic and in the industrial world someprojects have been developed trying to overcome theheterogeneity issues in order to allow the managementof a high number of heterogeneous devices belonging todifferent technologies and exhibiting different features.These projects do not often provide an infrastructureto perform high level processing of information, butonly to collect gathered data in a centralised database.The definition of sampling details and of device accessinterfaces are clearly the goals of these works.

SWORD [24] is an example of industrial project devel-oped by Siemens; it provides support for remote controlin machine-to-machine scopes, and it allows a centralmanagement of a distributed infrastructure. SWORDis able to detect and signal alarms coming from thenodes deployed in the monitored area. All the detectedsignals are then collected in a centralised database thatis the data source for the final control and monitoring

application. The interaction between SWORD and thenodes is performed through a communication protocolbased on the interchange of XML messages over anHTTP channel: this means that each device must supportthis protocol in order to be integrated with SWORD.

The GSN [25] [26] middleware is completely devel-oped in Java, and it is executed on the computerscomposing the backbone of the acquisition network. Theinterface between the middleware and the devices isdefined by wrappers: they are software components ableto transform raw data coming from a physical source, toa GSN suitable format. GSN middleware is based on theconcept of Virtual Sensor, that is a software componentable to filter and process data coming from the wrappersor from other virtual sensors. The behaviour of thesecomponents is defined through an XML configurationfile, while the required data processing is set usingan SQL-like query. Data coming from wrappers areprocessed through a chain of virtual sensors in orderto produce output information. Unfortunately, VirtualSensors have to be developed specifically for every newkind of node.

A remarkable feature of both [24] and [25] is thesupport for heterogeneous devices, which is achievedby posing strong constraints on the supported nodes. Infact, support for Java programming at the device level isassumed, or, at least, the availability of a TCP/IP stack isrequired; this requirement can not be feasible on limitedor ad hoc devices.

An interesting project is TinyRest [27], which exploitsREST [28] principles, applying them to MOTE devices;more specifically, it defines a mapping between HTTPREST messages and TinyOS messages, in order to allowa high level sampling control. The approach of TinyRestdoes not require the device to implement a full TCP/IPstack, but rather a software layer is provided in order toenable HTTP data exchanges between the middlewareand the devices. The approach is bounded to MOTEtechnology: PerLa aims at extending and improving itin order to support heterogeneity without posing manyconstraints on the devices.

In fact, the heterogeneous network support layer pro-vided by PerLa tries to adapt the middleware itself tonodes’ features rather than forcing the devices to supportcomplex high level protocols, allowing in this way areduction of the coding effort required to a developer inorder to integrate new devices. This makes the commu-nication subsystem of PerLa quite different from bothexisting message oriented technologies (e.g., JMS [29])and tightly coupled communication middleware (e.g.,RMI [30], CORBA [31]), that all require the presence ofa JVM on each device.

All the projects presented up to now aim at handlingthe sampling operations and managing heterogeneityissues. However, there are some projects that considersupport to pervasiveness in a different light: they focuson the elaboration of gathered data, and they providesystem ubiquity features through the concept of context.

6

The main research projects belonging to this group areAURA [32] and GAIA [33]: they both provide middle-ware architectures supporting complex devices, such asdigital cameras, PDAs, shared storage spaces, overheadprojectors, etc. One of their goals is to integrate all thesekinds of devices, but the existence of device drivers (thatprovide high level APIs suitable for interacting with eachnode) is assumed.

AURA introduces the concept of personal aura, a proxythat enables the user to use his own resources indepen-dently of the physical location the system is accessedfrom. In this way pervasiveness is granted, since theuser can operate independently of his location and of thedevice used to interact with the system. GAIA introducessome ideas that are very close to AURA ones: whenthe user moves from a physical environment to another,available technologies detect his presence and tune thesystem to better support his activities. GAIA reachesthis goal by hiding the technologies in the environment(video motion, speech recogniser, etc.), achieving in thisway a high ubiquity level.

The CoolTown project [34] is quite similar to AURAand GAIA, but it was developed in the industrial worldat HP Labs. Its goal was the implementation of anubiquitous system able to adapt its behaviour accordingto the specific context. The system is completely webbased, and each device is abstracted through a webinterface that allows system-node interaction.

PerLa is designed to cover the lowest architecturallayers of a pervasive system and to act as a middlewareplatform over which application logic can be developedwithout taking care of device integration issues. Thesefeatures make PerLa a powerful tool to manage monitor-ing applications that are usually handled by systems likeGSN, SWORD, TinyDB, or DSN. Although PerLa alsosupports the management of contexts, it is not designedto replace systems like AURA, GAIA, and CoolTown,but rather to provide a solid substratum over which thiskind of application can be deployed. In fact, the useroriented features provided by these projects are out ofthe scope of the PerLa project; vice versa, the technicaldetails to interact with physical devices are out of thescope of AURA, GAIA, and CoolTown.

The maximum degree of abstraction is achieved byCASCADAS [35], which aims at designing autonomiccomponents, able to self-organise and to adapt to thesituation, in order to achieve the expected functionalities.The goal of the project is the development of the knowl-edge and the techniques needed to build an autonomouscomputing network. The resulting system should prop-erly work without requiring continuous administratorsupervision; moreover, it should be able to learn itsown optimal behaviour over time and to integrate newcomponents when needed. The CASCADAS project doesnot take care of low level issues related to networkarchitecture and sampling operations. On the contrary,the existence of a high level network service, able toproduce data streams and feed autonomic components,

is assumed.The SelfLet project [36] [37] that is under develop-

ment at Politecnico di Milano, is part of CASCADAS.It also deals with autonomic software components, butit mainly focuses on software engineering aspects, suchas the life cycle description and the definition of runtimevalidation models. For both these aspects, the SelfLetapproach highlights how the design phase and the de-ployment phase can no longer be considered as twocompletely distinct activities. The project development isbounded to high level layers. A Java based implemen-tation, currently in embryonic phase, relies on severalexisting technologies and tools (REDS [38], DROOLS[39], LIME [40], OSGi [41]) for the development ofSelfLet internal components. Therefore, each SelfLet canbe thought as a Java object, able to achieve a goalthrough the execution of a Behaviour. This Behaviourcan be accomplished by searching for services that areprovided by other SelfLets or executing an elementaryactivity, called Ability. Research interests are focusedover the Ability level. The routines performing sampling,actuation, and communication with physical devices arenot part of the SelfLet middleware, and they must beimplemented as Abilities.

The MoGATU project [42] [43] [44] aims at extendingthe functionalities of mobile devices by exploiting theirability to autonomously establish connections amongthem. The main difference with respect to the otherprojects cited, consists of the different role reservedfor mobile devices. In fact, they are not treated as thesensors and the actuators needed to solve user submittedqueries, but rather they are intended as the interfacesused by the end users to access some services offered bythe system.

This project employs powerful devices equipped witha user interface, like PDAs and laptops, where clientapplications act as data consumers and give ubiquitousaccess to the information stored in the wired network.These devices also operate as data producers, providingneighbour nodes with information retrieved from wiredservers or other nodes.

Prism-MW [45] is an extensible middleware platformdesigned to enable efficient implementation, deploy-ment and execution of distributed software systems. InPrism-MW the heterogeneity support is entirely focusedon enabling communications among different devices,equipped with different operating systems and pro-grammed with different languages. PerLa extends theheterogeneity support also to sampling operations anddata management. The main goal of both projects is toprovide a highly-decoupled component-based extensibleframework to develop distributed applications; in Sec-tion 3 we shall further compare their architectures.

Glide [12] is a Java platform developed on the topof Prism-MW. The goal of the system is to allow thesearch and the retrieval of resources (e.g., MP3 files)hosted on a set of nodes and characterized by someattributes (e.g., genre, quality, etc.), given a query in

7

terms of these attributes (e.g., retrieve all the MP3 fileswith genre equals to pop and quality equals to high).GLIDE and PerLa are very different systems, since PerLafocuses on monitoring applications and it is designedto support sampling operations and to manage datastreams while GLIDE is designed to search and man-age single resources. However some similarities can befound between GLIDE and the Registry component ofPerLa, as we shall see in Section 3.

3 THE PERLA SYSTEMAs highlighted in the earlier sections of this paper, thereare currently no projects that provide a comprehensiveand adequate solution for managing a modern pervasivesystem. The deficiencies identified in our analysis of thestate of the art always result in poor usability for the endusers and/or inadequate support for node developersand pose great limits to the scenarios of applicationwhere these middlewares can be successfully employed.PerLa aims at overcoming most of the weaknesses ofexisting management systems for sensing networks byachieving a coverage of most of the layers identified inSection 2.

The development of the PerLa System focused on thedesign and implementation of the following features:

• data-centric view of the pervasive systems;• homogeneous high level interface to heterogeneous

devices;• support for highly dynamic networks (e.g., wireless

sensor networks);• minimal coding effort for new device addition.

The first two targets are achieved using a techniquealready known in the literature (see [19] [20] [23]): man-aging the pervasive system as a database. The resultof this approach is the possibility to access all datagenerated by the sensing network via an SQL-like querylanguage, called the PerLa Language, that allows endusers and high level applications to gather and processinformation without any knowledge of the underlyingpervasive system. Every detail needed to interact withthe network nodes, such as hardware and softwareidiosyncrasies, communication paradigms and protocols,computational capabilities, etc., is completely masked bythe PerLa Middleware.

The PerLa Middleware provides great scalability bothin terms of number of nodes and types of nodes.Other middlewares for pervasive systems only supportdeployment-time network configuration (see [24] [25][26]), or provide run-time device addition capabilitiesfor a well-defined class of sensing nodes at best (e.g.,TinyDB [19] [20]).

These limitations are no longer acceptable. In modernpervasive systems, specifically wireless sensor networks,nodes can hardly be considered “static” entities. Hard-ware and software failures, device mobility or commu-nication problems can significantly impact the stabilityof a sensing network. Furthermore, the ever increasing

presence of transient devices like PDAs, smart-phones,personal biometric sensors, and mobile environmentalmonitoring appliances makes the resilience to networkchanges an essential feature for a modern pervasivesystem middleware. Support for run-time network re-configurability is therefore a necessity. Moreover, themiddleware should also be able to detect device abilitiesand delegate to them whichever computation they canperform. The tasks that nodes are unable to performshould be executed by the middleware.

PerLa fulfills these requirements by means of a Plug &Play device addition mechanism. New types of nodes areregistered in the system using an XML Device Descriptor,i.e., a document containing all the information needed tointeract with the hardware and software modules of thesensing nodes. The PerLa Middleware, upon receptionof a device descriptor, autonomously assembles everysoftware component needed to handle the correspondingsensor node. End users and node developers are notrequired to write any additional line of code. State of theart middlewares for sensing networks, like [24] [25] [42][33], require instead the creation of individual softwarewrappers for every device connected to the system.

Table 1 provides a feature-wise comparison amongPerLa and the other systems analysed in the state of theart section of this paper.

3.1 PerLa System overview

The PerLa System is composed of different softwaremodules with well defined interfaces and functions.

To provide a practical and effective framework thatwould support the implementation of middleware com-ponents, and ensure a strong decoupling among them,the first stages of the PerLa architectural design werespent developing a set of Java foundation softwareelements. The goal of these basic building blocks isto provide the common features needed by any PerLacomponent, such as initialization, event logging andcommunication functionalities. The following list con-tains a brief digest of the aforementioned elements:

• Component: the base abstract class inherited by everymiddleware block. Defines the essential methodsemployed to manage all PerLa subsystems (e.g.,start, stop, etc.);

• Pipe: this object is a one way message queue thatprovides a highly-decoupled communication systembetween different middleware components. Trans-mission of information through pipes is performedusing the primitives push and pop, which providea mechanism to respectively send and receive mes-sages. Each PerLa component can be connected toone or more input and output pipes;

• Waiter: this class allows synchronised waiting onmultiple pipes;

• Channel: the Channel is an abstract class that acts asa gateway between a pipe and a physical channel.

8

AURA GAIA MoGATU TinyDB SWORD GSN DSN Cascadas Selflet PRISM GLIDE PerLa

Support for autonomic com-ponentsSupport for contexts

Data manipulation

Centralized node integrationRun-time heterogeneous net-work supportDeployment-time heteroge-neous network supportLow level support

TABLE 1A comparison of middlewares for pervasive systems

Not surprisingly, a similar architecture had been al-ready studied and described in the literature [45]. Dif-ferent analogies can be made among the two designs;the Component class of Prism-MW core is very similarto the Component class in PerLa: the Send() method isthe equivalent of pushing a message into an output pipe,while the Handle() method is similar to the reception ofa message from a pipe using a Waiter object.

As shown in Figure 1, PerLa components are dividedin two macro-layers:

• Low level support: Low level support provides ahomogeneous API to manage and operate all thedifferent devices that compose the pervasive system.

• High level support: This layer implements data man-agement and query execution services needed tointerface end-users and applications with the per-vasive system.

The end users and the application developers interfacewith the PerLa system through the upper interface withthe High Level Support, while the “network architects”(e.g., node developers), who work near to the physicaldevices, interface with the system through the PhysicalLayer interface (Figure 1).

The main software module of the low level supportlayer is the Functionality Proxy Component (FPC). Assuggested by its name, the FPC’s primary task is to actas a proxy among sensing nodes and the rest of thePerLa System. Owing to the FPC’s homogeneous accessinterface, components of the High level support layercan communicate with the physical devices ignoringtheir functional behaviour. As a matter of fact, dueto the abstraction provided by the Functionality ProxyComponent, all nodes of the pervasive system appear toimplement a common interface.

The physical communication between sensing devicesand FPCs is completely managed by the Channel Man-ager. This component handles the creation of VirtualChannels, a software abstraction that hides all the pecu-liarities of the medium adopted to communicate with thepervasive system nodes. By means of virtual channels,FPCs and physical devices can communicate transpar-ently.

Fig. 1. PerLa Middleware architecture

PerLa does not impose any constraint on the commu-nication protocol and message format that the sensingnodes must implement. In fact PerLa allows the useof arbitrary protocols. Messages to or from the nodesare automatically converted by the FPC; this featuresignificantly reduces the effort in terms of node firmwaredevelopment and does not require a specific physicalcommunication channel. PerLa can therefore operatewith a wide range of devices, regardless of their hard-ware and software capabilities. Other approaches (e.g.,[24] [33]) impose simple and clear high level middlewareinterfaces that are difficult to implement on certain kindsof sensing nodes (e.g., an HTTP interface can be toohard to implement if the physical device is a µ-controllerconnected to the middleware via an industrial bus).

PerLa is not the only middleware to provide thistype of support for device heterogeneity. As noticedin Section 2, a similar technique is presented in GSN[26] [25]. However, in this case, the integration of anew kind of device requires node developers to codethe corresponding Wrapper. Heterogeneity support is

9

therefore limited to deployment-time.Conversely, PerLa supports the introduction of new

kinds of nodes at run-time through a Plug & Play deviceaddition mechanism. FPCs are created by means of theFPC Factory, a software module responsible for PerLa’sPlug & Play system. New nodes can join a networkmanaged by PerLa simply by forwarding their XMLdevice descriptor to the FPC Factory, which creates aFunctionality Proxy Component suitable to handling thenode. Further details will be introduced in Section 3.1.2.

The complete inventory of all FPCs active in a PerLainstallation is maintained by the FPC Registry. This com-ponent is mainly employed to select the list of devicesthat can run a particular query (see also Section 3.2.3),and to check if a node trying to connect with the PerLaSystem already has a corresponding FPC (mobile andtransient device support). The Query Parser and theQuery Analyser compose the user interface of the entirePerLa System. The PerLa Language defines two classesof queries:

• Low Level Queries (LLQ): define the behaviour ofsingle devices

• High Level Queries (HLQ): manipulate data streamsgenerated by LLQs.

A detailed explanation of the roles and the differencesbetween the two query levels is discussed in Section 3.2.

The Query Parser is PerLa’s front-end to final users.This component receives and parses textual queries andsends them to the Query Analyser. As shown in Figure1, the Query Analyser is responsible for the creation ofboth the High Level Query Executor (HLQE) and the LowLevel Query Executor (LLQE).

The High Level Query Executor is implemented as awrapper for an external Data Stream Management System(DSMS) [46]. Its main purpose is to allow users toperform data management operations on the informationgathered from the pervasive system. Each High LevelQuery submitted by the end user is executed on adedicated HLQE.

Differently from High Level Queries, Low LevelQueries required the creation of an ad hoc executionengine, called Low Level Query Executor. The executionof a single Low Level Query is performed according tothis procedure:

• the Query Analyser accesses the FPC Registry toretrieve a list of FPCs that can successfully performthe query (see Section 3.2.3, EXECUTE IF clause);

• the Query Analyser instantiates a Low Level QueryExecutor for each FPC in the list and configures itfor the execution of the statement submitted by theuser;

• each Low Level Query Executor interacts with thecorresponding FPC to gather information from thephysical devices.

All records generated by the LLQEs related to a singlequery are then combined into a single data stream, whichrepresents the output of the Low Level Query sent by theuser.

3.1.1 Functionality Proxy ComponentAn FPC is a Java object that provides a functionalabstraction of a given device or a group of them. TheFPC is used by other PerLa middleware components tointeract with the sensing node itself. A hardware devicecan host an FPC only if:

• it runs a Java Virtual Machine (JVM)• it is connected to the PerLa Middleware via TCP/IP

Although some physical devices are powerful enoughto host their own FPCs, many others are not, eitherbecause they cannot run a JVM or because they are notprovided with a TCP/IP interface.

In the deployment presented in Section 4.1, all FPCsare hosted in the central network node, which is power-ful enough to support a complete JVM. Conversely, dueto strict hardware limitations on both the nodes and thegateway, the deployment shown in Section 4.2 requiresall FPCs to be run on a remote server. In any case, thelocation of the FPC should be as near as possible to theabstracted device.

The FPC actually implements a Logical Object, i.e., anabstraction of the physical node that is used by the LowLevel Query Executor to access the information gatheredfrom the sensing nodes. By virtue of this abstraction,the PerLa Language is completely independent from anyhardware or software feature exhibited by the physicaldevices (see Section 3.2.3 for further details).

The information generated by the physical sensingnodes are abstracted as FPC Attributes. Attributes canbe used to retrieve the node state (e.g., battery status,memory occupation, etc.), to access sampled data (e.g.,temperature, pressure, etc.), or to change some parame-ters on the device (sampling frequency, node parameters,etc.). FPC Attributes can be classified in three categories,depending on the type of value they abstract:

• Static attributes: represent constant values that de-scribe immutable characteristics of the sensing node.They are commonly used to define properties likedevice name, maximum sampling rate, location ofstationary nodes, etc. Attributes of this type are setusing the XML device descriptor, and do not changetheir value for the entire FPC life cycle.

• Probing dynamic attributes: when this kind of at-tribute is read, the FPC must refer to the physicaldevice in order to produce the requested value.Similarly, when a write operation is performed (ifsupported), the FPC has to interact with the deviceto set the new attribute value. Probing attributes canbe used to represent a real sensor (e.g., temperature,pressure, etc.), a real actuator (e.g., the position of astepper motor) or a memory area (e.g., informationstored in a device RAM, ROM, flash, etc.). Readand write operations on a probing dynamic attributealways result in an interaction with a physical node.Should the node be momentarily unavailable, theFPC caches and defers the operation to a more con-venient time (i.e., writes and reads are performed

10

when the node is back on-line or when the networkconnection allows to communicate with the device).Even if cached, every single user operation on anFPC’s probing dynamic attribute is relayed to thedevices.

• Non probing dynamic attributes: when a non prob-ing dynamic attribute is read, the FPC returns thevalue stored in its local cache. No actual readingis performed on the physical device. The cachedvalue is periodically refreshed in accordance withthe guidelines set in the XML device descriptor.Two policies are available: pull (the FPC periodicallyfetches the new value from the device, regardlessof user requests for that attribute) and push (thedevice sends a message to the FPC whenever anew value is available). Writing operations are notsupported on non probing dynamic attributes bydesign, since the semantics associated with themwould imply the possibility to lose messages if notsent to the node before another write is requestedby the user. This behaviour is not desirable when,for example, a series of command are to be writtenon the device, since the PerLa system would not beable to guarantee the delivery of all of them.

It is important to note that all three types of FPCattributes are accessed using the same paradigm, andthat different access semantics can be added in the futureif necessary.

Events and notifications fired from a physical device(e.g., when an RFID reader senses an RFID tag or when adevice identifies a certain pattern in the signal producedby a sensor) are intercepted by the FPC and relayed tothe other middleware components. This feature allowsPerLa to support both push and pull data retrievingsemantics. Push nodes (like those used in Sections 4.2and 4.1) spontaneously send the data messages to PerLa.Conversely, pull nodes send information only on request.The FPC is therefore required to periodically query thedevice if more than one sample is needed.

As said before, an FPC can be defined to abstract eithera single node or a group of them: in the latter option, thefinal user controls all the nodes as a unique entity. Themachine hosting an FPC that controls multiple devicesmust be able to sustain the workload required to managenumerous data sources.

Table 2 shows a real example of the mechanism thatallows to map a node message and the Java object usedby FPC. In particular Table 2.a presents the messagesent by a node (used in the deployment in Section4.1), which consists of 5 attributes (i.e., timestamp,panelVoltage, etc.) with different byte length andsign. Moreover, the µ-controller of this node adopts “bigendian” encoding. In Table 2.c, the corresponding Javaclass is presented. The information about fields length,sign and endianess are stored in special Java annotation(setter getter methods and constructors are omitted inthe example). By means of these annotations, the FPCcan unmarshal the byte-stream sent by the device into

Fig. 2. FPC: Functionality Proxy Component

the corresponding Java class.

More complex data structures (e.g., used to storemicroseismic events presented in Section 4.1) are un-marshaled in a hierarchical fashion. The resulting classreflects the nested C-structure used in the node message.

Figure 2 provides an overview of the Functional-ity Proxy Component, and it represents the interac-tions between this component and the device. It alsodistinguishes the software modules belonging to themiddleware from the software modules written by thedeveloper who is integrating the device with the PerLamiddleware.

The upper interface of an FPC is composedof two classes of methods: the first one(getAttributeByName()) allows one to retrievethe data coming from the device, while the secondone (setAttributeByName()) allows one to setparameters on the device. In the lower interface it isshown how the node sends messages (of the previousexample) and the FPC (in particular the unmarshaler)decodes and stores them into an internal object(DataMessage class). However, this class is createdat runtime; hence, a suitable way to access its fieldconsists in the use of the Java Reflection mechanism.The proposed solution is general and does not dependon the given class (e.g., DataMessage of the example).

In the same way, setAttributeByName() allowsthe device parameter change by manipulating the cor-responding class (e.g., ParamMsg). The marshaller usesthis class to create a message for the node; in turn thenode sets the corresponding parameters after receivingthe message.

11

C-like message structure XML description Java Class

typedef struct{uint64_t timestamp;int16_t panelVoltage;int16_t panelCurrent;int16_t batteryVoltage;uint8_t flags;

} DataMessage;

<parameterStructure name="DataMessage"><endianess>BigEndian</endianess><parameterElement name="timestamp">

<length>8</length><type nameType="long"><sign>unsigned</sign>

</type><attributeType>probing</attributeType><permission>r</permission><continuousValue />

</parameterElement>

<parameterElement name="panelVoltage"><length>2</length><type nameType="long"><sign>signed</sign>

</type><attributeType>probing</attributeType><permission>r</permission><continuousValue />

</parameterElement>...

</parameterStructure>

@StructInfo(endianness = Endianness.BIG_ENDIAN)public class DataMessage{

...@SimpleField(size = 8, sign = Sign.UNSIGNED)private long timestamp;

@SimpleField(size = 2, sign = Sign.SIGNED)private int panelVoltage;

@SimpleField(size = 2, sign = Sign.SIGNED)private int panelCurrent;

@SimpleField(size = 2, sign = Sign.SIGNED)private int batteryVoltage;

@SimpleField(size = 1, sign = Sign.UNSIGNED)private int flags;...

a) b) c)

TABLE 2Physical to logical attributes mapping

3.1.2 FPC Factory and RegistryAs presented above, an FPC allows a simple interactionwith a physical device, hiding the communication detailsand masking low level programming issues. To reach areal Plug & Play behaviour, the middleware should beable to create FPCs at runtime, to instantiate them on aJava and TCP/IP enabled machine and to set up a VirtualChannel (see Section 3.1.3). A middleware component,called FPC Factory, has been designed to achieve thisgoal. An instance of this object is deployed on eachmachine charged to host the FPCs; given the informationcontained in an XML device descriptor file, the FPCcode is automatically generated. This process is feasiblebecause the FPC can be implemented by merging acertain number of modules, each of them managing adifferent aspect of the interaction with the physical node.The XML descriptor is used to decide which are theright modules to be selected from a library in orderto provide a proper device abstraction. To provide fullPlug & Play support, a physical device that is joiningthe system must be able to send its XML descriptor tothe nearest factory and to start communicating with thegenerated FPC without any human interaction. A specialVirtual Channel is reserved on each physical channel tocommunicate with the FPC Factory. Thus, the new devicesends its XML descriptor on the reserved channel and, ifno error occurs, the binding of the generated FPC witha new Virtual Channel is notified. All the subsequentcommunications between the physical device and theFPC are performed on the new Virtual Channel.

All the relevant information, needed to dynamicallybuild the FPC and to establish a data connection betweenthe physical device and the generated FPC, must beincluded in the file. More specifically, the developer has

to specify the details needed to contact the available FPCFactories during the set up phase. They include the ini-tialisation parameters of the physical channels requiredto establish a communication with the FPC Factory; thelist of available sensors, and the physical measures thatcan be sampled by the device. For each attribute, thename, the data type at language level, the data typeat physical level (e.g., the encoding and the length ofthe value generated by the Analog to Digital Converter(ADC)), the conversion function, and the constraints onthe sampling rate should be specified.

A working example of an XML device descriptor isgiven in Table 2.b. As mentioned before, the FPC (byusing the generated class presented in Table 2.c) de-serialises the encoded messages sent from the node (seeTable 2.a).

Another important middleware component strictlyrelated to the FPC Factory is the FPC Registry; it isbasically a Main Memory Database (MMDB) in which areference to any existing FPC is maintained. Its main goalis to provide support for the evaluation of the EXECUTEIF clause of Low Level Queries. After a new FPC isgenerated, the factory registers it; the device metadata(supported attributes and events, maximum samplingfrequencies, etc.), the attribute values and the remotereference to the FPC object are stored in the FPC Registry.

Assuming FPCs as the resources and properly translat-ing the EXECUTE IF clause of PerLa queries (see Section3.2.2) into GLIDE queries, the Registry can probablybe thought as a set of ProfileServer and ProductServercomponents [12].

We can summarise the steps performed by the Factoryto create a FPC:

1) the XML description is received by the Factory;

12

2) the Factory using the JAXB [47] library transformsthe XML entity into a Java object (within annota-tions, constructors, setter and getter methods);

3) the Factory creates a new FPC derived class, en-capsulating the object generated in step 2;

4) the Factory invokes the Java compiler and producesthe .class files;

5) the Factory invokes the Java loader to load the FPCat runtime;

6) the Factory registers the new FPC in the registry;7) the Factory binds the FPC and the physical device

through the virtual channel.In a real pervasive system, a physical device is often

connected to different nodes that can host its FPC, andmore than one physical channel could exist betweenthese nodes and the device. Moreover, some of the Javaenabled nodes and some of the physical channels canbe temporarily unavailable due to power managementpolicies or mobility problems. Thus, a mechanism toallow an already initialised device to reconnect to themiddleware using a different physical channel has beenimplemented. Basically, the device executes the initial-isation protocol again, but the FPC Factory, simply byperforming a search in the registry, recognises that anFPC for that device already exists in the middleware.To restore the communication between the device andits FPC, the found FPC is reinstated and a new VirtualChannel is established.

3.1.3 Channel ManagerNetwork heterogeneity is an intrinsic characteristic ofpervasive systems. Many different physical medium andprotocols are often required, even in small deployments,to provide a communication backbone for the differenttechnologies employed in a sensing network.

The PerLa Middleware deals with this problem using aspecific component, the Channel Manager. This softwaremodule is responsible for the creation of Virtual Chan-nels. Virtual channels use the existing communicationfacilities of the pervasive system to provide a bidirec-tional, point-to-point logical communication layer. Bymeans of this software abstraction, the FPC and thecorresponding sensing device can communicate amongthem as if they were connected through a dedicatedlink. The Channel Manager handles every routing andmultiplexing operation required to conceal the physicalcommunication difficulties (e.g., a single physical linkshared among many sensing nodes).

Consider, as an example (given in Section 4.2), a radiochannel that operates in duty cycle mode to reducepower consumption; in this case the Channel Managermust support buffering in order to mask the real be-haviour of the physical link. Consider, as a secondexample (given in Section 4.1), a device that is connectedto the nearest Java machine through two cascaded links(for instance, a slave device can be attached to a masterCAN-bus [48] board, which, in turn, communicates witha Java machine through a radio link). In this case,

the channel manager should also take into account therouting required to transport the messages from a Vir-tual Channel side to the other one; more generally, thechannel manager must deal with heterogeneous physicalnetworks.

Note that the same physical link can be shared bymany couples FPC-device: consider as an example asituation in which a certain number of nodes are cabledon a shared broadcast bus (e.g., CAN-bus), that mustbe used to communicate with the nearest Java machine.Thus, the channel virtualisation module must also per-form multiplexing on the physical channel in order tomanage similar situations. Another important featureof the Channel Manager is the ability of automaticallysetting up the virtual channel when a new device joinsthe system: only some configuration files are required tobe installed on the physical device in order to inform themiddleware code about the supported protocols.

The Channel Manager is implemented using two dif-ferent programming languages: Java and C. While theJava version is deployed on the FPC side, the portableC version has to be executed on the physical nodes. Thelatter variant of the Channel Manager allows a greatreduction in terms of code that needs to be written bynode developers since all the libraries required to handlePerLa’s Virtual Channels are distributed along with thePerLa middleware. Both Java and C implementations ofthe channel manager now support the most used businterfaces (i.e., serial bus, TCP/IP sockets, CAN-bus),but a developer can integrate the middleware with newmodules in order to support other protocols.

3.2 PerLa Language descriptionAs previously said, the PerLa Language aims at provid-ing a database like abstraction of the whole pervasivesystem in order to hide the high complexity of lowlevel programming and allow users to retrieve data fromthe system in a fast and easy way. The main issuewe dealt with during the language design phase wasthe definition of a small set of clauses having a largesemantic expressiveness. In fact, our main goal was toobtain a language marked by an easy syntax (in orderto be easily learned), but able to manage many kindsof heterogeneous devices and to support most of thedifferent existing sampling modes (periodic, on event,etc.). Another goal of the language was to avoid theintroduction of a specific clause for each non functionalfeature, but rather to provide a uniform mechanismto access both device metadata (sampling frequency,battery status, etc.) and the result of a sampling op-eration. This generic approach allows maintaining thelanguage simplicity and simplifying the introduction ofnew non functional characteristics that can be considerednow or that will possibly emerge in the future. In thissection, the PerLa Language is briefly introduced. Firstly,existing sampling modes are explained; then, the typesof supported statements are reported; finally, Low Level

13

Queries are presented in detail. An in-depth analysis andthe full EBNF formal definition of the PerLa Languagecan be found in [5] and [49].

3.2.1 Physical devices and sampling operationThe sampling operation has a key role in the definitionof a language that should provide an easy interactionwith a pervasive system. In fact, the main goal of afinal user is to retrieve data from the network, andthis can be done by querying a given subset of nodesand processing the data gathered from them. The widevariety of devices that can be part of a pervasive systemsuggests that many different kinds of sampling exist.Moreover, although the concept of sampling is extremelyclear for many technologies, its definition is not trivialfor other ones. Thus, identifying the most importantsampling modes and defining a common abstraction isthe main challenge in order to provide a simple andpowerful language.

Many nodes that are usually employed in commonmonitoring systems are composed of an electronic boardhosting one or more physical sensors, each of them ableto probe a specific physical measure. In this case, asampling operation is simply the reading of the sensoroutput (e.g., the value of an ADC connected to the signalconditioning stage of the sensor). If a node also has astorage device on board (e.g., RAM, ROM, flash, etc.),the sampling operation can be extended to the readingof a section of that memory. Note that, from the languagepoint of view, there are no differences between a sampledvalue obtained from probing a sensor and a value readfrom memory. In fact, in both cases, the user requests asampling and the device returns a value; in the followingwe will refer to this behaviour as time based sampling.

Pervasive systems often deal with other kinds of de-vices for which the definition of the sampling operationis more difficult. Consider, as an example, the RFIDtechnology: two issues must be considered in order toprovide a clear semantics of the sampling operation.Firstly, the subject and the object of the sampling mustbe identified. The RFID reader can be abstracted asthe sensor, while the tag ID can be considered as thesampled datum. However, a dual approach is possi-ble: each tag can be abstracted as a sensor, while theRFID reader identifier can be considered as the sampleddatum. The first approach is the most common one,but the second is more fitted to the requirements of amonitoring application. In fact, RFID tags are usuallyattached to the items the user wants to monitor. Thus,the user is often interested in discovering the sequenceof readers “seen by a tag” rather than knowing thelist of tags entering the range of a given reader. Thesecond issue to be considered when dealing with RFIDsis inherently related to the features of this technology. Infact, a High Frequency RFID tag is activated only whenit crosses the limited operating range of a reader. Whena sampling is required there are high probabilities thatno tag is active. There are two approaches to deal with

this problem. The first one requires the introduction of acached value: when a sampling is required the returnedvalue is the cached one, i.e., the ID of the last readerthat sensed a given tag (or the ID of the last tag thatwas seen by a given reader). In this way the samplingsemantics is kept very similar to that of classical sensorsampling. On the contrary, the second approach requiresa radical change in the sampling abstraction; in fact,the interaction model between the device and the uppersoftware layers becomes asynchronous. The idea is toabstract the sampling operation as an event: when atag is sensed by a reader, an event flag is raised and aparameter of this event reports the ID of the reader (or ofthe tag). Note that when the latter approach is used, thesampling operation is not invoked by the user; there isnot a concept of sampling rate, and the execution of thenext sampling operation is logically delegated to the tagdevice. In the following we will refer to this behaviour asevent based sampling. Even if other sampling modes couldbe implemented, in the first implementation of PerLawe only support time- and event-based sampling, sincethese modes were required by the applications we havebeen involved in.

The remainder of this section introduces the differentquery typologies that compose the PerLa Language.The following language description is defined aroundthe concept of Logical Object. As already explained inSection 3.1.1, the logical object is an abstract sensing oractuation device that exposes a common and uniform setof functionalities required to interact with the physicalsensing node. The most important property of a logicalobject is the ability to expose a list of Attributes, i.e.,the features that can be queried or modified on thephysical device. Attributes are used to provide a highlevel abstraction of a physical transducer, an actuator,or of a variable in the node’s memory. In the nextsubsections the concepts of node, device and logicalobject are used interchangeably.

3.2.2 Supported PerLa queriesAfter the analysis of pervasive systems peculiarities, wedecided to support three kinds of statements in PerLa.

Low Level Queries (LLQ) allow one to precisely de-fine the behaviour of a single device. The main role of alow level statement is to define the sampling operations,but also to allow the application of SQL operators (suchas: grouping, aggregation, filtering) on sampled data.When a LLQ is injected in the system, the set of LogicalObjects that meet all the requirements to execute it iscomputed. Then, an instance of the query is deployed oneach selected logical object; data produced by all theseinstances are collected in a stream that can be furthermanipulated by High Level Queries or sent as output tothe final user.

As said before, PerLa is completely declarative, andit is characterised by an SQL-like syntax. However,LLQs present some relevant syntactical differences with

14

respect to standard SQL in order to provide the neededexpressiveness for managing the interaction with logicalobjects (e.g., definition of the sampling parameters).

High Level Queries (HLQ) allow one to definedata manipulation operations on streams generated byLow Level Queries or other High Level Queries. Thesekinds of queries do not directly deal with the logicalobject abstraction since they operate only on streamsindependently from their sources. As a consequence noad hoc clauses are needed: the syntax and the semanticsof these statements are similar to those of streamingDSMSs. Detailed examples of High Level Queries canbe found in [49]

Actuation Queries (AQ) add the expressiveness tomodify the state of a device, enabling in this way acomplete interaction with the environment. This kind ofstatement is often employed to control physical actuatorsor to set some parameters used by low level algorithmsdirectly executed on the node (e.g., thresholds to gen-erate events). An example of Actuation Query can befound in Section 4.1. The complete semantics of thisstatement is detailed in [49].

3.2.3 Low Level Queries in detailIn this subsection, LLQs are explained in detail sincethey are the most interesting and original part of thelanguage. Moreover, knowing the middleware designwill be useful to understand some choices in semanticsof these statements. After introducing LLQ statementsfrom a syntactical point of view, two simple examples arereported in order to better clarify the language features.

Low Level Queries are introduced by the clauseAS LOW:, preceded by the structure of the streamin which the output records have to be inserted.Every LLQ is composed of four syntactical blocks, eachof them defining a specific aspect of the query semantics.

The Sampling Section specifies how and whenthe sampling operation should be performed. Theabove described logical object abstraction allows aformalisation of this operation that is completelyindependent from the low level details of the physicaldevice: in fact, sampling is simply defined as thegeneration of a record whose fields are filled withthe current values of the logical object attributes. Assaid in Section 3.2.1, event based and time based arethe supported sampling types; the first one forces asampling whenever an event is raised by the logicalobject, while the second one forces the sampling to beperiodically executed with a certain frequency (that canbe parametrically specified). To clarify the distinctionbetween event based and time based sampling modes,consider the following example. Suppose we want toretrieve the list of RFID readers that sense a specificRFID tag. The logical object that wraps the RFID tag hasat least one static attribute (the tag ID), one dynamic non

probing attribute (the last RFID reader that sensed thetag) and one event (that signals the occurred sensing). Inthis situation, a reasonable sampling mode is the eventbased one: a sampled record is generated wheneverthe event is raised. The opposite situation is the caseof a node having a temperature sensor on board, thatshould be sampled every 10 minutes. In this case,the logical object that wraps the node has a dynamicprobing attribute that returns the current temperature.To obtain the required sampling rate, a time basedsampling mode must be used. It should be noted that alogical object abstracting a device, characterised by anintrinsic type of sampling, can contain the logic neededto support other sampling types. As an example, apassive RFID tag is obviously inherently event based,but some meaningful queries with time based samplingcan be performed too (see the example in Table 3.b).

The Data Management Section, introduced by theSELECT clause, has the role of managing sampled dataand computing query results. The interface betweenthis section and the previous one is an ideally infinitebuffer (in the following referred as the Local Buffer),where all the sampled records are appended. The datamanagement section allows end users to specify whichoperations are to be performed on local buffer records tocreate the actual query output. The syntax of this querysection strongly resembles standard SQL. However,important differences exist due to the need of managingthe infinite data stream through the definition of recordwindows in the local buffer. Moreover, other differenceshave been introduced due to our decision of focusingespecially on aggregates. In fact, we think that in manyreal situations the data of interest is an aggregation ofsome sampled values, rather than the list of all sampledrecords. This led to the development of a customsyntax for aggregate operations, different from standardSQL, that eases the extraction of a record windowfrom the local buffer. Consider the following example:SELECT AVG(temperature, 10 SAMPLES). Thisquery excerpt instructs the LLQE to extract the last 10temperature samples from the local buffer, compute theaverage, and output the result. A similar syntax allowsthe selection of a record window using time intervals(e.g., SELECT AVG(temperature, 1 HOUR)).

The Execution Conditions Section defines the rulesto establish if a certain logical object should partic-ipate in the query. The EXECUTE IF clause intro-duces the conditions that must be satisfied by anFPC in order to be admitted for query execution;these conditions can refer both to the current valueof an attribute and to the existence of attributes andevents (e.g., EXECUTE IF temperature > 15 ANDEXISTS (temperature) forces the execution of aquery on all the nodes that have a temperature sensorand that are currently sensing more than 15 degrees). Themiddleware component responsible for the evaluation

15

of the termination condition is the FPC Registry (seeSection 3.1.2). This section of the query is optional andif not specified all the Logical Objects in the systemwill be involved in query execution. For this reason, aprecise semantics for null values management has beenintroduced in order to deal with situations in which thevalue of an attribute is required and the logical objectdoes not expose it. The EXECUTE IF can be optionallycomplemented with a REFRESH clause that specifieswhen the execution condition has to be re-evaluated toupdate the list of nodes involved in the evaluation of aquery.

The PILOT JOIN is another clause designed to deter-mine the set of Logical Objects on which a Low LevelQuery has to be executed. Differently from the EXECUTEIF, the PILOT JOIN allows the definition of an executioncondition based on records produced by a second inde-pendent query. This clause adds the possibility to designcomplex execution conditions and enables the user toemploy the output of one or more unrelated sensors totrigger the execution of a query (refer to Section 4.2 foran example of the PILOT JOIN). By means of the PILOTJOIN it is also possible to implement a basic form ofcontext-aware data tailoring [4] [3].

After the evaluation of the execution condition section,a logical object is possibly charged with executing acertain Low Level Query. After the query is started,only the sampling and the data management sectionsremain active until the termination condition becomestrue or an explicit termination of the query is required.Thus, a running query can be conceptually thought ofas a couple of threads that respectively act as a producer(that inserts data into the local buffer according to thesampling section) and as a consumer (that producesoutput records manipulating the local buffer content,according to the data management section). As saidbefore, both the previous threads can be activatedperiodically or when an event happens.

The Termination Conditions Section is the last blockcomposing a low level statement and is introducedby the TERMINATE AFTER clause. Although queriessubmitted to a pervasive system have often to beexecuted in a continuous fashion and are manuallystopped, PerLa supports a mechanism to optionallyset the execution lifetime in the query itself, eitherin terms of number of selections (TERMINATE AFTER3 SELECTIONS) or run time (TERMINATE AFTER 5HOURS). This can be useful when a one shot query hasto be executed (e.g., to monitor the current state of alogical object or to sample a single value from a specificsensor) or when the monitoring period is known a priori.

Consider a system in which the nodes are some adhoc boards placed in a building and provided withtemperature or pressure sensors. Suppose that a user isinterested in retrieving, every ten minutes, the maximumtemperature reached in a given room, by sampling each

sensor once a minute. Table 3.a shows an example LLQthat has exactly this behaviour.

a) CREATE OUTPUT STREAM Table (Temperature FLOAT)AS LOW:

EVERY 10 minSELECT MAX(temp, 10 min)SAMPLING

EVERY 1 minEXECUTE IF EXISTS(temp) AND EXISTS(room)

AND room = 3

b) CREATE OUTPUT STREAM Table (rfid STRING, counter INTEGER)AS LOW:

EVERY 10 minSELECT lastReaderId, COUNT(*, 10 min)SAMPLING

ON EVENT lastReaderChangedEXECUTE IF ID=[tag]TERMINATE AFTER 1 SELECTIONS

TABLE 3Low level queries examples

The execution conditions section of the query speci-fies that the statement must be executed only on theLogical Objects that expose a temp attribute, i.e., onthe devices able to sense temperature values. Moreover,we restrict the selection to only those devices that areplaced in room 3 (we assume the existence of a roomstatic attribute, set at deployment time on every nodeof the sensing network as mentioned in Section 3.1.1).The sampling section requires the execution of a timebased sampling, with a period of one minute. Alsothe data management section is activated with a timebased semantics, and the computation of the maximumtemperature value sensed during the last ten minutes isrequired. This is a continuous query since no terminationcondition is specified, thus an explicit stop command isrequired to halt its execution. Note that the analysis ofthe query allows one to bound the ideally infinite bufferthat acts as the interface between the sampling and thedata management sections; for example, a history of tenminutes is enough to produce the correct results in theconsidered query.

Table 3.b shows an example of event-driven query. Itsgoal is to report, after ten minutes, how many times agiven RFID tag was sensed by each reader. Wheneverthe lastReaderChanged event of the considered tag israised, a sampling operation is performed and a recordreporting the RFID reader identifier is appended to thelocal buffer. The data management section is activatedonly once (due to the termination condition) with a time-based semantics, and it aggregates and counts the datacontained in the local buffer.

4 CASE STUDIESIn this section, two case studies in which the PerLa sys-tem has been employed are presented. The first test bedapplication is related to the monitoring of a mountainthat is subject to rockfall phenomena. This example is

16

Fig. 3. HW architecture for rockfall monitoring

presented focusing on the middleware rather than on thelanguage: the hardware architecture and the definition ofthe FPCs are discussed. The second test bed applicationis related to the monitoring of the wine productionprocess, and it is presented especially focusing on PerLalanguage features: the application context, the relevantphysical measures, the logical object interface, and someexamples of typical queries are reported.

4.1 Rockfall monitoring

In this section we briefly introduce a real geophysicalmonitoring application, which has been chosen for thefirst deployment of the middleware described in thispaper. The project consists in the monitoring of MonteSan Martino mountain, a mountain located near Lecco(Italy), in order to predict rockfall events (that have al-ready happened on that mountain in the past). The maingoal of the project is to set up a model, based both onhistorical data and real-time measurements, able to earlypredict rockfall events. Thus, a certain set of sensors hasbeen deployed on San Martino face in order to mea-sure some physical parameters, like accelerations, noisesand inclinations. The main constraint of this applicationis related to power management: when deployed, thenodes cannot be easily reached anymore, because theyare placed on dangerous points. As a consequence, thenetwork nodes must be “low-power” and equipped withsolar panels. The case study discussed in this subsectionis a particular embedded system in which the physicaldevices were ad hoc designed and created by our staff.

The Monte San Martino monitoring system is workingsince the end of April 2010, with a duty cycle of 24hours/day. PerLa managed more than 2500000 measure-ment records from 30 sensors, of 6 different types, in 6months with no interruption. A large saving of codingeffort was obtained thanks to the flexibility of PerLAin adding new types of devices which had not beenforeseen at the outset.

In the following, the employed hardware and softwarearchitecture is described.

Fig. 4. Microacoustic burst signal acquired by DSPics

4.1.1 Hardware architectureFigure 3 shows the whole architecture that has beendesigned for this application. The Acquisition and Elab-oration Units (AEU) are DSPic33f [50] based boards,equipped with an accelerometer or with a geophone.Each of these units continuously monitors physical vi-brations, and pre-processes the sampled signals in or-der to detect some events that could be related to themonitored rockfall phenomenon, i.e., the formation andorganisation of fractures inside rocks. Figure 4 presentsa typical microacoustic signals acquired by the DSPicnodes. In particular, the three diagrams show the accel-erations over the X, Y and Z axes.

Many of the sensed events are false positives; infact, an event is useful only when it is simultaneouslydetected by more than one node. For this reason, acertain number of sensing units are connected to a CAN-bus channel, since a wireless network does not allow toreach the required level of synchronism. Data sampledby the sensing units are collected by a local coordinator,connected to the same CAN-bus. This coordinator is a“low-power” Linux based embedded board that is notpowerful enough to host a Java virtual machine. Thefunction of the local coordinator is to act as a masteron the CAN-bus and to route collected data toward thegateway through a ZigBee [51] network. The gateway isa powerful Linux based embedded board and it is theonly device that will be installed in a safe area of themountain face. It is connected to a remote control centrethrough a TCP/IP radio bridge.

Finally, the system also manages data from solar cellspower sources. These devices are “black boxes” as totheir internal structure, and therefore PerLa creates awrapper to manage the monitoring data.

The goal of the application is to manage collected dataand to control involved devices entirely from the controlcentre. The application level software that is provided tothe users exploits PerLa features to collect required datafrom the system.

17

Fig. 5. Actuation query example

4.1.2 Software architectureThe AEUs are the physical nodes for which an FPCis generated. This FPC exposes: the ID, the latitudeand longitude of the node (static attributes); the burstdetection notification (event); the parameters of the lastdetected event such as initial time, duration, maximumamplitude, first FFT (Fast Fourier Transform) coefficients(non probing dynamic attributes); and the current accel-eration values on the X, Y and Z axes (probing dynamicattributes).

Note that the defined FPCs can be queried using boththe event based and the time based sampling mode. Inthe first case the list of detected events can be retrieved,while in the second case the inclination of the node canbe periodically monitored (relying on current accelera-tion values). The FPC components and the FPC Factoryare placed on the gateway; in fact, it is the nearest deviceable to run a JVM and connected to the control centrevia TCP/IP (through the radio channel).

AEU boards run the FreeRTos operating system. Theprogram deployed on these devices performs a realtime digital filtering on the 2KHz MEMS signal andexecutes the event detection algorithm. Note that thiscomputation must be executed at this level, becausethe transmission of the whole sampled signal as a datastream is too expensive in terms of bandwidth andpower consumption. The AEU program implements asimple communication protocol on the CAN-bus channelin order to modify some parameters (e.g., thresholds ofthe event detection algorithm) and to return sampleddata. It should be noted that the software deployed onthe AEU can be thought of, from the middleware pointof view, as the low level API provided by the devicevendor.

When a user requires the current value of the incli-nation attribute, the FPC builds up a request messagethat is sent on the virtual channel (i.e., the ZigBee link).The sampled data, returned by the sampling routine,is then sent back to the FPC over the virtual channel.Finally, the FPC performs the conversion and the nor-malisation of the sampled inclination value, based onthe information contained in the self configuration file.Similarly, when an actuation query is submitted on theFPC, the parameters and the values to be set are sent onthe virtual channel by the middleware code. The onlyroutine that must be written by the device vendor is theC code to send a proper message on the CAN-bus inorder to update the parameters of the algorithm runningon the AEU.

All the devices (DSPic, Local Coordinator and Gate-way) can be fine-tuned by modifying the correspondingparameters. In particular for the local coordinator andgateway, the duty-cycle can be changed at runtime.Moreover, the acquisition, the filtering and the detectionalgorithm of the DSPic are parametrized to allow anin-situ deep investigation of the micro-acoustic burst.PerLa offers the actuation queries to achieve this goal.A simplified version of an actuation query for DSPicsis presented in Figure 5; the MemsXGain, MemsYGain,MemsZGain are the parameters used to change ampli-fier gains of the three MEMS channels; FilterTapsXrepresents the coefficients of the digital filter used onthe MEMS X signal channel (note that the filter is a 128-coefficient FIR filter).

4.2 Wine production process monitoringThe test bed application discussed in this subsectionrefers to the automation of a large wine productionfarm, from the vineyard to the table. This is one ofthe case studies of ARTDECO, a large project fundedby the Italian University Ministry. Some parameters ofthe production cycle must be kept under strict control,both to guarantee the quality and the integrity of theproduct and to ensure a complete traceability of all thewine lots, until the final labeling phase. The first moni-toring is performed through humidity, temperature andchemicals sensors, placed in the vineyard and connectedthrough a wireless network. Moreover, the men whowork in the vineyard are equipped with PDAs, runninga worker information system: these PDAs can also beused to sense temperature and position.

The PerLa Language can handle all these nodes by in-teracting with their Logical Objects that basically exposethe node unique identifier, the device type, the currentpowerLevel, a locationX and a locationY attributes,a temperature value, and, optionally, a humidity value.Note that the attributes revealing the current position arestatic for the wireless nodes (since they are placed in aknown point during the system set up), while they aredynamic for the PDAs.

Suppose that the monitoring of the environment pa-rameters (temperature and, if available, humidity) in thevineyard is required. The low level statement shown inTable 4.a can be used to continuously sample, once perminute, the temperature and the humidity in the wholevineyard and to insert the results into a Monitoringstream. The EXECUTE IF clause requires the executionof the query on all the nodes currently placed in theyard and having a temperature sensor on board, but itdoes not restrict the query execution to wireless nodesonly. Therefore, suppose that a worker is in the yardwith his PDA; that device will execute the query exactlyas the wireless nodes do (with the only difference thatNULL values will be produced for the humidity, sincethe corresponding sensor is unavailable). The REFRESHclause forces the EXECUTE IF condition to be evaluated

18

a) CREATE OUTPUT STREAM Monitoring(nodeId ID, temperature FLOAT, humidity FLOAT,locationX FLOAT, locationY FLOAT)

AS LOW:EVERY ONESELECT ID, temperature, humidity, locationX, locationYSAMPLINGEVERY 1 m

EXECUTE IF EXISTS (temperature) ANDis_in_Vineyard(locationX, locationY)

REFRESH EVERY 10 m

d) CREATE SNAPSHOT TrucksPositions (linkedBaseStationID ID)WITH DURATION 1hAS LOW:

SELECT linkedBaseStationIDSAMPLINGEVERY 1 hWHERE is_in_CriticalZone (locationX, locationY)EXECUTE IF deviceType = "GPS"

b) CREATE OUTPUT STREAM FrostAlarm (nodeId ID, ts TIMESTAMP)AS LOW:

EVERY ONESELECT ID, TIMESTAMPHAVING (AVG (temperature, 10 m) < 5) AND

(AVG (humidity, 10 m) > 0.75)SAMPLING

EVERY 1 mEXECUTE IF EXISTS (temp) AND EXIST (humidity) ANDis_in_Vineyard(locationX, locationY)

e) CREATE OUTPUT STREAM OutOfTemperatureRangePallets(palletID ID)

AS LOW:EVERY 10 m

SELECT IDSAMPLING EVERY 10 m

WHERE temperature > 18PILOT JOIN TrucksPositions ON baseStationID =TrucksPositions.linkedBaseStationID

c) CREATE OUTPUT STREAM LowPoweredDevices (sensorID ID)AS LOW:

EVERY ONESELECT IDSAMPLING

EVERY 24 hWHERE powerLevel < 0.15EXECUTE IF deviceType = "WirelessNode"

f) CREATE OUTPUT STREAM BottlesEnteredInCellar(bottleId ID, ts TIMESTAMP)

AS LOW:EVERY ONESELECT ID, TIMESTAMPSAMPLING

ON EVENT lastReaderChangedWHERE lastReaderId = "CellarX_ReaderId"

TABLE 4Wine production process monitoring - Queries

again every 10 minutes in order to update the list ofnodes running the query. Suppose now that the useris not interested in continuously retrieving the sam-pled data, but only in receiving some alarms when themonitored parameters exceed established thresholds. Asan example, consider the frost phenomena that mustbe certainly avoided in a vineyard. It is known that atemperature under 5 degrees combined with a humidityover 75% is an environmental condition that can causea vineyard to freeze; the query shown in Table 4.b canbe used to signal a suitable alarm. Note that the frostcondition is evaluated on the average of the last 10minutes samples in order to remove incidental noise.

An important feature of the PerLa Language is itsability of querying the network state as “ordinary” data.Suppose that, in the previous example, we want tomonitor the power state of the wireless nodes in order todetect low powered devices and to allow the substitutionof the batteries before the nodes become inactive. Table4.c shows a statement that can be used exactly for thispurpose.

The transport is another important and critical phasein the wine life cycle that should be controlled in order toavoid sudden changes of temperature of the transportedbottles. To support the monitoring of this phase, everytruck is equipped with a GPS and with a base station.Moreover, each pallet has a temperature sensor used tosense the temperature of the contained bottles. The queryreported in Table 4.e produces the list of pallets whosetemperature exceeded a certain threshold while the truck

was traveling through a zone considered particularlycritical. In that query, the PILOT JOIN operation is usedto activate the temperature sampling only on the palletscontained in the trucks that are driving into the criticalzone (see Table 4.d). Note that the interface of the LogicalObjects wrapping GPS devices must have an attribute(linkedBaseStationId) that retrieves the ID of thebase station reinstated on the same truck (this is a staticattribute, whose value is defined at network deploymenttime).

The last example that is presented is relative to themanagement of the bottles stored in the cellars. They areall equipped with a low frequency RFID tag that allowsone to completely trace the history of the bottle. An RFIDreader is mounted on the door of each cellar; thus, it ispossible to exactly know when a bottle entered and leftthe cellar. Moreover, temperature and humidity sensorsare employed to constantly monitor the cellar tempera-ture in a similar way as explained for the vineyard. Thus,the temperature and the humidity fluctuations a bottlewas subject to can be exactly traced. Table 4.f shows anLLQ that can be used to monitor the bottles entering ina given cellar.

5 SYSTEM EVALUATION

This section provides a twofold evaluation of the PerLasystem: first, an assessment of the performance of themiddleware; then an analysis of the benefits in terms ofdevelopment speed achieved through the use of PerLa.

19

5.1 PerformanceFigure 6 summarises the resources required by the PerLasystem to convert and process messages coming from thedevices described in Section 4.1.

This information was gathered using a custom-madebenchmark, which mimics a homogeneous network ofDSPic Acquisition and Elaboration Units. The simulationwas performed on a PC equipped with a 2.40GHz IntelCore 2 Quad CPU Q6600 and 4GB of RAM, runninga 64bit Linux Kernel version 2.6.32. The JVM used isthe 64bit Sun Java virtual machine version 1.6.0 20. Thenumber of DSPic nodes in the simulated network rangedfrom 1 to 100, in increments of 10 units.

While running the benchmark, each node of the sim-ulated network sends a continuous stream of 1000 mes-sages. Every message has a fixed size of 7450 bytes, andcontains a 1 sec waveform that represents the 3 axessignal sample recorded during a microseismic activity,along with some additional information (e.g., timestamp,peak value, noise ratio, etc.).

Both execution time and memory occupation wererecorded to evaluate the scalability of the system. Di-agrams a, b, c and d of Figure 6 present the resultsobserved from the same simulation, executed with andwithout Just In Time (JIT) Java compilation optimisation.

The experiments a and b of Figure 6 shows thatthe average message processing time scales linearly aswe increase the number of nodes. This behaviour isdetermined by the intrinsic multithreaded architecture ofPerLa. In fact, all FPCs of the system run concurrently(each of them is executed in a separate thread). More-over, all Java classes shared among different FPCs weredesigned to foster parallel execution. In Figure 6.a, theruntime JIT optimisation greatly reduces the average ex-ecution time, but a “cache”-like effect concurs in increas-ing the observed standard deviation. This phenomenonis absent in Figure 6.b, which displays the results of asimulation performed without JIT optimisation.

Figures 6.c and 6.d summarise the memory occu-pation statistics collected during the execution of oursimulation. Both charts show that the total memoryoccupation increases as the benchmark adds new nodes,and reaches about 30% of the total memory available tothe JVM (850MB). The amount of variance experiencedduring these experiments is due to the effects of theJava Garbage Collector, which periodically redeems thememory held by unused objects. The reader should alsonote that the maximum memory occupation is reachedmore slowly when JIT is disabled, due to the reducedperformances of the JVM under these conditions. Infact, the execution speed does not affect the FPCs alone,but also the threads that generate DSPic messages aswell (memory occupation is greatly influenced by thenumber of data messages queued in the system duringthe benchmark).

It is worth noting that this simulation recreates a“stress-case” scenario, designed to evaluate the perfor-mance of the system under heavy workload. In the

deployment described in Section 4.1, the total numberof sensors is 30, and each of them sends one messageevery 5 minutes. The message throughput simulated inthe benchmark application, due to bandwidth limits im-posed by the different communication channels utilised,would not even be achievable in the real life case study.Moreover, the particular 7450 bytes message sent byDSPic nodes in this rockfall monitoring sensor networkis unusual for WSNs applications. For comparison, inZigBee applications, data messages are usually limitedto 80 bytes.

5.2 BenefitsAs said in Section 3, PerLa is a tool for managing sensingnetworks, designed to support and expedite both lowlevel and high level development processes.

The declarative, SQL-like PerLa language enables ap-plication developers to retrieve data from the sensingnetwork with ease and simplicity. No knowledge re-garding the specific hardware and software modulesequipped on the sensing nodes is needed to write aPerLa query. This feature allows for a stronger separationamong high level applications and the acquisition net-work. Moreover, the PerLa language supports and pro-motes low level heterogeneity, since differences existingamong the devices employed in the underlying sensingnetwork are concealed by the middleware.

As mentioned earlier, PerLa provides different toolsto support developers and users working on the lowerlevels of a sensing network. The integration of newtypologies of devices in the PerLa middleware does notnecessitate the explicit development of software driversor java classes. Node developers are only required to de-scribe the capabilities and features of their new hardwarein a XML device description file; all the code neededto communicate and interface with the sensing nodesis generated at run-time by the middleware itself (seeSection 3.1.2 for further details).

A brief evaluation of the effectiveness of this featurecan be made by comparing the lines of code generatedby PerLa versus the length of a device descriptor used inthe Monte San Martino monitoring system. About 2.000lines of procedural code were automatically assembledby the middleware to communicate with the Acquisitionand Elaboration Units described in Section 4.1.1. Thisfigure is a good approximation of the code needed tointegrate the same device on other systems (e.g., to builda GSN wrapper), and comprises all software componentsneeded to marshal and unmarshal data messages, man-age the communication with the physical sensing devicesand assemble the output records needed by the queryexecutors. The length of the corresponding XML devicedescriptor, in comparison, was only 57 lines long.

PerLa proved to be of invaluable help during thedevelopment of both case studies of Section 4. Sub-stantial reductions in development time were achievedthrough the extensive use of the aforementioned mid-dleware features instead of more customary approaches,

20

Fig. 6. Data processing time and memory usage in San Martino rockfall monitoring application

which would have required the development of an adhoc managing application. An assessment of these timesavings can be obtained with the analysis of a particularsituation, occurred during the development of the MonteSan Martino monitoring system: due to a refinement inthe sampling procedure, many of the data structures em-ployed by the acquisition nodes to communicate with thePerLa system changed. PerLa allowed device designersto quickly mirror these changes to the middleware bysimply modifying the XML device descriptor, a task thatrequired only a fraction of an hour. No query had tobe rewritten due to this alteration, and no code had tobe adapted by hand to align the middleware with thenew message format. As noticed in Section 4.1, largesavings of coding effort were also obtained thanks tothe flexibility in adding new device types which had notbeen envisaged at the beginning of the project.

Of the 20 man months required for the completedevelopment of the monitoring system installed on theSan Martino mountain, only a couple of weeks werespent in activities required to customise the PerLa systemfor the specific application (i.e., writing queries, XML

descriptors and, occasionally, components needed tocommunicate with new physical media and protocols).

6 CONCLUSIONS AND FUTURE WORKIn this paper we presented the design and the implemen-tation of PerLa, a language and infrastructure for datamanagement in pervasive systems, mainly oriented tomonitoring applications, but also suitable as a supportsubstratum for the deployment of autonomic systems.The paper is mainly focused on the issues related tothe heterogeneity of the different devices composinga pervasive system: this aspect is investigated both atdata management and at physical integration levels. Thepresented solution aims at handling these issues using anon trivial approach.

The PerLa architecture is composed of two main el-ements. A declarative SQL-like language has been de-signed to provide the final user with an overall ho-mogeneous view on the whole pervasive system. Theprovided interface is simple and flexible, and it allowsusers to completely control each physical node, maskingthe heterogeneity at the data level.

21

A middleware has been implemented in order tosupport the execution of PerLa queries, and it is mainlycharged to manage the heterogeneity at the physical inte-gration level. The key component is the FPC (Functional-ity Proxy Component), which is a Java object able to rep-resent a physical device in the middleware and to takethe device’s place whenever unsupported operations arerequired. In this paper we have not investigated someoptimisations that can improve the performance of themiddleware and the expressiveness of the language. Ourfuture work will be focused on integrating intelligent in-network processing protocols [52] in the middleware andon extending the language semantics with compressionand data mining operators.

We also are going to extend both the language andthe middleware to support context-based applications.The basic idea is to change the behaviour of the sur-rounding network (e.g., by injecting an actuation queryin the nodes) depending on the actual context. From thelanguage point of view, although the pilot join operationis a first attempt to support context-based queries, weshall extend PerLa with specific context managementoperators. Building upon existing data retrieval capa-bilities of PerLa, these new language features will beable to support context management and context-awareapplications (see [3] and [4]).

From the middleware point of view, a context canbe seen like an agent which, depending on the currentvalidity condition, automatically injects different sets ofqueries into the system. The condition is periodicallychecked by a new component, the Context Manager,which also spawns or disables context agents.

Finally, we plan on integrating some energy-awaredata reduction facilities into the language, in order toreduce the network traffic [53].

ACKNOWLEDGMENTSThis research has been partially funded by the follow-ing research projects: Politecnico di Milano PROME-TEO P2ICT Lab, Italian MIUR-FIRB Art-Deco, EuropeanCommission Programme IDEAS-ERC Project 227077-SMScom.

REFERENCES[1] ARTDECO - project web page - http://artdeco.elet.polimi.it.[2] C. Batini, S. Ceri, and S. Navathe, Conceptual Database Design: An

Entity-Relationship Approach. Benjamin Cummings, 1992.[3] C. Bolchini, C. A. Curino, G. Orsi, E. Quintarelli, R. Rossato, F. A.

Schreiber, and L. Tanca, “And what can context do for data?”Communications of ACM, vol. 52, no. 11, pp. 136–140, November2009.

[4] C. Bolchini, F. A. Schreiber, and L. Tanca, Chapter 6: Data Manage-ment. In Mobile Information Systems, Pernici B. (Ed.), SpringerVerlag, Berlin, 2006, pp. 155–175.

[5] F. A. Schreiber, R. Camplani, M. Fortunato, M. Marelli, andF. Pacifici, “PerLa: A data language for pervasive systems,” inProc. PerCom, 2008, pp. 282–287.

[6] M. Weiser, “The computer for the 21st century,” Scientific Ameri-can, vol. 265, no. 3, pp. 66–75, September 1991.

[7] T. Kindberg and A. Fox, “System software for ubiquitouscomputing,” IEEE Pervasive Computing, vol. 1, no. 1, pp. 70–81,January 2002. [Online]. Available: http://dx.doi.org/10.1109/MPRV.2002.993146

[8] A. Messer, H. Song, D. Cheng, and S. Gibbs,“A classification of pervasive system software,”http://magic.ubc.ca/cmppc/positions/messer.pdf.

[9] H. Salem and M. Nader, “Middleware: Middleware challengesand approaches for wireless sensor networks,” IEEE DistributedSystems Online, vol. 7, no. 3, 2006.

[10] M. T. Ozsu and P. Valduriez, Priciples of Distributed DatabaseSystems. Prentice Hall, 1999.

[11] D. J. Abadi, “Data management in the cloud: Limitationsand opportunities,” IEEE Data Engineering Bulletin, 32(1), pp.3–12, March 2009. [Online]. Available: http://sites.computer.org/debull/A09mar/abadi.pdf

[12] C. A. Mattmann, S. Malek, N. Beckman, M. Mikic-Rakic, N. Med-vidovic, and D. J. Crichton, “Glide: A grid-based light-weightinfrastructure for data-intensive environments,” in Advances inGrid Computing - EGC 2005, ser. Lecture Notes in ComputerScience, P. M. A. Sloot, A. G. Hoekstra, T. Priol, A. Reinefeld,and M. Bubak, Eds. Springer, 2005, vol. 3470, pp. 68–77.

[13] N. Davies, A. Friday, and O. Storz, “Exploring the grid’s potentialfor ubiquitous computing,” IEEE Pervasive Computing, vol. 3, no. 2,pp. 74–75, 2004.

[14] L. W. McKnight, J. Howison, and S. Bradner, “Wireless grids: Dis-tributed resource sharing by mobile, nomadic, and fixed devices,”IEEE Internet Computing, vol. July/August, pp. 24–31, 2004.

[15] C. Hartung, R. Han, C. Seielstad, and S. Holbrook, “Firewxnet:a multi-tiered portable wireless system for monitoring weatherconditions in wildland fire environments,” in MobiSys, 2006, pp.28–41.

[16] A. Mainwaring, J. Polastre, R. Szewczyk, D. Culler, and J. Ander-son, “Wireless sensor networks for habitat monitoring,” in ACMInternational Workshop on Wireless Sensor Networks and Applications,September 2002.

[17] G. W. Allen, J. Johnson, M. Ruiz, J. Lees, and M. Welsh, “Monitor-ing volcanic eruptions with a wireless sensor network,” in Proc.Second European Workshop on Wireless Sensor Networks (EWSN’05),January 2005.

[18] P. Juang, H. Oki, Y. Wang, M. Martonosi, L. S. Peh, and D. Ruben-stein, “Energy-efficient computing for wildlife tracking: designtradeoffs and early experiences with zebranet,” in ASPLOS-X:Proc. 10th int. conf. on Architectural support for programming lan-guages and operating systems, vol. 37, no. 10. New York, NY, USA:ACM Press, October 2002, pp. 96–107.

[19] S. R. Madden, M. J. Franklin, J. M. Hellerstein, and W. Hong,“Tinydb: an acquisitional query processing system for sensornetworks,” ACM Trans. Database Syst., vol. 30, no. 1, pp. 122–173,2005.

[20] S. Madden, M. J. Franklin, J. M. Hellerstein, and W. Hong, “Tag:a tiny aggregation service for ad-hoc sensor networks,” SIGOPSOper. Syst. Rev., vol. 36, no. SI, pp. 131–146, 2002. [Online].Available: http://dx.doi.org/10.1145/844128.844142

[21] P. Levis, S. Madden, J. Polastre, R. Szewczyk, K. Whitehouse,A. Woo, D. Gay, J. Hill, M. Welsh, E. Brewer et al., “Tinyos: Anoperating system for sensor networks,” Ambient Intelligence, pp.115–148, 2005.

[22] D. Chu, L. Popa, A. Tavakoli, J. M. Hellerstein, P. Levis, S. Shenker,and I. Stoica, “The design and implementation of a declarativesensor network system,” in SenSys ’07: Proceedings of the 5thinternational conference on Embedded networked sensor systems. NewYork, NY, USA: ACM, 2007, pp. 175–188.

[23] D. Chu, A. Tavakoli, L. Popa, and J. Hellerstein, “Entirely declara-tive sensor network systems,” in VLDB ’06: Proceedings of the 32ndinternational conference on Very large data bases. VLDB Endowment,2006, pp. 1203–1206.

[24] “Sword - business service,” Siemens -http://webdoc.siemens.it/CP/SIS/Press/SWORD.htm, internalCommunication.

[25] K. Aberer, M. Hauswirth, and A. Salehi, “The Global SensorNetworks middleware for efficient and flexible deployment andinterconnection of sensor networks,” Tech. Rep., 2006, sumitted toACM/IFIP/USENIX 7th International Middleware Conference.

[26] ——, “A middleware for fast and flexible sensor network deploy-

22

ment,” in VLDB ’06: Proceedings of the 32nd international conferenceon Very large data bases. VLDB Endowment, 2006, pp. 1199–1202.

[27] T. Luckenbach, P. Gober, S. Arbanowski, A. Kotsopoulos, andK. Kim, “TinyREST: A protocol for integrating sensor networksinto the internet,” in Proc. of REALWSN. Stockholm, Sweden,,2005.

[28] R. Fielding, “Architectural styles and the design of network-basedsoftware architectures,” Ph.D. dissertation, Citeseer, 2000.

[29] “JMS - java message service,” Oracle -http://java.sun.com/products/jms/.

[30] “RMI - remote method invocation,” Oracle -http://java.sun.com/javase/technologies/core/basic/rmi/index.jsp.

[31] “CORBA, the common object request broker architecture,” OMG- http://corba-directory.omg.org/.

[32] J. P. Sousa and D. Garlan, “Aura: An architectural frameworkfor user mobility in ubiquitous computing environments,” in InProceedings of the 3rd Working IEEE/IFIP Conference on SoftwareArchitecture. Kluwer Academic Publishers, 2002, pp. 29–43.

[33] M. Roman, C. K. Hess, R. Cerqueira, A. Ranganathan, R. H.Campbell, and K. Nahrstedt, “Gaia: A Middleware Infrastructureto Enable Active Spaces,” IEEE Pervasive Computing, pp. 74–83,Oct 2002.

[34] T. Kindberg and J. J. Barton, “Towards a real-world wide web,”in ACM SIGOPS European Workshop, 2000, pp. 195–200.

[35] http://www.cascadas-project.org, cASCADAS project web page.[36] S. Bindelli, E. D. Nitto, R. Mirandola, and R. Tedesco, “Building

autonomic components: The selflets approach,” in ASE Workshops,2008, pp. 17–24.

[37] D. Devescovi, E. Di Nitto, D. Dubois, and R. Mirandola, “Self-organization algorithms for autonomic systems in the selfletapproach,” in Autonomics ’07: Proceedings of the 1st internationalconference on Autonomic computing and communication systems, ICST,Brussels, Belgium, 2007, pp. 1–10.

[38] G. Cugola and G. Picco, “REDS: A reconfigurable dispatchingsystem,” in Proceedings of the 6th international workshop on Softwareengineering and middleware. ACM, 2006, p. 16.

[39] “Drools - the business logic integration platform,” JBoss Commu-nity - http://jboss.org/drools.

[40] A. Murphy, G. Picco, and G. Roman, “Lime: A middleware forphysical and logical mobility,” in icdcs. Published by the IEEEComputer Society, 2001, p. 0524.

[41] “OSGi service platform core specification,” O. Alliance -http://www.osgi.org/Download/Release4V41.

[42] D. Chakraborty, F. Perich, A. Joshi, and Y. Yesha, “Middlewarefor Mobile Information Access,” in 5th International Workshop onMobility in Databases and Distributed Systems (MDDS 2002), Aix enProvence, France, September 2002.

[43] F. Perich, S. Avancha, D. Chakraborty, A. Joshi, and Y. Yesha,“Profile Driven Data Management for Pervasive Environments,”in 13th International Conference on Database and Expert SystemsApplications (DEXA 2002), Aix en Provence, France, September2002.

[44] S. Avancha, D. Chakraborty, H. Chen, L. Kagal, F. Perich, andA. Joshi, “Issues in Data Management for Pervasive Environ-ments,” in NSF Workshop on Context-Aware Mobile Database Man-agement, January 2002.

[45] S. Malek, M. Mikic-Rakic, and N. Medvidovic, “A style-awarearchitectural middleware for resource-constrained, distributedsystems,” IEEE Transactions on Software Engineering, vol. 31, no. 3,pp. 256–272, 2005.

[46] L. Golab and M. Ozsu, “Issues in data stream management,” ACMSigmod Record, vol. 32, no. 2, pp. 5–14, 2003.

[47] “JAXB, java architecture for xml binding,” Oracle -http://java.sun.com/developer/technicalArticles/WebServices/jaxb/.

[48] “CAN bus specifications,” ISO Standard 11898 1 2003,http://www.iso.org.

[49] F. A. Schreiber, R. Camplani, M. Fortunato, M. Marelli, andG. Rota, “Design of PerLa, a declarative language and a mid-dleware architecture for pervasive systems,” ARTDECO R.A.11b,DEI Tech. Report 2010.9, pp. 1–141, 2010.

[50] “DSPic, high-performance 16-bit dig-ital signal controllers,” MicroChip -http://ww1.microchip.com/downloads/en/devicedoc/70155c.pdf.

[51] Z. Alliance, “Zigbee specification,” ZigBee Document 053474r06,Version, vol. 1, 2005.

[52] G. Cugola and M. Migliavacca, “A context and content-basedrouting protocol for mobile sensor networks,” in EWSN, 2009,pp. 69–85.

[53] C. Cappiello and F. A. Schreiber, “Quality- and energy-aware datacompression by aggregation in wsn data streams,” in PERCOM’09: Proceedings of the 2009 IEEE International Conference on Perva-sive Computing and Communications. Washington, DC, USA: IEEEComputer Society, 2009, pp. 634–639.

Fabio A. Schreiber is full professor ofDatabases and of Technologies for InformationSystems at the Department of Electronic andInformation Engineering of the Politecnico di Mi-lano. His current research interests include: datamanagement in pervasive systems, DatabaseSystems for context-aware applications, VerySmall and Mobile Database design methodolo-gies and applications. On these and other topicshe authored a hundred papers published in in-ternational journals and conferences.

Romolo Camplani received the Dr.Ing. degreein electronic engineering and the Ph.D. degreein electronic and computer science from Univer-sita’ degli Studi di Cagliari, Cagliari, Italy, in 2003and 2008, respectively, and the M.Eng. degreein embedded system design from the Universita’della Svizzera Italiana, Lugano, Switzerland. Heis currently a Postdoctoral Researcher with Po-litecnico di Milano, Milan, Italy. His research in-terests include wireless sensors networks, adap-tive routing algorithms, and power-aware em-

bedded applications.

Marco Fortunato received the Dr.Eng. degreein computer science engineering from the Po-litecnico di Milano, Milano, Italy, in July 2008.He is a software developer and he is currentlycollaborating with the Dipartimento di Elettronicae Informazione on the development of PerLaproject since 2007. His research interests in-clude databases and data management in per-vasive systems.

Marco Marelli received the Dr.Eng. degree incomputer science engineering from the Politec-nico di Milano, Milano, Italy, in December 2007.He is a software developer and he is currentlycollaborating with the Dipartimento di Elettronicae Informazione on the development of PerLaproject since 2007. His research interests in-clude databases and data management in per-vasive systems.

Guido Rota is a graduate student at Politecnicodi Milano, where he got his Bachelor degree witha thesis on ”A distributed support system fortransportation collision management”. He alsoworks as an Informatic Consultant and SoftwareDeveloper.

Documents

PerLa: a Language and Middleware Architecture for Data ...perlawsn.sourceforge.net/files/documentation/TSE-2010-11-0328.R1_Schreiber.pdffor Data Management and Integration in Pervasive