Engineering Dynamic Real-Time Distributed Systems: Architecture, System Description ...pdfs.semanticscholar.org/7d61/f261b3618267d5bf57ab36603c... · 2015. 7. 28. · Description

1

Engineering Dynamic Real-Time Distributed Systems: Architecture, System Description Language, and Middleware

Binoy Ravindran

The Bradley Department of Electrical and Computer Engineering Virginia Polytechnic Institute and State University

Blacksburg, VA 24061 Phone: 540-231-3777, Fax: 540-231-3362, Email: [email protected]

Abstract This paper presents an architectural framework and algorithms for engineering dynamic real-time distributed systems using commercial off-the-shelf technologies. In the proposed architecture, a real-time system application is developed in a general-purpose programming language. Further, the architectural-level description of the system such as composition and interconnections of application software and hardware, and the operational requirements of the system such as timeliness and survivability are specified in a system description language. The specification of the system is automatically translated into an intermediate representation (IR) that models the system in a platform-independent manner. The IR is augmented with dynamic measurements of the system by a language run-time system to produce a dynamic system model. The dynamic model is used by resource management middleware strategies to perform resource management that achieves the timeliness and survivability requirements. The middleware techniques achieve the timeliness and survivability requirements through run-time monitoring and failure detection, diagnosis, and dynamic resource allocation. We present two classes of algorithms—predictive and availability-based—for performing resource allocation. To validate the viability of the approach, we use a real-time benchmark application that functionally approximates dynamic real-time command and control systems. The benchmark is specified in the system description language and the effectiveness of the architecture in achieving its design goals is examined through a set of experiments. The experimental characterizations illustrate that the middleware is able to achieve the desired timeliness requirements during a number of load situations. Furthermore, the results indicate that availability-based allocation algorithms perform resource allocation less frequently, whereas the predictive algorithms give a better steady state performance for the application. Keywords: process control systems, command and control, real-time systems, real-time resource management, timeliness, survivability, scalability, quality of service, middleware, system software 1. Introduction Real-time, military, computer-based, command and control (C2) systems are “dynamic” in the sense that processing and communication latencies do not have known upper bounds and event and task arrivals have non-deterministic distributions. Examples of dynamic real-time systems include the emerging generation of surface combatant systems of the U. S. Navy that must process sensor reports (radar tracks) and respond to event (threat) arrivals that have neither known upper bounds nor deterministic distributions, respectively. Such real-time C2 systems

2

require decentralization because of the physical distribution of application resources and for achieving survivability in the sense of continued availability of application functionality. Because of their physical dispersal, most real-time C2 distributed computing systems are “loosely” coupled using communication paradigms that employ links and routers, resulting in additional uncertainties, besides those that are caused by the non-deterministic load characteristics of the application. Most of the past efforts on real-time scheduling and resource management have focused on real-time systems that are static (in both the above senses) and perform device-level, sampled data monitoring and regulatory control that is usually centralized [Bak91, LRT92, LSS87, RCF97, RTL93, SB96, SKG91, SLS88, SSL89, TLS96, XP90], but occasionally distributed [CSR86, HS92, Kao95, KDK89, RSZ89, Shi91, SR91, SRC85, Ver95, WSM95]. These techniques cannot be practically employed or adapted for systems that are dynamic [Jen99, Koob96, Sta96, SK97]. Dynamic real-time computer systems and their applications have workload characteristics that are inherently posteriori. Thus, such systems require adaptive resource management techniques that adapt the application to workload changes, counter uncertainties, and achieve their real-time and survivability requirements. We argue that the conventional "hard"/"soft" dichotomy is too coarse, and timeliness is better treated as a quality of service (QoS) for activities that occur in such systems. The paucity of concepts, methodologies, and techniques that allow engineering of, and reasoning about, dynamic real-time computing systems—and the consequent absence of appropriate commercial real-time operating system and system software (e.g., middleware) products—have forced system engineers to “home-brew” and construct ad hoc special purpose, and consequently expensive, technologies and solutions.1 Furthermore, such systems usually have life cycles that span decades. During such extended life cycles, new and greater challenging scenarios—in target identification, reaction time, command and control, and tracking and reaction accuracy—emerge. Since the system requirements such as timeliness and survivability are addressed in an application-specific and platform specific-manner and with custom-built computing resources, it is difficult to meet the new requirements without significant changes to the application and the underlying computing platform. To meet the altered requirements in an economically viable manner, the system should be easily evolvable to integrate with new types of resources and upgradeable to more efficient resource management techniques without introducing undue change to the application. Thus, engineers of dynamic real-time systems are confronted with the problem of building the system such that (1) the operational requirements of the system such as timeliness and survivability can be achieved and (2) the evolution of the system can be facilitated in a cost-effective manner as the system undergoes changes to its requirements and technology foundations. To address these problems, in this paper, we present a resource management architecture and algorithms for engineering dynamic real-time distributed systems. In the proposed architecture, a system description language is used to describe the structural 1 A good example is the U.S. Navy’s surface war-fighting resource—the Aegis fleet—which has stringent demands for real-time performance, interfaces with a multitude of sensor and weapon systems, and uses military standard computer technology [WRH+96].

3

composition and interconnections of the system (application software and hardware), the timeliness and survivability requirements of the system as desired QoS, and characteristics of the underlying hardware platform. An abstract model of the system is automatically constructed from the language specifications and is augmented with dynamically measured performance characteristics of the system by a language run-time system. The dynamic system abstractions characterize the state of the system such as utilization levels of resources and are used by resource management middleware strategies to reason about available resources, discover resources, and perform resource allocation at run-time. The system description language facilitates the evolvability of the system. The language facilitates the separation of application and platform specifics from the resource management strategies. The abstract model constructed from the language specifications is an intermediate representation (IR) that is independent of the programming language used to develop the application as well as features specific to the underlying hardware platform. The goal of resource management is to adapt the application to workload changes so that acceptable levels of QoS can be achieved. We present resource management techniques that achieve the timeliness and survivability requirements by monitoring the application at run-time for its adherence to the desired requirements. Furthermore, the techniques detect situations when the application exhibit failures or trends for failures toward meeting the requirements, perform diagnosis to identify the causes of the failures or deteriorating situations, determine recovery actions and resources for actions that will improve or recover from the situations, and enact the actions. The resource management techniques employ adaptation strategies such as replication of application programs for load sharing and migration of programs from overloaded resources to recover from failures or improve from deteriorating situations. We present two classes of algorithms—predictive and availability-based—for performing resource allocation. We study the effectiveness of the approach through implementation and an experimental performance evaluation using a real-time benchmark that functionally approximates dynamic real-time C2 systems. The resource management techniques are implemented as part of a middleware infrastructure that uses commercial off-the-shelf (COTS) components — hardware, operating system, programming language, and networking technology. The benchmark is specified in the system description language and the effectiveness of the architecture in achieving its design goals is examined through a set of experiments. The experimental performance characterizations illustrate that the middleware is able to achieve the desired timeliness and survivability requirements during a number of load situations. Furthermore, the results indicate that availability-based allocation algorithms perform resource allocation less frequently, whereas the predictive algorithms give a better steady state performance for the application. Thus, the contributions of the paper are as follows:

1. A resource management architecture for engineering dynamic real-time distributed systems 2. A system description language that allows an architectural-level description of dynamic

real-time distributed systems and their operational requirements such as timeliness and survivability as desired QoS

4

3. Resource management middleware algorithms that detect, diagnose, and recover from deteriorating timeliness and survivability QoS situations through run-time reallocation of resources.

4. Illustration of the relative merits two different classes of algorithms—predictive algorithms and availability-based algorithms—for performing timeliness QoS management.

2. Scope of the Work The work presented in this paper is part of a prototyping effort in producing solutions for engineering the future surface combatants of the U.S. Navy [HiPerD, Quo97]. Therefore, we are strongly motivated by the characteristics of Navy combatant systems in our effort. We summarize the characteristics of the application and the assumptions that we have made in the design and construction of the resource management architecture as follows: 1. Replication of application program components is employed to achieve application-level

scalability. Programs are made scalable by sharing load among replicas. Individual replicas self-schedule their own portion of the load by considering the total number of replicas that is currently in the system. Replication of program components is automatically performed by the middleware.

2. The states of the replicas and their consistency is not addressed in this work, as we assume that the programs process data objects that are “continuous” in the sense that their values are obtained directly from a sensor in the application environment, or computed from values of other such objects. The replicas are thus assumed to be temporally consistent (e.g., sufficiently up-to-date) without applying every change in value, due to the continuity of physical phenomena.

3. Migration of application program components between a set of host machines is also employed to adapt the application to workload changes. When an application program is migrated between hosts, the program components that communicate with the migrated program automatically adapt to the situation by “pausing” until the migrant program can receive and send messages. Migration of program components is automatically performed by the middleware.

4. Survivability is achieved by maintaining replicas of application program components on host machines that belong to different zones. Zones are clusters of computing system resources that exhibit different failure properties such as low probabilities for the simultaneous failure of multiple zones. When a replica terminates abnormally, a new replica is automatically started by the middleware.

5. The hardware platform used by the middleware consists of a set of COTS host machines (Sun Ultra workstations running Solaris OS) that are interconnected together using an Ethernet network. The clocks of the machines are synchronized using the Network Time Protocol (NTP) Version 3 [Mill95]. Further, the platform is dedicated to the use of real-time applications.

Thus, the application is constructed with features that will enable it to adapt to workload changes or recover from abnormal failures to achieve its timeliness and survivability requirements. Further, the application is built to utilize the resource management services of the middleware that will automatically replicate, migrate, and restart application components, keep the components

5

synchronized, and seamlessly allow its operation to continue, thereby achieving its requirements. Furthermore, the application is built to utilize the communication service of the middleware for its inter-application program communication needs. The communication libraries of the middleware encapsulate some simple semantics that must be adhered by the application (Section 7.2 discusses them). 3. Organization of the paper

The rest of the paper is organized as follows: Section 4 presents a generic real-time C2 system. The generic system is used to reason about dynamic real-time systems and their load characteristics. We present the resource management architecture in Section 5. Section 6 presents the resource management middleware and Section 7 illustrates the system description language. The resource management model is presented in Section 8 and algorithms for each stage of the resource management process are described in Section 9. An experimental characterization of the resource management algorithms is presented in Section 10. We compare and contrast our work with that of related efforts in Section 11. Finally, the paper concludes with a summary of the work, its contributions, and current limitations in Section 12. 4. A General Real-Time C2 System Figure 1 shows a generic real-time C2 system. The control system consists of tasks that perform assessment of the environment, initiation of actions, and monitoring and guidance of the actions to their successful completion. The inter-relationship of the tasks with the environment and the intra-relationship of the tasks among themselves are illustrated in Figure 1.

OperatingEnvironment

AssessmentTask

InitiationTask

GuidanceTask

Sensor Sensor

Actuator Actuator

data

event event

data

data stream

data stream

action

control of action

Real-Time System

(periodic) (transient) (transient-periodic)

Figure 1. A Generic Real-Time C2 System

The assessment task periodically collects data from the environment using hardware sensors. The data is filtered, correlated, classified, and then used to determine the necessity of an action by the system. When an action is necessary, the task generates an event that activates the initiation task.

6

The initiation task determines the action that needs to be taken and causes actuators to perform the action. Since the task executes in response to an event that can occur at any time, the initiation task has a transient behavior. Upon initiation of the action by the actuators, the guidance task is notified. The guidance task repeatedly uses sensors to collect data, to monitor the actions that were initiated, and to guide the actuators to successful completion of the actions. Note that the activation of the guidance task begins and terminates aperiodically, and once active, it executes periodically. Thus, the guidance task has a transient-periodic behavior. The real-time requirements of the tasks include deadlines for the completion of each periodic or transient instance of task execution. Furthermore, guidance tasks have a de-activation time i.e., an activation deadline. After a careful study of the AAW real-time C2 system, we have observed that the resource needs of the system—the tasks—are significantly influenced by the size of the data and the event streams [WRH+96]. Size of the data stream refers to the number of data items (sensor reports) that the assessment and guidance tasks have to process during a single cycle, and size of the event stream refers to the arrival rate of events that trigger the execution of the initiation and guidance tasks. For systems such as the AAW, data stream sizes (radar tracks) and event (threat) arrivals have neither known upper bounds, nor deterministic distributions. Thus, dynamic real-time systems are real-time systems that have (1) unknown upper bounds for the size of data streams processed by periodic and transient-periodic tasks during a single execution instance and (2) non-deterministic distributions for the arrival rates of events that trigger the execution of transient and transient-periodic tasks. Observe that the generic model of real-time C2 systems that we have presented here does not capture its distributed nature. The objective of presenting the model is only to reason about dynamic real-time systems. However, real-time C2 systems are often distributed due to the physical dispersal of application resources (e.g., radars and weapon launchers in a combat system) and for achieving survivability. 5. The Resource Management Architecture

Figure 2 presents the resource management architecture. In the proposed architecture, the real-time application is developed in a general-purpose programming language (e.g., C, C++, Ada) and a system description language is used to specify the architectural-level description of the application and its non-functional requirements. The system description language provides concrete abstractions to describe the architectural-level properties such as composition and interconnections of application software and hardware resources, and its timeliness and survivability requirements as desired QoS. An abstract, static model of the system—an intermediate representation or IR—is automatically constructed from the language specifications by a compiler of the language. The static IR is augmented with dynamically measured application performance characteristics by the language run-time system. The dynamic IR characterizes the state of the application and is used by a resource management middleware. The middleware uses the dynamic IR for delivering the desired QoS to the application.

7

Application

Source Code

Application Compiler

Application Object Code

System Libraries

Application Executable

Application Linker

Application Language Run-Time

System

Real-Time

System Application

Compiler

System Specificat ion

Static IR

Dynamic IR

Run-Time System

Resource Management Middleware

off-line on-line

Middleware Commn. Libraries

Middleware GUI

Libraries

Application Environment

Language and Middleware Support

Figure 2. The Resource Management Architecture

The application interfaces with the system description language run-time system and the middleware through application program interfaces (APIs). API calls that measure current time from local clocks and transmit time-stamped messages to the language run-time system are inserted into the application code. Furthermore, APIs that provide communication services and produce GUI displays are also inserted into the application code. The application is compiled and linked with the APIs to produce the final executable. The system description language promotes evolvability of the system as the specifications can be changed when changes occur to the operating environments, requirements, and computing resources of the application. The specification can be modified and the IRs can be automatically

8

regenerated from the specification when application characteristics change. Furthermore, the resource abstractions constructed and maintained by the language run-time system model the characteristics of the underlying resources in a platform-independent manner. This decouples resource management techniques that deliver the desired QoS from the specifics of the application and the underlying platform, and facilitates the construction of such technology that is “portable” across platforms. The language run-time system and its interface with the resource management middleware are shown in Figure 3. The core components of the run-time system include a parser, a system data broker, software monitors, and hardware monitors. The functionality of each of the components are summarized as follows:

Figure 3. The System Description Language Run-time System

The system data broker is responsible for collecting and maintaining all application information. The parser is the front-end to the system data broker. It reads a description of the application and it's requirements that are expressed using the abstractions of the language and builds the data structures that model the system. Dynamically measured QoS metrics are collected and maintained by the software monitors. The system data broker obtains measurements of the dynamic attributes of the application program components from the software monitors. Performance metrics of hardware resources are collected and maintained by the hardware monitors. The metrics are transmitted to the data broker on demand as well as periodically. The hardware monitors consist of a set of host monitor daemons, a hardware broker program, and a hardware analyzer program. There is one host monitor daemon per host machine. Host monitor daemons collect host-level metrics such as CPU-idle-time, CPU ready-queue-length, free-available-memory for each host in the system, and network-level metrics such as number of communication packets that are sent out through, and received at, network interfaces of hosts in a network. The daemons periodically send the low-level metrics of hardware resources to the

Real-TimeSystem

ApplicationSoftware

NetworkTime

Protocol

System Specification

Parser

ResourceManagementMiddleware

SystemData

Broker

Host data, network data(CPU, memory,communication activity)

HardwareMonitors

Global time

Time-stamped messages

Control messagesfor application program control

SurvivabilityQoS violations

Hardware resourcemetrics & failure

notifications

Systemperformance

data Static systemdata structures

(Run-timedata structures)

Application softwareperformance data(program latencies, commn. latencies)

SoftwareMonitors

Real-TimeSystem

HardwareHardware performancemetrics

Timeliness QoS violations

9

hardware broker. The hardware broker thus becomes a repository of “raw” hardware performance information. The broker periodically sends the raw metrics to the hardware analyzer. The hardware analyzer computes higher-level metrics such as exponential-moving-averages, trend values, and aggregate metrics from the low-level metrics. The metrics are computed for both host and network hardware resources. The hardware analyzer provides this data to the resource management middleware. The software monitors consist of a set of QoS manager programs. The QoS managers monitor application-level (end-to-end) QoS metrics and alert the resource management middleware of degrading QoS situations. There is one QoS manager per task that has an end-to-end timeliness QoS requirement. Each QoS manager receives time-stamped event tags from application programs, transforms them into application-level QoS metrics, and evaluates the metrics for degrading QoS. When a task receives low timeliness QoS, the QoS manager of the task detects the situation and performs diagnosis to determine components of the task that are causing the low QoS. The QoS manager notifies the resource management middleware on the low QoS situation and the result of the diagnosis. 6. The Resource Management Middleware We present the overall resource management process and the architecture of the middleware in Section 6.1. Section 6.2 discusses the implementation of the middleware and the language components.

6.1 The Resource Management Process and the Middleware Architecture The resource management process is shown in Figure 4 and is summarized as follows: Detection of low QoS situations is performed by the language run-time system. The run-time system performs diagnosis to determine the causes of low QoS and notifies the middleware on the low QoS situation and the result of the diagnosis. The middleware analyzes the result of the diagnosis to identify possible recovery actions to improve the QoS to acceptable levels. Furthermore, resource allocation is performed to allocate hardware resources to execute the selected recovery actions. Finally, the middleware enacts the resource reallocation decision on the application.

Figure 4. The Resource Management Process

1.Monitor

Low QoS situation

Recovery actions

Resource reallocation decision

2.Diagnose

3.Analyze

4.Allocate

Causes oflow QoS

10

The architecture of the middleware is shown in Figure 5. The core component of the middleware is the resource manager. It is activated when tasks are missing their deadlines and when system components such as hosts and application programs have abnormal failures. In response to these events, it reallocates resources to improve the QoS delivered to the application. Typical actions taken by the resource manager include replicating the “bottleneck” application programs of an “overloaded” task, migrating application programs from heavily loaded resources to less loaded resources, restarting an application program that failed, and restarting the collection of application programs that were executing on a failed host. The resource manager uses the application state information provided by the language run-time system to determine the resource allocation decisions.

Figure 5. The Middleware Architecture The program control component consists of a central control program and a set of startup daemons. When the resource manager needs to start an application program on a host, it informs the control program, which then notifies the startup daemon on the host. Each host contains a startup daemon, which starts and terminates programs on the host at the request of the control program. The startup daemons are also responsible for notifying the control program when failures of application programs occur. In the event of an application program failure, the startup daemon on the host of the failed program notifies the control program, and the control program in turn, alerts the resource manager.

6.2 Implementation of the Middleware and the System Description Language We implemented the components of the system description language and the middleware in C. The concrete syntax of the language is described using a context-free-grammar. We built a parser for the language using the Lex and YACC compiler writing tools.

Real-TimeControlSystem

NetworkTime

Protocol

ResourceManager

SystemDescription LanguageRun-Time

System

Globaltime

Time-stamped messages

Control messagesfor application program control

System performancemetrics

Application software and hardware performance data

Timeliness andsurvivability QoS violation notifications

ProgramControl

Resource reallocationdecision (request to replicate, migrate, or restart applicationprograms)

HumanComputer Interface

Application status(program-host mapping,performance data)

User commands

11

The language run-time system and the middleware components communicate using the communication protocol stack (TCP/IP) of the underlying operating system. Furthermore, the middleware provides a simple communication service that is used by the application program components. The service is provided to the application as a set of initialize, send, and receive API library functions. The functions abstract the communication semantics from the application. The core communication semantics for an application program component include: (1) registering with the middleware before starting the execution, (2) obtaining the names and addresses of the host machines of the sender and receiver application programs of the component from the middleware before transmitting messages between them, and (3) have the middleware transparently change the sender and receiver host machines when the sender and receiver application programs of the component get replicated, migrated, and restarted to different host machines at run-time as a result of resource management decisions. The semantics are built-into the API functions provided to the application. Application program components also collect time-stamps and send the time-stamps to the middleware through middleware APIs. The APIs record the current time from the local clock of the host machine through operating system functions. 7. The System Description Language We now present the abstractions of the system description language. For brevity, we only illustrate the specification of a subsystem and it’s QoS objectives.2 We focus on the core language abstractions that describe the application software within a single subsystem.3 The abstractions are illustrated by presenting the concrete syntax and the corresponding example specifications in a depth-first manner. Figure 6 illustrates the grammar of a subsystem and Figure 7 illustrates a corresponding example specification (without the rules and example specification for the non-terminal <path-defn>). Specification of a subsystem includes describing its name, relative priority, and the set of application programs, devices (sensors and actuators), and end-to-end tasks that constitute the subsystem. The specification of an application program includes describing its name and attributes such as boolean properties that indicate whether the application program (1) can be replicated for survivability (“Survivable”) where the replicas are maintained in different zones, and (2) can be replicated for scalability (“Scalable”) to adapt the application program to increase in workloads. An application program may combine its input stream (“Combining”), which may be received from different predecessor application programs and devices, and split its output stream so that it may be distributed to different successor application programs and devices (“Splitting”). These properties are also specified. 2 A real-time C2 system may consist of a collection of subsystems, where subsystems may differ in their functional objectives [Wel97]. 3 Abstractions for describing hardware can be found in [Rav98].

12

The performance properties of the application that are obtained by static profiling are also described (“PerformanceProperties”). The property that is considered here is the execution latency of the application program.4 The property is described as a function that accepts arguments that include any combination of (1) data stream size that the program will process, (2) CPU utilization of the processor at which the program will be executed, and (3) memory utilization of the processor at which the program will be executed. The function will return the execution latency of the program for the corresponding input parameters.5 The function is described as the name of a dynamically linked library procedure (“PerFunction”) that must be called with the specified arguments (“Args”) and will return the specified value (“ReturnValue”). <sw-sub-system>:: Subsystem ID “{“

Priority INT_LITERAL “;” {<appln-defn>}+ <device-defn>

{<path-defn>}* <connectivity-descr> “}” <appln-defn>:: Application ID “{”

Survivable BOOL_STR “;” Scalable BOOL_STR “;” Combining BOOL_STR “;” Splitting [NONE|EQUAL|STR] “;” <perf-properties> {<startup-block>}+ {<shutdown-block>}+ “}”

<per-properties> PerformanceProperties {<per-func>}+ | λ <per-func>:: PerFunctionType <per-func-type> “{” PerFunction STR “;” Args {<per-func-arg-type>“,”}* <per-func-arg-type> “;” ReturnValue {< per-func-return-val>} “;” “}” <per-func-type>:: EXECLAT <per-func-arg-type>:: DS |CPUTLZ |MEMUTLZ <per-func-return-val>:: EL <device-defn>:: Device {ID “,”}* ID “;” | λ <connectivity-descr>:: Connectivity “{“ {<graph-defn>}+ “}” <graph-defn>:: <pair-wise-descr> | <complete-graph-descr> | ID <pair-wise-descr>:: “(” ID “,” ID “)” <complete-graph-descr>:: “[“ {ID}+ “]”

Figure 6. Grammar for Describing Software Subsystems

Application program attributes also include all information that is necessary to start and terminate the execution of application programs. These are not elaborated here. The connectivity of the subsystem describes the flow of data and events between the application programs and devices of the subsystem and is described as a sequence of ordered application program and device pair names within parentheses. Alternately, one may use the square bracket set notation, which is a short hand to represent a complete graph. Figure 8 shows the rules for the non-terminal <path-defn>. The non-terminal <path-defn> allows for describing the non-functional requirements of real-time C2 systems in terms of end-to-end “paths” through application programs. A path is a convenient functional abstraction for 4 However, this can be extended to include properties such as CPU utilization and memory utilization. 5 We describe how this function is determined in Section 9.

13

expressing end-to-end requirements such as timeliness and survivability as desired QoS in real-time C2 systems [Wel97]. The specification of a path includes describing its name, set of constituent application programs, attributes, timeliness and survivability requirements, and data and event stream definitions. The rules for the non-terminal <stream-defns> are not shown in Figure 8. Subsystem AAW { Priority 3;

Application FilterManager { Survivable YES; Scalable NO; Combining NO; Splitting EQUAL;

PerformanceProperties { PerFunctionType EXECLAT { PerFunction "FilterManagerExecLat"; Args DS, CPUTLZ; // DS: # of data items, CPUTLZ: PERCENT ReturnValue EL; // EL: in milliseconds } } Startup {..}; Shutdown {..}; }

Device Sensor, Actuator; ……….

Connectivity { (Sensor, FilterManager) (FilterManager, Filter) (Filter, EDManager) (EDManager, ED) };

Figure 7. An Example (Partial) Specification of a Software Subsystem

The attributes of a path include its priority, type, and importance. Path type, which defines the execution behavior of the path, is periodic, transient, or transient-periodic. The importance attribute (a string) is interpreted as the name of a dynamically linked library procedure that must be passed arguments that include the priority and the current time, and that returns an integer value representing the importance of the path. Rules for specifying timeliness and survivability requirements as QoS requirements are also shown in Figure 8. We specify timeliness requirements as desired QoS by describing the requirements and the minimum and maximum slack values that must always be maintained on the requirements. For a given requirement, the minimum slack value defines the maximum increase that is allowed for the observed performance measure against the requirement. The maximum slack value, on the other hand, defines the maximum decrease that is allowed for the observed performance measure. Thus, the slack values define a range of acceptable values for the performance measure as opposed to a single, discrete value for the performance measure that is considered acceptable. In addition to describing slack values on the requirement, the QoS specification also includes abstractions for monitoring the requirement and for detecting low QoS situations. Thus, for each requirement, we also describe the size of a moving “window” of performance measurements and the maximum number of acceptable violations of slack values within that window. Timeliness QoS specification (rule <Real-TimeQoS>) therefore, has two parts: (1) specification of timing constraints (rule <RTQoS-metric>) and (2) specification of the corresponding slack and monitoring values (rule <threshold>). Timing constraints include simple deadlines, inter-

14

processing times, throughputs, and super-period deadlines. A simple deadline is defined as the maximum end-to-end path latency during one cycle of a periodic or transient-periodic path, or during an activation of a transient path. Inter-processing time is defined as the maximum time that is allowed between processing of a data element in the data stream of a periodic or transient-periodic path in successive cycles. The throughput requirement is defined as the minimum number of data items that a periodic or transient-periodic path must process during a unit period of time. A super-period deadline is defined as the maximum allowed latency for all cycles of a transient-periodic path during a single activation. A super-period deadline is specified as the name of a dynamically linked library procedure that is dynamically invoked to determine the estimated super-period deadline. <path-defn>:: Path ID “{“

<appn-set> <path-attribs> <Real-TimeQoS> <SurvivabilityQoS> <stream-defns> “}”

<appn-set>:: Contains “{“ { <app-name> “,”}* <app-name> “}”

<app-name>:: ID <path-attribs>:: Priority INT_LITERAL “;” Type <path-type> “;” Importance STR “;” <path-type>:: Periodic | Transient | Transient-Periodic <Real-TimeQoS>:: Real-TimeQoS “{“ {<RTQoS-metric>}+ “}” <RTQoS-metric>:: SimpleDeadline FLOAT_LITERAL “;” <threshold>

| Inter-ProcessingTime FLOAT_LITERAL “;” <threshold> | Throughput INT_LITERAL “;”<threshold> | Super-PeriodDeadline STR “;” <threshold>

<threshold>:: MaxSlack INT_LITERAL “;” MinSlack INT_LITERAL “;” MonitorWindowSize INT_LITERAL “;” Violations INT_LITERAL “;”

<SurvivabilityQoS>:: SurvivabilityQoS “{“ Survivable BOOL_STR; MinCopies INT_LITERAL “}”

Figure 8. Grammar for Describing Paths

We specify survivability requirements as desired QoS (rule <SurvivabilityQoS>) by describing whether the path should be managed to ensure survivability using a boolean variable and the minimum required level of redundancy. Replicating a path entails replicating all of the application programs that constitute the path. The replicas of the application programs of the path must be maintained in different zones to achieve survivability. An example path specification without stream definition and slack and monitoring values for inter-processing time and throughput requirements is given in Figure 9. The “Sensing” path is periodic and has a priority of two. It has a simple deadline of four seconds, which means that each review cycle must not have a latency that exceeds this amount. The “MaxSlack” and “MinSlack” definitions further constrain this QoS requirement to the interval [0.8 seconds, 3.2 seconds]. If fifteen out of the most recent twenty cycles violate this requirement, then corrective actions through resource management techniques are required. If the upper bound is exceeded, the corrective action would be to allocate more resources for the path. In the case where the lower

15

bound is exceeded, the action would be to de-allocate some of the resources of the path. The path inter-processing time is seven seconds, which requires that no more than this amount of time must elapse between reviews of a data stream element in successive cycles of the path. The path throughput must be at least two hundred data stream elements per second. The path also requires the existence of at least two copies of each of its member application programs at any given time. Path Sensing { Contains { Sensor, FilterManager, Filter, EvalDecideManager, EvalDecide } Priority 2; Type Continuous; Importance "SensingImport"; Real-TimeQoS { SimpleDeadline 4; //secs MaxSlack 80; MinSlack 20; //PERCENTAGE MonitorWindowSize 20; Violations 15; Inter-ProcessingTime 7; //seconds …. Throughput 200; //data elements per second …. SurvivabilityQoS { Survivable YES; MinCopies 2 } …. }

Figure 9. An Example (Partial) Path Specification

<stream-defns>:: <DataStream-defn> | <EventStream-defn> | <DataStream-defn> <EventStream-defn>

<DataStream-defn>:: DataStream “{“ Type <stream-type> “;” <env> “}” <EventStream-defn>:: EventStream “{“ Type <stream-type> “;” <env> “}” <env>:: <stream-attrib> <env-descr> | λ <stream-type>:: Deterministic | Stochastic | Dynamic <stream-attrib:> Size | Rate <env-descr>:: INT_LITERAL | “(“ INT_LITERAL “,” INT_LITERAL “)” | <pdf-descr> <pdf-descr>:: STR | FILENAME

Figure 10. Grammar for Describing Stream Properties

Figure 10 shows the grammar for describing stream properties. A corresponding specification is illustrated in Figure 11. The stream type can be deterministic, stochastic, or dynamic. The data stream size or event arrival rate of a deterministic stream is a constant (scalar or an interval). A stochastic stream has a data stream size or an event arrival rate that is characterized by a probability distribution function. The distribution is described as a string containing the name of a distribution and its parameters, or is described as the name of a data file containing a data set that characterizes the behavior of the stream. The data stream size or event arrival rate of a dynamic stream is not described in the specification, since it must be determined at run-time.

DataStream { DataStream { Type Deterministic; Type Stochastic; Size 40 } Size "/home/red/MGDataStream.data" } EventStream { EventStream {

Type Dynamic; Type Stochastic; } Rate "Exponential 0.5" }

Figure 11. Example Stream Specifications

16

8. The Resource Management Model We now present the resource management model. The model is mathematically formalized to (1) reason about the system, (2) give a formal semantics to the abstractions of the system description language, (3) facilitate the presentation of resource management problems,6 and (4) facilitate the presentation of resource management algorithms. We discuss the static and dynamic aspects of the resource management model in the subsections that follow.

8.1 The Static Model The static aspects of the system include architectural-level properties of the application software such as composition and interconnections, stream properties such as statically known characterization of sizes and rates, QoS requirements, and architectural-level characteristics of the hardware system such as composition, interconnections, and partitions. For brevity, we omit a detailed description of the hardware system. 8.1.1 Software System A software subsystem, SS, consists of a set of application programs 1 2. { , ,....}SS A a a= , a set of

devices (sensors and actuators) 1 2. { , ,....}SS D d d= , a communication graph of application programs and devices ( ) (( . . ) ( . . ))SS SS D SS A SS D SS AΓ ∈ ∪ × ∪� , and a set of paths

1 2. { , ,....}SS P P P= , where �denotes the power set. Each path iP has a type ( ) { , , }iP p tp tτ ∈ where p, t, and tp denote periodic, transient, and

transient-periodic respectively. A path iP consists of a set of application programs

,1 ,2. { , ,....}i i iP A a a= where . .iP A SS A⊆ , a set of devices ,1 ,2. { , ,....}i i iP D d d= where . .iP D SS D⊆ ,

a data stream .iP DSand an event stream .iP ES. For a path iP , a data stream .iP DSis defined

only if ( ) { , }iP p tpτ ∈ and an event stream .iP ESis defined only if ( ) { , }iP t tpτ ∈ . The connectivity

of a path iP is the graph of application programs and devices that belong to the path. It is defined

as ( ) (( . . ) ( . . ))i i i i iP P D P A P D P Aγ ∈ ∪ × ∪� .

A path always has a root node (i.e., the beginning of the path) and a sink node (i.e., the end of the path). The root node of the path is the only node in the path that does not have an incoming edge from any other application programs or devices that belong to the path. The sink node of the path is the only node in the path that does not have an outgoing edge to any other application programs or devices that belong to the path. These are denoted as Root(Pi) and Sink(Pi), respectively, where ( ) ( . . ) (( : . ) ( : . ) :i j j i i k k i k k iRoot P v v P A P D a a P A d d P D= • ∈ ∪ ⇔ ¬∃ ∈ ∧ ∈

(( , ) ( , )) ( ))k j k j ia v d v Pγ∪ ∈ , and where ( ) ( . . ) (( : . )i j j i i k k iSink P v v P A P D a a P A= • ∈ ∪ ⇔ ¬∃ ∈ ∧

( : . ) : (( , ) ( , )) ( ))k k i j k j k id d P D v a v d Pγ∈ ∪ ∈ . 6 For brevity, we do not present a formal description of the resource management problems. A formal description of the problems can be found in [Rav98]. Section 10 presents the resource management algorithms.

17

For convenience, we define the “route” between a pair of nodes and the set of “ancestor” nodes of a node in the connectivity graph of a path. A “route” of length n from node xv to node yv of a

path iP is defined as a sequence of nodes 0 1 2( , , , ) , , ,......,x y i nRoute v v P n v v v v= ⟨ ⟩ such that

( 1, ) ( ),k k iv v Pγ− ∈ 0:1 ,and for , .x y nk k n v v v v∀ ≤ ≤ = = The ancestor nodes of a node iv of path

iP is defined as the set ( , ) { ( . . ) ( , , , ),i i j j i i j i iAncestors v P v v P A P D Route v v P n= • ∈ ∪ ∃ for 1}.n ≥

A path also has a “settling time.” The settling time of a path iP , denoted as ( )iPξ , is a real value

that indicates the amount of time that must be elapsed between successive triggers of resource management actions for the path. Some of the application program features include attributes such as Scale( ,i ja ) which is a boolean

flag that indicates whether application program ,i ja can be replicated to adapt to changes in data

and event stream sizes, Comb( ,i ja ) which is a boolean flag that defines whether application

program ,i ja combines the inputs from its successors before forwarding, and Split( ,i ja ) ∈ {none,

equal, non-equal}, which describes whether application program ,i ja splits it outputs among its

successors, and if it splits its outputs equally among its successors. Also, the set of hosts where an application program ia is eligible to run (i.e., the set of hosts for which ia has been compiled for

execution) is denoted as EligibleHosts( ia ).

8.1.2 Data and Event Streams The characteristics of data and event streams of paths are also modeled. The type of the data stream of path iP is defined as ( . )iP DSτ ∈ {de, st, dy}, where de denotes deterministic, st denotes

stochastic, and dy denotes dynamic. The size characteristics of the data stream of path iP are

denoted as ( . )iP DSσ which is a scalar or interval constant if ( . )iP DSτ = de, a distribution if

( . )iP DSτ = st, and undefined if ( . )iP DSτ = dy. Similarly, for the event stream a path iP , ( . )iP ESτ

∈ {de, st, dy} defines its type and ( . )iP ESρ defines its event arrival rate. The kth event in the

event stream of a path iP is denoted as . ( )iP ES k and the expected time of deactivation of the kth

event is denoted as ( . ( ))iP ES kφ .

8.1.3 QoS Each path may have timeliness requirements such as a simple deadline of ( )REQ iPλ seconds, a

required throughput of ( )REQ iPθ data stream elements/time, and a required data inter-processing

time of ( )REQ iPδ seconds. To mask momentary “QoS violations” (an unacceptable degradation in

the delivered QoS) during QoS monitoring, we define a sampling window and a maximum acceptable number of QoS violations within the window. These are defined similarly for each QoS parameter. For example, ( ( ))REQ iPω λ is the sampling window size for latency and ( ( ))REQ iPυ λ is

18

the maximum allowable number of QoS violations within ( ( ))REQ iPω λ . The desired slack interval

for each QoS parameter is also defined. In the case of the simple deadline requirement, the minimum and maximum slack requirements of a path iP , denoted as ( ( ))MIN REQ iPϕ λ and

( ( ))MAX REQ iPϕ λ percentages, respectively, is defined such that:

( ) ( , ) 100( ( )) ( ( ))

( )REQ i OBQ i

MIN REQ i MAX REQ iREQ i

P P cP P

P

λ λϕ λ ϕ λ

λ − × ≤ ≤

should hold for all path cycles c. We define the observed path cycle latency, ( , )OBS iP cλ in Section

8.2.3. 8.1.4 Hardware System A hardware subsystem HS consists of a set of host machines 1 2. { , ,....}HS H h h= , a set of

interconnecting devices, 1 2. { , ,....}HS I i i= , and a set of networks 1 2. { , ,...}HS N N N= . A host is the basic unit of computing in the model. An interconnecting device is a piece of hardware that serves as a “routing” device between hosts. Examples of interconnecting devices include bridges and routers. A host can also serve as a routing device. A network is a hardware communication medium that is used by a set of hosts for inter-hosts communications. A host communicates with another host via a route of networks, interconnecting devices, and hosts. We define a route as a sequence of “hops” where each hop is triple ( , , )sh N rh of a sending host or an interconnecting device ( . . )sh HS H HS I∈ ∪ , a network .N HS N∈ , and a receiving host or an interconnecting

device ( . . )rh HS H HS I∈ ∪ . Formally, a route Route( ,i jh h ) from host ih to host jh is a

sequence 1 1 1( , , ),....., ( , , )n n nsh N rh sh N rh⟨ ⟩ of n hops for which (1) 1 ,i n jsh h rh h= = , where

1 . , .nsh HS H rh HS H∈ ∈ , (2) 1( :1 : )k kk k n rh sh+∀ ≤ < = ∧ ( .krh HS H∈ ∪ . )HS I , and (3)

: .ii N HS N∀ ∈ holds.

A network defines a collection of hosts. The set of hosts that belong to a networkiN is the set

. { .i j j j x j yN H h h HS H h sh h rh= • ∈ = ∨ = , for some hop ( , , ) ( , )x i y p qsh N rh Route h h∈ , for some

, : . , . }p qp q h HS H h HS H∈ ∈ . The set of networks that belong to a host ih is denoted as .ih N .

A zone consists of a set of hosts and interconnecting devices. For brevity, we omit a formal definition of zones. It can be found in [Rav98]. The set of zones in the subsystem HS is denoted as 1 2. { , ,....}HS Z Z Z= . The zone of a host ih is denoted as ZoneOfHost( ih ) and the set of hosts

that belong to zonejZ is denoted as . .jZ H

8.2 The Dynamic Model The dynamic aspects of the system that are modeled include characteristics of the application software such as replicas of application programs, mapping of application programs onto host machines, stream properties of paths, QoS metrics, and performance measures of the hardware

19

system. Again, for brevity, we omit a description of the performance measures of the hardware system. In modeling the dynamic properties of the system, we use execution cycles of paths and discrete instants in time as reference points. 8.2.1 Software System The set of replicas of application program ,i ja of path iP during cycle c is denoted as

, , ,1 , ,2( , ) { , ,....}i j i j i jReplicas a c a a= . The host to which application program , ,i j ka is assigned during

cycle c of path iP is denoted as , ,( , , )i j k iHost a c P .

8.2.2 Data and Event Streams The set 1 2. ( ) { . ( ) , . ( ) ,....}i i iP DS c P DS c P DS c= represents the set of data elements in the data stream

.iP DSof path iP during cycle c. The set , , 1 , 2. ( , ) { . ( , ) , . ( , ) ,....}i i j i i j i i jP DS c a P DS c a P DS c a= denotes

the set of elements in the data stream processed by the application program ,i ja during cycle c of

path iP .

The data load of a path during an execution cycle is defined as the total number of data items processed by the path during the cycle. Hence, . ( )iP DS c denotes the data load of a path iP

during cycle c. The data load of an application program during a path cycle is defined as the total number of data items processed by the application program during the cycle. Hence,

,. ( , )i i jP DS c a denotes the data load of an application program ,i ja during cycle c of path iP .

The processing of elements of a data stream may be divided among replicas of an application program of a path. Further, the replicas may be executed on different host machines to exploit concurrency as a means of decreasing the execution latency of the path. In successive stages of a path that has non-combining application programs (i.e., application programs that, after processing the data, simply divide the data among their successors), data will arrive in batches to application programs. Hence, each application program may process several batches of data during a single cycle. The set , , 1 , 2. ( , , ) { . ( , , ) , . ( , , ) ,......}i i j i i j i i jP DS c a k P DS c a k P DS c a k= defines the

set of data stream elements contained in the kth data stream batch processed by application program ,i ja during cycle c of path iP . The total number of data batches processed by an

application program ia of path iP during cycle c is given by:

{ } { }( ): ( , )

( , , ) ( , ) 1k i i

i i kk a Ancestors a P

TotalBatches a P c Max Replicas a c∀ ∈

= ∪∏

The start and end times of processing data elements and data batches are represented as follows:

,( . ( , ) )i i j ks P DS c a and ,( . ( , ) )i i j ke P DS c a denote the start time and end time of processing the kth

data stream element by application program ,i ja during path cycle c. The notations

20

,( . ( , , ))i i js P DS c a k and ,( . ( , , ))i i je P DS c a k denote the start time and end time of processing the kth

data stream batch by application program ,i ja during path cycle c.

8.2.3 QoS Since a path processes several batches of data during a single execution cycle, the path will have a set of latencies that are incurred in processing each of the data batches. We define the observed cycle latency of a path during a cycle as the maximum of the set of latencies that are incurred in processing each of the data batches during the cycle. For a path iP , where ( ) { , }iP p tpτ ∈ , the

execution latency during a cycle c is given by:

{ }( ), , ,( , ) ( . ( , , )) ( . ( , ,1))OBS i k k k i i m n i i xP c MAX t t t e P DS c a j s P DS c aλ += • ∈ = −\

, ,, :1 ( , , ), where ( ), and ( ).i i i m i i x in j j TotalBatches a P c a Sink P a Root P∀ ∀ ≤ ≤ = =

For convenience, we define the cycle latencies of application programs and inter-application communication links of a path. The cycle latency of an application program of a path is defined as the total latency incurred by the application program in processing all the data batches during the cycle. For an application program ia of path iP , the total latency incurred in processing all data

batches during a cycle c is given by:

( )( , , )

1

( , , ) ( . ( , , )) ( . ( , , ))i iTotalBatches a P c

OBS i i i i i ik

a P c e P DS c a k s P DS c a kλ=

= −∑

The cycle latency of an application program pair of a path is defined as the total latency incurred by the application program pair in transmitting all the data batches during the cycle. For an application program pair , ,( , )i j i ka a of path iP , the total latency incurred in transmitting all data

batches during a cycle c is given by:

( ),( , , )

, , , ,1

( , , , ) ( . ( , , )) ( . ( , , ))i k iTotalBatches a P c

OBS i j i k i i i j i i kl

a a P c e P DS c a l s P DS c a lλ=

= −∑

The observed data inter-processing time of a path is determined from the data inter-processing times of the application programs of the path. The observed data-inter-processing time of an application program ,i ja of path iP for processing the kth data stream element in its data stream

during a path cycle c is given by , , ,( . ( , ) ) ( . ( , ) ) ( . ( 1, ) )OBS i i j k i i j k i i j kP DS c a s P DS c a s P DS c aδ = − − ,

where c +∈` . From this, we can determine the observed data-inter-processing time of a path. We define the observed data-inter-processing time of a path during a cycle as the maximum of all observed data inter-processing times for processing data stream elements by application programs of the path during the cycle. That is, the observed data-inter-processing time of a path iP during a

cycle c +∈` is defined as ( ), ,( . ( )) ( . ( , ) ) , , . .OBS i OBS i i j k i j iP DS c MAX P DS c a k a P Aδ δ= ∀ ∀ ∈

The observed throughput of a path iP during a cycle c, where ( ) { , }iP p tpτ ∈ is defined as:

. ( )( , )

( , )i

OBS iOBS i

P DS cP c

P cθ

λ= . The function is undefined for all paths iP , where ( ) .iP tτ =

21

8.2.4 Hardware System The load index of a host ih at an instant in time t is denoted as ( , )iLI h t and the load index of a

network iN at an instant in time t is denoted as ( , )iLI N t . We use a number of different index

functions to characterize the load of hosts and networks. Definitions of the load index functions can be found in [WSRK99]. Furthermore, we denote the CPU utilization and memory utilization of a host ih at an instant in time t as ( , )iCPU h t and ( , )iMem h t , respectively.

9. Adaptive Resource Management Techniques We now present algorithms for each stage of the resource management process. The resource management steps include QoS monitoring (Section 9.1), QoS diagnosis and analysis (Section 9.2), and resource allocation (Section 9.3). Since we do not perform diagnosis for low survivability QoS situations, Section 9.2 only discusses QoS diagnosis and analysis for low timeliness QoS situations.

9.1 QoS Monitoring 9.1.1 Timeliness QoS Monitoring Monitoring of timeliness QoS involves collection of time-stamped events that are sent from application program components and synthesis of the events into path-level QoS metrics. Analysis of a time series of the QoS metrics enables detection of low QoS situations. In this paper, we focus on the simple deadline requirement and the minimum slack value that must always be maintained on the requirement to achieve the desired timeliness QoS. We present a low timeliness QoS detection algorithm that continuously receives time-stamps from the application programs of the path, computes the path cycle latency, and checks for an “overload” of the path. Informally, an overload for a path is said to occur if its cycle latency frequently exceeds the minimum slack value on its simple deadline. The algorithm identifies a low timeliness QoS for the path if the overload is detected after the elapse of a time interval that is larger than the settling time of the path. We now formally present the algorithm. An overload of a path Pi is said to occur in a cycle c when the observed cycle latency of the path during cycle c exceeds the minimum slack value that must be maintained on the simple deadline requirement of the path in at least ( ( ))REQ iPυ λ path cycles of the previous ( ( ))REQ iPω λ cycles.

That is, ( , )

( ) ( , )( ( )) : ( ( )) ( ( ))

( )

i

REQ i OBS iREQ i REQ i MIN REQ i

REQ i

PathOverload P c

P P dP d c d P P

P

λ λυ λ ω λ ϕ λ

λ

⇔

− ≤ − < ∧ <

Figure 12 shows the low timeliness QoS detection algorithm. The algorithm performs timeliness QoS monitoring and detection of low timeliness QoS for a single path. The algorithm continuously receives time-stamped messages from the application programs of the path. Each

22

application program of the path sends two time stamps for every batch of data that it processes during a single path cycle. Recall that due to the presence of replicas of application programs in a path, several batches of data may have to be processed during a single execution cycle of the path. The time stamps send by the application program include “start” and “end” time stamps that are recorded by the application program at the beginning and at the end of processing of the data batch, respectively. Upon the receipt of an application program message, the algorithm decodes the message and updates its data structures that store the values of the time stamps. 1. Let iP be the candidate path that is to be monitored for low timeliness QoS.

2. , , ,( , ) ;Set ( , ) 0, : . ,OBS i i j n i j iP c Batch a m j a P Aλ = ∅ = ∀ ∈ and for a large n ∈ ., m ∈ .;

3. PathSettlingTime = FALSE; 4. while (true) do

4.1 Message = ReceiveMessageFrom( iP );

4.2 Let Message.Appn = , ,i j na ;

4.3 Let c = Message.Cycle; 4.4 , , , ,( , ) ( , ) 1;i j n i j nBatch a c Batch a c= +

4.5 Let , ,( , );i j nk Batch a c=

4.6 if Message.Type = “start” then 4.6.1 , ,( . ( , , ))i i j ns P DS c a k = Message.Value;

4.7 else if Message.Type = “end” then 4.7.1 , ,( . ( , , ))i i j ne P DS c a k = Message.Value;

4.8 if (Message.Appn = Sink( iP ) and Message.Type = “end”) then

4.8.1 Let , ,( . ( , , )) ( . ( , ( ),1))k i i j n i it e P DS c a k s P DS c Root P= − ;

4.8.2 ( , ) ( , ) { };OBS i OBS i kP c P c tλ λ= ∪

4.8.3 if (PathOverload(Pi,c) and PathSettlingTime = FALSE) then 4.8.3.1 PathSettlingTime = TRUE; 4.8.3.2 StartAlarm( ( )iPξ );

4.8.3.3 SendMessage( iP , c);

Figure 12. Low Timeliness QoS Detection Algorithm

When a message is received from a “sink” application program of the path, the algorithm determines the cycle latency of the path. This is computed as the difference between the end time of processing the current data batch of the sink application program and the start time of processing the first data batch of the “root” application program for the current path cycle. Once the path cycle latency is determined, the algorithm checks for an overload of the path. If an overload for the path is detected, the algorithm then checks whether a time interval that is larger than the settling time of the path has elapsed since the last resource management action. This is determined by checking the value of a boolean flag variable called PathSettlingTimeFlag that is initialized with a false value. A low timeliness QoS for the path is identified only if an overload for the path is detected and the PathSettlingTime flag has a false value. When these two conditions are satisfied, the algorithm sets the PathSettlingTime flag to a true value and starts an alarm to asynchronously notify it after the elapse of the path settling time. When the alarm notifies the

23

algorithm, the PathSettlingTime flag is reset to a false value. The resetting of the flag is performed asynchronously. Once a low timeliness QoS for the path is identified, the algorithm notifies the QoS diagnosis procedure. 9.1.2 Survivability QoS Monitoring Monitoring of survivability QoS involves detection of abnormal failures of application programs, hosts, and zones. We present a “master-slave” algorithm that detects failures of application programs, hosts, and zones, and reports the failures to the QoS diagnosis procedure. The algorithm is master-slave in the sense that there is a single master algorithm that asynchronously interacts with a set of slave algorithms. The slave algorithms execute (as part of the startup daemon program components of the middleware) on each host machine in the system. The slave algorithms receive notifications from the master algorithm on application program components that must be started on their host machines. The slave algorithms start the execution of the application programs, monitor their executions, and detect the abnormal failure of application programs by catching run-time exceptions. The failures are immediately reported to the master algorithm. 1. Let ih be the host that is monitored for program failures. Set FailedApplications = ∅;

2. while (true) do 2.1 Message = ReceiveStartupMessages(); 2.2 RaiseProcessAbnormalTermination(Message.Appln); 2.3 StartProcess(Message.Appln); 2.4 catch exception RaiseProcessAbnormalTermination(Message.Appln)

2.4.1 FailedApplications = FailedApplications ∪ {Message.Appln}; 2.4.2 Let tk = GetTimeofDay(); /* time of application failure */ 2.4.3 SendMessage(Message.Appln, kt );

2.4.4 continue; /* Execution continues from the point where the exception was raised. */ Figure 13. The “Slave” Algorithm for Detecting Application Program Failures

Figure 13 shows the slave algorithm that executes on each host in the system. The algorithm receives messages to start application programs on its host from the master algorithm. The slave algorithm invokes a system call that generates a run-time exception upon an abnormal process failure before starting the execution of each application program on the host. The exception system call is non-blocking and the exception signal is generated in an asynchronous manner. Execution control is transferred to the exception handler of the algorithm upon a process failure. The master algorithm detects the failure of hosts in the system by catching exceptions that are raised when the communication channel between the slave and master hosts fail. Further, the master algorithm keeps track of the failed hosts in the system and triggers a zone failure when it detects the failure of all hosts in the zone.

Figure 14 shows the master algorithm. The algorithm starts the execution of the slave algorithms on each host in the system. The master algorithm invokes a system call that generates a run-time exception upon the failure of the communication channel between the master and slave hosts before starting the execution of the slave algorithm on the remote host. Subsequently, the

24

algorithm starts the execution of the slave algorithm on the host and establishes a communication channel with the host. The exception system call is non-blocking and the exception signal is generated in an asynchronous manner. Execution control is transferred to the exception handler of the algorithm upon the failure of the communication channel. This happens when the host fails, resulting in a failure of the slave algorithm itself.

1. Set ;FailedHosts= ∅ ;FailedZones= ∅

2. while (true) do 2.1 for each host .ih HS H∈ do

2.1.1 RaiseCommunicationChannelFailure( ih );

2.1.2 StartProcessFailureDetectionAlgorithm( ih ); /* start slave algorithm on ih */

2.1.3 EstablishCommunicationChannel( ih );

2.2 catch exception RaiseCommunicationChannelFailure( ih )

2.2.1 { }iFailedHosts FailedHosts h= ∪ ; /* host ih failure */

2.2.2 Let kt = GetTimeofDay(); /* time of host failure */

2.2.3 SendMessage( ,i kh t );

2.2.4 Let iZ = ZoneOfHost( ih );

2.2.5 ZoneFailed = TRUE; 2.2.6 for each host i ih Z∈ do

2.2.6.1 if ih FailedHosts∉ then

2.2.6.1.1 ZoneFailed = FALSE; 2.2.7 if ZoneFailed = TRUE then

2.2.7.1 { };iFailedZones FailedZones Z= ∪

2.2.7.2 Let kt = GetTimeofDay();

2.2.7.3 SendMessage( ,i kZ t );

2.2.8 continue; Figure 14. The “Master” Algorithm for Detecting Host and Zone Failures

The algorithm detects the failure of a zone by checking whether all hosts of the zone have failed.

9.2 Diagnosis and Analysis When a low timeliness QoS situation is detected, diagnosis functions are performed to determine: (1) the causes of low QoS and (2) the possible actions that may improve the QoS and recover from the low QoS situation. These two steps of timeliness QoS diagnosis are discussed in the Sections 9.2.1 and 9.2.2, respectively. 9.2.1 Identifying Causes of Low Timeliness QoS The first step in diagnosing a low timeliness QoS of a path is to determine the cause of an overload for the path. Recall that an overload for a path takes place due to the increase in execution latencies of some application programs of the path or due to the increase in

25

communication latencies of some inter-application communication links of the path. The increase in latency has to be substantial enough to cause the path latency to exceed its minimum slack requirement. Further, the violation of the minimum slack requirement has to be observed for a certain number of path cycles that is larger than the maximum that is allowed in a window of recent cycles. 1. Let iP be a path such that PathOverload(Pi, c) is true.

2. Set ( ) ;iUnhealthyApplications P = ∅ and ( ) ;iUnhealthyCommunicationLinks P= ∅

3. for each application .i ia P A∈ do

3.1 For convenience, let’s denote ( , , )i iOBS a P kλ with the function ( )kα . Determine the execution latency

trend of ia in the former and latter halves of ( ( ))REQ iPω λ :

( ( )) ( ( )) ( ( ))

2 2 2

( ( )) 1 ( ( )) 1 ( ( )) 1

( , , )

( ( ))( ) ( )

2

REQ i REQ i REQ i

REQ i REQ i REQ i

P P Pc c c

k c P k c P k c P

i i

REQ i

ExecLatencyTrendInFormerHalf a P c

Pk k k k

ω λ ω λ ω λ

ω λ ω λ ω λ

ω λα α

− − −

= − + = − + = − +

=

× × − ×

∑ ∑

2( ( )) ( ( ))

2 22

( ( )) 1 ( ( )) 1

( ( ))

2

REQ i REQ i

REQ i REQ i

P Pc c

REQ i

k c P k c P

Pk k

ω λ ω λ

ω λ ω λ

ω λ− −

= − + = − +

× −

∑

∑ ∑

( ( )) ( ( )) ( ( ))1 1 1

2 2 2

2

( , , )

( ( ))( ) ( )

2

( ( ))

2

REQ i REQ i REQ i

c c c

P P Pk c k c k c

k

i i

REQ i

REQ i

ExecLatencyTrendInLatterHalf a P c

Pk k k k

Pk

ω λ ω λ ω λ

ω λα α

ω λ

= − + = − + = − +

=

× × − ×

×

∑ ∑ ∑

2

( ( )) ( ( ))1 1

2 2

REQ i REQ i

c c

P Pc k c

kω λ ω λ

= − + = − +

−

∑ ∑

3.2 if ( , , ) ( , , )i i i iExecLatencyTrendInFormerHalf a P c ExecLatencyTrendInLatterHalf a P c< then

3.2.1 ( ) ( ) { };i i iUnhealthyApplications P UnhealthyApplications P a= ∪

4. for each application pair , ,( , ) ( )i j i k ia a Pγ∈ do

4.1 Determine the communication latency trend of , ,( , )i j i ka a in the former and latter halves of ( ( ))REQ iPω λ

similarly as in statement 3.1 (i.e., replace ( , , )i iOBS a P kλ in statement 3.1 with , ,( , , , )i j i kOBS ia a P kλ ).

Let the communication latency trend of , ,( , )i j i ka a in the two halves be denoted by CommLatencyTrend

, ,( , , , )i j i k iInFormerHalf a a P cand , ,( , , , )i j i k iCommLatencyTrendInLatterHalf a a P c.

4.2 if , ,( , , , )i j i k iCommLatencyTrendInFormerHalf a a P c<

, ,( , , , )i j i k iCommLatencyTrendInLatterHalf a a P c then

4.2.1 { }, ,( ) ( ) ( , ) ;i i i j i kUnhealthyCommunicationLinks P UnhealthyCommunicationLinks P a a= ∪

Figure 16. Timeliness QoS Diagnosis Algorithm that Identifies Application Components Causing Low QoS

26

Timeliness QoS diagnosis algorithms identify application components of the path that are experiencing significant “slowdown.” The diagnosis may be performed “locally” in the sense that only the application components of the path that is receiving low QoS is considered. On the other hand, “global” diagnosis techniques consider application components of other paths and that are using the same hardware resources as that of the path that is receiving low QoS. Local diagnosis can be performed quickly, whereas global diagnosis requires greater state information and in the worst-case, consideration of all paths in the subsystem. In this paper, we focus on a local diagnosis technique. We present a local diagnosis technique that computes trends of execution latencies of applications programs and communication latencies of inter-application communication links of a path that is receiving low QoS. The technique uses the latency trends to identify application programs and communication links that are causing a low QoS of the path.

( ( )) 1REQ i

c Pω λ− + c

(a)

( , , )OBS i ia P cλ

estimated regression lines that approximate application latenciesat different path cycles during thetwo halves of the window

former halfof window

latter halfof window

PathCycle

( ( )) 1REQ i

c Pω λ− + c

(b)

, ,( , , , )OBS i j i k ia a P cλ

former halfof window

latter halfof window

PathCycle

estimated regression lines that approximate inter-application communication latenciesat different path cycles during thetwo halves of the window

Figure 15. Regression Lines that plot (a) Application Program Latencies and (b)

Communication Link Latencies as a Function of Path Cycles

Observe that the latency of a path is monitored for a certain number of previous path cycles to detect an overload of the path. Therefore, the set of previous path cycles during which the path is monitored constitutes a “window” of latency measurements. The diagnosis technique presented here uses the intuition that if a path frequently violates the minimum slack requirement on its simple deadline during a window of past path cycles (and hence has been detected for an overload), then there exists some application programs or inter-application communication links of the path that are experiencing an increase in their execution or communication latencies during the window of past path cycles as well. Therefore, application programs or communication links of the path that are exhibiting increasing trends in their latencies during the window of past path cycles where the increase is significantly large such that the latency trends in the latter half of the

27

window is larger than that in the former half of the window must contribute to the low QoS of the path. We identify the application programs and inter-application communication links that cause a low QoS to the path by comparing the execution and communication latency trends during the two halves of the window of past path cycles. We use the slope of regression lines that are estimated from the set of latency values measured during the window of past path cycles to determine the latency trends. The slope of a regression line y = a + bx, that is estimated from the sample { ( , )i ix y ; i = 1, 2 ,…, n} is given by:

22

1 1

1 1 1

n n

i ii i

n n n

i i i ii i i

b

n x x

n x y x y

= =

= = =

=

− −

∑ ∑

∑ ∑ ∑

The technique is illustrated in Figures 15(a) and 15(b). We now present a timeliness QoS diagnosis algorithm that identifies application programs and communication links that are causing a low QoS of the path. The algorithm is shown in Figure 16. The algorithm uses trends of application component latencies to identify application programs and inter-application communication links that cause the low QoS of the path.

9.2.2 Identifying Recovery Actions for Improving Timeliness QoS Once the set of application components that are causing the low timeliness QoS of the path is diagnosed, further analysis is performed. The objective of this analysis is to identify possible recovery actions for the application components that are diagnosed as “unhealthy” that will improve the QoS of the path. In this paper, we focus on identifying recovery actions for application programs that are diagnosed as unhealthy.7 We present an algorithm that identifies a set of candidate recovery actions for an unhealthy application program that will improve the QoS of the path. The first step of the algorithm is to determine if the “unhealthiness” of the application program is due to increased data load or due to increased contention for resources. This information is used to determine the appropriate recovery action for the application program that will improve its “health” and the QoS of the path. To determine whether the data load of an unhealthy application program has significantly increased in the recent past, we compute the trend of the data load of the application program during a window of past path cycles. We conclude that the data load has significantly increased in the recent past, if the data load is exhibiting an increasing trend during the window of past path cycles where the increase is significantly large such that the trend in the latter half of the window is larger than that in the former half of the window. Like before, we use the slopes of regression 7 It’s possible to identify recovery actions for unhealthy inter-application communication links such as determining a set of alternate routes for the links. However, the run-time enactment of such actions require additional operating system and hardware support and is beyond the scope of this paper.

28

lines that are estimated from the set of data load values measured during the window of past path cycles to determine the data load trends. 1. Let Pi be a path such that PathOverload(Pi, c) is true and let the set of applications of Pi that are identified

as “unhealthy” be ( )iUnhealthyApplications P. Let the function : N Nθ → map path cycles to absolute

time. 2. for each application , ( )i j ia UnhealthyApplications P∈ do

2.1 For convenience, let’s denote ,. ( , )i i jP DS k a with the function ( )kπ . Determine the data load

trend of ai,j in the former and latter halves of ω(λREQ(Pi)):

( ( )) ( ( )) ( ( ))

2 2 2

( ( )) 1 ( ( )) 1 ( ( )) 1

,( , , )

( ( ))( ) ( )

2

REQ i REQ i REQ i

REQ i REQ i REQ i

P P Pc c c

k c P k c P k c P

i j i

REQ i

DataLoadTrendInFormerHalf a P c

Pk k k k

ω λ ω λ ω λ

ω λ ω λ ω λ

ω λπ π

− − −

= − + = − + = − +

=

× × − ×

∑ ∑

2( ( )) ( ( ))

2 22

( ( )) 1 ( ( )) 1

( ( ))

2

REQ i REQ i

REQ i REQ i

P Pc c

k c P k c P

REQ iPk k

ω λ ω λ

ω λ ω λ

ω λ− −

= − + = − +

× −

∑

∑ ∑

,

( ( )) ( ( )) ( ( ))1 1 1

2 2 2

2

( , , )

( ( ))( ) ( )

2

( ( ))

2

REQ i REQ i REQ i

i j i

c c c

P P Pk c k c k c

k

REQ i

REQ i

DataLoadTrendInLatterHalf a P c

Pk k k k

Pk

ω λ ω λ ω λ

ω λπ π

ω λ

= − + = − + = − +

=

=

× × − ×

×

∑ ∑ ∑

2

( ( )) ( ( ))1 1

2 2

REQ i REQ i

c c

P Pc k c

kω λ ω λ

− + = − +

−

∑ ∑

2.2 Determine the host load trend of ,i ja in the former and latter halves of ω(λREQ(Pi)) similarly as in

statement 2.1 (i.e., replace ,. ( , )i i jP DS k a in statement 2.1 with ,( ( , , ), ( ))i j iLI HOST a k P kθ ). Let

the host load trend of ,i ja in the two halves be denoted by HostLoadTrendInFormerHalf

,( , , )i j ia P c and ,( , , )i j iHostLoadTrendInLatterHalf a P c.

2.3 if DataLoadTrendInFormerHalf(ai,j, Pi, c) < DataLoadTrendInLatterHalf(ai,j, Pi, c) and HostLoadTrendInFormerHalf(ai,j, Pi, c) ≥ HostLoadTrendInLatterHalf(ai,j, Pi, c) then

2.3.1 Set Act(ai,j, Pi, c) = “Replicate”; 2.4 else if DataLoadTrendInFormerHalf(ai,j, Pi, c) ≥ DataLoadTrendInLatterHalf(ai,j, Pi, c) and

HostLoadTrendInFormerHalf(ai,j, Pi, c) < HostLoadTrendInLatterHalf(ai,j, Pi, c) then 2.4.1 Set Act(ai,j, Pi, c) = “Mi grate”;

Figure 17. Timeliness QoS Diagnosis Algorithm that Identifies Recovery Actions

To determine whether the unhealthy application program is experiencing significant contention for computing resources in the recent past, we analyze the load of the host machine where the application program is executing. We compute the trend of the load of the host during a window of past path cycles. We conclude that the host load has significantly increased in the recent past, if the host load is exhibiting an increasing trend during the window of past path cycles where the

29

increase is significantly large such that the trend in the latter half of the window is larger than that in the former half of the window. We use the slopes of regression lines that are estimated from the set of host load values measured during the window of past path cycles to determine the host load trends. The analysis of changes in data load and host load is used to determine the appropriate recovery action for an unhealthy application program. This is performed as follows: If the data load of the application program has increased significantly in the recent past and the load of the host of the application program has not increased by a large factor, then replicating the application program and executing the replicas on different host machines can improve the path latency. The replicas of the application program can share the additional data load, process it concurrently, and thereby reduce the path latency.

If the data load of the application program has not increased significantly in the recent past and the load of the host of the application program has increased by a large factor, then migrating the application program can improve the path latency. The application program may be residing on a host resource that is heavily loaded and may be subjected to increased resource contention. By migrating the application program to a less loaded host, the resource contention can be reduced. This can reduce the path latency.

We now present an algorithm that determines the recovery actions for unhealthy application programs. The algorithm is shown in Figure 17. The algorithm uses the set of application programs of the path that is diagnosed as unhealthy (by the diagnosis algorithm shown in Figure 16) as its input. For each unhealthy application program, the trend values of data load and host load are used to determine the recovery action. 9.3 Resource Allocation Once the unhealthy application programs are identified and recovery actions are determined, the next step is to allocate hardware resources to the actions so that the actions can be enacted. We consider two classes of algorithms for allocating resources to the recovery actions: (1) predictive algorithms and (2) resource availability-based algorithms. These are discussed in the subsections that follow.

9.3.1 Predictive Algorithms We consider two predictive resource allocation algorithms. The algorithms are predictive in the sense that they extrapolate the performance of application programs for each possible allocation and select the allocation that yields the optimal performance. The algorithms use performance profiles of application programs that are obtained for a set of resource utilization conditions and external load situations for extrapolation. The algorithms extrapolate the application performance from the profile data using regression theory. We consider two predictive algorithms that use external load and resource utilization parameters for predicting program latencies. We use the size of the data stream processed by periodic paths

30

during a single cycle (period) as the external load parameter. Resource utilization is characterized by parameters such as CPU utilization and memory utilization.

Latency Profile for Filter Program

0

50

100

150

200

250

300

300 400 500 600 700 800 900 100Data Stream Size

Late

ncy

of F

ilter

Pro

gram

(m

s)

CPU = 9% MEMORY = 94.23%

CPU = 15% MEMORY = 95.44%

CPU = 27% MEMORY = 97.1%

CPU = 39% MEMORY = 97.46%

Figure 18. Execution Latencies of Filter at Different Data Stream Sizes and CPU

Utilizations Our first algorithm called DCP, uses both data stream size and CPU utilization for latency prediction. The latency of an application program is measured at different levels of CPU utilization and data stream sizes. Plots of a sample set of measurements of an application program of the real-time benchmark described in [WS99] called Filter are shown in Figure 18. Memory utilizations corresponding to the CPU utilizations are also shown in the figure. The latency values shown in the figure are averages of latency measurements. We derive a multiple linear regression equation from the measurements to define the regression line. The regression equation thus obtained for an application program ,i ja that is based on data stream

size and CPU utilization is given by:

( ), 1 2 3( , , ) ,DS CPU i jEstExecLat a DS CPU f DS CPU a a DS a CPU+ = = + × + × (1)

where the execution latency is obtained in milliseconds, DS is the data size in number of data items, CPU is the CPU utilization in percentage, and ai 's are constants that are dependent upon the application program. The second algorithm called DMP, combines data stream size and memory utilization for predicting program latency. We derive a multiple linear regression equation from the measurements to define the regression line. The regression equation thus obtained for an application program ,i ja that is based on data stream size and memory utilization is given by:

( ), 1 2 3( , , ) ,DS Mem i jEstExecLat a DS MEM f DS MEM b b DS b MEM+ = = + × + × (2)

where the execution latency is obtained in milliseconds, DS is the data size in number of data items, MEM is the memory utilization in percentage, and bi 's are constants that are dependent upon the application program.

31

1. Let iP be a path such that ( , )iPathOverload P cis true.

2. Let the set of applications of path iP that are identified as “unhealthy” be denoted as

( )iUnhealthyApplications P and let ,( , , )i j iAct a P c define the recovery action of application program ,i ja .

Let ,( )i jCandidateHost a denote the candidate host for application ,i ja . Set

, ,( ) , ( )i j i j iCandidateHost a a UnhealthyApplications P= ∅ ∀ ∈ .

3. for each application , ( )i j ia UnhealthyApplications P∈ do

3.1 Set ( )mFitness h to some large value, for some ,: ( )m i jm h EligibleHosts a∈ ;

3.2 for each host ,( )k i jh EligibleHosts a∈ do

3.2.1 if ,( )k i jh Host a≠ then

3.2.1.1 if ,( , , )i j iAct a P c = ”Replicate” then

3.2.1.1.1 Set ,

,

. ( , )( ) ;

2i i j

i j

P DS c aDataLoad a =

3.2.1.2 else Set , ,( ) . ( , ) ;i j i i jDataLoad a P DS c a=

3.2.1.3 Let the function : N Nθ → map path cycles to absolute time. Determine ( )kFitness h =

( )( ), ,, ( ), , ( )DS CPU i j i j kEstExecLat a DataLoad a CPU h cθ+ ;

3.2.1.4 if ( ) ( )k mFitness h Fitness h< then

3.2.1.4.1 Set m kh h= ;

3.3 Set ,( ) ;i j mCandidateHost a h=

Figure 19. The DCP Predictive Resource Allocation Algorithm for Determining Host Resources

Note that the regression equations that determines the execution latency of the program as a function of data size and CPU utilization and as a function of data size and memory utilization are described in the system specification using the abstraction “PerformanceProperties” of the system description language (Section 7). The DCP and DMP algorithms predict the execution latency of a program for a given number of data items and CPU utilization, and for a given number of data items and memory utilization, respectively. The algorithms use the set of application programs of the path that is diagnosed as unhealthy (by the algorithm in Figure 16) and their recovery actions (by the algorithm in Figure 17) as its input. For each unhealthy application program, the algorithm computes a “fitness” function for each of the hosts on which the application program is eligible to execute. The current host of the application is excluded from the eligible host list. The algorithms select the host that yields the minimum value for the fitness function as the “best” candidate host. The two algorithms differ in the way they calculate the fitness function. The DCP algorithm uses Equation (1) to determine the fitness function, where the equation estimates the execution latency of the program on the host at the current CPU utilization of the host and at a certain data size load. The DMP algorithm on the other hand, uses Equation (2) to determine the fitness function,

32

where the equation estimates the execution latency of the program on the host at the current memory utilization of the host and at a certain data size load. However, both the algorithms determine the data size load parameter used in the respective equations based upon the recovery action identified for the application program. If the recovery action is to replicate, then the algorithms estimate the execution latency of the program (i.e., the replica) on the candidate host at half the data load that is currently processed by the program, since the addition of a replica will reduce the data load of each of the application programs by half. If the recovery action is to migrate, then the algorithms estimate the execution latency of the program on the candidate host at the full data load that is currently processed by the program, since the migrant program will process the entire data load. We show the DCP algorithm in Figure 19. The DMP algorithm differs from this only in line 3.2.1.3, where the algorithm will use the equation , ,( , ( ),DS CPU i j i jEstExecLat a DataLoad a+

( , ( ))kMem h cθ ).

9.3.2 Resource Availability-based Algorithms Besides the predictive algorithms, we also consider two resource availability-based algorithms. The resource availability-based algorithms select host resources based on their availability. We use utilization of CPU and memory to characterize resource availability. The two resource availability-based algorithms include: (1) CPU utilization-based algorithm called CUA and (2) memory utilization-based algorithm called MUA. 1. Let iP be a path such that ( , )iPathOverload P cis true.

2. Let the set of applications of iP that are identified as “unhealthy” be denoted as

( )iUnhealthyApplications P and let ,( , , )i j iAct a P c define the recovery action of application program ,i ja .

Let ,( )i jCandidateHost a denote the candidate host for application ,i ja . Set

,( ) ,i jCandidateHost a = ∅ , ( )i j ia UnhealthyApplications P∀ ∈ .

3. for each application , ( )i j ia UnhealthyApplications P∈ do

3.1 Set ( )mFitness h to some large value, for some ,: ( )m i jm h EligibleHosts a∈ ;

3.2 for each host ,( )k i jh EligibleHosts a∈ do

3.2.1 if ,( )k i jh Host a≠ then

3.2.1.1 Let the function : N Nθ → map path cycles to absolute time. Determine

( )( ) , ( ) ;k kFitness h CPU h cθ=

3.2.1.2 if ( ) ( )k mFitness h Fitness h< then

3.2.1.2.1 Set m kh h= ;

3.3 Set ,( ) ;i j mCandidateHost a h=

Figure 20. The CUA Availability-based Resource Allocation Algorithm for Determining Host Resources

33

The CUA and MUA algorithms use the set of application programs of the path that is diagnosed as unhealthy (by the diagnosis algorithm shown in Figure 16) and their recovery actions (determined by the diagnosis algorithm shown in Figure 17) as its input. For each unhealthy application program, the algorithm computes a “fitness” function for each of the hosts on which the application program is eligible to execute. Like the predictive algorithms, CUA and MUA exclude the current host of the application from the eligible host list. The algorithms select the host that yields the minimum value for the fitness function as the “best” candidate host. The two algorithms differ in the way they calculate the fitness function. The CUA algorithm uses the current CPU utilization of the host as the value of the fitness function, whereas the MUA algorithm uses the current memory utilization of the host. We show the CUA algorithm in Figure 20. The MUA algorithm differs from this only in line 3.2.1.1, where the algorithm will use ( , ( ))iMem h cθ .

10. Experimental Characterizations We performed a set of experiments to quantify the performance of the middleware services that allow a recovery from low timeliness and survivability QoS situations. Experiments were also performed to measure the intrusiveness of the middleware on system resources. In this paper, we discuss the experiments that determine the relative merits of predictive and availability-based timeliness QoS management algorithms. Furthermore, we focus on replication as the recovery action. Details of experiments that validate the effectiveness of real-time and survivability services (by replication and migration) and measure the middleware intrusiveness can be found in [RWS99, WRSB98]. Section 10.1 summarizes the experimental environment. We describe the application workload and the metrics of interest in Section 10.2. Section 10.3 presents the results the experiment and their analysis.

10.1 The Experimental Environment Our experimental environment consists of the real-time middleware containing the resource management algorithms and a real-time benchmark called DynBench that functionally approximates dynamic real-time C2 systems. Details of DynBench can be found in [WS99]. For completeness, we summarize the paths of the benchmark that have timeliness and survivability QoS requirements as follows:

1. A periodic assessment path called Sensing that consists of the sensor (a simulator program called Sensor), a filter manager program (called FilterManager), one or more replicas of the filter program (called Filter), an evaluate and decide manager (called EDManager), and one or more replicas of an evaluate and decide program (called ED).

2. A transient path called Engagement that consists of three programs: an action manager program (called ActionManager), an action program (called Action) that is replicated, and an actuator simulator program (called Actuator).

34

3. A transient-periodic monitor and guidance path called MonitorGuide that includes all components of the assessment path, a monitor and guide manager program (called MGManager), and a monitor and guide program (called MG) that is replicated.

Figure 21. The Software Architecture of DynBench The software architecture of the benchmark application is shown in Figure 21. Here, we focus on the periodic Sensing path. The hardware platform of the experimental environment consists of a network of Sun Ultra workstations that are connected using Ethernet.

10.2 Application Workload and Performance Metrics The objective of the experiments is to measure the performance of (four) resource management algorithms that deliver the desired timeliness QoS by allocating resources using DCP, DMP, CUA, and MUA algorithms. Our goal is to measure the performance of the algorithms in terms of the timeliness QoS delivered to the periodic Sensing path during situations of high external load and by program replication. We measure the performance of the algorithms using the following three metrics:

1. Average Time Between Successive Resource Allocations: This is the average of the time between successive resource allocations that are performed during an experiment. The parameter indicates how frequently resource allocations have to be made to achieve acceptable timeliness QoS during an experimental period.

2. Average Steady State Latency Slack: This is the average of the slack values of steady state path latencies after resource allocations during an experiment. The steady state latency slack reflects the “margin” of path latency after a resource allocation and is simply the difference between the path deadline and the steady state latency after resource allocation.

3. Percentage Tardy Time: This is the fraction of the total time of an experiment during which the latency of the path exceeded the deadline.

We performed a set of experiments to determine the above metrics. Each experiment used a scenario for increasing the size of the data stream processed by the Sensing path. The same

Action

Sensor

Actuator

Sensing Path (periodic)

Engagement Path(transient)MonitorGuide

Path (transient-periodic)

MG Action

FilterManager

EDManager

MGManager

ActionManager

Action

MG

MG

FilterFilter

Filter

ED

ED

ED

35

scenario is used for all the algorithms and is shown in Figure 22. As seen in the figure, the data stream size is continuously increased. Furthermore, the data stream is generated (by the Sensor program) such that it contains significant amount of noise so that the workload of Filter program will increase significantly with respect to that of ED.8 As the data stream size increases, the path latency increases and eventually exceeds the minimum slack requirement on the deadline, resulting in a low timeliness QoS for the path. Whenever this situation occurs, it triggers the timeliness QoS diagnosis and resource allocation algorithms of the middleware. Furthermore, the middleware always identifies Filter as the “unhealthy” application program and replication as the recovery action. The “best” host for executing the replica of Filter is then determined by one of the four algorithms.

Data Stream Size Scenario

150

450550

600650

700750

0100200300400500600700800

0 25 50 100 200 300 400

Cycle Number

Num

ber

of D

ata

Item

s

Figure 22. Data Stream Size Scenario for the Experiments

To eliminate the effects of noise and random factors and to ensure that the results of the experiments are statistically significant, we repeated the experiment 20 times for each algorithm. Thus in total, 80 experiments were performed. For each algorithm, the metrics of interest were measured and average values over the 20 samples (from the 20 experiments) were taken.

Table 1. Coefficients of Regression Equations (1) and (2) for the Filter Program Equation (1) 1 184.812393a = 2 0.00771a = 3 0.029041a = Equation (2) 1 48.023b = 2 0.1422b = 3 0.08b =

Recall that the regression equations (1) and (2) use a set of coefficients to determine the program execution latency, given the CPU and memory utilizations and data stream sizes. The values of the coefficients for the Filter program that are obtained by measurement are shown in Table 1. 8 This is done so that the middleware will replicate Filter and not ED. Note that Filter and ED are the only two replicable programs of the Sensing path.

36

10.3 Experimental Results and Analysis Figure 23 shows the average time between successive resource allocations performed by the algorithms during the experiments. Observe that the availability-based algorithms perform resource allocations less frequently than the predictive algorithms. This is due to the fact that the availability-based algorithms always select the least utilized host machines. Host machines that exhibit low utilizations are able to satisfy higher data stream sizes for longer periods of time. On the other hand, the predictive algorithms selects host machines that yields the least value for the predicted latency, where the prediction is based on both data stream sizes and host utilizations. Therefore, the host machines that are selected by the algorithms need not necessarily have low utilizations.

Average Time Between Successive Reallocations

41.3

40.11

42.89 42.94

38.539

39.540

40.541

41.542

42.543

43.5

DCP DMP CUA MUA

Tim

e (s

econ

ds)

Figure 23. Comparison of Average Time between Successive Resource Allocations

Furthermore, the regression equations used by the algorithms for predictions are heavily biased toward data stream sizes than CPU or memory utilizations. As shown in Equation (1), Figure 22, and Table 1, the data stream size parameter DS in Equation (1) that has values in the range [0, 400] for the experiment, dominates the CPU utilization parameter CPU that has values in the range [0, 100]. This is true even though the coefficient of DS is smaller than that of CPU. Thus, the predictive algorithms tend to select host machines that exhibit relatively higher CPU utilizations since the influence of the CPU utilization parameter is smaller. Therefore, at high data stream sizes, the algorithms find it difficult to satisfy the data stream size loads for longer periods of time and have to perform resource allocation more frequently to achieve acceptable timeliness. On the other hand, the availability-based algorithms select hosts that have low utilizations. This satisfies high data stream sizes for longer periods of time and hence the algorithms perform resource allocation only less frequently. The other observation that we make from Figure 23 is that both the CUA and MUA availability-based algorithms have almost the same performance. This implies that both CPU utilization-based allocation and memory utilization-based allocation can satisfy high data stream sizes for longer

37

periods of time. This is rather surprising because, for a given resource allocation situation, CUA may select a host machine that has high memory utilization (but with least CPU utilization) and MUA may select a host machine that has high CPU utilization (but with least memory utilization). However, the lack of resource availability in one parameter is made up by the high availability of the other resource parameter. Hence, over longer periods of time, the low availabilities of one resource will be compensated by the high availabilities of the other resource and the algorithms tend to have almost the same performance.

Comparision of Prediction Error

22.422

44.5795

0

5

10

15

20

25

30

35

40

45

50

DCP DMPPredictive Algorithm

Per

cent

age

Err

or (

%)

Figure 24. Comparison of Prediction Errors for Predictive Algorithms

Note that this does not mean that both CPU and memory are equally important parameters for the application programs. In fact, Figure 23 also shows that DCP performs resource allocation less frequently than DMP. From this, we hypothesize that DCP is able to make more accurate predictions of the latencies than DMP since the programs could be more CPU-intensive than memory-intensive. To confirm this intuition, we measured the error in predicted values of the latencies of the Filter program by the two predictive algorithms. The average of the percentage error values is shown in Figure 24. As the figure illustrates, the average percentage error is significantly lower—by almost 50%—for the DCP algorithm. This illustrates that DCP more accurately predicts the program latency. Hence, the algorithm is able to satisfy the (high) data stream sizes for longer periods of time than the DMP algorithm.

Figure 25 shows the average steady state latency slack of the Sensing path after resource allocations during the experiments for the four algorithms. Interestingly, DCP gives the best results here, followed by CUA. The memory-based algorithms perform worse. The reason why DCP performs the best here is due to the accuracy of the algorithm in predicting the program latency. Recall that this algorithm more accurately predicts the program latency than DMP. Therefore, for a given resource allocation situation, when the algorithm is required to select a best host that can improve the timeliness of the path, the algorithm selects a host that can satisfy the data stream size of the situation most accurately. This results in very good steady state latency as well as steady state latency slack for the given situation. On the other hand, CUA

38

selects the host that has the least CPU utilization. The resulting program latency may be reasonably good for satisfying the data stream size of the given situation, but may not be the best for the given situation. Since the average steady state latency slack metric characterizes the performance of the algorithms for each resource allocation situation, DCP outperforms CUA for the metric.

Percentage Steady State Latency Slack

29.11

25.4728.02

22.79

0

5

10

15

20

25

30

35

DCP DMP CUA MUA

Sla

ck (%

)

Figure 25. Comparison of Steady State Latency Slack for the Algorithms

Observe that though CUA may not produce the best resource allocation for a given situation, the allocation produced by the algorithm will be good for longer periods of time. As discussed earlier, this is because the algorithm selects the host that has the least CPU utilization. This can potentially satisfy higher number of data items that are expected as time progresses (see the data stream size scenario in Figure 22). Thus, CUA outperforms DCP for metrics such as average time between successive resource allocations. The memory-based algorithms perform worse here since they do not consider CPU utilization in the host selection process. Recall our earlier observation that CPU utilization is found to be an important factor that affects the latencies of the application programs, as the programs are more CPU intensive than memory intensive. Therefore during resource allocations, the memory-based algorithms may select host machines that have high CPU utilizations resulting in poor steady state latency slack values. We also observe that of the two memory-based algorithms, DMP gives a better steady state latency slack than MUA. Again, this is attributed to the ability of the predictive algorithm to select a host that more accurately satisfies the data stream size of the given situation. MUA does not consider the data stream size parameter and may select a host that has high CPU utilization (but low memory utilization). The high CPU utilization of the host results in the poor steady state performance of the algorithm. Figure 26 shows the percentage of tardy time for the four algorithms during the experiments. Recall that percentage tardy time metric measures the overall performance of the application

39

during an experimental period (as opposed to the steady state slack metric that characterizes the performance after individual resource allocations). Hence, the performance of the algorithms for the percentage tardy time metric should be consistent with the performance that we had earlier observed for the average time between successive resource allocations metric since the metric also characterizes the overall performance of the application during an experimental period.

Percentage Tardy Time

87.59

89.556

83.9684.49

81

82

83

84

85

86

87

88

89

90

DCP DMP CUA MUA

Tar

dy T

ime

(%)

Figure 26. Comparison of Percentage Tardy Times for the Algorithms

From Figure 26, as expected, we can observe that the availability-based algorithms perform better than the predictive algorithms. The availability-based algorithms always select the least utilized host machines. Host machines that exhibit low utilizations are able to satisfy higher data stream sizes for longer periods of time. Hence, the application spends most of its time with acceptable timeliness and is therefore less tardy. On the other hand, the predictive algorithms selects host machines that yields the least value for the predicted latencies, where the prediction is based on both data stream sizes and host utilizations. Due to the bias of the regression equations used by the predictive algorithms on the data stream size parameter, the algorithms tend to select host machines that exhibit relatively higher utilizations (since the influence of the utilization parameter is smaller). Therefore, at high data stream sizes, the algorithms find it difficult to satisfy the data stream size loads for longer periods of time and causes the application to be more tardy. We also observe from Figure 26 that CUA and MUA have almost the same performance. We attribute this to the “compensating factor” that was discussed earlier. Figure 26 also shows that DCP performs better than DMP. As before, this is due to the fact that DCP is able to make more accurate predictions of the latencies than DMP since the programs are more CPU-intensive than memory-intensive. Recall that earlier we had confirmed this intuition by observing the error in predicted values of the latencies by the two predictive algorithms (Figure 24).

40

Finally, we note that the similarity in the performance of the predictive and availability-based algorithms for the two symmetrical metrics—average time between successive resource allocations and percentage tardy time—is remarkable. This strongly validates our intuitions. 11. Related Work We compare the work presented in this paper with that of related efforts in this section. Since there are two fundamental aspects to the work—system description language mechanisms and resource management strategies—we compare and contrast the related efforts on both the aspects. The comparisons are presented in the subsections that follow.

11.1 System Description Language The system description language presented in this paper is significantly different from other real-time languages, which can be primarily divided into two groups: application development languages, and specification languages (or formalisms) that are used to describe timing constraints at the application level, or even at the system level. Examples of application programming languages are Tomal [KH76], Pearl [Mar78], Real-Time Euclid [KS86], RTC++[ITM92], Real-Time Concurrent C [GR91], Dicon [LG85], Chaos [BG92], Flex [LN88], TCEL [GH93], Ada95 [ANS95], MPL [NTA90], and CaRT-Spec [WSM95]. These languages include a wide variety of features that allow the compiler (and possibly the run-time system) to check assertions or even to transform code to ensure adherence to timing constraints. Specification languages or “meta-languages,” such as ACSR [CLX95], GCSR [ALC95], [Sha89], and RTL [JM86] formalize the expression of different types of timing constraints and in some cases allow proofs of program properties based on these constraints. In some cases, such as RTL [JM86], these features have been folded into an application development language [GKS95]. The language presented here is a specification meta-language that is independent of any particular application language. Rather than providing real-time support within a particular application language, it provides support for expressing timing constraints for families of application programs, written in a wide variety of programming languages. Unlike previous work in which timing constraints are described at a relatively small granularity, such as at the individual program level, we allow timing constraints to be expressed at a larger granularity, i.e., to span multiple programs. Another way of characterizing real-time languages is in the way they permit characterization of a system’s behavior when interacting with the physical environment. Prior work has typically assumed that the effects of the environment on the system can be modeled deterministically. Our work expands this to include systems that interact with environments that are deterministic, stochastic, and dynamic. This is accomplished by modeling interactions of the system with the environment as data and event streams that may have stochastic or dynamic properties. In order to handle dynamic environments, it is useful if the language includes features that can be related to dynamic mechanisms (such as those described in [FJR94] [HSNL97] [RSYJ97] [SH94]) for monitoring, diagnosis and recovery. Language support for run-time monitoring of real-time programs has been addressed in [GKS95], [KL91], Real-Time Euclid [KS86], and Ada95 [ANS95]. However, this prior work provides limited support for diagnosis and recovery actions. We extend the language features pertaining to diagnosis of timing problems, and to the

41

migration or replication of application program components to handle higher data stream or event stream loads. Previous real-time languages allow the description of behaviors that are purely periodic or transient (aperiodic). We extend language support to describe hybrid behaviors such as the transient-periodic behavior described in [SP96]. We also allow for dynamic multi-dimensional timing constraints—deadlines that are expressed on an aggregation of execution cycles of tasks—that to our knowledge cannot be described in any existing real-time language. The software engineering literature proposes architecture description languages (ADLs) such as C2 [TM+96], Wright [AG97], and Darwin [MD+95], among others, that support formal specification of system components and their interconnections. For example, C2 provides a hierarchical architectural style for constructing systems where components are built with knowledge of only those components that are “above” it in the hierarchy and without any knowledge of those components that are “below” it in the hierarchy, thus promoting reuse of components and flexibility in the composition of components. Wright on the other hand, provides a formal basis for specifying the interactions between system components and analyzing the interactions to determine whether they are behaviorally correct and deadlock-free. Darwin provides abstractions to describe components of distributed systems, their interactions, and a formalism to determine the correctness of programs that are constructed from the specifications. We note that these ADLs may be used to describe the software architecture of dynamic real-time applications for promoting reuse, achieving flexibility in constructing the system, verifying the absence of deadlocks, and verifying the behavioral correctness of the system. However, they lack abstractions that allow descriptions of task properties that are pertinent to resource management. Such properties include (1) behavioral types of tasks such as periodic, transient, and transient-periodic behaviors, (2) timeliness requirements such as simple deadlines, and (3) environment-dependent attributes such as data and event stream characteristics. On the other hand, one of the goals of our system description language is precisely this – abstractions for describing properties that enable resource management and therefore can be used as an interface to resource management techniques.

11.2 Scheduling and Resource Management To compare and contrast our resource management mechanisms with other related works, we present a simple classification scheme that classifies real-time scheduling and resource management algorithms. The classification scheme is based on a characterization of the size of data streams processed by periodic tasks. The size of the data stream of periodic tasks can be defined as either deterministic or stochastic depending upon whether the data stream size can be characterized completely a-priori using discrete values or using probability distribution functions, respectively. Data stream size can also be defined as dynamic, if an a-priori characterization of the stream size is impossible. We now examine well-known real-time scheduling and resource management algorithms and classify them on the basis of the classification scheme.

42

Table 2 shows classification of real-time scheduling and resource management algorithms based on the size of data streams processed by periodic tasks. Observe that the size of the data stream refers to the number of data items (or sensor reports) that periodic assessment tasks and transient-periodic guidance tasks of real-time C2 systems must process during a single execution cycle. The number of sensor reports that must be processed in a single execution cycle will significantly impact the execution times of the tasks. Thus, we classify the real-time scheduling and resource management algorithms as deterministic, stochastic, and dynamic depending upon how the algorithms model the cycle execution times of periodic and transient-periodic tasks.

Table 2. Classification of Real-Time Scheduling and Resource Management Algorithms Based on Data Stream Sizes

In a number of real-time scheduling algorithms, the execution time of a “job” is considered to be known completely a-priori. Typically, execution time is assumed to be an integer “worst-case” execution time, as in [Bak91, Cla90, LL73, RSZ89, SKG91, Ver95, WSM95, XP90]. Therefore, we classify these scheduling algorithms as deterministic, as the algorithms model the number of data items processed by periodic tasks as integer, discrete values that are known a-priori. Paradigms that generalize execution times have also been developed. Execution time is modeled as a set of discrete values in [KM97], as an interval in [SL96], and as a probability distribution in [AB98, Leh96, Loc86, SK97, TD+95, Kao95]. Therefore, these algorithms are characterized as stochastic in the classification, since they model data stream sizes using distribution functions. The resource management models presented in [BN+98, HSNL97, RLLS97, RSYJ97] allow execution time of tasks to have unknown a-priori characterizations. Hence they are classified as dynamic. Our work also belongs to this category. So we compare and contrast our work with these efforts. Our work fundamentally differs from [BN+98, RLLS97, HSNL97] in the application model. [BN+98, RLLS97, HSNL97] present an application model that consists of multiple applications, where each application can operate at multiple discrete “levels.” A level is a strategy for doing application’s work and is characterized by a benefit and resource usage (e.g., CPU utilization). The benefit of an application at a given level is defined as a function of the resource usage of the application at that level and is (explicitly) specified by the user. The authors present resource allocation algorithms that dynamically select or negotiate the levels of applications, or enable the applications to select their levels, such that the total benefit is maximized and each individual application operates without missing its deadline. In contrast, our application model consists of applications, where each application operates at a single level of benefit (i.e., satisfaction of its deadline) and an implicit resource usage corresponding to that benefit. Further, we present resource management algorithms that dynamically allocate resources to an application when the application exhibits trends for timing failures due to increase in workloads. The resource

Deterministic Stochastic Dynamic Data

stream size [Bak91][Cla90][LL73]

[RSZ89][SKG91][Ver95] [XP90][WSM95]

[AB98][KM97][Leh96] [Loc86][SK97] [SL96]

[TD+95] [Kao95]

[BN+98] [RLLS97] [HSNL97][RSYJ97]

43

management algorithms automatically allocate the appropriate resources by determining the “right” programs of the application to replicate or “right” programs to migrate and “right” processor resources for the replica or migrant programs such that the application timing constraint is satisfied. The approach described in [RSYJ97] is similar to ours in (1) the application model and (2) the way resource management is performed i.e., QoS monitoring followed by QoS diagnosis and resource allocation by replication or migration to improve QoS. However, there are fundamental differences. The goal of [RSYJ97] is to discover the factors that influence the effectiveness of run-time resource management that achieve the timeliness requirements in dynamic real-time systems. The authors identify several factors such as early detection, overhead of enactment of the reallocation decision, and application state-driven incremental decision heuristics that contribute to the effectiveness of run-time resource management through an experimental study. Our work fundamentally differs from [RSYJ97] in the objective. Our goal is to discover resource management techniques—monitoring and detection, diagnosis, and resource allocation strategies—that will achieve the timeliness requirements during high external load situations. Therefore, we are interested in discovering algorithms that answer questions such as (1) how can a low timeliness situation be detected, (2) how can we recover from a low timeliness situation and what recovery actions are needed, (3) what are the resources that must be allocated to recovery actions so that the actions will improve timeliness, and (4) what are the relative merits of algorithms that allocate resources based on availability and those that allocate resources by forecasting application timeliness. This paper presents algorithms that answer the above questions and thus, the work differs from [RSYJ97] in the objectives. However, the two objectives are complimentary and throw light on important considerations for engineering dynamic real-time systems when viewed collectively. 12. Conclusions This paper presents a resource management architecture for engineering dynamic real-time systems. In the proposed architecture, a real-time system application is developed in a general-purpose programming language and a system description language is used to specify the architectural-level description of the system and its timeliness and survivability requirements as desired quality of service (QoS). An abstract model that is constructed from the language specifications is dynamically augmented by the (system description) language run-time system to produce a dynamic intermediate representation (IR). The dynamic IR characterizes the state of the system and is used by a resource management middleware to perform resource management and deliver the desired application QoS. To validate the viability of the approach, we use a real-time benchmark application that functionally approximates dynamic real-time command and control systems. The benchmark is specified in the system description language and the effectiveness of the architecture in achieving its design goals is examined through a set of experiments. The experimental characterizations illustrate that the middleware is able to achieve the desired timeliness requirements during a number of load situations. Furthermore, the results indicate that availability-based allocation algorithms perform resource allocation less frequently, whereas the predictive algorithms give a better steady state performance for the application.

44

The contributions of the paper are: 1. Resource management architecture for engineering dynamic real-time distributed systems 2. System description language that allows an architectural-level description of dynamic real-

time distributed systems and their operational requirements such as timeliness and survivability as desired QoS

3. Resource management middleware algorithms that detect, diagnose, and recover from deteriorating timeliness and survivability QoS situations

4. Illustration of the relative merits two different classes of algorithms—predictive and availability-based—for performing timeliness QoS management.

We acknowledge that the resource management techniques presented here may be specific to dynamic real-time C2 applications that exhibit the properties outlined in Section 2. Also, the paper does not address transient and transient-periodic tasks in dynamic real-time distributed systems. Resource management techniques for such tasks are currently being developed. Furthermore, the work presented here assumes that only the application is subject to failures. Survivability of the middleware will be addressed in future work. Acknowledgements We would like to thank Lonnie Welch, Behrooz Shirazi, Carl Bruggeman and members of the DeSiDeRaTa research group for their guidance, feedback, and help in developing and implementing the ideas presented in this paper. We also thank the anonymous referees for their many insightful comments that have significantly improved the paper.

This paper is a significantly revised version of the paper “Resource Management Middleware for Dynamic, Dependable Real-Time Systems,” that is scheduled to appear in the Journal of Real-Time Systems and the paper “Specification and Modeling of Dynamic, Distributed Real-Time Systems,” that appeared in Proceedings of The 19th IEEE Real-Time Systems Symposium, pages 72 - 81, December 1998. These papers focus on illustrating the general effectiveness of the real-time and survivability middleware services with availability-based allocation algorithms. References

[AB98] A. Atlas and A. Bestavros, “Statistical Rate Monotonic Scheduling,” Proceedings of The IEEE Real-Time Systems Symposium, pages 123-132, December 1998.

[AG97] R. Allen and D. Garlan, “A Formal Basis for Architectural Connection,” ACM Transactions on Software Engineering and Methodology, Volume 6, Number 3, pages 213-249, July 1997.

[ALC95] H. B-Abdallah, I. Lee and J-Y. Choi, “A Graphical Language with Formal Semantics for the Specification and Analysis of Real-Time Systems,” Proceedings of The IEEE Real-Time Systems Symposium, pages 276--286, December 1995.

[ANS95] International Standard ANSI/ISO/IEC-8652:1995, Ada 95 Reference Manual, Intermetrics, Inc., January 1995.

[Bak91] T. P. Baker, “Stack-Based Scheduling of Real-time Processes, Journal of Real-Time Systems,” 3(1):67--99, March 1991.

45

[BG92] T. Bihari and P. Gopinath, “Object-Oriented Real-Time Systems,” IEEE Computer, 25(12):25--32, December 1992.

[BN+98] S. Brandt, G. Nutt, et. al., “A Dynamic Quality of Service Middleware Agent for Mediating Application Resource Usage,” Proceedings of The IEEE Real-Time Systems Symposium, pages 307-317, December, 1998.

[Cla90] R. K. Clark, “Scheduling Dependent Real-Time Activities,” CMU-CS-90-155 (Ph.D. Thesis), Department of Computer Science, Carnegie Mellon University, 1990.

[CLX95] J-Y. Choi, I. Lee, H-L. Xie, “The Specification and Schedulability Analysis of Real-Time Systems using ACSR,” Proceedings of The IEEE Real-Time Systems Symposium, pages 266--275, December 1995.

[CSR86] S. Cheng, J. Stankovic, and K. Ramamritham, “Dynamic Scheduling of Groups of Tasks with Precedence Constraints in Distributed Hard Real-Time Systems,” Proceedings of The IEEE Real-Time Systems Symposium, 1986.

[FJR94] R. Rajkumar, F. Jahanian and S. Raju, “Run-time monitoring of timing constraints in distributed real-time systems,” Journal of Real-Time Systems, 1994.

[GH93] R. Gerber and S. Hong, “Semantics-Based Compiler Transformations for Enhanced Schedulability,” Proceedings of The IEEE Real-Time Systems Symposium, pages 232--242, December 1993.

[GKS95] M. Gergeleit, J. Kaiser, and H. Streich, “Checking Timing Constraints in Distributed Object-Oriented Programs,” Proceedings of The Object-Oriented Real-Time Systems Workshop, October 1995.

[GR91] N. Gehani and K. Ramamritham, Real-Time Concurrent C: A Language for Programming Dynamic Real-Time Systems, Journal of Real-Time Systems, 3(4):377--405, December 1991.

[HiPerD] The U.S. Naval Surface Warfare Center, “High Performance Distributed Computing,” Available at http://www.nswc.navy.mil/hiperd/index.shtml

[HS92] C-H. Hou and K. Shin, “Allocation of Periodic Task Modules with Precedence and Deadline Constraints in Distributed Real-Time Systems,” Proceedings of The IEEE Real-Time Systems Symposium, December 1992.

[HSNL97] D. Hull, A. Shankar, K. Nahrstedt and J. W. S. Liu, “An End-to-End QoS Model and Management Architecture,” Proceedings of The IEEE Workshop on Middleware for Distributed Real-Time Systems and Services, pages 82-89, 1997.

[ITM92] Y. Ishikawa, H. Tokuda, and C. M. Mercer, “An Object-Oriented Real-Time Programming Language,” IEEE Computer, 25(10):66--73, October 1992.

[Jen99] E. D. Jensen, “Adaptive Real-Time Distributed Computer Systems,” Available at: http://www.realtime-os.com/.

[JM86] F. Jahanian and A. K.-L. Mok, “Safety Analysis of Timing Properties in Real-Time Systems,” IEEE Transactions on Software Engineering, 12(9):890--904, 1986.

[Kao95] B. C. Kao, “Scheduling in Distributed Soft Real-Time Systems With Autonomous Components,” PhD Thesis, Princeton University, November 1995.

[Koob96] G. Koob, “Quorum,” Proceedings of The Darpa ITO General PI Meeting, pages A-59-A-87, October 1996.

[KDK89] H. Kopetz, A. Damm, C. Koza, M. Mulazzani, W. Schwabl, C. Senft, and R. Zainlinger, “Distributed Fault-Tolerant Real-Time Systems: The Mars Approach,” IEEE Micro, 9(1), pages 25-40, February 1989.

[KH76] R. B. Kieburtz and J. L. Hennessy, “Tomal - A High-Level Programming Language for Microprocessor Process Control Applications,” ACM SIGPLAN Notices, pages 127--134, April 1976.

46

[KL91] K. B. Kenny and K. J. Lin, “Building Flexible Real-Time Systems Using the Flex Language,” IEEE Computer, pages 70--78, May 1991.

[KM97] T.-W. Kuo and A. K. Mok, “Incremental Reconfiguration and Load Adjustment in Adaptive Real-Time Systems,” IEEE Transactions on Computers, Volume 46, Number 12, pages 1313-1324, December, 1997.

[KS86] E. Kligerman and A. D. Stoyenko, “Real-Time Euclid: A Language for Reliable Real-Time Systems,” IEEE Transactions on Software Engineering, 12(9):941--949, September 1986.

[Leh96] J. P. Lehoczky, “Real-time Queueing Theory,” Proceedings of The IEEE Real-Time Systems Symposium, pages 186--195, December 1996.

[LG85] I. Lee and V. Gehlot, “Language Constructs for Distributed Real-Time Systems,” Proceedings of The IEEE Real-Time Systems Symposium, December 1985.

[LL73] C. L. Liu and J. W. Layland, “Scheduling Algorithms for Multiprogramming in a Hard Real-Time Environment,” Journal of the ACM, Volume 20, Number 1, pages 46--61, 1973.

[LN88] K. J. Lin and S. Natarajan, “Expressing and Maintaining Timing Constraints in Flex,” Proceedings of The 9th IEEE Real-Time Systems Symposium, pages 96--105, December 1988.

[Loc86] C. D. Locke, “Best-Effort Decision Making for Real-Time Scheduling,” CMU-CS-86-134 (Ph.D. Thesis), Department of Computer Science, Carnegie Mellon University, 1986.

[LRT92] J. P. Lehoczky and S. Ramos-Thuel, “An Optimal Algorithm for Scheduling Soft-Aperiodic Tasks in Fixed-Priority Preemptive Systems,” Proceedings of The IEEE Real-Time Systems Symposium, pages 110--123, 1992.

[LSS87] J. P. Lehoczky, L. Sha, and J. K. Strosnider, “Enhanced Aperiodic Responsiveness in Hard Real-time Environments,” Proceedings of The IEEE Real-Time Systems Symposium, 1987.

[Mar78] T. Martin, “Real-Time Programming Language Pearl - Concept and Characteristics,” Proceedings of The IEEE Computer Society Second International Computer Software and Applications Conference, pages 301--306, 1978.

[MD+95] J. Magee, N. Dulay, S. Eisenbach, and J. Kramer, “Specifying Distributed Software Architectures,” Proceedings of The Fifth European Software Engineering Conference, September 1995.

[Mill95] Mills, D.L, “Improved Algorithms for Synchronizing Computer Network Clocks,” IEEE/ACM Transactions on Networks, pages 245 – 254, June 1995.

[NTA90] V. M. Nirkhe, S. K. Tripathi, A. K. Agrawala, “Language Support for the Maruti Real-Time System,” Proceedings of The 11th IEEE Real-Time Systems Symposium, pages 257--266, 1990.

[Quo97] DARPA ITO, “Quorum,” Available at http://www.ito.darpa.mil/research/quorum/projlist.html, August 1997.

[Rav98] B. Ravindran, “Modeling and Analysis of Complex, Dynamic Real-Time Systems,” PhD Thesis, Department of Computer Science and Engineering, The University of Texas at Arlington, August 1998.

[RCF97] I. Ripoll, A. Crespo, and A. G. Fornes, “An Optimal Algorithm for Scheduling Soft Aperiodic Tasks in Dynamic Priority Preemptive Systems,” IEEE Transactions on Software Engineering, 23(6):388--400, June 1997.

[RLLS97] R. Rajkumar, C. Lee, J. Lehoczky and D. Siewiorek, “A Resource Allocation Model for QoS Management,” Proceedings of The 18th IEEE Real-Time Systems Symposium, pages 298--307, 1997.

47

[RSYJ97] D. Rosu, K. Schwan, S. Yalamanchili and R. Jha, “On Adaptive Resource Allocation for Complex Real-Time Applications,” Proceedings of The 18th IEEE Real-Time Systems Symposium, pages 320--329, December 1997.

[RSZ89] K. Ramamritham, J. A. Stankovic, and W. Zhao, “Distributed scheduling of tasks with deadlines and resource requirements,” IEEE Transactions on Computers, 38(8):1110--1123, August 1989.

[RTL93] S. Ramos-Thuel and J. P. Lehoczky, “On-line Scheduling of Hard Deadline Aperiodic Tasks in Fixed-Priority Systems,” Proceedings of The IEEE Real-Time Systems Symposium, pages 160-171, 1993.

[RWS99] B. Ravindran, L. R. Welch, and B. Shirazi, “Resource Management Middleware for Dynamic, Dependable Real-Time Systems,” Journal of Real-Time Systems, Accepted for publication, To appear.

[SB96] M.Spuri and G.Buttazzo, “Scheduling Aperiodic Tasks in Dynamic Priority Systems,” Journal of Real-Time Systems, 10:179--210, March 1996.

[SH94] K. G. Shin and C-J. Hou, “Design and Evaluation of Effective Load Sharing in Distributed Real-Time Systems,” IEEE Transactions on Software Engineering, 15(7), pages 875—889, July 1989.

[Sha89] A. Shaw, “Reasoning About Time in Higher-Level Language Software,” IEEE Transactions on Software Engineering, 15(7), pages 875--889, July 1989.

[Shi91] K.G. Shin, “Harts: A Distributed Real-Time Architecture,” IEEE Computer, 24(5):25--35, May 1991.

[SK97] D.B. Stewart and P.K. Khosla, “Mechanisms for Detecting and Handling Timing Errors,” Communications of the ACM, 40(1):87--93, January 1997.

[SKG91] L. Sha, M. H. Klein, and J. B. Goodenough, “Rate Monotonic Analysis for Real-Time Systems,” In A. M. van Tilborg and G. M. Koob (Editors), Scheduling and Resource Management, pages 129--156. 1991.

[SL96] J. Sun and J.W.S. Liu, “Bounding Completion Times of Jobs With Arbitrary Release Times And Variable Execution Times,” Proceedings of The IEEE Real-Time Systems Symposium, 1996.

[SLS88] B. Sprunt, J. Lehoczky, and L. Sha, “Exploiting Unused Periodic Time For Aperiodic Service Using the Extended Priority Exchange Algorithm,” Proceedings of The IEEE Real-Time Systems Symposium, pages 251-258, 1988.

[SP96] S. Sommer and J. Potter, “Operating System Extensions for Dynamic Real-Time Applications,” Proceedings of The IEEE Real-Time Systems Symposium, pages 45--50, December 1996.

[SR91] J.A. Stankovic and K.Ramamritham, “The Spring Kernel: A New Paradigm for Real-time Systems,” IEEE Software, 8(3):62--72, May 1991.

[SRC85] J.A. Stankovic, K.Ramamritham, and S.Cheng, “Evaluation of a Flexible Task Scheduling Algorithm for Distributed Hard Real-Time Systems,” IEEE Transactions on Computers, C-34(12):1130--1141, December 1985.

[SSL89] B.Sprunt, L.Sha, and J.Lehoczky, “Aperiodic Task Scheduling in Hard Real-time Systems,” Journal of Real-Time Systems, 1(1):27--60, 1989.

[Sta96] J.A. Stankovic et. al, “Strategic Directions in Real-time and Embedded Systems,” ACM Computing Surveys, 28(4):751--763, December 1996.

[TD+95] T. S. Tia, Z. Deng, et. al., “Probabilistic Performance Guarantee for Real-Time Tasks with Varying Computation Times,” Proceedings of The IEEE Real-Time Technology and Applications Symposium, pages 164-173, 1995.

48

[TLS96] T-S. Tia, J.W.-S. Liu, and M.Shankar, “Algorithms and Optimality of Scheduling Soft Aperiodic Requests in Fixed-Priority Preemptive Systems,” Journal of Real-Time Systems, 10:23--43, January 1996.

[TM+96] R. N. Taylor, N. Medvidovic, K. M. Anderson, E. J. Whitehead Jr., J. E. Robbins, K. A. Nies, P. Oreizy, and D. L. Dubrow, “A Component- and Message-Based Architectural Style for GUI Software, IEEE Transactions on Software Engineering, Volume 22, Number 6, pages 390-406, June 1996.

[Ver95] J. P. C. Verhoosel, “Pre-Run-Time Scheduling of Distributed Real-Time Systems: Models and Algorithms,” PhD Thesis, Eindhoven University of Technology, The Netherlands, January 1995.

[Wel97] L. R. Welch, “Large Grain, Dynamic Control System Architectures,” Proceedings of the Joint Workshop on Parallel and Distributed Real-Time Systems, IEEE CS Press, pages 22 – 26, April 1997.

[WRH+96] L. R. Welch, B. Ravindran, R. D. Harrison, L. Madden, M. W.Masters, and W. Mills, “Challenges in Engineering Distributed Shipboard Control Systems,” Proceedings of The Work-In-Progress Session, The Seventeenth IEEE Real-Time Systems Symposium, pp: 19 -- 22, December 1996.

[WRSB98] L. R. Welch, B. Ravindran, B. A. Shirazi, and C. Bruggeman, “Specification and Modeling of Dynamic, Distributed Real-Time Systems,” Proceedings of The 19th IEEE Real-Time Systems Symposium, pages 72 - 81, December 1998.

[WS99] L. R. Welch and B. A. Shirazi, “A Dynamic Real-Time Benchmark for Assessment of QoS and Resource Management Technology”, Proceedings of The Fifth IEEE Real-Time Technology and Applications Symposium”, pages 36-45, June 1999.

[WSM95] L. R. Welch, A. D. Stoyenko, and T. J. Marlowe, “Response Time Prediction for Distributed Periodic Processes Specified in CaRT-Spec,” Control Engineering Practice, 3(5), May 1995, pp. 651-664.

[WSRK99] L. R. Welch, B. Shirazi, B. Ravindran and F. Kamangar, “Instrumentation Support for Dynamic Resource Management of Real-time, Distributed Control Systems,” International Journal of Parallel and Distributed Systems and Networks, Volume 2, Number 3, pages 105-117, 1999.

[XP90] J. Xu and D. L. Parnas, “Scheduling Processes with Release Times, Deadlines, Precedence, and Exclusion Relations,” IEEE Transactions on Software Engineering, 16(3), pages 360-369, March 1990.

Documents

Engineering Dynamic Real-Time Distributed Systems: Architecture, System Description ...pdfs.semanticscholar.org/7d61/f261b3618267d5bf57ab36603c... · 2015. 7. 28. · Description