9
Proceedings of the 33rd Hawaii International Conference on System Sciences - 2000 0-7695-0493-0/00 $10.00 (c) 2000 IEEE 1

Detecting anomalies in constructed complex systems

Embed Size (px)

Citation preview

Proceedings of the 33rd Hawaii International Conference on System Sciences - 2000

Detecting Anomalies in Constructed Complex Systems

Christopher LandauerAerospace Integration Science Center

The Aerospace Corporation, Mail Stop M6/214P. O. Box 92957, Los Angeles, California 90009-2957, USA

E-mail: [email protected], Phone: +1 (310) 336-1361

Kirstie L. BellmanPrincipal Director, Aerospace Integration Science Center

The Aerospace Corporation, Mail Stop M6/214P. O. Box 92957, Los Angeles, California 90009-2957, USA

E-mail: [email protected], Phone: +1 (310) 336-2191

Abstract

It is well-known that complex systems are di�cultto design, implement, and analyze. Component-levelveri�cation has improved to the point that we can expectto produce formal or nearly formal veri�cation analysesof all components of a complex system. What remainsare the system-level veri�cations, which we believe canbe improved by our approach to system development.

In earlier papers, we de�ned the wrapping integra-tion infrastructure, which shows how a little bit ofknowledge about the uses of the di�erent resources goesa very long way towards using them properly, identify-ing anomalies and even monitoring their behavior.

In this paper, we �rst describe our Knowledge-Basedintegration infrastructure, called \wrappings", then wedescribe some anomaly detection algorithms originallydeveloped for veri�cation and validation of Knowledge-Based systems, and �nally, we show how systems or-ganized using wrappings lend themselves to evaluationstudies, both o�ine and online.

1. Introduction

Our main interest is the study of coordinationamong components, agents, and even humans withinvery large computing systems, without imposing toomany design decisions in advance [20], and especially,without precluding any particular architectural style[19]. We have (1) emphasized the system's infrastruc-ture as a way to keep large numbers of disparate com-

0-7695-0493-0

ponents coordinated, (2) shown how the infrastructurecan be based on explicit meta-knowledge and activeintegration processes that use the meta-knowledge, (3)introduced a di�erent interpretation of interaction withcomputers, based on a comparison of what we view asthe important aspects of many popular programmingparadigms, and (4) shown how these notions can becombined to build a framework for heterogeneous sys-tem integration.

All of these notions derive from our dynamic infras-tructure for what we have called Constructed ComplexSystems, which are complex heterogeneous systemsthat are mediated or integrated by computer programs,and in particular, on our \Wrapping" technique,which is based on two notions: (1) explicit, machine-interpretable descriptions of all software, hardware,and other computational resources in a system, and(2) active integration processes that select, adapt, andcombine these resources to apply to particular prob-lems. Wrappings are a Knowledge-Based dynamic in-tegration infrastructure for constructing complex het-erogeneous software systems. Because the wrappingsare intended to de�ne everything that the system cando, they can be used to guide system speci�cation, andbecause the wrappings de�ne everything that the sys-tem can do, they can be used to guide testing.

It is well-known that these systems are hard to de-sign, implement, and analyze, because there is no onemodel, modeling notation, or even modeling style thatcan be adequate to de�ne all aspects of the system [6][22].

Formal methods have been developed to make thespeci�cation and analysis of systems more reliable, but

/00 $10.00 (c) 2000 IEEE 1

Proceedings of the 33rd Hawaii International Conference on System Sciences - 2000

they have not yet reached their expected utility, inpart because the intellectual overhead for using themis high, and in part because the mathematical meth-ods available are not powerful enough for complex sys-tem models [13]. Component-level veri�cation has im-proved to the point that we can expect to produce for-mal or nearly formal veri�cation analyses of all com-ponents of a complex system, but it is still di�cult toproduce system-level veri�cations (For us, veri�cationis comparison of a system de�nition to a system speci�-cation, and validation is comparison of the speci�cationto the intended behavior).

In this paper, we describe the Wrapping approachvery brie y, and discuss how it supports detailed anal-yses of system behavior, using empirical methods.

2. Wrapping

The wrapping approach is a computationally re ec-tive, Knowledge-Based integration infrastructure thatuses explicit, machine-processable information aboutresource uses, AND Active processes that use the in-formation to organize resources to apply to posed prob-lems. In this section, we give an overview of the wrap-ping approach; many more details are elsewhere [9] [10][12] [14] [15] (and references therein).

The wrapping theory has four essential features:

1. ALL parts of a system architecture are resources,including programs, data, user interfaces, architec-ture and interconnection models, and everythingelse.

2. ALL activities in the system are problem study,(i.e., all activities apply a resource to a posed prob-lem), including user interactions, information re-quests and announcements within the system, ser-vice or processing requests, and all other process-ing behavior. We therefore speci�cally separatethe problem to be studied from the resources thatmight study it.

3. Wrapping Knowledge Bases contain wrappings,which are explicit machine-processable descrip-tions of all of the resources and how they canbe applied to problems to support what we havecalled the Intelligent User Support (IUS) functions[2]:

� Selection (which resources to apply to a prob-lem),

� Assembly (how to let them work together),

� Integration (when and why they should worktogether),

Tia[

tuottgtmpass[hoea

pwfacavst

e

0-7695-0493-0/

� Adaptation (how to adjust them to work onthe problem), and

� Explanation (why certain resources were orwill be used).

Wrappings contain much more than \how" to usea resource. They also help decide \when" it isappropriate, \why" you might want to use it, and\whether" it can be used in this current problemand context.

4. Problem Managers (PMs), including the StudyManagers(SMs) and the Coordination Manager(CM), are algorithms that use the wrapping de-scriptions to collect and select resources to applyto problems. They use implicit invocation, bothcontext and problem dependent, to choose and or-ganize resources. The PMs are also resources, andthey are also wrapped.

he wrapping information and processes form expertnterfaces to all of the di�erent ways to use resources inheterogeneous system that are known to the system

3] [16].The most important conceptual simpli�cations that

he wrapping approach brings to integration are theniformities of the �rst two features: the uniformityf treating everything in the system as resources, andhe uniformity of treating everything that happens inhe system as problem study. The most important al-orithmic simpli�cation is the re ection provided byreating the PMs as resources themselves: we explicitlyake the entire system re ective by considering theserograms that process the wrappings to be resourceslso, and wrapping them, so that all of our integrationupport processes apply to themselves, too. The entireystem is therefore Computationally Re ective [17] [7]9]. It is this ability of the system to analyze its own be-avior that provides some of the power and exibilityf resource use. These ideas have proven to be useful,ven when implemented and applied in informal andd hoc ways [18] [5].Since the process steps that interact with the Wrap-

ing Knowledge Base are themselves posed problems,e can use completely di�erent syntax and semanticsor di�erent parts of the WKB in di�erent contexts,nd select the appropriate processing algorithms ac-ording to context. In particular, whether one writesbout one WKB or several is a matter of taste andiewpoint. Even though the WKBs have a variableyntax, there are some common features in the seman-ics, which we describe next.Each wrapping is a list of \problem interpretation"

ntries, each of which describes one way in which this

00 $10.00 (c) 2000 IEEE 2

Proceedings of the 33rd Hawaii International Conference on System Sciences - 2000

resource can be used to deal with a problem. Theremay be several problem interpretation entries for thesame problem if the resource has many di�erent waysof dealing with it. Each of these problem interpreta-tion entries has lists of context conditions that musthold for the resource to be considered or applied. Allof these conditions are checked using the current lo-cal context. These sets of conditions are importantat di�erent times; one at resource consideration orplanning time, and the other at resource applicationtime. These act as pre-conditions for the applicationof the resource. The corresponding post-condition isthe \product" list of context component assignments,which describes what information or services this re-source makes available when it is applied. This meansthat ANY Knowledge Representation mechanism that:(1) can express the above semantic features, (2) canbe queried by problem (in order to match resourcesto problems), and (3) can be �ltered by problem andcontext information (in order to resolve resource selec-tions), can be used for Wrappings.

Many more details can be found in [14] [15], includ-ing thorough descriptions of the Wrapping KnowledgeBases and the Problem Managers that are usd in anywrapping-based system.

3. Veri�cation and Validation

Once we have changed our approach to system devel-opment by using the models and the meta-knowledgeof wrappings [11], we can apply the usual mathemat-ically formal methods to their analysis. This part ofthe process is well-known; using wrappings gives usa complementary set of descriptions that we can alsoanalyze. Regardless of how the wrapping knowledgeis obtained, it consists of a number of rules for theuse and adaptation of resources. The question thenarises, \How do we check the correctness of this meta-knowledge?", and even more so, \How do we check theconsistency of this knowledge with the knowledge inthe other wrappings being processed in this system?".We have developed mathematical methods for veri�ca-tion and validation (V&V) of KBSs that apply to thewrappings [4]. The theoretical foundation of this workis the very important tenet that rulebases are models,and development and analysis of them should followsound modeling principles, since building a rulebaseis building a model. Some of the genesis and historyof our approach, and a placement of it in the contextof other V&V research, can be found in earlier pub-lications [1] [8]. In this context, validation refers tothe evaluation of system speci�cations or behavior ac-cording to external desiderata of performance (\is it

tta(

starssaafcaocncrCigciosargttts

itcmscq

3

botlti

0-7695-0493-0/0

he system we intended?"), and veri�cation refers tohe more mathematically formal process of evaluatingsystem according to explicit speci�cations of behavior\is the system correct?").

Rulebases whose problems have su�ciently rich de-criptions carry internal redundancies and other rela-ionships that allow some kinds of errors to be caughtutomatically, simply by a careful examination of theulebase itself, even in the absence of a complete systempeci�cation. Our approach is to use empirical analy-es, collecting frequency and distribution informationbout the terms that appear in the expert system, andttempting to identify the strange behavior by lookingor anomalies in the distributions. Using this approach,ombining this notion with those about global errorsnd principled methodology [1], we de�ned a new setf acceptability principles for rulebases [8]. These prin-iples go beyond concerns over mathematical correct-ess to considerations of the distribution and simplicityonditions that can signal errors or ine�ciencies in theules. The principles we have introduced are calledonsistency, Completeness, Irredundancy, Connectiv-ty, and Distribution. This is part of our overall pro-ram to develop new principles of correctness, developriteria illustrating the principles, develop algorithmsmplementing the criteria, and �nally, to develop meth-ds that help correct the errors. The principles are as-ociated with criteria for acceptability in rulebases, andnalysis algorithms that can check them. They treat aulebase as a formal mathematical structure (usually araph or an incidence matrix), and the analyses showhat the mathematical conditions can be checked e�ec-ively, if not necessarily quickly [8]. We have shownhese analyses to be very helpful in the early designtages of a system development [4].

This section describes some criteria for acceptabil-ty in rulebases, and analysis algorithms that can checkhem. The criteria are organized according to the prin-iples listed above. They treat a rulebase as a formalathematical structure (to be de�ned in the followingections), and the analyses show that the mathematicalonditions can be checked e�ectively, if not necessarilyuickly.

.1. Knowledge-based Analysis Tools

In order to describe our approach to analyzing rule-ases, we start with a careful description of the kindf rulebases we consider. It is a restriction made forhe purposes of this discussion, and not a necessaryimitation on the techniques. We have applied theseechniques to other knowledge-based systems, includ-ng sets of equations.

0 $10.00 (c) 2000 IEEE 3

Proceedings of the 33rd Hawaii International Conference on System Sciences - 2000

A rulebase is a �nite set R of \if-then" rules of theform

� IF hypothesis, THEN conclusion

For example, a rulebase for anomaly detection in space-craft attitude control might contain a rule such as

� rule10: IF thruster = on AND thruster-command= o�, THEN signal anomaly

A clause is either an atomic formula or its negation (forexample, a predicate expression such as \thruster=on"above).

The set V of variables in a rulebase R is �nite. Vari-ables are a little hard to describe, so for now we justnote that \thruster" and \thruster-command" aboveare variables. We will de�ne them more preciselyshortly.

We also make a distinction between the syntacticrestriction of having each variable value in the appro-priate domain (without regard to the rules) and thesemantic restriction that all rules are to be satis�ed. Aprospective situation is an instantiation of (i.e., an as-signment of valid values to) all of the variables, so thateach value is in the appropriate domain. This rep-resents a kind of separate syntactic condition for thevariable values. A situation is a prospective situationwith the further restriction that all the rules are truefor all of the situations. This represents is a kind ofcombined semantic condition for the variable values, ifwe consider the rules to determine the semantics of thevariables. Every variable is considered to be a featureof the situation, with a possibly unknown value in theappropriate domain. The rest of this section explainswhat this restriction means. It is assumed that thereare no unde�ned values.

Each variable v is considered to be a function appliedto situations, so for a situation s, the expression v(s)denotes the value of the variable v in situation s. Moregenerally, for any expression e over a setW of variablescontained in V , the expression e(s) denotes the valueof the expression e in situation s. Rules are implicitlyuniversally quanti�ed over situations. A variable v inthe rulebase is a �xed component selection function vapplied to a logical variable representing the situations. There are no explicit quanti�ers, so all situationvariables are free in the expressions.

The set of situations is therefore a subset of theprospective situations, which is the Cartesian productof all of the variable domains; however, the particu-lar subset is not precisely known, since it is limitedby the rules in the rulebase to only those elements ofthe Cartesian product that satisfy the rules (becausethe rules de�ne the situations). The rules may, for

elTPaSettt

vwvvddtdtvsab

3

hista\tlitpak

cradoagatvt

0-7695-0493-0/0

xample, de�ne connections between variables that al-ow some of the variables to be computed from others.he Cartesian product will occasionally be called theroblem Space, to distinguish it from the set of situ-tions, which is called the Solution Space or Situationpace. Therefore, the syntactic restriction of havingach variable value in the appropriate domain de�neshe prospective situations, and the semantic restrictionhat all rules are to be satis�ed de�nes those prospec-ive situations that are situations.

A rulebase is applied to a situation to compute someariable values (not to set the values, but to �nd outhat the values are), so that a situation has both pro-ided variable values (\input" variables) and derivedariable values, some of which are internal (\interme-iate" or \hidden" variables) and some of which areisplayed (\output" variables). It is further assumedhat the variable values not speci�ed by the input aree�ned but unknown, and that the rulebase is expectedo compute them (or at least to compute the outputariable values). We can model other styles of expertystem this way also, in terms of access to parametersnd processing steps. We have concentrated on rule-ases of this form for convenience in description.

.2. Incidence Matrices and Graphs

We are most interested here in the case in which weave no models or other external criteria for correctnessn the rulebase, since this is the most common case, andince it is the hardest case. We cannot in that case hopeo prove very much, since we quickly run into undecid-bility problems, but we can identify places that arestrange" (for various de�nitions of \strage"). Thenhe rulebase designer can decide whether those anoma-ous conditions belong in the rulebase or not. We arenterested therefore in examining the occurrence pat-erns of various elements in a rulebase, because theyrovide a very gross model of the rules and rule inter-ctions, and because they lead the use of some well-nown mathematical techniques.

We start with a tool that allows us to record pre-isely what we mean by an occurrence and an occur-ence pattern. It is a common tool in combinatorialnalyses called an incidence matrix [23]. It also lets use�ne a graph that allows a di�erent formal analysisf the occurrence patterns. There is a standard equiv-lence between square incidence matrices and graphs,iven by making a vertex for each row or column, andn edge for each nonzero entry in the matrix (some-imes the edge is labeled with the nonzero entry). Con-ersely, the matrix has a row and column for each ver-ex, a zero entry for each non-edge, and a nonzero en-

0 $10.00 (c) 2000 IEEE 4

Proceedings of the 33rd Hawaii International Conference on System Sciences - 2000

try for each edge. The nonzero entry is usually 1 forunlabelled graphs, and it is usually the label for edge-labelled graphs.

The incidence matrix is a good numerical represen-tation of the occurrences of elements in the rulebase(we still haven't said what those elements are), andallows natural numerical summaries and analyses (cor-relations, etc.). The graph is a good topological repre-sentation of the occurrences of elements in the rulebase,and allows natural combinatorial and topological anal-yses (connectivity, etc.). This paper will not discuss indetail the di�erent kinds of graphs that can be formedfrom a rulebase. More details can be found in [8].

The simplest incidence matrix of a rulebase countsthe number of occurrences of variables in rules. Thecounting incidence matrix RV is a matrix indexed byR� V , with

RV (r; v) = no. instances of variable v in rule r;

For example, the rule listed above would have a row for\rule10", and columns for \thruster" and \thruster-command", with both entries 1. There are also inci-dence matrices RC for clause occurrences in rules andCV for variable occurrences in clauses.

Many combinations of incidence matrices have sig-ni�cance for analyzing a rulebase. From this matrixRV , higher products can be computed, for example,giving the number of pairs of variables that appear inthe same rule, or the number of pairs of rules that con-tain the same variable. For example,

� (RV RV tr)(q; r) = number of variables in com-mon between rule q and rule r,

� (RV tr RV )(v; w) = number of rules containingboth variable v and w.

Two simple extensions of RV that are used in the ex-ample analysis separate the rules into hypotheses andconclusions. The matrix V H has a row for each vari-able and column for each rule, with entries that countthe number of occurrences of the variable in the rulehypotheses, and the matrix V C has a row for each vari-able and column for each rule, with entries that countthe number of occurrences of the variable in the ruleconclusions. In particular, we have

RV tr = V C + V H;

where we write M tr for the transpose of a matrix M(interchange rows and columns).

Many other combinations of incidence matrices havesigni�cance for analyzing a rulebase. Furthermore,square matrices are easier to consider as graphs, sincewe can identify rows and columns. For example,

Mur

usidns

irchuicoocIana

av

a

(bsbo

nwi

0-7695-0493-0/

� (V Htr V H)(q; r) = number of variables in com-mon between rule q and rule r hypotheses,

� (V C V Ctr)(v; w) = number of rules containingboth variable v and w in their conclusions,

� (V Ctr V H)(q; r) = number of variables in com-mon between rule q conclusion and rule r hypoth-esis,

any other variations exist also. Among these prod-cts and others, the matrix (V Ctr V H) is particularlyelevant for studying the inference system.Several more detailed incidence matrices are also

seful for rule analyses [8] [4], such as matrices thateparate the roles of variables into reading and writ-ng, and those that consider clauses. There are formale�nitions of reading and writing, but they will not beeeded here. The intuitive notions of using a value andetting a variable will su�ce.A variable v is read by a rule r if v occurs in r, either

n hyp(r) or in conc(r), in such a way that its value isequired. Not every occurrence in conc(r) is a read ac-ess, but every occurrence in hyp(r) is (except when theypothesis expression can always be evaluated withoutsing the value of v). A variable v is written by a rule rf v occurs in conc(r) in a way that allows the value tohange. For example, a variable used only as the sourcef a value is not written, but a predicate that comparesne variable to another, without specifying which onehanges or how they change, is assumed to write both.t should be noted here that \write" only means thatnew value becomes known to the system. There iso change in the value, only in what the system knowsbout it.The variable read matrix Rd is indexed by R � V ,

s is the variable write matrix Wr. For a rule r andariable v, the boolean access matrices are de�ned by

Rd(r; v) =

�1 if rule r reads variable v;0 otherwise;

nd

Wr(r; v) =

�1 if rule r writes variable v;0 otherwise

so Wr need only consider conc(r), but Rd needsoth hyp(r) for condition testing and conc(r) for valueources), and the counting access matrices use the num-er of read and write instances of v in r instead of usingnly their existence.For two variables v and w, (Rdtr)(Wr)(v; w) is

onzero i� there is a rule r which reads v and writes. For two rules q and r, (Wr)(Rdtr)(q; r) is nonzero� there is a variable v which is written by q and read

00 $10.00 (c) 2000 IEEE 5

Proceedings of the 33rd Hawaii International Conference on System Sciences - 2000

by r (if the counting access matrices are used, then theproducts also count how many such rules or variablesthere are). These two product matrices are clearly re-lated to a kind of required ordering between variablesand between rules. Since we assume that the variablevalues do not change, they must be written before theyare read, unless they are provided as input.

We can certainly prove some things about the orig-inal rulebase using the matrices, such as the non-determinability of a variable value (since no rule setsit), or some kinds of deadlock in the rules (since somerules that set a value can only occur after rules thatuse it), but since in general the matrix is a very loosemodel, the results are fairly sparse and weak. Instead,we usually have to fall back on empirical methods, witha corresponding retreat from proofs.

3.3. Association Matrices

An association matrix is a covariance matrix com-puted from occurrence patterns across a set of possiblelocations. The counting incidence matrix product, de-�ned by

(RV )(RV tr)(q; r) = commonality of q and r;

which is the number of variables v common to rulesq and r, counts variables in common to rules, mea-suring the occurrence pattern of a rule according tothe variables it contains. Then the correlations can becomputed from the covariances, in the usual way:

Cor(q; r) = Cv(q; r)=(Sd(q) � Sd(r));

Sd(q) =pCv(q; q);

Cv(q; r) = (RV RV tr)(q; r)=jV j

�Av(q) �Av(r);

Av(q) =X

(vars v 2 V ) RV (q; v)=jV j:

Here, the q row of the counting incidence matrix RVis the occurrence pattern for rule q, so Av(q) is theaverage number of occurrences of each variable in ruleq, and Sd(q) is the standard deviation of the num-bers. There is no random variable here, so there is nopoint in using the \sample standard deviation". Thecorrelation is a measure of similarity between rules, asmeasured by the variables in them. The correlationvalue is 1 if and only if the two rules use exactly thesame variables with the same frequency of occurrenceof each variable. It will be negative, for example, whenthe two rules use disjoint sets of variables, and -1 inrare cases only (not likely in a rulebase).

Similarly, the matrix product (RV tr)(RV ) countsrules that occur in common to two variables for each

0-7695-0493-0/

pair of variables, measuring the occurrence pattern of avariable by the set of rules containing it. Correlationsare computed as before. Other incidence matrices forvariables in clauses and clauses in rules can also be usedin this way.

The use of correlations is in detecting unusual ones.If clause b almost always occurs with c, then somethingshould be noted when they do not occur together. Ifvariable v always occurs with w, then there may be agood reason for combining the variables. There shouldalso be su�cient justi�cation for unusual correlationsor distinctions.

These uses of association and correlation matricescome down to one question (expressed here only forvariables, but equally applicable to clauses or rules orother constructions):

� If v and w are highly correlated, then why are theydi�erent?

Since each covariance matrix above is symmetric andpositive semi-de�nite (as are the corresponding corre-lation matrices), one can consider the matrix to be a\similarity matrix" that indicates which elements aresimilarly distributed. Then the similarity measure-ments contained in the correlation matrix can be usedin a cluster analysis [21]. Clustering methods are sim-ple, and can give useful information, since the clustersof rows in RV (for example) are sets of rules that use(roughly) the same variables. More information on thisapproach can be found in [8].

3.4. Criteria for Rulebase Correctness

This subsection describes some principles of rule-base correctness, and ways to test them for a particularrulebase. There is no description of how to determinewhether or not to test the principles, since that decisionis rulebase dependent. A principle of rulebase correct-ness is a condition on a set R of rules that is requiredfor the rulebase to be reasonable in some incompletelyde�ned sense. This notion is not the same as a prin-ciple of modeling a process or a system by rules (thatstep is hard [1]). Rather, it is a notion of how rules �ttogether into a rulebase.

In order to understand the range of possible criteriafor rulebases, di�erent uses of the rulebase must beconsidered. The major uses considered thus far are asa speci�cation, i.e., a de�nition of classes of situations,or as an evaluation, i.e., a determination of the properclass for a given situation.

These styles of application require two kinds of con-sistency. The speci�cation aspect of a rulebase requiresstatic (or non-procedural) consistency of the rules as

00 $10.00 (c) 2000 IEEE 6

Proceedings of the 33rd Hawaii International Conference on System Sciences - 2000

logical de�nitions. The evaluation aspect of a rule-base requires dynamic (or procedural) consistency ofthe rules as algorithm de�nitions (remember, the infer-ence engine is considered to be part of the rulebase).

Static analysis refers to the examination of the rulesas separate symbolic expressions, without stringingthem together, dynamic analysis refers to the inter-actions of the rules during inference, logical analysisrefers to the form of the rules as mostly uninterpretedexpressions, and statistical analysis refers to the valuesof the variables.

The �ve principles are as follows:

� Consistency (no con ict),

� Completeness (no oversight),

� Irredundancy (no super uity),

� Connectivity (no isolation), and

� Distribution (no unevenness).

These principles are implemented by many criteria forrulebase correctness. The criteria are separated intoclasses, according to the principles they implement.

Analyzing a rulebase for these principles involves im-plementing mathematical and computational criteriathat check for the application of each principle. TheConsistency criteria, which address the logical consis-tency of the rules, can rightly be considered as \correct-ness" criteria. The Completeness and Irredundancycriteria, which preclude oversights in speci�cations andredundancy in the rules, are more like \reasonableness"criteria for the terms in the rules. The Connectivitycriteria, which concern the inference system de�ned bythe rules, are like completeness and irredundancy cri-teria for the inference system. Finally, the Distributioncriteria are \esthetic" criteria for simplicity of the rulesand the distinctions they cause, as well as the distri-bution of the rules.

The �rst three principles, Consistency, Complete-ness, and Irredundancy, are not discussed in detail inthis paper, since they are relatively easy to explain andhave appeared in other rulebase V&V research. TheConnectivity criteria were discussed earlier [8], so theytoo will be described only brie y here. The rest ofthis section discusses the simplest of the correspondingcriteria, without much detail. The Distribution andSimplicity principles are discussed in detail in [8].

The Consistency principle leads to criteria that in-volve some kind of lack of con ict among rules. Thesituations should be well de�ned, as should all the in-teresting variable values. The criteria will not be listedhere, as they correspond to easy syntactic checks. Forexample, if a rule also checked \thruster = on" and

\o\

vbkftnta

sgmaoi

4

ptcswRsntf

widcifl

ccrtuscilss

0-7695-0493-0/0

thruster-command = o�", but set \signal ok" insteadf \signal anomaly", then that rule is inconsistent withrule10".The Completeness principle leads to criteria that in-

olve some kind of universal applicability of the rule-ase. Defaults are usually used to guarantee certaininds of completeness. All detectable places where de-aults will be used should be signaled, since some ofhese places may only indicate undesired incomplete-ess in a rulebase, instead of places that are expectedo be �xed by the use of defaults. These criteria willlso not be listed here.The Irredundancy principle leads to criteria that in-

ist that everything in the rulebase is there for someood reason. The variables make a di�erence, the rulesake a di�erence, and nothing is extraneous. Therere good reasons to note redundancies as more danger-us than mere ine�ciency; they are often the result ofncomplete models or modeling criteria [1].

. Discussion

The analyses we describe above can clearly be ap-lied to Wrapping Knowledge Bases [4], by translatinghe wrappings into \if-then" type rules. In that way, wean learn many things about the overall shape of theystem: whether certain resources can ever be used,hat classes of resources use what other classes, etc..emember, we are most interested in systems that areo large, heterogeneous, and dynamic, that we do notecessarily know all of the components in advance, sohese analyses cannot be complete decision proceduresor the corresponding questions.Of course, it is hard to analyze a system in whiche do not know all of the components, but we canmagine that there might be a planner whose problemecompositions depend on being able to �nd certainomputational services, or a component whose task its to �nd certain services. Then, since we have criteriaor what these components are looking for, we can ateast perform some rudimentary analyses.There are a few kinds of problems with constructed

omplex systems that we have not discussed: mainlyommunication issues (are the right bytes getting to theight places?) and timing issues (are there any impor-ant timing anomalies?). We can study both of thesesing a wrapping infrastructure also, but it requiresome more careful modeling and meta-modeling. Wean imagine \shadow" resources that are applied onlyn a \simulation" mode, but with exactly the same se-ection and application criteria. Their purpose is toimulate the components they shadow, performing onlyome of the calculations, but taking, as accurately as

0 $10.00 (c) 2000 IEEE 7

Proceedings of the 33rd Hawaii International Conference on System Sciences - 2000

possible, the same amount of time. Then simulationmode can be used to try to �nd timing anomalies anddanger spots. The system can also be driven using asimulation clock in the usual discrete event simulationmanner.

Similarly, at run time, the complete mediation ofthe wrapping infrastructure means that various kindsof correctness or partial correctness criteria can be ap-plied as part of the resource selection process. Thismeans that we can insert instrumentation, data diver-sion, security policy validation, or any other kind ofinter-component marking procedure between any com-ponent invocations. This ability gives us a great exi-bility in studying the behavior of complex distributedsystems.

We believe that using a wrapping-based architec-ture will help to make system-level speci�cation moretransparent and analyzable, and system-level veri�ca-tion and testing more reliable.

References

[1] Kirstie L. Bellman, \The Modelling Issues In-herent in Testing and Evaluating Knowledge-based Systems", pp. 199-215 in Chris Culbert(ed.), Special Issue: Veri�cation and Valida-tion of Knowledge Based Systems, Expert Sys-tems With Applications Journal, Volume 1, No.3 (1990)

[2] Kirstie L. Bellman, \An Approach to Integrat-ing and Creating Flexible Software EnvironmentsSupporting the Design of Complex Systems", pp.1101-1105 in Proceedings of WSC '91: The 1991Winter Simulation Conference, 8-11 December1991, Phoenix, Arizona (1991); revised versionin Kirstie L. Bellman, Christopher Landauer,\Flexible Software Environments Supporting theDesign of Complex Systems", Proceedings of theArti�cial Intelligence in Logistics Meeting, 8-10March 1993, Williamsburg, Va., American De-fense Preparedness Association (1993)

[3] Kirstie L. Bellman, April Gillam, ChristopherLandauer, \Challenges for Conceptual DesignEnvironments: The VEHICLES Experience",Revue Internationale de CFAO et d'Infographie,Hermes, Paris (September 1993)

[4] Kirstie L. Bellman, Christopher Landauer, \De-signing Testable, Heterogeneous Software Envi-ronments", pp. 199-217 in Robert Plant (ed.),Special Issue: Software Quality in Knowledge-Based Systems, Journal of Systems and Software,Volume 29, No. 3 (June 1995)

[5] Dr. Kirstie L. Bellman, Captain Al Reinhardt,USAF, \Debris Analysis Workstation: A Mod-elling Environment for Studies on Space Debris",

[

[

[

[

[

0-7695-0493-0/0

Proceedings of the First European Conference onSpace Debris, 5-7 April 1993, Darmstadt, Ger-many (1993)

[6] Richard Bellman, P. Brock, \On the concepts of aproblem and problem-solving", American Math-ematical Monthly, Volume 67, pp. 119-134 (1960)

[7] Gregor Kiczales, Jim des Rivieres, Daniel G. Bo-brow, The Art of the Meta-Object Protocol, MITPress (1991)

[8] Christopher Landauer, \Correctness Principlesfor Rule-Based Expert Systems", pp. 291-316 inChris Culbert (ed.), Special Issue: Veri�cationand Validation of Knowledge Based Systems, Ex-pert Systems With Applications Journal, Volume1, No. 3 (1990)

[9] Christopher Landauer, Kirstie L. Bellman,\Knowledge-Based Integration Infrastructure forComplex Systems", International Journal of In-telligent Control and Systems, Volume 1, No. 1,pp. 133-153 (1996)

10] Christopher Landauer, Kirstie L. Bellman, \In-tegration Systems and Interaction Spaces", pp.161-178 in Proceedings of FroCoS'96: The FirstInternational Workshop on Frontiers of Combin-ing Systems, 26-29 March 1995, Munich, Ger-many (March 1996)

11] Christopher Lan-dauer, Kirstie L. Bellman, \Model-Based Sim-ulation Design with Wrappings", pp. 169-174 inProceedings of OOS'97: Object Oriented Simu-lation Conference, WMC'97: 1997 SCS WesternMulti-Conference, 12-15 January, Phoenix, SCSInternational (1997)

12] Christopher Landauer, Kirstie L. Bellman,\Wrappings for Software Development", pp. 420-429 in 31st Hawaii Conference on System Sci-ences, Volume III: Emerging Technologies, 6-9January 1998, Kona, Hawaii (1998)

13] Christopher Landauer, Kirstie L. Bellman,\Where's the Math? The Need for New Math-ematical Foundations for Constructed ComplexSystems", to appear in Proceedings ICC'98: the15th International Congress on Cybernetics, 23-27 August 1998, Namur, Belgium (1998)

14] Christopher Landauer, Kirstie L. Bellman,\Generic Programming, Partial Evaluation, anda New Programming Paradigm", Paper etspi02in 32nd Hawaii Conference on System Sciences,Track III: Emerging Technologies, Software Pro-cess Improvement Mini-Track, 5-8 January 1999,Maui, Hawaii (1999); revised and extended ver-sion in Christopher Landauer, Kirstie L. Bell-man, \Generic Programming, Partial Evalua-tion, and a New Programming Paradigm", Chap-ter 8, pp. 108-154 in Gene McGuire (ed.), Soft-

0 $10.00 (c) 2000 IEEE 8

Proceedings of the 33rd Hawaii International Conference on System Sciences - 2000

ware Process Improvement, Idea Group Publish-ing (1999)

[15] Christopher Landauer, Kirstie L. Bellman,\Problem Posing Interpretation of ProgrammingLanguages", Paper etecc07 in 32nd Hawaii Con-ference on System Sciences, Track III: Emerg-ing Technologies, Engineering Complex Comput-ing Systems Mini-Track, 5-8 January 1999, Maui,Hawaii (1999)

[16] Christopher Landauer, Kirstie L. Bellman, AprilGillam, \Software Infrastructure for System En-gineering Support", Proceedings AAAI'93 Work-shop on Arti�cial Intelligence for Software Engi-neering, 12 July 1993, Washington, D.C. (1993)

[17] Pattie Maes, D. Nardi (eds.), Meta-Level Archi-tectures and Re ection, Proceedings of the Work-shop on Meta-Level Architectures and Re ec-tion, 27-30 October 1986, Alghero, Italy, North-Holland (1988)

[18] Lawrence H. Miller, and Alex Quilici, \AKnowledge-Based Approach to EncouragingReuse of Simulation and Modeling Programs",in Proceedings of SEKE'92: The Fourth Interna-tional Conference on Software Engineering andKnowledge Engineering, IEEE Press (June 1992)

[19] Mary Shaw, David Garlan, Software Architec-ture: Perspectives on an Emerging Discipline,Prentice-Hall (1996)

[20] Mary Shaw, William A. Wulf, \Tyrannical Lan-guages still Preempt System Design", pp. 200-211 in Proceedings of ICCL'92: The 1992 In-ternational Conference on Computer Languages,20-23 April 1992, Oakland, California (1992); in-cludes and comments on Mary Shaw, WilliamA. Wulf, \Toward Relaxing Assumptions in Lan-guages and their Implementations", ACM SIG-PLAN Notices, Volume 15, No. 3, pp. 45-51(March 1980)

[21] R. Sibson, \SLINK: An Optimally E�cient Algo-rithm for the Single-Link Cluster Method", Com-puter J., Vol. 16, pp. 30-34 (1973)

[22] Donald O. Walter, Kirstie L. Bellman, \Some Is-sues in Model Integration", pp. 249-254 in Pro-ceedings of the SCS Eastern MultiConference, 23-26 April 1990, Nashville, Tennessee, SimulationSeries, Volume 22, No. 3, SCS (1990)

[23] Wilson, Robin J., Introduction to Graph Theory,Longman, London (1972)

0-7695-0493-0/00 $10.00 (c) 2000 IEEE 9