Download pdf - Roles of design knowledge in knowledge-based systems

Int . J . Human – Computer Studies (1996) 44 , 689 – 721

Roles of design knowledge in knowledge-based systems

M ICHEL B ENAROCH

School of Management , Syracuse Uni y ersity , Syracuse , NY 1 3 2 4 4 , USA . email : mbenaroc ê mailbox .syr .edu

( Recei y ed 1 February 1 9 9 5 and accepted in re y ised form 2 8 No y ember 1 9 9 5 )

Recent research suggests that the abilities of a knowledge-based system (KBS) depend in part on the amount of explicit knowledge it has about the way it is designed . This knowledge is often called design knowledge because it reflects design decisions that a KBS developer makes regarding what ontologies to embody in the system , what solution strategies to apply , what system architecture to use , etc . This paper examines one type of design knowledge pertaining to the structure underlying the solutions a KBS produces . (For example , in medical diagnosis , the output might be just a disease name , but the solution is actually a causal argument that the system implicitly constructs to find out how the disease came about . ) We define this type of design knowledge , show how it can be represented , and explain how it can be used in problem solving to make the structure underlying solutions explicit . Subsequently , we also present and illustrate new avenues that the availability and use of the design knowledge discussed open with respect to the ability to build KBSs that possess strong explanation capabilities , are easier to maintain , support knowledge reuse , and of fer more robustness in problem solving . ÷ 1996 Academic Press Limited

1 . Introduction

Knowledge-based system (KBS) designers are usually concerned with producing systems that meet certain design goals . These goals include providing KBSs with the ability to generate good explanations , to be robust (i . e . avoid brittleness and exhibit novelty) in problem solving , and to capture knowledge in a way that simplifies its maintenance and facilitates its reuse .

Recent research suggests that , the more a KBS knows about the way it is designed , the better it can meet these design goals . For instance , Falkenhainer & Forbus (1991) show how a KBS can address novel problems when it knows what assumptions underlie the domain models its uses and how the knowledge base (KB) organizes these models in relation to the assumptions . Likewise , Swartout , Paris & Moore (1991) illustrate the way a KBS can generate explanation dialogues that address follow-up questions of the user , provided that the system knows what its explanation capabilities are designed to say and how .

Knowledge pertaining to the way a system is designed is generally referred to as design knowledge . Chandrasekaran & Swartout (1991) explain the intuition behind design knowledge as follows . In the KBS design process , a designer brings to bear substantial amounts of knowledge about the subject matter of the application task , the nature of the task , how its parts work together to accomplish its goal , the range of solution strategies applicable , plausible KBS architectures , etc . KBS design is thus viewed as the process of making specific design decisions that produce the KBS

689

1071-5819 / 96 / 050689 1 33$18 . 00 / 0 ÷ 1996 Academic Press Limited

M . BENAROCH 690

sought (e . g . how to represent needed knowledge and what solution strategies to use) , and design knowledge is simply a ‘‘documentation’’ of these decision in the form of diagrams , narrative descriptions , and the like . These authors recognize that design knowledge is open-ended and cannot be all represented explicitly .

We propose in this paper that a good starting point for attempts to capture design knowledge has to do with Clancey’s (1992) recent realization , namely : the output of KBSs is normally a rational argument that explains their solution , not just a solution . For example , in diagnosis the output is typically a causal argument having the structure of a proof , and in design it is a logical argument having the structure of a plan . In this respect , Clancey observed two things . The structure of the explanatory arguments a KBS implicitly or explicitly constructs depends on the application task being addressed . Moreover , whether a KBS makes these explanatory arguments and their underlying structure explicit depends on the way the KBS is designed . In light of these observations , it is intuitively appealing to try to document design decisions pertaining to a specific KBS in relation to the structure of explanatory arguments which that KBS seeks to construct . In other words , if the general goal of a KBS is to construct explanatory arguments having a specific structure , the KBS design endeavor can be viewed as a goal-driven process involving design decisions concerning how to represent the explanatory arguments sought , how to represent domain knowledge needed to construct them , what strategies to use to control their construction process , and so on .

This paper hence focuses on the representation of design knowledge pertaining to the structure underlying the explanatory arguments KBSs construct , and the roles that this design knowledge can play with respect to the ability of KBSs to meet the above design goals . Most of the examples we use are from the domain of medical diagnosis , and in particular NEOMYCIN (Clancey , 1988) , simply because this domain has been studied extensively in the KBS literature . The paper proceeds as follows . Section 2 identifies specific design decision which determine the structure underlying the explanatory arguments a KBS seeks to construct for its task . It also explains how design knowledge reflecting these decisions can be represented . Section 3 presents a task-independent KBS architecture that uses this design knowledge to drive the construction process of explicit explanatory arguments . Section 4 presents avenues that the availability and use of the design knowledge discussed open with respect to the aforementioned design goals . It illustrates these avenues by looking at how several recently build KBSs work in terms of the way they capture and utilize design knowledge . Section 5 discusses the way our work relates to previous work . Section 6 provides some concluding remarks and discusses several future research questions .

2 . Design knowledge and situation-specific models

This section explains how design knowledge relating to the structure underlying the explanatory arguments KBSs construct can be represented . It starts with an example that illustrates how such explanatory arguments typically look like . Upon examining the structure underlying these arguments , it identifies specific design decisions that determine this structure . Finally , it explains how design knowledge reflecting these decisions can be represented .

ROLES OF DESIGN KNOWLEDGE IN KBSs 691

2 . 1 . SITUATION-SPECIFIC MODELS

For any given problem situation , most KBSs implicitly or explicitly create a rational argument which explains their solution for that situation . Since this explanatory argument is a model that captures what a KBS knows about the specific situation addressed , it is often referred to as a situation - specific model ( SSM ) . Thus , while the KB of a system contains knowledge that applies to all problem situations the system is set to address , an SSM contains knowledge that was derived based on the KB and applies only to a specific problem situation .

For example , consider a medical diagnosis task that seeks to identify the disease causing certain patient symptoms as well as find out how this disease came about . Figure 1 respectively shows part of the SSM NEOMYCIN constructs for one specific diagnosis situation , some of the domain knowledge it uses for this purpose , and a trace of how this SSM is constructed . This sample SSM is in fact a directed graph linking diseases , symptoms , pathological bodily structures , etc . Once complete , this SSM would essentially constitute a causal argument which shows that a certain organism (e . g . E .coli ) has entered the body (e . g ., during some surgical procedure) , migrated to a specific organ system (e . g . Meninges of the brains) where it proliferated because the normal immune system is suppressed , inducing a disease (e . g . Bacterial-Meningitis) that causes the pathological bodily structures and symptoms (e . g . durable headaches) observed in the patient .

Believing that a KBS such as NEOMYCIN uses a rational reasoning process to construct SSMs like the one in Figure 1 raises questions regarding the form of these SSMs . Specifically : are all these SSMs directed graphs? if so , do they have a common underlying structure? if so , how does this structure relate to the nature of domain knowledge and / or application task? is this structure determined by design decisions that a system developer makes in the KBS design process? and , if so , what are these design decisions and how can we represent them? The rest of this section answers these questions , in relation to the type of design knowledge discussed in this paper .

2 . 2 . DESIGN DECISIONS AND THE STRUCTURE OF SITUATION-SPECIFIC MODELS

KBS builders make various design decisions during the KBS design process . Some of these decisions are closely related to conceptual aspects having to do with the way the universe of discourse is modeled , while others are closely related to technical aspects of the prospective KBS (e . g . what knowledge representation formalism to use) . The former design decisions are known to be more critical to the abilities that the KBS will possess ; they usually ought to precede the more technically oriented design decisions (Newell , 1982) .

In relation to the structure underlying SSMs , we focus on conceptual design decisions pertaining to choices that a KBS designer makes with respect to the four conceptual levels presented in Figure 2— epistemology , ontology , perspecti y e and instance (Brachman , 1979 ; Davis , Shorbe & Szolovits , 1993) . Epistemology choices relate to the way knowledge pertaining to the phenomena of interest is expressed , e . g . using relational networks or neutral networks . As can be seen from the domain knowledge presented in Figure 1 , NEOMYCIN ’s designers chose to express knowledge as taxonomic relational networks that classify and link agents , diseases , symptoms , etc .

M . BENAROCH 692

Situation-Specific Model

induce relationsubsumption relationsubtype relationcausal relation

disease

finding/symptompathological bodilystructure and/or findingagent (e.g. organism)inducing a disease

E.coli Acute-Bacterial-Menegitis

High grade fever

CNS (headache) duration

Acute-Meningitis

12 hours headaches

stiff neck on flexation

56

7

...

9

1

2

4

3

...Intracranial-Mass-Lesion

Intracranial-Tumor

Increased-Intracranial-Pressure

Seizure

Infectious Disease

Domain KnowledgeSurgery

subsumes

Neurosurgery Cardiosurgery

Recent-Neurosurgery

Ventricular-Ureteral-Shunt

Bacteriasubtype

Gram-Neg-Rod Gram-Pos-Rod

E.coli Klebsiella-Pnumonaie

induces

Disease

Congenial Infectious

Meningitis

Acute Chronic

bacterial viral

... ...

causes

subtypeFinding/Symptom

subsumes

fever headaches

high grade CNS duration

10

...

Meningitis

8

fever

...

...

......

...

...

Partial Trace of the Construction Process

1 The ‘‘12 hours headaches’’ initial patient symptom is mapped to the more abstract ‘‘CNS headache duration’’ finding .

2 The ‘‘12 hours headaches’’ suggests that the disorder might be Meningitis . 3 Meningitis can be hypothesized if the patient experiences a ‘‘stif f neck on flexion’’ .

This triggers the question : ‘‘Does the patient have a ‘‘stif f neck on flexion’’?’’ . The user’s answer is YES .

4 Refinement , or specialization of the Meningitis hypothesis generates the more specific hypothesis Acute-Meningitis .

5 Similar refinement of Acute-Meningitis generates the more refined hypothesis Acute-Bacterial-Meningitis .

6 Support for Acute-Bacterial-Meningitis using known symptoms is found as the ‘‘CNS duration’’ finding .

7 An attempt to further support Acute-Bacterial-Meningitis triggers the question : ‘‘Does the patient have a high-grade fever?’’ The answer is 105 . 8 Farenhit , which is categorized as the more abstract ‘‘high-grade fever’’ finding .

8 A dif ferent attempt is made to support Meningitis by looking for categorical evidence for a more general hypothesis . The attempt identifies Infectious-Disease as a plausible hypothesis .

9 Infectious-Disease implies the presence of certain findings , which may be already known . The ‘‘high-grade fever’’ is thus found to be subsumed by ‘‘fever’’ , providing support to the presence of the needed ‘‘fever’’ symptom .

10 Infectious-Disease is hypothesized , calling for the search for additional findings that may be consistent with the diagnosis so far .

11 . . .

F IGURE 1 . Part of a situation-specific model (SSM) constructed by NEOMYCIN , some of the domain knowledge used to construct that SSM , and partial trace of the construction process .


Ways to express knowledge about the "world"

Relational Networks Neural Networks Symbolic Networks Natural Language

Taxonomic Compositional Transitional

Process Structure Function Structure Causal Discourse-state

Normal Abnormal process(e.g. disease)

Chronological process(e.g., staged failure)

Abnormal Interactive-Historic process

Actor(e.g., organism)

Spatial(e.g., organ system)

Temporal Malfunction(e.g., disease)

Manifestation(e.g., symptom)

overload(e.g., Psychogenic)

infectious(e.g., Bacteria)

invalid input orenvironment(e. g., Toxic)

developmental(e.g., Congenital)

infection

Meneigitis

Viral

Acute Chronic

Bacterial

Design Choices

Epistemology

Ontology

Perspective

Instancecancer

F IGURE 2 . Design choices (epistemic , ontological , perspective and instance choices) that the designer of a KBS makes .

Ontology choices determine the specific ontological entities and relations used to model the phenomena of interest . In NEOMYCIN , the ontology of choice is labeled ‘‘Abnormal Interactive-Historic process’’ in Figure 2 . Using this ontology , diseases are modeled as malfunctions , which are the result of abnormal processes involving certain generic ontological entities (actor , location , manifestation , etc . ) that follow particular causal scripts and produce specific interaction histories with the body . Depending on the application task , these generic entities might represent dif ferent things . For example , in medical diagnosis an actor may be an organism , while in the diagnosis of electronic devices it may be an electric serge . Hence , for reasons of clarity , a KBS designer might choose to associate generic ontological entities with task-specific synonyms and descriptors . For example , in NEOMYCIN , ‘‘etiological agent’’ is used as a task-specific synonym of ‘‘actor’’ , and ‘‘common-unlikely’’ is used as a descriptor corresponding to an attribute of agents that allows to distinguish between agents that are common or unlikely inducers of diseases (e . g . a Gram- Positive-Rods bacteria is unlikely to induce a Pelvic Abscess) . Figure 3 shows other synonyms and descriptors commonly used in the context of medical diagnosis .

Perspecti y e choices determine the facets from which relevant phenomena are examined . Granted that these phenomena can be modeled using the various ontological entities implied by ontological choices , a KBS designer might choose to focus only on those phenomena facets that can be modeled using a specific subset of

M . BENAROCH 694

Abnormal Interactive-Historic process

Actor(inducer)

Location(spatial)

Time(temporal)

Malfunction(inducee)

Manifestation(causal)

etiologicalagent

entrypoint

residence site(organ system)

entrytime

duration ofresidence

disease pathologicalbodily

structure

finding(symptom, test

result, etc)

common inducer vs.unlikely inducer

recent vs.non-recent

treatable vs.untreatable

enabling cause (i.e., a stepin a disease process) vs.

circumstantial cause (for adisease process)

normal vs.abnormal

Ontology

Generic Ontological

Entities(perspectives)

Task-SpecificSynonyms

Task-SpecificDesciptiors

involves

have

associated with

F IGURE 3 . Sample task-specific synonyms and descriptors (i . e . attributes) of generic ontological entities implied by the ‘‘Abnormal Historic-Interactive Process’’ ontology applied to a medical diagnosis task .

these entities . In NEOMYCIN , one of the perspectives of choice is that which looks at diseases in terms of the actors (agents) inducing them , and one of the perspectives ignored is that which focuses on the location of diseases . The latter observation is apparent from the fact that , unlike in INTERNIST II (Pople , 1982) , the domain knowledge NEOMYCIN uses does not include a taxonomy of organ systems and the diseases they can involve .

Instance choices determine the instances of ontological entities that the KB knows about . In Figure 2 , branches below the identified ontological entities correspond to specific relational networks , and choices at this level determine the instances contained in these networks . In NEOMYCIN , one of the design choices corresponding to this level excludes from the disease taxonomy the various types of Viral- Meningitis (since these are all treated similarly) . This choice is apparent in Figure 2 through the fact that the node labeled ‘‘viral’’ has no children .

When these conceptual design choices are made for a specific KBS , they imply much of the structure underlying SSMs which that KBS would construct . More specifically , this structure is determined by epistemic , ontological and perspective choices as well as by the modality of the application task . For example , Figure 4 shows part of the SSM structure implied by the ontological choice labeled ‘‘Abnormal Interactive-Historic process’’ in Figure 2 , for the medical diagnosis task addressed by NEOMYCIN . This ‘‘generic’’ SSM is , and any of its instances would be , a directed graph , because its underlying epistemology is that of relational networks . Further , its nodes represent generic entities specific to its underlying ontology and perspectives . Finally , most of its links represent generic relations specific to its underlying ontology , and some of its links (ones with labels in parentheses) represent relations implied by the task-specific organization of instances (e . g . the ‘‘subtype’’ link coming out of the ‘‘malfunction’’ node is implied by the fact that diseases can be conceptually organized in a specialization taxonomy) . Relative to this generic SSM , the SSM instance shown in Figure 1 does not contain certain types of nodes because it is incomplete and because NEOMYCIN ’s designers chose to disregard some perspectives (e . g . organ system) .


(Subsume)

Location(organ system)

MoveLocation

(entry point)

MoveActor(agent)

(Sub-type)

InduceMalfunction

(disease)

(Sub-type)

(Sub-type)

Cause

CauseManifestation

(pathologic structure)

(Subsume)

Manifestation(finding/symptom)

(Subsume)

F IGURE 4 . Part of the generic SSM implied by the ‘‘Abnormal Historic-Interactive Process’’ ontology , in case of a medical diagnosis task like the one addressed in NEOMYCIN . Nodes represent generic ontological entities , whose task-specific synonyms are depicted in parenthesis . Non-vertical links between nodes depict ontological relations indicating that diseases are modeled as malfunctions that follow a certain generic causal script—an agent enters the body , moves to an organ system where it proliferates , induces a disease which , in turn , causes pathological bodily structures and symptoms . Vertical links depict task-specific relations indicating that instances of certain entities bear specialization and subsumption

relations among them .

Because most conventional KBSs capture knowledge using relational networks , we will focus in the rest of the paper on the relational networks epistemology .

2 . 3 . CAPTURING CONCEPTUAL DESIGN DECISIONS AS DESIGN KNOWLEDGE

A generic SSM can be described in terms of the axioms (assertions , constraints , assumptions) with which its instances must comply . Some of these axioms are implied by ontology choices , while others are implied by the task’s modality . For example , a few of the axioms needed to describe the generic SSM form shown in Figure 4 are as follows .

(a) Every terminal node (i . e . symptom) in the SSM is an abnormal finding for which there must be a causal link to a supporting disease or pathological bodily structure that explains it .

(b) A disease node is always the father of a finding node (i . e . causality is assumed to be uni-directional) .

(c) The root of a sub-graph in an SSM must be a treatable disease (to enable the prescription of drugs) .

(d) The root of a sub-graph in an SSM must be the most specific disease possible (so that the most correct drugs can be prescribed) .

(e) The SSM must have a single root (to minimize the number of drugs prescribed) , and that root must ‘‘contain’’ every known abnormal finding .

The first two axioms are ontological axioms in that they describe the general nature of entities and relations involved . The last three axioms are modality axioms

M . BENAROCH 696

that apply in the case of a task whose goal is to diagnose and prescribe a cure ; they describe the nature of ontological entities with respect to task-specific descriptors (e . g . axiom (c) with respect to treatability) and relations (e . g . axiom (d) with respect to specialization relations between diseases) .

The axioms needed to describe a generic SSM can be expressed using predicate calculus as quantified logical sentences of the form :

(( quantifier o 1 , . . . , o n ( π o 1 , . . . , o n )[[ ∧ , ∨ , — l ]( π ? ? ? )]) é

( quantifier o 1 , . . . , o n ( π o 1 , . . . , o n )[[ ∧ , ∨ , — l ]( π ? ? ? )])) ,

where o 1 , . . . , o n are ontological entities , π is an n -ary function constant or relational constant , and [ ] is an optional sentence . For example , the axioms labeled (a) and (d) above can be expressed as :

(a 9 ) (( ; F (Present F ) ∧ ( abnormal F )) é ( ' D ( Cause D , F ))) , (d 9 ) ( ; D (Present D ) ∧ (Root D )) é ( — ı

' D 1 (Subtype D , D l ))) ,

where F is a finding , D and D 1 are diseases , abnormal is a relational constant whose argument is an abnormal finding , Present and Root are function constants that check whether their argument satisfies certain conditions (with respect to an SSM instance) , and Cause and Subtype are relational constants .

A KBS that captures the underlying structure of SSMs it constructs as a set of such axioms can be said to also capture (implicitly) design decisions that imply this structure . The axioms used to describe a generic SSM , say the one in Figure 4 , reflect epistemic , ontological and perspective choices in three ways . First , as axioms (a 9 ) and (d 9 ) show , the vocabulary used to express axioms reflects the epistemology chosen through reference to properties of relational networks (e . g . root , father) . Second , the ontology chosen is visible because the same vocabulary also involves ontology-specific entities and relations (e . g . finding , disease , cause) , and because of the ontology-specific nature of phenomena the axioms define using these entities and relations . Finally , perspective choices are reflected by the subset of ontological entities involved in the specific set of axioms defining the generic SSM applicable . Because each ontological entity is identiefid with a specific perspective , the lack of axioms that refer to a certain entity would indicate the exclusion of the perspective corresponding to that entity . (Recall that choices relating to the instance level are apparent only from the content of the KB itself . )

Following this observation , design knowledge , denoted D , that reflects the conceptual design decisions discussed is defined to include two things . One is the set of axioms used to describe the generic SSM applicable , denoted D A . The other is the vocabulary used to express these axioms , denoted D V .

Given that D A can be expressed using a vocabulary that pertains to the epistemology and ontology levels , what vocabulary must be used to express domain knowledge , denoted K , and strategic knowledge that controls the way K is applied in the SSM construction process , denoted S ? The answer to this question has implications on the ability of a system to use D for any meaningful purpose . For example , in MYCIN (Shortlif fe , 1976) , rules comprising K are expressed using a vocabulary that pertains only to the instance level (i . e . rules do not refer to the type of ontological entity each instance stands for) , and the inference engine embedding S (rule-chaining and conflict resolution strategy) is built to reason with knowledge


that is expressed using this vocabulary . Hence , even if the relevant D were available to MYCIN , it would not be usable because it cannot make ‘‘contact’’ with K and S .

If D is to be usable , K and S have to be expressed using a vocabulary that also pertains to the epistemology and ontology levels . Given that under the relational networks epistemology K is essentially a set of directed graphs , such a vocabulary could be as follows . A node in a directed graph could be represented as the triplet ( o i [ d ? ? ? ]) , where o is an ontological entity , i is an instance of a , and [ d ? ? ? ] are optional descriptors of o . For example , the node ( F High - Grade - Fe y er abnormal ) represents the abnormal finding High-Grade-Fever . A link could be represented as a sentence ( π node 1 ? ? ? node n ) , where π is a relational constant over n nodes , and as such it can be said to coincide with a node - chain , denoted node 1 ? ? ? node i 5

node i 1 1 ? ? ? node n . For example , the link ( Cause ( D Meningitis ) ( F High - Grade - Fe y er abnormal )) represents a binary relation , and it coincides with the node-chain ( D Meningitis ) 5 ( F High - Grade - Fe y er abnormal ) . Following this convention , o 1 ? ? ? 5 o i 5

o i 1 1 ? ? ? o n will denote the ontological entities in a node-chain (e . g . D 5 F in the above example) . As to operators that apply K in the SSM construction process and the S used to control this process , we will see later how these are expressed using the same vocabulary .

At this point the question is this : how can a KBS use D , and why should it use D ? The first part of this question is answered in the next section . The second part will be addressed in Section 4 , once we understand the role that D can play in problem solving .

3 . Using design knowledge in problem-solving

This section presents a KBS architecture that uses D to direct the construction of explicit SSMs . The idea behind the architecture can be explained using the problem-space metaphor (Rich , 1983 ; p . 25) . Since the goal is to create an SSM instance for a specific problem , we view D A as implicitly defining the goal knowledge state about the problem , and the evolving SSM instance being created as defining the current problem-space state . As Figure 5 shows , whatever is the current problem-space state , state inference operators are first used to identify gaps between this state and the goal state (i . e . axioms in D A that the SSM violates) . Next , when gaps are detected , search control operators are used to choose which gap will be pursued first . Then , search operators that apply K (instantiate elements in K , derive new elements based on K , acquire input suggested by K , etc . ) are used to derive the knowledge needed to update the SSM and eliminate the gap pursued . Since the updated SSM defines a new problem-space state for which new gaps might be detected , this iterative process would be repeated until all gaps are eliminated (i . e . the SSM violates no axioms in D A ) . This idea can be illustrated using the medical diagnosis example depicted in Figure 1 . Starting with an SSM containing only one node for the symptom the system accepts as input , one of the violated axioms is : ‘‘a finding must be linked to a causing disease’’ (axiom (a) in Section 2 . 3) . Supposing that this were the only violated axiom , the system would search K to find a disease that might explain the symptom currently present in the SSM . In turn , the search result would be used to update the SSM , and the process would be repeated .

The architecture we present involves an iterative inference procedure that views

M . BENAROCH 698

Srategic Planning Plane (S-Plane)

(2) identify "search" operators that can apply K to derive knowledge elements necessary to resolve reported violations, and

(3) use S to select which of these operators is to be triggered

S-SSM S-SSM update operators

S search control operators

Inference Plane (I-Plane)(1) analyze the current I-SSM to identify violations of the generic I-SSM form (stop when no violations are found)

(5) update the cuurent I-SSM

currentI-SSM

state difference operators

I-SSM update operators

Domain Knowledge Plane (K-Plane)(4) triggered search operator applies K-- instantiate knowledge elements in K, derive new elements based on K, acquire input suggested by K, etc"search" operatorsK

reportviolations

report"search"results

triggerchosen"search"operator

F IGURE 5 . A KBS architecture for reasoning with design knowledge . The underlying inference procedure is a five-step iterative process that views the knowledge involved as though it resides in three dif ferent

planes .

the knowledge in a KBS as though it spans the three planes shown in Figure 5— domain knowledge plane ( K - plane ) , inference plane ( I - plane ) , and strategic - planning plane ( S - plane ) . The K - plane contains K and ‘‘search’’ operators that apply K . The I - plane contains D A , the SSM instance being constructed , and operators for analysing and updating this SSM . The S - plane contains S , operators that use S to control the triggering order of search operators , and another type of SSM that models the KBS’ ongoing behavior for the case tackled . To avoid confusion between the SSMs constructed in the I-plane and S-plane , we will refer to them as I - SSM and S - SSM , respectively . † We next elaborate on the activities taking place in these planes , following the order of steps in the inference procedure depicted in Figure 5 .

3 . 1 . THE I-PLANE

Being that an I-SSM is essentially a directed graph , it is represented using predicate calculus as a set of sentences of the form ( π node 1 ? ? ? node n ) , where π is an n -ary relational constant over nodes .

Identifying specific ways in which the current I-SSM violates axioms in D A

requires analysing this I-SSM using state dif ference operators , called I - operators .

† Several existing KBSs build the I-SSM and S-SSM implicitly , ‘‘on top’’ of K . For example , in INTERNIST II (Pople , 1982) , an I-SSM of the patient being diagnosed is dynamically formed by adding to K ‘‘constrictor’’ and ‘‘spanning’’ links that tie together instantiated entities , and something similar to an S-SSM is created at run-time by adding to K ‘‘planning’’ links that help focus the system’s problem-solving activities .


Each I-operator stands for a single axiom in D A . For example , for the axiom ( ; F ( Present F ) ∧ ( abnormal F )) é ( ' D ( Causes D , F ))) , a corresponding I-operator would be represented as :

(I – OP : : DISEASE – CAUSE – FINDING IF (AND (Present ( F $var1 abnormal )) ( NOT (Present ( Cause ( D $var2)( F $var1))))) THEN post the unsatisfied proposition ( Cause ?( D $var2)( F $var1))) ,

where Present is a function that checks if its node argument is present in the I-SSM , and abnormal is a logical expression requiring $var1 to be bound to an abnormal finding in K . When the IF part of this operator is false , the THEN part posts the unsatisfied proposition ( Cause ?( D $var2)( F $var1)) . This proposition coincides with the node-chain ?( D $var2) 5 ( F $var1) , where ?( D $var2) is a node labeled as ‘‘unknown’’ .

More generally , an I-operator is a rule that posts the right-hand-side of an axiom in D A as an unsatifised proposition , provided that the left-hand-side of that axiom is violated . An unsatisfied proposition coincides with a specific node-chain , in which nodes corresponding to the ontological entities bound by quantifiers in the right-handside of the violated axiom are labeled as unknown .

In each iteration of the inference procedure , after applying I-operators on the current I-SSM , some I-SSM nodes might end up having associated with them one or more unsatisfied propositions . Each proposition is an n -ary relational constant , ( π node 1 ? ? ? node n ) , in which some nodes are labeled ‘‘unknown’’ . As Figure 5 indicates , these propositions are sent to the S-plane .

3 . 2 . THE S-PLANE

Unsatisfied propositions are processed as follows . For each proposition : (1) identify specific search operators that can apply K to possibly derive the knowledge needed to satisfy that proposition , and (2) select from the identified search operators the one to be triggered .

Before elaborating on these activities , let us first see what search operators are all about . These operators reside in the K-plane , and we will refer to them as K - operators . A K-operator uses K to derive a value to which it can bind the instance variable(s) marked unknown in the specific unsatisfied proposition it aims to address . To illustrate , a K-operator for addressing the unsatisfied proposition ( Cause ?( D $var2) ( F $var1)) would have the form :

(K – OP : : FIND – DISEASE – CAUSING – FINDING (argument : (body : (return :

?( D $var2) 5 ( F $var1)) k step 1 l ? ? ? k step n l ) ( D $var2) 5 ( F $var1))) ,

where ‘‘body’’ is code that searches K for knowledge based on which it can bind $var2 to a disease instance that causes the finding instance already bound to $var1 . Hence , a K-operator is generally expressed in terms of the node-chain argument coinciding with the proposition it aims to satisfy , and code that applies K and returns the node-chain argument with the unknown nodes bound to proper instances .

Returning to activities in the S-plane , K-operators identified for the unsatisfied propositions are added to the S-SSM (the SSM modeling the system’s behavior) . The S-SSM is a directed graph consisting of entries of the form : ( operator - name

M . BENAROCH 700

Iteration 1

Iteration 2

Iteration 3

Iteration N

?D ?Fs ?Fg

K_OP::FIND_DISEASE_CAUSING_FINDING K_OP::GENERALIZE_FINDING

F=12 hours headaches

?D Fs ?Fg

K_OP::FIND_DISEASE_... K_OP::GENERALIZE_FINDING

Fg=CNS headaches duration?D ?Fs

1K_OP::GENERALIZE_FINDING

K_OP::FIND_DISEASE_CAUSING_FINDING


?Ds ?Dg ?A

K_OP::SPECIALIZE_DISEASE_... K_OP::FIND_INDUCER

D=Meningitis

?F ?D Fs Fg=NIL

K_OP::GENERALIZE_FINDING

Fg=CNS headaches duration

K_OP::FIND_... 2

?FS


K_OP::FIND_DISEASE_CAUSING_FINDING

3 1

K_OP::GENERALIZE_FINDING

?D ?Fs Fg=Fever

Dg

8

8'

?A Ds=NIL

6,7

5F={CNS headaches duration,high grade fever}

?A Ds=Acute-Bacterial-Meningitis Dg

5'

?F ?A Ds ?Dg F=Fever

10

?D ?Fs ?Fg

?A Ds=Acute-Meningitis5

Dg=Infectious-Disease F=stiff-neck-on-flexation ?S

D=Meningitis


?Fs Fg=CNS headaches duration

Fs Fg=NIL

2

"resolved" node-chainnonresolved node-chain relating to an applicable K-operatorinapplicable node-chain violating constraints on S-SSM form(e.g. don't specialize a D that generalizes another D)

x yx ?yx y

. . .

9 4?D

5''

F IGURE 6 . Evolution of the S-SSM during construction of the sample I-SSM shown in Figure 1 .

node - chain ) . Figure 6 shows how the S-SSM evolves during construction of the sample I-SSM shown in Figure 1 . Like with the I-SSM , there can be constraints on the form of the S-SSM . For example , with respect to the S-SSM in the bottom of Figure 6 , the constraint ‘‘no cycles are allowed in the S-SSM’’ prevents an attempt to generalize Acute-Bacterial-Meningitis into Acute-Meningitis , because the latter was already specialized into the former and added to the S-SSM . After enforcing all such constraints , the S-SSM would contain some branches with terminal nodes marked as unknown . Each such branch corresponds to an applicable K-operator .


Choosing which one of the applicable K-operators will be triggered means making control decisions about which sub-goal (or unsatisfied proposition) posted in the I-SSM to pursue . For example , in NEOMYCIN ’s case , one such decision involves choosing which D on the dif ferential (i . e . the set of most specific hypothesized diseases) will be the ‘‘focus’’ of the next problem-solving activity . To make such control decisions , the S-plane employs search control operator , called S - operators , which use S to analyse the current I-SSM on several levels . To understand the dif ferent levels of I-SSM analysis , it is useful to look at how Figure 7 relates them to the way NEOMYCIN makes the above sample control decision . Granted that the I-SSM is a directed graph :

$ global I - SSM analysis chooses a sub-graph containing D s that each could be a good candidate focus ;

$ intermediate I - SSM analysis next chooses a focus D within that subgraph ; and $ local I - SSM analysis finally chooses which of the K-operators corresponding to

node-chains that represent immediate children of that focus D ( i .e . D 5 A , D 5 F , or D 5 D ) is to be triggered .

Switching from a local to a global mode of analysis takes place only when special events occur , e . g . when a new hypothesized D was added to the I-SSM that is not subsumed by any existing I-SSM sub-graph (i . e . a ‘‘wider dif ferential’’ event) . The occurrence of such events is checked for while updating the S-SSM .

S-operators capture strategic principles in S hierarchically , the same way the NEOMYCIN hierarchy of subtasks depicted in Figure 7 captures and organizes these principles . However , the dif ference compared to NEOMYCIN is that we express these principles using the same graph vocabulary used to express the I-SSM . High level S-operators capture principles for global and intermediate I-SSM analysis , e . g . ‘‘anchor the diagnostic reasoning to the I-SSM sub-graph whose root ( D isease) contains the largest number of abnormal manifestation ( F inding) nodes’’ . The lowest level S-operators capture principles for local I-SSM analysis . These principles are essentially preferential constraints on node-chains . They are therefore expressed as ordered sets of ontological node-chains . For example , the principle ‘‘test a hypothesized malfunction ( D isease) before refining it’’ would be expressed as :

( D 5 ? D u F : D 5 h F j . D 5 h D j ) ,

where D / F means either D or F , ? denotes the unknown node in a node-chain , and h j denotes the multiplicity of the surrounded entity . The way this preferential constraint works in local I-SSM analysis can be illustrated in terms of how NEOMYCIN

applies the sub-task labeled Pursue-Hypothesis in Figure 7 . This sub-task induces a breadth-first search of the disease taxonomy by always invoking sub-task Test- Hypothesis whose goal is to look for additional F indings to support the focus D isease , before invoking sub-task Refine-Hypothesis whose goal is to specialize the focus D isease based on the accumulated findings . Respectively , a local I-SSM analysis involving the preferential constraint ( D 5 ? D u F : D 5 h F j . D 5 h D j ) would induce the same search pattern . Because Test-Hypothesis and Refine-Hypothesis are basically K-operators that take the ontological node-chains D 5 ? h F j and D 5 ? h D j , respectively , Test-Hypothesis will always be triggered earlier . Note that preferential constraints could refer to ontological entities as well as task-specific descriptors , e . g .

M . BENAROCH 702

Est

ablis

h-H

ypot

hesi

s-S

pace

...

Exp

lore

-and

-Ref

ine

Gro

ups

& D

iffer

entia

te

Pur

sue-

Hyp

othe

sis

Test

-Hyp

othe

sis

Ref

ine-

Hyp

othe

sis

...

I-S

SM

d sa

tisfi

ed p

ropo

siti

on p

oste

d in

the

I-S

SM s

unsa

tisfi

ed p

ropo

siti

on p

oste

d in

the

I-S

SM

S-op

erat

or f

or g

loba

l I-

SSM

ana

lysi

s Su

btas

k E

stab

lish-

Hyp

othe

sis-

Spac

e it

erat

ivel

y ap

plie

s th

e fo

llow

ing

stra

tegi

c pr

inci

ples

to

choo

se a

subs

et o

f th

e di

f fer

enti

al*

(or

a su

bgra

ph i

n th

e I-

SSM

) on

whi

ch t

o fo

cus .

1 . If

the

re a

re a

nces

tors

of

hypo

thes

es o

n th

e di

f fer

enti

al n

ot y

et t

este

d by

Tes

t-H

ypot

hesi

s , t

hen

perf

orm

Gro

up &

Dif f

eren

tiat

e on

the

m .

2 . If

the

re a

re h

ypot

hese

s on

the

dif f

eren

tial

not

yet

pur

sued

by

Pur

sue-

Hyp

othe

sis ,

the

n pe

rfor

m E

xplo

re-a

nd-R

efine

on

them

. 3 .

. . .

S-op

erat

or f

or i

nter

med

iate

I-S

SM a

naly

sis

Subt

ask

Exp

lore

-and

-Refi

ne

iter

ativ

ely

appl

ies

the

follo

win

g st

rate

gic

prin

cipl

es

to

choo

se

a di

seas

e on

w

hich

to

fo

cus

(fro

m

the

chos

en

I-SS

M

subg

raph

) .

The

su

btas

k ab

orts

w

hen

a ‘‘w

ider

-dif f

eren

tial

’’ † e

nd c

ondi

tion

is

enco

unte

red .

1 . If

the

cur

rent

foc

us i

s no

w l

ess

likel

y th

an a

noth

er h

ypot

hesi

s on

the

dif f

eren

tial

, the

n pe

rfor

m P

ursu

e-H

ypot

hesi

s on

the

str

onge

r hy

poth

esis

. 2 .

If t

here

is

a ch

ild o

f th

e cu

rren

t fo

cus

that

has

not

bee

n pu

ruse

d , t

hen

perf

orm

Pur

sue-

Hyp

othe

sis

on t

he c

hild

of

the

curr

ent

focu

s . (T

his

is t

rue

only

aft

er t

he c

urre

nt f

ocus

was

jus

t re

fined

and

rem

oved

fro

m d

if fer

enti

al . )

3 . If

th

ere

is

a si

blin

g of

th

e cu

rren

t fo

cus

that

ha

s no

t be

en

purs

ued ,

th

en

perf

orm

Pur

sue-

Hyp

othe

sis

on t

he s

iblin

g of

the

cur

rent

foc

us .

4 . If

the

re i

s an

y ot

her

hypo

thes

is o

n th

e di

f fer

enti

al t

hat

has

not

been

pur

sued

, th

en p

ursu

e it

(e . g

., pe

rfor

m P

ursu

e-H

ypot

hesi

s on

the

str

onge

st h

ypot

hesi

s no

t ye

t pu

rsue

d) .

S-op

erat

or f

or l

ocal

I-S

SM a

naly

sis

Subt

ask

Pur

sue-

Hyp

othe

sis

appl

ies

the

follo

win

g st

rate

gic

prin

cipl

e to

cho

ose

the

next

act

ivit

y to

carr

y ou

t w

ith

rega

rd t

o th

e fo

cus

(or

the

I-SS

M n

ode-

chai

n th

at i

s th

e ch

ild o

f th

e fo

cus

to b

e ad

dres

sed

next

) . 1 .

Tes

t th

e fo

cus

befo

re r

efini

ng i

t : pe

rfor

m T

est-

Hyp

othe

sis

on t

he f

ocus

, and

mar

k th

e fo

cus

as pu

rsue

d .

K-o

pera

tor

(Pri

mit

ive)

sub

task

Refi

ne-H

ypot

hesi

s ad

ds t

axon

omic

chi

ldre

n of

the

foc

us t

o th

e di

f fer

enti

al .

* T

he d

if fer

enti

al i

s th

e se

t of

mos

t sp

ecifi

c di

seas

e hy

poth

eses

the

sol

ver

is c

onsi

deri

ng .

† A

wid

er-d

if fer

enti

al m

eans

tha

t a

new

hyp

othe

sis

was

add

ed t

o th

e I-

SSM

tha

t is

not

sub

sum

ed by

any

exi

stin

g su

bgra

ph .

F IG

UR

E 7

. C

ontr

ol d

ecis

ions

mad

e th

roug

h a

hier

arch

ical

ana

lysi

s of

the

I-S

SM u

sing

S-o

pera

tors

. T

he e

xam

ple

illus

trat

es t

he g

loba

l , in

term

edia

te a

nd l

ocal

I-SS

M a

naly

sis

perf

orm

ed c

orre

spon

ding

to

the

way

spe

cific

NE

OM

YC

IN s

ub-t

asks

con

trol

the

dia

gnos

tic

reas

onin

g pr

oces

s .


‘‘pursue a common inducer of a hypothesized malfunction ( D isease) before an unlikely one’’ . This sample preferential constraint would be expressed as :

(?( A common - unlikely ) 5 D : ( A common - inducer ) 5 D . ( A unlikely - inducer ) 5 D ) ,

where A stands for an A gent , and common - inducer and unlikely - inducer are the values that descriptor common - unlikely can assume .

The control scheme we have just described is used to identify which applicable K-operator is to be triggered . Once a K-operator was chosen and triggered by the S-plane , control is transferred to the K-plane . (We did not elaborate here on the reasons we construct the S-SSM ; these reasons are discussed in Section 4 . )

3 . 3 THE K-PLANE

A triggered K-operator uses K to derive a value to which it can bind the nodes marked as unknown in its node-chain argument . For example , the K-operator labeled K – OP : : FIND – DISEASE – CAUSING – FINDING in Section 3 . 2 receives the node-chain argument ?( D $var2) 5 ( F $var1) , and it seeks to bind $var2 to an appropriate disease instance in K .

Under the relational networks epistemology , K is a set of relational networks . Each network captures a certain relation between instances of specific ontological entities . It can take the form :

(( π node - chain [ k CF l ]) (node-chain [ k CF l ]) ? ? ? (node-chain [ k CF l ] l )) ,

where π is a relational constant and [ k CF l )] is an optional certainty factor . For example , a disease taxonomy and a network linking agents with diseases they induce would look as follows , respectively :

( Subtype ( D ) 5 ( D )) (( D Infectious-Disease) 5 ( D Meningitis)) (( D Meningitis) 5 ( D Acute-Meningitis)) ( ? ? ? )) ,

(( Induce ( and ( A ) [ h ( F ) j ]) 5 ( D ) k CF l ) ((and ( A E . coli) ( F Ventricular-Ureteral-Shunt) , 5 ( D Bacterial-Meningitis) k 0 . 3 l )) ((and ( A Klebsiella-Pneumoniae) ( F Ventricular-Ureteral-Shunt)) 5

( D Bacterial-Meningitis) k 0 . 3 l )) ( . . . )) .

In the later network , the first entry can be viewed as saying : if the hypothesized disease is Bacterial-Meningitis and the patient have had a Ventricular-Ureteral-Shunt procedure , the inducing agent is E . coli with a k 0 . 3 l certainty . Relational networks involving descriptors of ontological entities can be represented in a similar fashion . To illustrate , a network that distinguishes between agents that are common or unlikely inducers of diseases would look as follows :

(( common - unlikely - agent ( A k common - inducer unlikely - inducer l ) 5 ( D )) (( A Enterobacteriaceae common-inducer) 5 ( D Pelvic-Abscess)) (( A Gram-Positive-Rods unlikely-inducer) 5 ( D Pelvic-Abscess)) ( ? ? ? )) ,

M . BENAROCH 704

where common - unlikely - agent is a descriptor that takes on the values k common - inducer unlikely - inducer l .

If the epistemology chosen involves only taxonomic relational networks , for example , K-operators would search K and instantiate elements in K by matching node-chains . Specifically , the body of a K-operator would first find a relational network with a ‘‘header’’ containing an ontological node-chain that matches the one in the input argument of the K-operator . Then , it would scan the network to find entries containing the known nodes in its node-chain argument , based on which it would bind proper instances to the nodes labeled unknown in this node-chain argument . If the epistemology chosen (also) involves compositional relational networks , a K-operator might simultaneously search multiple networks and apply inferences on the search results to produce the binding sought .

In any case , the results generated by K-operators are reported to the I-plane , where they are used to update the current I-SSM (e . g . grow a link between I-SSM nodes , or aggregate I-SSM sub-graphs) .

3 . 4 . GENERALITY OF THE ARCHITECTURE

Based on the discussion so far , the architecture presented is said to be task- independent . It would work for any application task that seeks to construct I-SSMs whose underlying epistemology is that of relational networks . To see this point , we can examine the vocabulary used to express the parts of the inference procedure underlying the architecture and the knowledge constituents that these parts utilize . This vocabulary , D V , consists of :

(1) D V e —terms referring to epistemic entities (e . g . sub-graph , node , link , root , sibling) ;

(2) D V o —terms referring to ontological entities (e . g . actor , malfunction , manifestation) ; and

(3) D V t —task-specific synonyms and descriptors referring to terms in D V o (e . g . agent , treatable disease , finding) .

The parts of the inference procedure we identified but not elaborated on involve operators for updating the I-SSM and the S-SSM . Under the relational networks epistemology , the I-SSM and S-SSM are directed graphs . Their updating therefore involves standard operations on graphs (e . g . add node , link nodes , append sub-graphs , find root) . Accordingly , operators that carry out such operations need to be expressed using only terms in D V e , without having to know what these terms correspond to in D V o and D V t . †

As to the knowledge constituents used by the inference procedure , these includes : I-operators corresponding to axioms in D A , K-operators , K , S-operators , and S . So

† This can be illustrated in the case of ACCORD (Hayes-Roth , Hewett , Johnson & Gravey , 1988) . ACCORD uses a hierarchy of general-purpose operators for manipulating structures similar to what we call I-SSMs . These operators are viewed as verbs , with the subject and object relations being epistemic terms . For example , an operator called YOKE appends two I-SSM sub-graphs that satisfy a certain ‘‘position’’ criterion . The meaning of YOKE’s operation can be interpreted only based on the ontological meaning of sub-graphs that are specific to the application where it is being used . If sub-graphs stand for partial device configurations (in design) , YOKE would conceptually treat ‘‘position’’ as a spatial constraint between the sub-graphs . On the other hand , if sub-graphs stand for partial disease process descriptions (in diagnosis) , YOKE would conceptually interpret ‘‘position’’ as a place on a time-line .


far these constituents have been expressed using terms in D V t . Yet , beside constituents referencing descriptors of ontological entities (e . g . I-operators corresponding to modality axioms in D A ) , we could have expressed them using only terms in D V o . After all , except for descriptors , all the task-specific terms in D V t are synonyms of terms in D V o that we chose to use for clarity . We can look more closely at K , S , and their associated operators . K includes relational networks in which only the ‘‘header’’ requires referencing terms in D V o (optionally in D V t ) . As to K-operators , while their input / output arguments are node-chains that could reference terms in D V o and D V t , their ‘‘body’’ could also involve terms in D V e . For example , the body of a K-operator that seeks to generalize a finding (e . g . ‘‘12 hours headaches’’ into ‘‘CNS headaches duration’’) might do so by searching for the fathers of that finding within some taxonomy of findings . Finally , S includes strategic principles that are expressed in terms of I-SSM sub-graphs and ordered sets of node-chains . Although some of these node-chains reference only terms in D V o

(optionally in D V t ) and some also reference task-specific descriptors in D V t , S-operators do not need to know what these terms stand for to accomplish their job .

4 . Benefits from design knowledge

Having seen how D is used in the construction of explicit I-SSMs , we next focus on avenues that the availability and use of D open with respect to the design goals mentioned in Section 1 (ease of maintenance , reuse , robustness , etc . ) . We relate these avenues to several recent KBSs that were built to meet these goals , and rationalize the way these systems work in terms of how they capture and use D . The next discussion is by no means meant to be complete ; a detailed inquiry into how any one of the design goals can be met is the subject of a separate paper .

4 . 1 . KNOWLEDGE REUSE AND KBS CONSTRUCTION AND MAINTENANCE

Understanding how D can help to ease construction , simplify maintenance and support knowledge reuse requires introducing the notion of a microtheory . Guha & Lenat (1994) define a microtheory to be a ‘‘fairly adequate’’ solution to some application task , recognizing that a task can have several microtheories which correspond to dif ferent viewpoints , levels of granularity , etc . These authors say that a microtheory is the set of ground rules used to model , and reason about , certain phenomena in the context of a specific task . These ground rules includes :

(1) axioms (assertions , constraints , assumptions) that define the nature of relevant ontological entities ;

(2) an inference procedure for reasoning with the axioms about problem situations the task aims to address ; and

(3) a vocabulary (language , representational constructs) for expressing the axioms , the inference procedure , and the necessary domain knowledge .

We argue that D ( D A 1 D V ) can be identified with a microtheory of a specific application task . This is visible from the parallel between D and the three parts of a microtheory . First , D A is a set of axioms describing the generic I-SSM applicable to some task , where this generic I-SSM reflects the nature of relevant ontological

M . BENAROCH 706

entities in terms of assertions and assumptions with which they must comply . Second , D A reflects the underlying epistemology by using epistemic terms in D V (e . g . root , father) , and we saw in Section 3 . 4 that the inference procedure required to reason with D A depends only on the assumed epistemology . Thus , D A identifies the inference procedure needed to construct the kind of I-SSMs the task requires . Finally , D V is the vocabulary used to define the necessary representational constructs using such terms as nodes and node-chains . For example , three of the constructs we discussed earlier are : ordered sets of node-chains for representing preferential constraints in S , rule structures for expression axioms in D A as I-operators involving node-chains and unsatisfied propositions , and procedural structures with input / output node-chain arguments for representing K-operators .

Returning to the aforementioned design goals , the question is : how does the fact that D can be identified with a microtheory help with respect to these goals? Neches , Fikes , Finin , Patil , Gruber , Senator & Swartout (1991) propose to lower the ef fort needed to build and maintain KBSs using tools that act as frameworks for handling instances of specific task classes . They suggest developing such frameworks in the form of top-level abstraction hierarchies which KBS designers can reuse and elaborate to create specific applications .

Consider how a KBS shell in the spirit of the above proposal could be developed based on the kind of design knowledge we discuss . Assume the existence of an object-oriented ( O-O ) hierarchy with nodes organized in two tiers , corresponding to epistemic and ontological design choices like those in Figure 2 , where each tier in the hierarchy contains layers of increasingly specialized design knowledge . A node at the top tier would store or inherit an epistemic vocabulary , D V e , and an inference procedure for reasoning in terms of the kind of I-SSMs supported by a specific epistemology (e . g . the inference procedure in Figure 5) . A node at the lower tier would store or inherit from nodes in the same tier an ontological vocabulary , D V o , axioms defining the nature of the generic entities specific to an ontology , D A , and their derivative I-operators and K-operators .

Unlike common KBS shells which only provide representational constructs , a shell involving such an O-O hierarchy of design knowledge would provide reusable partial microtheories of various task classes . These are partial microtheoris because their D A would not include modality axioms , which might be unique to instances of these task classes (e . g . the axiom ‘‘an I-SSM root must represent a treatable disease’’ in NEOMYCIN ’s case) . Given such a shell , the KBS design endeavor would involve roughly the following steps .

(1) Make epistemic and ontological design choices from the options available in the O-O hierarchy . This would identify a partial microtheroy—inference procedure , D V e , D V o , D A and its derivative I-operators and K-operators—of the intended application task .

(2) Elaborate the partial microtheory identified (i . e . tailor it to the application task) . (2 . 1) Make perspective design choices by selecting from D A a subset of axioms ,

A , that refer to the ontological entities needed to model and examine domain phenomena from the angles of interest .

(2 . 2) Provide task-specific synonyms and descriptors of ontological entities in A , and add modality axioms to A . The shell would then present for the


modality axioms proper templates that the KBS designer could fill to create the I-operators and K-operators corresponding to these axioms .

(3) Design K and S based on the elaborated microtheory . The shell would assist by : (3 . 1) suggesting relational networks needed in K based on relational constants

in axioms (e . g . an axiom embedding the phrase ( Cause D F ) requires a causal network relating diseases to findings) ; and

(3 . 2) proposing preferential constraints needed in S based on node-chain arguments in K-operators .

We can illustrate some of the things that a KBS designer would do while applying the above steps in the context of a mechanical diagnosis task like the one discussed by Console , Portinale , Dupre & Torasso (1993) . Figure 8 presents the generic I-SSM applicable to , and a sample I-SSM instance constructed for , this task . Following design choices made in step 1 , the shell would identify a partial microtheory similar to the one used in NEOMYCIN . Then , given the ontological entities that axioms in the identified D A refer to (agent , malfunction , etc . ) , design choices in step 2 . 1 would , for example , exclude from D A axioms pertaining to the ‘‘location’’ entity . (Recall from Figure 4 that , in medical diagnosis , ‘‘location’’ pertains to an organ system where the agent inducing a disease resides . ) In step 2 . 2 , one of the task-specific synonyms that would be provided to the shell will distinguish between an ‘‘internal agent’’ and an ‘‘external agent’’ (e . g . extremely hot climate) , and one of the modality axioms that would be added to D A will state that ‘‘an agent can be internal or environmental’’ . In step 3 . 1 , one of the relational networks that K must include is suggested by the remaining axioms in D A ; it is a taxonomy of malfunctions capturing type-of relations between them . Finally , in step 3 . 2 , the above modality axiom newly added to D A would indicate the need for a preferential constraint that specifies whether an internal agent should be pursued before an external agent , or vice versa .

The link between the KBS design approach outlined above and the aforementioned design goals can be summarized as follows . This design approach could render the KBS construction process more structured and predictable by virtue of facilitating reuse of D (and , in turn , of K-operators , I-operators , and S-operators) . It enables a KBS designer to capitalize on the availability of a hierarchical ‘‘library’’ of reusable partial microtheories of various task classes , like in the above mechanical diagnosis example . This KBS design approach could also simplify maintenance in cases where the changes made to a KBS involve a revision of the ontological and perspective design choices (not instance choices) underlying the system . In such cases , we could require a KBS shell like the one described above to handle a revision in three steps . First , the shell would ask the designer to revise the definition of what the KBS seeks to accomplish—add , delete , and modify axioms in D A . Then , the shell would check for the consistency of axioms in the revised D A . Finally , the shell would compare axioms in the revised D A to axioms in the original D A to identify : (1) which of the existing I-operators and K-operators need to be added , deleted , or modified ; (2) which relational networks in K have to be added , deleted , or restructured ; and (3) which strategic principles in S must be added , deleted , or revised (based on the exclusion and / or addition of ontological entities , axioms in D A , etc . ) . In other words , the idea is to require that changes to a KBS would be preceded by a revision of the microtheory the system is utilizing . This idea is applied

M . BENAROCH 708

(Subtype)

induceMalfunction(disorder)

Actor internal orexternal (agent)

Manifestation(internal state/action)

Manifestation(finding/symptom)

cause

cause

badspark plugs

sparkplugs fault

excessivespark plugs

millage

...

irregularspark

ignition

excessive gasconsumption?

YES

gas smellpresent?

YES

irregulargas

mixture

highengine

temperature

auto-ignition

irregulargas

concentrationirregular

firing

faultymixtureignition

decreasedefficiency

powerdecrease

irregularacceleratorresponse!

temperatureindicator on?

NO

(a)

(b)

(Subtype)

F IGURE 8 . Re-use of D from NEOMYCIN (see Figure 4) in the context of a medical diagnosis task . (a) Generic I-SSM applicable , (b) sample I-SSM instance (adapted from Console , Portinale , Dupre &

Torasso , 1993) .

by KBS development approaches , such as CommonKADS (Schreiber , Wielinga & de Hoog , 1994) , which link the knowledge-level model that they develop for a task to the actual KBS design , and thus require that a revision of the KBS design be preceded by a revision of the knowledge-level model .


4 . 2 . ROBUSTNESS

Robustness means performing well outside a narrow range of expertise , that is , the ability to avoid brittleness and exhibit novelty . According to David , Krivine & Ricard (1993) , the most common approach to increasing robustness is to provide a KBS with multiple knowledge sources , or K s (e . g . structural and functional models , quantitative and qualitative models) , so as to improve domain coverage . This approach usually involves some dif ficulties . The conceptual ones include : how to identify to which K to shift and when , and how to translate results between K s . Work on these issues is typically limited to dealing with K s capturing functional models that are abstraction of each other , and to using predefined task-specific ‘‘switching modules’’ and dictionaries containing knowledge about when and how to switch between K s as well as how to translate results between K s (e . g . Hunt & Price , 1993) . Implementation dif ficulties pertain primarily to the integration of reasoning across K s that are captured using dif ferent representation formalisms , such as rules , frames , and logic predicates (Simmons & Davis , 1993) . Recent proposals like the one of Guida & Zanella (1993) suggest addressing these dif ficulties by integrating K s through their ontologies , representational assumptions , and epistemological types , among other things .

In the spirit of such proposals , we argue that one way to address the above dif ficulties is to capture the D underlying available K s and talk about the integration of microtheories . The idea is to allow a system to ‘‘compose’’ the microtheory it needs by combining narrower microtheories of various tasks in the same general domain (to which it has access , e . g . via the above discussed O-O hierarchy of microtheories . ) We illustrate this idea using two cases . The first case shows the role of D in the composition process , assuming the availabiilty of only two microtheories that are complementary in some sense . The second case also shows the role of D in finding the right microtheories to compose , assuming the availability of a variety of microtheories .

The first case is based on the work of Liu & Farely (1991) . Suppose that a KBS for designing electronic devices is faced with the query : gi y en a de y ice whose structure is fixed , how can we lower the current flowing through one of its resistors without changing the resistance of , or y oltage across , that resistor ? Further assume that the KBS has access to two microtheories (see Figure 9(a) for more details) .

(1) The lumped-device microtheory characterizes a component / device using terms like y oltage ( V ) , current ( I ) and resistance ( R ) . Its associated K includes macro behavioural axioms like Ohm’s law and Kirchhof f’s law . To address the query using this microtheory , the KBS must directly question the component model of a resistor ; however , this model contains primitive behavioral axioms which cannot be derived using this microtheory (i . e . based on Ohm’s law , I 5 V / R , the only way to change the current through a resistor is to change R or V ; but , it is required that R and V remain unchanged) . This means that this microtheory lacks certain ‘‘perspectives’’ that would allow reasoning about the component at the micro level .

(2) The charge-carriers (CC) microtheory describes a component in terms of spatial characteristics like cross-sectional area ( A ) and in terms of the movement of electric charge carriers using parameters like field ( E ) , force ( F ) , CC motion

M . BENAROCH 710

y elocity ( y ) , charge fiow ( C ) , and current ( I ) . The K associated with this microtheory includes micro behavioral axioms like I 5 C 5 A 1 y (where x denotes a qualitative change in x ) .

What facilitates the dynamic integration of these two microtheories is two commonalities on the D s identified with them . As Figure 9(a) shows , both D V s involve common parameters and both D A s involve common axioms . Through these common parameters and axioms , the two microtheories are combined to compose a more encompassing microtheory using which the query can be addressed as shown in Figure 9(a) . †

The second case , which illustrates how D also helps to search for the right microtheories to be composed , is based on Falkenhainer and Forbus’ (1991) work (however , we omit many details here) . These authors built a system that uses a set of thermodynamics models to automatically compose new models for analysing whatever device phenomenon is tackled . As Figure 9(b) shows , the system defines a model in terms of the relations it captures ( K ) , the ontological entities it involves ( D V ) , and its underlying ontological , grain , operational , etc . assumptions ( D A ) . Because each model may be suitable for reasoning only about specific phenomena , a model is viewed as a microtheory of the domain , and the collection of models available is considered a theory of the domain . By grouping the assumptions underlying models based on their similarities , the system forms a hierarchy of ‘‘assumption classes’’ and their corresponding models , analogous to the hierarchy of microtheories we described in Section 4 . 1 . Through use of the hierarchy of assumption classes , the system addresses any given query as follows . (1) Identify the input / output parameters of the artifact (e . g . turbine) to which the

query refers . (2) Compose a model for the query . Use the hierarchy of assumption classes to

select models that involve the parameters identified . If the selected models depend on the output of other models , selects those other models as well , as long as their underlying assumptions are consistent with the ones selected initially . The selection of models is managed using a dependency network (i . e . an assumption-based truth maintenance system) capturing relationships between the assumptions of selected models .

(3) Use the composed model with a qualitative simulation engine (embedding S ) to produce a solution—an envisionment of the qualitative behavior of the simulated artifact that forms what we call an I-SSM .

(4) If the solution is deficient (e . g . an empty envisionment) , use dependency- directed backtracking with the dependency network of assumptions to identify inadequate assumptions that caused the ‘‘failure’’ .

(5) Return to step (2) to revise the composed model , until the desired solution is obtained .

The above discussion implies that what enables the system of Falkenhainer and Forbus to work well is the fact that the system has available to it the D underlying K s it integrates dynamically for problem solving purposes .

† The ability to integrate these microtheories through use of D also helps with queries involving structural changes to a device . For example , we can also address the query : gi y en a simple DC circuit containing a light bulb to which there is a serially connected resistor , how can we ‘‘ redesign ’’ the circuit to make the bulb light brighter without changing the current and y oltage ? (see Liu & Farely , 1991) .


(a)QUERY: given a device whose structure is fixed, how can we lower the current flowing through one of its resistors without changing the resistance of, or voltage across, that resistor?

Lumped-Device MicrotheoryDV={V - voltage, I - current, R - resistance, ...}DA={(1) if parameter ?x is perturbed and it is connected to ?y, then ?y is also perturbed; (2) two connected parameters will continue to perturb each other until they reach a "steady state"; (3) ...}K={macro behavior axioms, e.g., Ohm's and Kirchoff's laws}

DV's have I and V as common entities, andDA's have axioms (1), (2), ...in common.

Charge-Carriers MicrotheoryDV={Q - charge, C - charge flow, v - flow velocity, E - field, I - current, F - force, L - component's length, A - cross-sectional area, ...}DA={(1) if parameter ?x is perturbed and it is connected to ?y, then ?y is also perturbed; (2) two connected p[arameters will continue to perturb each other until they reach a "staedy state"; (3) ...}K={micro behavioral axioms: ∂E=∂Q–∂L, ∂F=∂E, ∂v=∂F, ∂C=∂A+∂v,∂I=∂C, ∂V=∂A+∂v, ...}

(b)

(defModel (CONTAINED-LIQUID-GEOMETRY ?CL ?CAN) Relations ((Quantity (level ?cl)) (= (level ?cl) (/ (* 4 (mass ?cl)) (* (density ?sub) PI (expt (diameter ?can) 2))))

(= (pressure (bottom ?can) :absolute) (* (level ?cl) (density ?sub) G))))Individuals ((?can: conditions (fluid-container ?can)) (?cl : conditions (contained-liquid ?cl))

Assumtions ((CONSIDER (CONSIDER (CONSIDER (CONSIDER (CONSIDER

. . .

(container-of ?cl ?can))(substance-of ?cl ?sub)))(exists ?can))(viscous ?cl))(fluid-cs ?cl))(geometric-properties ?can))((staedy-state ?system ?q-type) (part-of ?component ?system) (steady-State (?q-type ?component)))

% K--the model

% DV part of a microtherory% ontological entities and their% relations

% DA part of a microtherory% simplifying assumptions, ontology% assumtions, grain assumptions,% approximations/abstractions, and% operational assumptions (e.g.. a% system is in a steady-state if all% its components are in a steady-state)

V

RI

0) ∂V=∂R=0 & ∂I=– (given)

1) ∂C=– (axiom ∂C=∂Ι)

2) ∂v=– (axiom ∂C=∂A=∂v)3) ∂F=– (axiom ∂C=∂F)4) ∂E=– (axiom ∂F=∂E)5) ∂L=– (axiom ∂E=∂Q–∂L & ∂Q=0)

L

E

v

QA

C

++

+

-

part-of Device/Component

State ofParameter(s)

characterizedby

Component State ofParameter(s)

characterizedby

causechange

causechange

+

F

∂x denotes a qualitative change in x

)

+

F IGURE 9 . Two examples of how D is used for robustness purposes . (a) D is used to compose a suitable microtheory from two narrower microtheories , (b) Domain model defined as a microtheory (adapted

from Falkenhainer & Forbus , 1991) .

M . BENAROCH 712

These examples demonstrate how D helps to increase the robustness of KBS by simplifying the integration of dif ferent K s in two ways . Firstly , availability of D eliminates the need to provide a KBS with predefined task-specific ‘‘switching modules’’ like those used by Hunt & Price (1993) . Secondly , using D with the KBS architecture we presented in Section 3 allows hiding implementation details of the K s being integrated . Since in our architecture any K is applied only with the K-operators defined by its associated microtheory , neither the inference engine nor the K-operators associated with the other K s being integrated need to know details pertaining to the representational formalism used to implement that specific K .

4 . 3 . EXPLANATION

Whereas the previously discussed design goals could be better met by virtue of having D available , the ability to enhance explanation capabilities has to do with the I-SSM and S-SSM constructed based on usage of D . The literature on explanation (Swartout & Moore , 1993 ; Tanner , Keuneke & Chandrasekaran , 1993) distinguishes between issues of generating the content of explanations and issues of generating explanations in a form that meets needs . Because our work does not aim to deal with the subject of explanation per-se , we focus here only on issues of content . Swartout and Moore submit that issues of content pertain primarily to the ability of a KBS to generate what , how and why explanations about its domain as well as its behavior . Figure 10 maps these types of explanations to four explanation modules which we associate with specific parts of the KBS architecture presented in Figure 5 .

Module 1 is concerned with defining and justifying knowledge elements that are part of K , or were derived based on K . For example , in the domain of electrical

Module 1Define and justifyelements in K based on K'

Explanation ofbehaviuor and strategy

Explanation ofsolutions

Explanation ofthe domain

Module 4How: explain the strategy usedto derive the solutionWhy: justify usage of thisstrategy

Module 3Why: explain why the solution generated and captured in the I-SSM is good

S-SSM S

currentS-SSM

genericI-SSMform(DA)

S'domain/world factsand environmentassumptions

justify principlesin S based on S'

Module 2

K

K'definitions and finer gradually

domainknowledge

F IGURE 10 . Four explanation ‘‘modules’’ and their mapping to our KBS architecture .


devices , one assertion based on Ohm’s law is : the current through a resistor increases when the y oltage across it increases . Referring back to the discussion in Section 4 . 2 , the role of D in generating an explanation that justifies this assertion is one of facilitating the dynamic shifting between the lumped-device and CC microtheories (see Liu & Farely , 1991) . In other words , assuming that a finer granularity K 9 can be used to justify the above assertion in K , D could facilitate the dynamic linkage between K and K 9 in a task-independent manner .

Module 2 is responsible for justifying strategic principles in S , e . g . explaining why a diagnostic KBS ‘‘follows an agent that is a common inducer of a malfunction before an unlikely one’’ . Justifying such a principle requires no use of D . Instead , it normally entails referencing the commonsense world facts and task-specific environmental assumptions underlying that principle .

Module 3 deals with explaining why the solution that the system produced is good . By viewing D A as a documentation of the informational requirements of a task , the I-SSM constructed for that task can be used to show how these requirements are met by a generated solution . For example , referring to the partial I-SSM in Figure 1 , it could help explain why the diagnosed disease is (supposedly) Acute-Bacterial- Meningitis by providing such information as : (1) the diagnosed disease is supported by the presence of all reported symptoms (e . g . headaches , high-grade-fever) ; (2) the disease is induced by the E .coli organism ; (3) the presence of E .coli is established by evidence about the patient having a suppressed immune system because of pregnancy , alcoholism , or a recent surgical procedure that is an enabling step in the disease development process ; (4) the disease is treatable ; and (5) the disease is the most specific one that explains all symptoms . In other words , since D A essentially defines a causal script pertaining to a disease process , explaining a solution boils down to using the I-SSM to show that an instance of the causal script that implicates the diagnosed disease has occurred .

This approach can be complemented using the reconstructi y e explanation approach used in REX (Wick , 1993) . The reconstructive approach assumes the existence of two KBSs . One KBS solves the task using techniques that may not be explainable to the user (since sometimes the best way for a KBS to solve a task ef ficiently requires using techniques totally foreign to users) . The second KBS then receives a trace of the solution (e . g . an I-SSM) and uses dif ferent techniques to assemble a plausible story that justifies the solution , where the story may deviate from the actual processing that solved the task in the first place . For example , in the case of diagnosis , the story can support a solution using the arguments : (1) the I-SSM identifies a number of diagnostic hypothesis that might explain the principle symptoms ; (2) some of these hypotheses were ruled out because they cannot explain the principle complaints in this instance , or because they are implausible independent of what complaints they might explain ; and (3) the diagnostic conclusion is the best of the plausible hypotheses that can explain the symptoms . This story shows how the informational requirements of the task are met by essentially reflecting the general logical structure of diagnostic tasks , and abstracting away many of the task-specific details appearing in the I-SSM .

Module 4 is concerned with how explanations of the strategy used to solve the task and why explanations that justify usage of this strategy . Like NEOMYCIN , DIVA

(Davis , Shorbe & Szolovits , 1993) and ABLE (Patil , Szolovits & Schwartz , 1984) , our

M . BENAROCH 714

architecture can produce how explanations by reporting on the subtasks that were applied step-by-step based on the strategy embedded in the hierarchy of S-operators (representing sub-tasks) . Such explanations are useful , but they do not reflect global lines of reasoning followed during problem solving (e . g . top-down refinement with a disease taxonomy) . Generating global how explanations requires access to a record of the system’s problem solving process , something lacking from typical KBSs . DIVA

is somewhat of an exception in the sense that it treats the stack of active sub-tasks as such a record , and it produces global how explanations by collapsing sequences of sub-tasks currently posted on the stack into abstract lines of reasoning ; however , DIVA uses for this purpose special task-dependent explanation routines . In contrast , we produce an S-SSM , an explicit record of the entire problem solving process , which could help to generate line-of-reasoning explanations by collapsing node- chain sequences corresponding to the order in which K-operators were triggered . To illustrate this idea using the S-SSM in the bottom of Figure 6 , consider the node-chain sequence D h Meningitis j 5 Ds h Acute - Meningitis j 5 Ds h Acute - Bacterial - Meningitis j 5 F h High - Grade - Fe y er j 5 ? ? ? corresponding to one S-SSM branch , followed by the sequence D h Meningitis j 5 Dg h Infectious - Disease j 5 F h Fe y er j 5 ? ? ? corresponding to a later S-SSM branch . By virtue of knowing that Ds and Dg denote a specialization and a generalization of D , respectively , the first sequence tells that the system performed a top-down refinement of the hypothesized disease (based on some specialization taxonomy) , and the second sequence indicates a switch to a categorical mode of reasoning about the hypothesized disease . These node-chain patterns would probably be found in other applications involving diagnosis .

Regarding why explanations of behavior , the S-SSM could also be useful by virtue of it posting all node-chain (or sub-task) sequences that have been started but might be incomplete . In NEOMYCIN and DIVA , for example , the so-called end-conditions of active sub-tasks on the stack can interrupt ongoing lines of reasoning , and without keeping track of these lines of reasoning , decisions concerning the state of the inference process are rather adhoc . In contrast , an S-SSM not only enables to deliberately compare alternative lines of reasoning that were interrupted , but also allows to resume them from the point where they were left of f . More than that , the idea of how MOLGEN (Stefik , 1980) uses planning operators like ‘‘least-commitment’’ and ‘‘guess-undo’’ to control the problem solving process can be taken a step further . Intuitively speaking , given an explicit S-SSM , one might consider using a dependency-directed network ‘‘on top’’ of the S-SSM so as to permit capturing as well as explaining the rational for pursuing , interrupting , retracting , or resuming alternative lines of reasoning . Such a capability would facilitate the generation of true why explanations of behavior .

5 . Relation to other work

How does our work on representing and using design knowledge relate to previous work? In brief , we think that our work is essentially the result of another step in the progression of work that focuses on the need to represent explicitly various types of knowledge .

Work on early KBSs aimed to make domain knowledge , K , explicit by separating it from the inference procedure . It yielded KBSs like MYCIN , which represented K as


DecomposableSub-tasks

PSM(s)

S

Problem-Solving Method(PSM)

Hierarchy ofGeneric Tasks (GTs)

Primitive GTs

Compotential &Task-StructureApproaches

Task

(Primitive) SolvableSub-tasks

CommonKADS

PSMs (Strategy Model)

Task Structure

Inference Structure

Generic Inference

map to

abstractedinto

Part of

map to

map to

...

define define roles of

analogous to

involveand control

define roles of

define Ontology of K

describe

K D

Hierarchy ofS-operators

report to

I-operators

I-SSM

K-operators

controlanalyse

update

define needed

documented by

capturemodel

I-SSM ontology (structure)

model capture

F IGURE 11 . Our work in relation to work on second generation KBSs .

rules and used a rule-chaining inference procedure . Despite the separation of K from the inference procedure , part of the strategic knowledge , S , used to control the processing order of rules was compiled into K , typically in the physical order of rule clauses , the physical order of rules in the KB , and the so-called metarules (Davis & Lenat , 1982) . As a result , K and S were neither accessible nor interpretable for purposes of explanation , reuse , and robustness .

Work on second generation KBSs thus focused on making S explicit by separating it from K . A related issue on which later work focused is separating the knowledge-level modeling of a task in the conceptualization stage from the symbol-level modeling of the task in the KBS design stage . Following Figure 11 , the next discussion reviews the major streams of work on these two issues and their key relations to our work .

There are task-specific and method-specific approaches to capturing S separately from K . The task-specific approach assumes that an application task is a hierarchy of sub-tasks which can be represented as procedural operators (e . g . see Figure 7) . High level operators capture S and use it to control the order of calls to lower level operators , whereas lowest level (primitive) operators use K to construct a solution . Because some sub-tasks are generic in the sense that they are common to many applications , such procedural operators are viewed as task-independent reusable modules . Chandrasekaran (1987) called these modules generic tasks ( GTs ) . To configure a complete inference procedure for a task , it is necessary to functionally compose a hierarchy of GTs that were possibly specialized for the task . Chandrase- karan & Johnson (1993) discuss several shells that assist in this endeavor . Related to this idea is McDermott’s (1988) method-specific approach , which captures S through the notion of role - filling problem - sol y ing methods ( PSMs ) . A PSM is a complete task-specific inference procedure that is conceptually composed of a hierarchy of GTs . It also relates a task to the K needed to accomplish the task . It can

M . BENAROCH 716

be specialized into various KBS applications , while helping knowledge acquisition ( KA ) by structuring K in terms of the roles that specific domain entities and relations play in problem solving . This idea was applied in KA tools like SALT

(Marcus , 1988) . Capturing S separately from K using GTs , in task- or method-specific ways ,

improved on first generation systems mainly in two respects . First , when S was expressed using an application-independent vocabulary , it became reusable in the context of KA tools like SALT . Second , making S explicit helped to make control decisions more explicit , hence providing for stronger explanations of behavior (or of problem-solving strategies) . Both improvements follow from the focus of second generation KBSs on the aspect of how a task is solved and the way this aspect can be captured using hierarchies of GTs .

Our work considers the how aspect to be a means to an end that has to be grounded in the aspect of what a task seeks to construct as a solution . For example , take the case of medical diagnosis . The what aspect first defines the structure of I-SSMs needed . Since hypotheses about the cause of a disease might be generated and added to the dif ferential on the basis of any data in the I-SSM , the how aspect then defines what data in the I-SSM to consider and when so as to limit the size of the dif ferential . The point is that focusing primarily on the what aspect does not prevent our KBS architecture from capturing the how aspect using hierarchies of GTs , as explained in Section 3 . 2 ; it only requires expressing S using the same graph vocabulary used to express I-SSMs . This means that our architecture preserves the two above improvements of second generation KBSs over earlier systems .

Focusing on the what (in addition to the how ) aspect of a task leads to additional improvements over typical second generation systems . Some improvements are due to the availability of design knowledge , D , reflecting the structure of I-SSMs constructed . First , as the examples in Section 4 . 2 suggest , D simplifies the dynamic integration of multiple K s . Compared to second generation KBSs , this helps to aovid a form of brittleness that arises not from lacking knowledge , but from lacking flexibility in using knowledge . Second , expressing K and S using an epistemic and ontological vocabulary ( D V e 1 D V o ) not only supports the kind of reuse of S found in the context of GTs and PSMs , but also provides for reuse of D itself and makes our KBS architecture task-independent (see Section 3 . 4) . Third , since D A implies the underlying structure of K , it could help structuring K during KA , like in PSM-oriented KA tools . Furthermore , compared to GT shells that help to specialize a hierarchy of GTs into a specific KBS application , D A helps in another respect : node-chains defined by axioms in a task-specific D A permit generating a list of the necessary primitive GTs (or K-operators) and checking for its completeness vis-a-vis the task .

Other improvements over second generation systems derive from the explicitness of the I-SSM and S-SSM constructed , both of which serve the function of a well-organized working memory . As we argued in Section 4 . 3 , an S-SSM permits to deliberately compare alternative lines of reasoning , and thus allows to make explicit the rationale behind global control decisions regarding what lines of reasoning to pursue , interrupt , retract , or resume . Related to this , an S-SSM also helps to generate global explanations of behavior that justify the particular strategy a KBS uses to induce specific lines of reasoning during problem solving . As to the I-SSM , it


helps to explain solutions a KBS generates by showing how they meet the informational requirements of the task . Additionally , an I-SSM permits making control decisions genuinely a result of the present state of knowledge about the problem (as reflected by the changing content of the I-SSM) . In other words , grounding all how -related control decisions in the need to evolve the I-SSM into a state with characteristics specified by D A allows our architecture to view the process of problem solving as modeling . Van de Velde (1993) recognizes that this view reflects more of the real nature of problem-solving from a knowledge-level perspective .

The last point brings us to the second issue that is the focus of work on second generation KBSs—the creation of a knowledge-level ( KL ) model of a task prior to the construction of a symbol-level ( SL ) model in the KBS design stage . Here we find several streams of work , chief among them are the componential approach (Steels , 1990) , the task - structure approach (Chandrasekaran & Johnson , 1993) , and CommonKADS (Schreiber et al . , 1994) . Without elaborating on how exactly these approaches go about constructing KL models , suf fice to say that they are all rooted in the how aspect of a task , because they use the notion of PSMs to direct their construction process . Although some of them analyze I-SSMs (or case models , as CommonKADS refers to them) while constructing KL models , I-SSMs are not modeled as part of the KL models produced . Consequently , KBSs designed based on such KL models do not reflect the structure of I-SSMs they construct (implicitly) , and they fail to of fer the benefits discussed above . For these reasons , Van de Velde (1993) called for the need to focus the KL modeling endeavor primarily on what a task entails , or I-SSMs it constructs , and Wielinga , Van de Velde , Schreiber & Akkermans (1993) consider making the explicit modeling of I-SSMs an integral part of CommonKADS .

Some KL modeling approaches also seek to simplify the construction of a KL model and its conversion into a working KBS , through the notion of reuse in all stages of KBS development . For example , consider the case of CommonKADS (Schreiber et al . , 1994) . It provides a library of generic KL models that can be adapted to the task modeled so as to capture postulate system requirements . Once a specific generic KL model has been adapted , CommonKADS seeks to facilitate two other things .

(1) Adapting a generic SL model counterpart . A generic SL model captures ‘‘technical’’ design knowledge which , unlike the conceptual design knowledge we discussed , documents design decisions specifying computational aspects left open in the KL model (e . g . representational constructs and computational methods used to carry out inferences) . Using a structure-preserving design approach , CommonKADS first identifies postulate system requirements that changed in the generic KL model upon its adaptation , and these are then used with technical design knowledge to identify design decisions that must be adapted in the generic SL model counterpart . Additional work in this area is discussed by Vanwelkenhuysen (1995) .

(2) Mapping an SL model to a reusable K (or KB) through domain ontologies (referred to as domain metamodels) . CommonKADS uses several ontologies , each corresponding to a dif ferent abstraction of K and interaction type with a

M . BENAROCH 718

generic KL model . The lowest level , least detailed ontology is task-oriented in that it describes the interaction of K with the inference-structure part of a KL model ; a higher level , more detailed ontology could be method-oriented in that it might distinguish between finer types of ontological entities based on inferences that a particular PSM in a KL model applies to them . By using dif ferent ontologies with dif ferent generality , and partitioning K s accordingly , CommonKADS seeks to identify classes of K s with dif ferent scope , generality and reusability that can be mapped to generic KL models and , in turn , to their SL model counterparts .

Our work provides an alternative way to link the KL and the SL modeling endeavors . The ontology (or structure) of I-SSMs that tasks of a certain class construct can be considered the core of a generic KL model of these tasks . It can be adapted to any specific task in that class , in the way we explained and illustrated in Section 4 . 1 . Thereafter , carrying out the SL modeling endeavor could then be simplified in three ways . First , adapting a generic SL model counterpart could rely on the one-to-one mapping existing between axioms in the D A defined by an adapted I-SSM ontology and the K-operators and I-operators needed in the SL model . Second , K-operators and S-operators required in the SL model counterpart could be further adapted using CommonKADS’ approach , once they will be set to capture the so-called technical design knowledge . Finally , since the K needed and its form are directly determined by the I-SSM ontology called for by a task , it is possible to identify which specific parts of (or relational networks in) reusable K s are relevant to the KL model and its counterpart SL model adapted to the task .

6 . Conclusion

We discussed in this paper one type of conceptual design knowledge , D , which reflects specific design decisions that a KBS developer makes with regard to the structure of I-SSMs that a prospective system seeks to construct . We first showed how D can be represented in the case of systems that capture knowledge using relational networks . We next presented a KBS architecture that uses D to facilitate the construction of explicit I-SSMs . Then , we reviewed potential avenues that this architecture opens with respect to the design of KBSs that are simpler to build and maintain , support knowledge reuse , possess strong explanation capabilities , and are more robust . Last , we showed that the architecture presented is simply the result of another step in the progression of work that focuses on the need to represent various knowledge types explicitly , i . e . D in addition to K and S .

We are working to implement and use the architecture presented in the context of financial risk management applications . (We did not use examples from this domain because of its highly specialized terminology . ) Using this architecture to solve risk management tasks is appealing mainly for two reasons . Explicit I-SSMs allow to conduct various ‘‘what-if’’ analyses that probe a solution to find out how it changes under dif ferent market scenarios (without resolving for each scenario from scratch) . Additionally , since many risk management tasks involve the same kernel of knowledge about financial securities and markets , availability of explicit design


knowledge would facilitate reuse and integration of this knowledge across KBS applications solving related tasks (for details see Benaroch , in press) .

Our implementation ef fort so far helped us to identify several areas were further research is needed . An important area pertains to the use of existing PSMs (propose and revise , cover and dif ferentiate , etc . ) with our architecture . Applying a PSM with some K constructs an I-SSM through a controlled interaction of its comprising primitive GTs (e . g . in ABLE , an I-SSM is called a patient-specific model) . However , typical PSMs are not modeled in a way that makes explicit the I-SSMs they construct , and they do not express the S (strategic principles) they employ using the graph vocabulary our architecture uses to analyse I-SSMs and make control decisions . To facilitate and simplify the use of existing PSMs , it is necessary to do two things : reveal the generic I-SSMs that specific PSMs construct , and understand how these PSMs use their S to select in each moment the I-SSM portion (i . e . sub-graph , branch , or node-chain) on which the next problem-solving activity will focus . Doing so requires considering the fact that typical PSMs decompose their task into sub-tasks , which themselves are either solved or further decomposed by ‘‘smaller’’ PSMs (Chandrasekaran & Johnson , 1993) . This might require extending our architecture so that it would treat an I-SSM as if it were composed of a hierarchy of I-SSMs mirroring the recurring task / method / sub-task patterns implied by PSMs . Under this scenario , subtask-specific I-SSMs are basically abstractions of the top level I-SSM that hide details which are irrelevant to their sub-task .

Section 4 implicitly identified two other areas for future research that deal with potential benefits that our architecture of fers with respect to the aforementioned KBS design goals . The first area has to do with simplifying the construction and maintenance of KBSs through reuse . Research on the ontologies and perspectives underlying generic I-SSMs that various task classes require is a pre-requisite to the development of a KBS shell involving a hierarchy of reusable design knowledge like the one we discussed . Such a shell could also include existing PSMs and facilitate their reuse in the context of specific generic I-SSMs that various task classes require , assuming that these PSMs are ‘‘expressed’’ in terms that relate to the generic I-SSMs they construct . The second area concerns the use of I-SSMs and S-SSMs for explanation purposes . It is necessary to develop techniques that can use any I-SSM to explain how the solution it captures meets the informational requirements of the task it addresses . As to the S-SSM , research is needed on the generic form of ontological node-chain sequences that specific PSMs produce during their execution . This research would provide the basis for generating true global explanations of behavior , in terms of the lines of reasoning that a KBS follows in problem solving .

References

B ENAROCH , M . (in press) . Towards the notion of a knowledge repository for risk managements . IEEE Transactions on Knowledge and Data Engineering .

B OBROW , D . (1984) . Qualitative reasoning about physical systems : an introduction . Artificial Intelligence , 24 , 1 – 3 .

B RACHMAN , R . J . (1979) . On the epistemological status of semantic networks . In N . V . F INDLER . Ed . Associati y e Networks : Representation and use of Knowledge by Computers , New York : Academic Press .

C HANDRASEKARAN , B . (1981) . Towards a functional architecture for intelligence based on

M . BENAROCH 720

generic information processing tasks . Proceedings of the Tenth International Joint Conference on Artificial Intelligence , pp . 1183 – 1192 . Los Altos , CA : Morgan Kaufmann .

C HANDRASEKARAN , B . & S WARTOUT , W . (1991) . Explanations in knowledge systems : the role of explicit representation of design knowledge . IEEE Expert , June 47 – 49 .

C HANDRASEKARAN , B . & J ONHSON , T . R . (1993) . Generic tasks and task structures : history , critique and new directions . In J . M . D AVID , J . P . K RIVINE & R . S IMMONS , Eds . Second Generation Expert Systems , pp . 233 – 272 . London : Springer-Verlag .

C LANCEY , W . J . (1988) . Acquiring , representing and evaluating a competence model of diagnostic strategy . In M . C HI , R . G LASER & M . J . F ARR , Eds . The Nature of Expertise , pp . 343 – 418 . Hillsdale , NJ : Lawrence Erlbaum Associated .

C LANCEY , W . J . (1992) . Model construction operators . Artificial Intelligence , 53 , 1 – 115 . C ONSOLE , L ., P ORTINALE , L ., D UPRE , D . T . & T ORASSO , P . (1993) . Combining heuristic

reasoning with causal reasoning in diagnostic problem solving . In J . M . D AVID , J . P . K RIVINE & R . S IMPSON , Eds . Second Generation Expert Systems , pp . 47 – 68 . London : Springer-Verlag .

D AVID , J . M ., K RIVINE , J . P . & S IMMONS , R . (1993) . Second generation expert systems : a step forward in knowledge engineering . In J . M . D AVID , J . P . K RIVINE & R . S IMMONS , Eds . Second Generation Expert Systems , pp . 3 – 23 . London : Springer-Verlag .

D AVID , J . M ., K RIVINE , J . P . & R ICARD , B . (1993) . Building and maintaining a large knowledge-based system from a ‘‘Knowledge-level’’ perspective : the DIVA experiment . In J . M . D AVID , J . P . K RIVINE & R . S IMMONS , Eds . Second Generation Expert Systems , pp . 376 – 401 . London : Springer-Verlag .

D AVIS , R . & L ENAT , D . (1982) . Knowledge - Based Systems in Artificial Intelligence . New York : McGraw-Hill .

D AVIS , R ., S HORBE , H . & S ZOLOVITS , P . (1993) . What is a knowledge representation? AI Magazine , Spring , 17 – 33 .

F ALKENHAINER , B . & F ORBUS , K . D . (1991) . Compositional modeling : finding the right model for the job . Artificial Intelligence , 51 , 95 – 143 .

G UHA , R . V . & L ENAT , D . B . (1994) . Enabling agents to work together . Communication of the ACM , 37 , 127 – 142 .

G UIDA , G . & Z ANELLA , M . (1993) . Knowledge-based design using the multi-modeling approach . In J . M . D AVID , J . P . K RIVINE & R . S IMMONS , Eds . Second Generation Expert Systems , pp . 174 – 208 . London : Springer-Verlag .

H AYES -R OTH , B ., H EWETT , M ., J OHNSON , M . V . & G RAVEY , A . (1988) . ACCORD : a framework for a class of design tasks . Report No . 88-19 , Knowledge Systems Laboratory , Stanford University , Stanford , CA .

H UNT , J . E . & P RICE , C . J . (1993) . Integrating functional models and structural domain models for diagnostic applications . In J . M . D AVID , J . P . K RIVINE & R . S IMMONS , Eds . Second Generation Expert Systems , pp . 131 – 160 . London : Springer-Verlag .

L IU , Z . & F ARELY , A . (1991) . Shifting ontological perspectives in reasoning about physical systems . Proceedings of AAAI - 1 9 9 1 , pp . 395 – 400 .

M ARCUS , S . (1988) . SALT : a knowledge acquisition tool for propose-and-revise systems . In S . M ARCUS , Ed . Automating Knowledge Acquisition for Expert Systems , pp . 81 – 123 . Boston , MA : Kluwer Academic .

M C D ERMOTT , J . (1988) . Preliminary steps toward a taxonomy of problem-solving methods . In S . Marcus , Ed . Automating Knowledge Acquisition for Expert Systems , pp . 225 – 256 . Boston , MA : Kluwer Academic .

N ECHES , R ., F IKES , R ., F ININ , T ., G RUBER , T ., P ATIL , R ., S ENATOR , T . & S WARTOUT , R . W . (1991) . Enabling technology for knowledge sharing . AI Magazine , Fall , 37 – 56 .

N EWELL , A . (1982) . The knowledge level . Artificial Intelligence , 19 , 87 – 127 . P ATIL , R . S ., S ZOLOVITS , P . & S CHWARTZ , W . B . (1984) . Causal understanding of patient

illness in medical diagnosis . In W . J . C LANCEY & E . H . S HORTCLIFFE , Eds . Readings in Medical Artificial Intelligence . Reading , MA : Addison-Wesley .

P OPLE , H . E . Jr (1982) . Heuristic methods for imposing structure on ill-structured problems : the structuring of medical diagnosis . In P . S ZOLOVITS , Ed . Artificial Intelligence in Medicine . AAAS Selected Symposium 5 1 . Colorado : Westview Press .


R ICH , E . (1983) . Artificial Intelligence . New York , NY : McGraw-Hill . S CHREIBER , G ., W IELINGA , B . & DE H OOG , R . (1994) . CommonKADS : a comprehensive

methodology for KBS development . IEEE Expert , December , 28 – 36 . S HORTLIFFE , E . H . (1976) . Computer - Based Medical Consultation : Mycin . New York , NY :

Elsevier . S IMMONS , R . & D AVIS , R . (1993) . The roles of knowledge and representation in problem

solving . In J . M . D AVID , J . P . K RIVINE & R . S IMMONS , Eds . Second Generation Expert Systems , pp . 27 – 45 . London : Springer-Verlag .

S TEELS , L . (1990) . Components of expertise . AI Magazine , Summer , 28 – 49 . S TEFIK , M . (1980) . Planning with constraints . Technical Report No . STAN-CS-80-784 ,

Computer Science Department , Stanford University , Stanford , CA . S WARTOUT , W . & M OORE , J . (1993) . Explanation in second generation expert systems . In

J . M . D AVID , J . P . K RIVINE & R . S IMMONS , Eds . Second Generation Expert Systems , pp . 543 – 585 . London : Springer-Verlag .

S WARTOUT , W . R ., P ARIS , C . & M OORE , J . (1991) . Design for explainable expert systems . IEEE Expert , June , 58 – 64 .

T ANNER , M . C ., K EUNEKE , A . M . & C HANDRASEKARAN , B . (1993) . Explanation using task structure and domain functional models . In J . M . D AVID , J . P . K RIVINE & R . S IMMONS , Eds . Second Generation Expert Systems , pp . 586 – 613 . London : Springer-Verlag .

V AN DE V ELDE , W . (1993) . Issues in knowledge level modeling . In J . M . D AVID , J . P . K RIVINE & R . S IMMONS , Eds . Second Generation Expert Systems , pp . 211 – 231 . London : Springer-Verlag .

V ANWELKENHUYSEN , J . (1995) . Using DRE to augment generic conceptual design . IEEE Expert , February , 50 – 56 .

W ICK , M . R . (1993) . Second generation expert system explanation . In J . M . D AVID , J . P . K RIVINE & R . S IMMONS , Eds ., Second Generation Expert Systems , pp . 614 – 640 . London : Springer-Verlag .

W IELINGA , B ., V AN DE V ELDE , W ., S CHREIBER , G . & A KKERMANS , H . (1993) . Towards a unification of knowledge modelling approaches . In J . M . D AVID , J . P . K RIVINE & R . S IMMONS , Eds ., pp . 229 – 335 . London : Springer-Verlag .

Paper accepted for publication by Associate Editor Dr . B . Chandrasekan .