Int . J . Human – Computer Studies (1996) 44 , 689 – 721
Roles of design knowledge in knowledge-based systems
M ICHEL B ENAROCH
School of Management , Syracuse Uni y ersity , Syracuse , NY 1 3 2 4 4 , USA . email : mbenaroc ê mailbox .syr .edu
( Recei y ed 1 February 1 9 9 5 and accepted in re y ised form 2 8 No y ember 1 9 9 5 )
Recent research suggests that the abilities of a knowledge-based system (KBS) depend in part on the amount of explicit knowledge it has about the way it is designed . This knowledge is often called design knowledge because it reflects design decisions that a KBS developer makes regarding what ontologies to embody in the system , what solution strategies to apply , what system architecture to use , etc . This paper examines one type of design knowledge pertaining to the structure underlying the solutions a KBS produces . (For example , in medical diagnosis , the output might be just a disease name , but the solution is actually a causal argument that the system implicitly constructs to find out how the disease came about . ) We define this type of design knowledge , show how it can be represented , and explain how it can be used in problem solving to make the structure underlying solutions explicit . Subsequently , we also present and illustrate new avenues that the availability and use of the design knowledge discussed open with respect to the ability to build KBSs that possess strong explanation capabilities , are easier to maintain , support knowledge reuse , and of fer more robustness in problem solving . ÷ 1996 Academic Press Limited
1 . Introduction
Knowledge-based system (KBS) designers are usually concerned with producing systems that meet certain design goals . These goals include providing KBSs with the ability to generate good explanations , to be robust (i . e . avoid brittleness and exhibit novelty) in problem solving , and to capture knowledge in a way that simplifies its maintenance and facilitates its reuse .
Recent research suggests that , the more a KBS knows about the way it is designed , the better it can meet these design goals . For instance , Falkenhainer & Forbus (1991) show how a KBS can address novel problems when it knows what assumptions underlie the domain models its uses and how the knowledge base (KB) organizes these models in relation to the assumptions . Likewise , Swartout , Paris & Moore (1991) illustrate the way a KBS can generate explanation dialogues that address follow-up questions of the user , provided that the system knows what its explanation capabilities are designed to say and how .
Knowledge pertaining to the way a system is designed is generally referred to as design knowledge . Chandrasekaran & Swartout (1991) explain the intuition behind design knowledge as follows . In the KBS design process , a designer brings to bear substantial amounts of knowledge about the subject matter of the application task , the nature of the task , how its parts work together to accomplish its goal , the range of solution strategies applicable , plausible KBS architectures , etc . KBS design is thus viewed as the process of making specific design decisions that produce the KBS
689
1071-5819 / 96 / 050689 1 33$18 . 00 / 0 ÷ 1996 Academic Press Limited
M . BENAROCH 690
sought (e . g . how to represent needed knowledge and what solution strategies to use) , and design knowledge is simply a ‘‘documentation’’ of these decision in the form of diagrams , narrative descriptions , and the like . These authors recognize that design knowledge is open-ended and cannot be all represented explicitly .
We propose in this paper that a good starting point for attempts to capture design knowledge has to do with Clancey’s (1992) recent realization , namely : the output of KBSs is normally a rational argument that explains their solution , not just a solution . For example , in diagnosis the output is typically a causal argument having the structure of a proof , and in design it is a logical argument having the structure of a plan . In this respect , Clancey observed two things . The structure of the explanatory arguments a KBS implicitly or explicitly constructs depends on the application task being addressed . Moreover , whether a KBS makes these explanatory arguments and their underlying structure explicit depends on the way the KBS is designed . In light of these observations , it is intuitively appealing to try to document design decisions pertaining to a specific KBS in relation to the structure of explanatory arguments which that KBS seeks to construct . In other words , if the general goal of a KBS is to construct explanatory arguments having a specific structure , the KBS design endeavor can be viewed as a goal-driven process involving design decisions concerning how to represent the explanatory arguments sought , how to represent domain knowledge needed to construct them , what strategies to use to control their construction process , and so on .
This paper hence focuses on the representation of design knowledge pertaining to the structure underlying the explanatory arguments KBSs construct , and the roles that this design knowledge can play with respect to the ability of KBSs to meet the above design goals . Most of the examples we use are from the domain of medical diagnosis , and in particular NEOMYCIN (Clancey , 1988) , simply because this domain has been studied extensively in the KBS literature . The paper proceeds as follows . Section 2 identifies specific design decision which determine the structure underlying the explanatory arguments a KBS seeks to construct for its task . It also explains how design knowledge reflecting these decisions can be represented . Section 3 presents a task-independent KBS architecture that uses this design knowledge to drive the construction process of explicit explanatory arguments . Section 4 presents avenues that the availability and use of the design knowledge discussed open with respect to the aforementioned design goals . It illustrates these avenues by looking at how several recently build KBSs work in terms of the way they capture and utilize design knowledge . Section 5 discusses the way our work relates to previous work . Section 6 provides some concluding remarks and discusses several future research questions .
2 . Design knowledge and situation-specific models
This section explains how design knowledge relating to the structure underlying the explanatory arguments KBSs construct can be represented . It starts with an example that illustrates how such explanatory arguments typically look like . Upon examining the structure underlying these arguments , it identifies specific design decisions that determine this structure . Finally , it explains how design knowledge reflecting these decisions can be represented .
ROLES OF DESIGN KNOWLEDGE IN KBSs 691
2 . 1 . SITUATION-SPECIFIC MODELS
For any given problem situation , most KBSs implicitly or explicitly create a rational argument which explains their solution for that situation . Since this explanatory argument is a model that captures what a KBS knows about the specific situation addressed , it is often referred to as a situation - specific model ( SSM ) . Thus , while the KB of a system contains knowledge that applies to all problem situations the system is set to address , an SSM contains knowledge that was derived based on the KB and applies only to a specific problem situation .
For example , consider a medical diagnosis task that seeks to identify the disease causing certain patient symptoms as well as find out how this disease came about . Figure 1 respectively shows part of the SSM NEOMYCIN constructs for one specific diagnosis situation , some of the domain knowledge it uses for this purpose , and a trace of how this SSM is constructed . This sample SSM is in fact a directed graph linking diseases , symptoms , pathological bodily structures , etc . Once complete , this SSM would essentially constitute a causal argument which shows that a certain organism (e . g . E .coli ) has entered the body (e . g ., during some surgical procedure) , migrated to a specific organ system (e . g . Meninges of the brains) where it proliferated because the normal immune system is suppressed , inducing a disease (e . g . Bacterial-Meningitis) that causes the pathological bodily structures and symptoms (e . g . durable headaches) observed in the patient .
Believing that a KBS such as NEOMYCIN uses a rational reasoning process to construct SSMs like the one in Figure 1 raises questions regarding the form of these SSMs . Specifically : are all these SSMs directed graphs? if so , do they have a common underlying structure? if so , how does this structure relate to the nature of domain knowledge and / or application task? is this structure determined by design decisions that a system developer makes in the KBS design process? and , if so , what are these design decisions and how can we represent them? The rest of this section answers these questions , in relation to the type of design knowledge discussed in this paper .
2 . 2 . DESIGN DECISIONS AND THE STRUCTURE OF SITUATION-SPECIFIC MODELS
KBS builders make various design decisions during the KBS design process . Some of these decisions are closely related to conceptual aspects having to do with the way the universe of discourse is modeled , while others are closely related to technical aspects of the prospective KBS (e . g . what knowledge representation formalism to use) . The former design decisions are known to be more critical to the abilities that the KBS will possess ; they usually ought to precede the more technically oriented design decisions (Newell , 1982) .
In relation to the structure underlying SSMs , we focus on conceptual design decisions pertaining to choices that a KBS designer makes with respect to the four conceptual levels presented in Figure 2— epistemology , ontology , perspecti y e and instance (Brachman , 1979 ; Davis , Shorbe & Szolovits , 1993) . Epistemology choices relate to the way knowledge pertaining to the phenomena of interest is expressed , e . g . using relational networks or neutral networks . As can be seen from the domain knowledge presented in Figure 1 , NEOMYCIN ’s designers chose to express knowledge as taxonomic relational networks that classify and link agents , diseases , symptoms , etc .
M . BENAROCH 692
Situation-Specific Model
induce relationsubsumption relationsubtype relationcausal relation
disease
finding/symptompathological bodilystructure and/or findingagent (e.g. organism)inducing a disease
E.coli Acute-Bacterial-Menegitis
High grade fever
CNS (headache) duration
Acute-Meningitis
12 hours headaches
stiff neck on flexation
56
7
...
9
1
2
4
3
...Intracranial-Mass-Lesion
Intracranial-Tumor
Increased-Intracranial-Pressure
Seizure
Infectious Disease
Domain KnowledgeSurgery
subsumes
Neurosurgery Cardiosurgery
Recent-Neurosurgery
Ventricular-Ureteral-Shunt
Bacteriasubtype
Gram-Neg-Rod Gram-Pos-Rod
E.coli Klebsiella-Pnumonaie
induces
Disease
Congenial Infectious
Meningitis
Acute Chronic
bacterial viral
... ...
causes
subtypeFinding/Symptom
subsumes
fever headaches
high grade CNS duration
10
...
Meningitis
8
fever
...
...
......
...
...
Partial Trace of the Construction Process
1 The ‘‘12 hours headaches’’ initial patient symptom is mapped to the more abstract ‘‘CNS headache duration’’ finding .
2 The ‘‘12 hours headaches’’ suggests that the disorder might be Meningitis . 3 Meningitis can be hypothesized if the patient experiences a ‘‘stif f neck on flexion’’ .
This triggers the question : ‘‘Does the patient have a ‘‘stif f neck on flexion’’?’’ . The user’s answer is YES .
4 Refinement , or specialization of the Meningitis hypothesis generates the more spec- ific hypothesis Acute-Meningitis .
5 Similar refinement of Acute-Meningitis generates the more refined hypothesis Acute-Bacterial-Meningitis .
6 Support for Acute-Bacterial-Meningitis using known symptoms is found as the ‘‘CNS duration’’ finding .
7 An attempt to further support Acute-Bacterial-Meningitis triggers the question : ‘‘Does the patient have a high-grade fever?’’ The answer is 105 . 8 Farenhit , which is categorized as the more abstract ‘‘high-grade fever’’ finding .
8 A dif ferent attempt is made to support Meningitis by looking for categorical evid- ence for a more general hypothesis . The attempt identifies Infectious-Disease as a plausible hypothesis .
9 Infectious-Disease implies the presence of certain findings , which may be already known . The ‘‘high-grade fever’’ is thus found to be subsumed by ‘‘fever’’ , provid- ing support to the presence of the needed ‘‘fever’’ symptom .
10 Infectious-Disease is hypothesized , calling for the search for additional findings that may be consistent with the diagnosis so far .
11 . . .
F IGURE 1 . Part of a situation-specific model (SSM) constructed by NEOMYCIN , some of the domain knowledge used to construct that SSM , and partial trace of the construction process .
ROLES OF DESIGN KNOWLEDGE IN KBSs 693
Ways to express knowledge about the "world"
Relational Networks Neural Networks Symbolic Networks Natural Language
Taxonomic Compositional Transitional
Process Structure Function Structure Causal Discourse-state
Normal Abnormal process(e.g. disease)
Chronological process(e.g., staged failure)
Abnormal Interactive-Historic process
Actor(e.g., organism)
Spatial(e.g., organ system)
Temporal Malfunction(e.g., disease)
Manifestation(e.g., symptom)
overload(e.g., Psychogenic)
infectious(e.g., Bacteria)
invalid input orenvironment(e. g., Toxic)
developmental(e.g., Congenital)
infection
Meneigitis
Viral
Acute Chronic
Bacterial
Design Choices
Epistemology
Ontology
Perspective
Instancecancer
F IGURE 2 . Design choices (epistemic , ontological , perspective and instance choices) that the designer of a KBS makes .
Ontology choices determine the specific ontological entities and relations used to model the phenomena of interest . In NEOMYCIN , the ontology of choice is labeled ‘‘Abnormal Interactive-Historic process’’ in Figure 2 . Using this ontology , diseases are modeled as malfunctions , which are the result of abnormal processes involving certain generic ontological entities (actor , location , manifestation , etc . ) that follow particular causal scripts and produce specific interaction histories with the body . Depending on the application task , these generic entities might represent dif ferent things . For example , in medical diagnosis an actor may be an organism , while in the diagnosis of electronic devices it may be an electric serge . Hence , for reasons of clarity , a KBS designer might choose to associate generic ontological entities with task-specific synonyms and descriptors . For example , in NEOMYCIN , ‘‘etiological agent’’ is used as a task-specific synonym of ‘‘actor’’ , and ‘‘common-unlikely’’ is used as a descriptor corresponding to an attribute of agents that allows to distinguish between agents that are common or unlikely inducers of diseases (e . g . a Gram- Positive-Rods bacteria is unlikely to induce a Pelvic Abscess) . Figure 3 shows other synonyms and descriptors commonly used in the context of medical diagnosis .
Perspecti y e choices determine the facets from which relevant phenomena are examined . Granted that these phenomena can be modeled using the various ontological entities implied by ontological choices , a KBS designer might choose to focus only on those phenomena facets that can be modeled using a specific subset of
M . BENAROCH 694
Abnormal Interactive-Historic process
Actor(inducer)
Location(spatial)
Time(temporal)
Malfunction(inducee)
Manifestation(causal)
etiologicalagent
entrypoint
residence site(organ system)
entrytime
duration ofresidence
disease pathologicalbodily
structure
finding(symptom, test
result, etc)
common inducer vs.unlikely inducer
recent vs.non-recent
treatable vs.untreatable
enabling cause (i.e., a stepin a disease process) vs.
circumstantial cause (for adisease process)
normal vs.abnormal
Ontology
Generic Ontological
Entities(perspectives)
Task-SpecificSynonyms
Task-SpecificDesciptiors
involves
have
associated with
F IGURE 3 . Sample task-specific synonyms and descriptors (i . e . attributes) of generic ontological entities implied by the ‘‘Abnormal Historic-Interactive Process’’ ontology applied to a medical diagnosis task .
these entities . In NEOMYCIN , one of the perspectives of choice is that which looks at diseases in terms of the actors (agents) inducing them , and one of the perspectives ignored is that which focuses on the location of diseases . The latter observation is apparent from the fact that , unlike in INTERNIST II (Pople , 1982) , the domain knowledge NEOMYCIN uses does not include a taxonomy of organ systems and the diseases they can involve .
Instance choices determine the instances of ontological entities that the KB knows about . In Figure 2 , branches below the identified ontological entities correspond to specific relational networks , and choices at this level determine the instances contained in these networks . In NEOMYCIN , one of the design choices corresponding to this level excludes from the disease taxonomy the various types of Viral- Meningitis (since these are all treated similarly) . This choice is apparent in Figure 2 through the fact that the node labeled ‘‘viral’’ has no children .
When these conceptual design choices are made for a specific KBS , they imply much of the structure underlying SSMs which that KBS would construct . More specifically , this structure is determined by epistemic , ontological and perspective choices as well as by the modality of the application task . For example , Figure 4 shows part of the SSM structure implied by the ontological choice labeled ‘‘Abnormal Interactive-Historic process’’ in Figure 2 , for the medical diagnosis task addressed by NEOMYCIN . This ‘‘generic’’ SSM is , and any of its instances would be , a directed graph , because its underlying epistemology is that of relational networks . Further , its nodes represent generic entities specific to its underlying ontology and perspectives . Finally , most of its links represent generic relations specific to its underlying ontology , and some of its links (ones with labels in parentheses) represent relations implied by the task-specific organization of instances (e . g . the ‘‘subtype’’ link coming out of the ‘‘malfunction’’ node is implied by the fact that diseases can be conceptually organized in a specialization taxonomy) . Relative to this generic SSM , the SSM instance shown in Figure 1 does not contain certain types of nodes because it is incomplete and because NEOMYCIN ’s designers chose to disregard some perspectives (e . g . organ system) .
ROLES OF DESIGN KNOWLEDGE IN KBSs 695
(Subsume)
Location(organ system)
MoveLocation
(entry point)
MoveActor(agent)
(Sub-type)
InduceMalfunction
(disease)
(Sub-type)
(Sub-type)
Cause
CauseManifestation
(pathologic structure)
(Subsume)
Manifestation(finding/symptom)
(Subsume)
F IGURE 4 . Part of the generic SSM implied by the ‘‘Abnormal Historic-Interactive Process’’ ontology , in case of a medical diagnosis task like the one addressed in NEOMYCIN . Nodes represent generic ontological entities , whose task-specific synonyms are depicted in parenthesis . Non-vertical links between nodes depict ontological relations indicating that diseases are modeled as malfunctions that follow a certain generic causal script—an agent enters the body , moves to an organ system where it proliferates , induces a disease which , in turn , causes pathological bodily structures and symptoms . Vertical links depict task-specific relations indicating that instances of certain entities bear specialization and subsumption
relations among them .
Because most conventional KBSs capture knowledge using relational networks , we will focus in the rest of the paper on the relational networks epistemology .
2 . 3 . CAPTURING CONCEPTUAL DESIGN DECISIONS AS DESIGN KNOWLEDGE
A generic SSM can be described in terms of the axioms (assertions , constraints , assumptions) with which its instances must comply . Some of these axioms are implied by ontology choices , while others are implied by the task’s modality . For example , a few of the axioms needed to describe the generic SSM form shown in Figure 4 are as follows .
(a) Every terminal node (i . e . symptom) in the SSM is an abnormal finding for which there must be a causal link to a supporting disease or pathological bodily structure that explains it .
(b) A disease node is always the father of a finding node (i . e . causality is assumed to be uni-directional) .
(c) The root of a sub-graph in an SSM must be a treatable disease (to enable the prescription of drugs) .
(d) The root of a sub-graph in an SSM must be the most specific disease possible (so that the most correct drugs can be prescribed) .
(e) The SSM must have a single root (to minimize the number of drugs prescribed) , and that root must ‘‘contain’’ every known abnormal finding .
The first two axioms are ontological axioms in that they describe the general nature of entities and relations involved . The last three axioms are modality axioms
M . BENAROCH 696
that apply in the case of a task whose goal is to diagnose and prescribe a cure ; they describe the nature of ontological entities with respect to task-specific descriptors (e . g . axiom (c) with respect to treatability) and relations (e . g . axiom (d) with respect to specialization relations between diseases) .
The axioms needed to describe a generic SSM can be expressed using predicate calculus as quantified logical sentences of the form :
(( quantifier o 1 , . . . , o n ( π o 1 , . . . , o n )[[ ∧ , ∨ , — l ]( π ? ? ? )]) é
( quantifier o 1 , . . . , o n ( π o 1 , . . . , o n )[[ ∧ , ∨ , — l ]( π ? ? ? )])) ,
where o 1 , . . . , o n are ontological entities , π is an n -ary function constant or relational constant , and [ ] is an optional sentence . For example , the axioms labeled (a) and (d) above can be expressed as :
(a 9 ) (( ; F (Present F ) ∧ ( abnormal F )) é ( ' D ( Cause D , F ))) , (d 9 ) ( ; D (Present D ) ∧ (Root D )) é ( — ı
' D 1 (Subtype D , D l ))) ,
where F is a finding , D and D 1 are diseases , abnormal is a relational constant whose argument is an abnormal finding , Present and Root are function constants that check whether their argument satisfies certain conditions (with respect to an SSM instance) , and Cause and Subtype are relational constants .
A KBS that captures the underlying structure of SSMs it constructs as a set of such axioms can be said to also capture (implicitly) design decisions that imply this structure . The axioms used to describe a generic SSM , say the one in Figure 4 , reflect epistemic , ontological and perspective choices in three ways . First , as axioms (a 9 ) and (d 9 ) show , the vocabulary used to express axioms reflects the epistemology chosen through reference to properties of relational networks (e . g . root , father) . Second , the ontology chosen is visible because the same vocabulary also involves ontology-specific entities and relations (e . g . finding , disease , cause) , and because of the ontology-specific nature of phenomena the axioms define using these entities and relations . Finally , perspective choices are reflected by the subset of ontological entities involved in the specific set of axioms defining the generic SSM applicable . Because each ontological entity is identiefid with a specific perspective , the lack of axioms that refer to a certain entity would indicate the exclusion of the perspective corresponding to that entity . (Recall that choices relating to the instance level are apparent only from the content of the KB itself . )
Following this observation , design knowledge , denoted D , that reflects the conceptual design decisions discussed is defined to include two things . One is the set of axioms used to describe the generic SSM applicable , denoted D A . The other is the vocabulary used to express these axioms , denoted D V .
Given that D A can be expressed using a vocabulary that pertains to the epistemology and ontology levels , what vocabulary must be used to express domain knowledge , denoted K , and strategic knowledge that controls the way K is applied in the SSM construction process , denoted S ? The answer to this question has implications on the ability of a system to use D for any meaningful purpose . For example , in MYCIN (Shortlif fe , 1976) , rules comprising K are expressed using a vocabulary that pertains only to the instance level (i . e . rules do not refer to the type of ontological entity each instance stands for) , and the inference engine embedding S (rule-chaining and conflict resolution strategy) is built to reason with knowledge
ROLES OF DESIGN KNOWLEDGE IN KBSs 697
that is expressed using this vocabulary . Hence , even if the relevant D were available to MYCIN , it would not be usable because it cannot make ‘‘contact’’ with K and S .
If D is to be usable , K and S have to be expressed using a vocabulary that also pertains to the epistemology and ontology levels . Given that under the relational networks epistemology K is essentially a set of directed graphs , such a vocabulary could be as follows . A node in a directed graph could be represented as the triplet ( o i [ d ? ? ? ]) , where o is an ontological entity , i is an instance of a , and [ d ? ? ? ] are optional descriptors of o . For example , the node ( F High - Grade - Fe y er abnormal ) represents the abnormal finding High-Grade-Fever . A link could be represented as a sentence ( π node 1 ? ? ? node n ) , where π is a relational constant over n nodes , and as such it can be said to coincide with a node - chain , denoted node 1 ? ? ? node i 5
node i 1 1 ? ? ? node n . For example , the link ( Cause ( D Meningitis ) ( F High - Grade - Fe y er abnormal )) represents a binary relation , and it coincides with the node-chain ( D Meningitis ) 5 ( F High - Grade - Fe y er abnormal ) . Following this convention , o 1 ? ? ? 5 o i 5
o i 1 1 ? ? ? o n will denote the ontological entities in a node-chain (e . g . D 5 F in the above example) . As to operators that apply K in the SSM construction process and the S used to control this process , we will see later how these are expressed using the same vocabulary .
At this point the question is this : how can a KBS use D , and why should it use D ? The first part of this question is answered in the next section . The second part will be addressed in Section 4 , once we understand the role that D can play in problem solving .
3 . Using design knowledge in problem-solving
This section presents a KBS architecture that uses D to direct the construction of explicit SSMs . The idea behind the architecture can be explained using the problem-space metaphor (Rich , 1983 ; p . 25) . Since the goal is to create an SSM instance for a specific problem , we view D A as implicitly defining the goal knowledge state about the problem , and the evolving SSM instance being created as defining the current problem-space state . As Figure 5 shows , whatever is the current problem-space state , state inference operators are first used to identify gaps between this state and the goal state (i . e . axioms in D A that the SSM violates) . Next , when gaps are detected , search control operators are used to choose which gap will be pursued first . Then , search operators that apply K (instantiate elements in K , derive new elements based on K , acquire input suggested by K , etc . ) are used to derive the knowledge needed to update the SSM and eliminate the gap pursued . Since the updated SSM defines a new problem-space state for which new gaps might be detected , this iterative process would be repeated until all gaps are eliminated (i . e . the SSM violates no axioms in D A ) . This idea can be illustrated using the medical diagnosis example depicted in Figure 1 . Starting with an SSM containing only one node for the symptom the system accepts as input , one of the violated axioms is : ‘‘a finding must be linked to a causing disease’’ (axiom (a) in Section 2 . 3) . Supposing that this were the only violated axiom , the system would search K to find a disease that might explain the symptom currently present in the SSM . In turn , the search result would be used to update the SSM , and the process would be repeated .
The architecture we present involves an iterative inference procedure that views
M . BENAROCH 698
Srategic Planning Plane (S-Plane)
(2) identify "search" operators that can apply K to derive knowledge elements necessary to resolve reported violations, and
(3) use S to select which of these operators is to be triggered
S-SSM S-SSM update operators
S search control operators
Inference Plane (I-Plane)(1) analyze the current I-SSM to identify violations of the generic I-SSM form (stop when no violations are found)
(5) update the cuurent I-SSM
currentI-SSM
state difference operators
I-SSM update operators
Domain Knowledge Plane (K-Plane)(4) triggered search operator applies K-- instantiate knowledge elements in K, derive new elements based on K, acquire input suggested by K, etc"search" operatorsK
reportviolations
report"search"results
triggerchosen"search"operator
F IGURE 5 . A KBS architecture for reasoning with design knowledge . The underlying inference procedure is a five-step iterative process that views the knowledge involved as though it resides in three dif ferent
planes .
the knowledge in a KBS as though it spans the three planes shown in Figure 5— domain knowledge plane ( K - plane ) , inference plane ( I - plane ) , and strategic - planning plane ( S - plane ) . The K - plane contains K and ‘‘search’’ operators that apply K . The I - plane contains D A , the SSM instance being constructed , and operators for analysing and updating this SSM . The S - plane contains S , operators that use S to control the triggering order of search operators , and another type of SSM that models the KBS’ ongoing behavior for the case tackled . To avoid confusion between the SSMs constructed in the I-plane and S-plane , we will refer to them as I - SSM and S - SSM , respectively . † We next elaborate on the activities taking place in these planes , following the order of steps in the inference procedure depicted in Figure 5 .
3 . 1 . THE I-PLANE
Being that an I-SSM is essentially a directed graph , it is represented using predicate calculus as a set of sentences of the form ( π node 1 ? ? ? node n ) , where π is an n -ary relational constant over nodes .
Identifying specific ways in which the current I-SSM violates axioms in D A
requires analysing this I-SSM using state dif ference operators , called I - operators .
† Several existing KBSs build the I-SSM and S-SSM implicitly , ‘‘on top’’ of K . For example , in INTERNIST II (Pople , 1982) , an I-SSM of the patient being diagnosed is dynamically formed by adding to K ‘‘constrictor’’ and ‘‘spanning’’ links that tie together instantiated entities , and something similar to an S-SSM is created at run-time by adding to K ‘‘planning’’ links that help focus the system’s problem-solving activities .
ROLES OF DESIGN KNOWLEDGE IN KBSs 699
Each I-operator stands for a single axiom in D A . For example , for the axiom ( ; F ( Present F ) ∧ ( abnormal F )) é ( ' D ( Causes D , F ))) , a corresponding I-operator would be represented as :
(I – OP : : DISEASE – CAUSE – FINDING IF (AND (Present ( F $var1 abnormal )) ( NOT (Present ( Cause ( D $var2)( F $var1))))) THEN post the unsatisfied proposition ( Cause ?( D $var2)( F $var1))) ,
where Present is a function that checks if its node argument is present in the I-SSM , and abnormal is a logical expression requiring $var1 to be bound to an abnormal finding in K . When the IF part of this operator is false , the THEN part posts the unsatisfied proposition ( Cause ?( D $var2)( F $var1)) . This proposition coincides with the node-chain ?( D $var2) 5 ( F $var1) , where ?( D $var2) is a node labeled as ‘‘unknown’’ .
More generally , an I-operator is a rule that posts the right-hand-side of an axiom in D A as an unsatifised proposition , provided that the left-hand-side of that axiom is violated . An unsatisfied proposition coincides with a specific node-chain , in which nodes corresponding to the ontological entities bound by quantifiers in the right-handside of the violated axiom are labeled as unknown .
In each iteration of the inference procedure , after applying I-operators on the current I-SSM , some I-SSM nodes might end up having associated with them one or more unsatisfied propositions . Each proposition is an n -ary relational constant , ( π node 1 ? ? ? node n ) , in which some nodes are labeled ‘‘unknown’’ . As Figure 5 indicates , these propositions are sent to the S-plane .
3 . 2 . THE S-PLANE
Unsatisfied propositions are processed as follows . For each proposition : (1) identify specific search operators that can apply K to possibly derive the knowledge needed to satisfy that proposition , and (2) select from the identified search operators the one to be triggered .
Before elaborating on these activities , let us first see what search operators are all about . These operators reside in the K-plane , and we will refer to them as K - operators . A K-operator uses K to derive a value to which it can bind the instance variable(s) marked unknown in the specific unsatisfied proposition it aims to address . To illustrate , a K-operator for addressing the unsatisfied proposition ( Cause ?( D $var2) ( F $var1)) would have the form :
(K – OP : : FIND – DISEASE – CAUSING – FINDING (argument : (body : (return :
?( D $var2) 5 ( F $var1)) k step 1 l ? ? ? k step n l ) ( D $var2) 5 ( F $var1))) ,
where ‘‘body’’ is code that searches K for knowledge based on which it can bind $var2 to a disease instance that causes the finding instance already bound to $var1 . Hence , a K-operator is generally expressed in terms of the node-chain argument coinciding with the proposition it aims to satisfy , and code that applies K and returns the node-chain argument with the unknown nodes bound to proper instances .
Returning to activities in the S-plane , K-operators identified for the unsatisfied propositions are added to the S-SSM (the SSM modeling the system’s behavior) . The S-SSM is a directed graph consisting of entries of the form : ( operator - name
M . BENAROCH 700
Iteration 1
Iteration 2
Iteration 3
Iteration N
?D ?Fs ?Fg
K_OP::FIND_DISEASE_CAUSING_FINDING K_OP::GENERALIZE_FINDING
F=12 hours headaches
?D Fs ?Fg
K_OP::FIND_DISEASE_... K_OP::GENERALIZE_FINDING
Fg=CNS headaches duration?D ?Fs
1K_OP::GENERALIZE_FINDING
K_OP::FIND_DISEASE_CAUSING_FINDING
F=12 hours headaches
?Ds ?Dg ?A
K_OP::SPECIALIZE_DISEASE_... K_OP::FIND_INDUCER
D=Meningitis
?F ?D Fs Fg=NIL
K_OP::GENERALIZE_FINDING
Fg=CNS headaches duration
K_OP::FIND_... 2
?FS
F=12 hours headaches
K_OP::FIND_DISEASE_CAUSING_FINDING
3 1
K_OP::GENERALIZE_FINDING
?D ?Fs Fg=Fever
Dg
8
8'
?A Ds=NIL
6,7
5F={CNS headaches duration,high grade fever}
?A Ds=Acute-Bacterial-Meningitis Dg
5'
?F ?A Ds ?Dg F=Fever
10
?D ?Fs ?Fg
?A Ds=Acute-Meningitis5
Dg=Infectious-Disease F=stiff-neck-on-flexation ?S
D=Meningitis
F=12 hours headaches
?Fs Fg=CNS headaches duration
Fs Fg=NIL
2
"resolved" node-chainnonresolved node-chain relating to an applicable K-operatorinapplicable node-chain violating constraints on S-SSM form(e.g. don't specialize a D that generalizes another D)
x yx ?yx y
. . .
9 4?D
5''
F IGURE 6 . Evolution of the S-SSM during construction of the sample I-SSM shown in Figure 1 .
node - chain ) . Figure 6 shows how the S-SSM evolves during construction of the sample I-SSM shown in Figure 1 . Like with the I-SSM , there can be constraints on the form of the S-SSM . For example , with respect to the S-SSM in the bottom of Figure 6 , the constraint ‘‘no cycles are allowed in the S-SSM’’ prevents an attempt to generalize Acute-Bacterial-Meningitis into Acute-Meningitis , because the latter was already specialized into the former and added to the S-SSM . After enforcing all such constraints , the S-SSM would contain some branches with terminal nodes marked as unknown . Each such branch corresponds to an applicable K-operator .
ROLES OF DESIGN KNOWLEDGE IN KBSs 701
Choosing which one of the applicable K-operators will be triggered means making control decisions about which sub-goal (or unsatisfied proposition) posted in the I-SSM to pursue . For example , in NEOMYCIN ’s case , one such decision involves choosing which D on the dif ferential (i . e . the set of most specific hypothesized diseases) will be the ‘‘focus’’ of the next problem-solving activity . To make such control decisions , the S-plane employs search control operator , called S - operators , which use S to analyse the current I-SSM on several levels . To understand the dif ferent levels of I-SSM analysis , it is useful to look at how Figure 7 relates them to the way NEOMYCIN makes the above sample control decision . Granted that the I-SSM is a directed graph :
$ global I - SSM analysis chooses a sub-graph containing D s that each could be a good candidate focus ;
$ intermediate I - SSM analysis next chooses a focus D within that subgraph ; and $ local I - SSM analysis finally chooses which of the K-operators corresponding to
node-chains that represent immediate children of that focus D ( i .e . D 5 A , D 5 F , or D 5 D ) is to be triggered .
Switching from a local to a global mode of analysis takes place only when special events occur , e . g . when a new hypothesized D was added to the I-SSM that is not subsumed by any existing I-SSM sub-graph (i . e . a ‘‘wider dif ferential’’ event) . The occurrence of such events is checked for while updating the S-SSM .
S-operators capture strategic principles in S hierarchically , the same way the NEOMYCIN hierarchy of subtasks depicted in Figure 7 captures and organizes these principles . However , the dif ference compared to NEOMYCIN is that we express these principles using the same graph vocabulary used to express the I-SSM . High level S-operators capture principles for global and intermediate I-SSM analysis , e . g . ‘‘anchor the diagnostic reasoning to the I-SSM sub-graph whose root ( D isease) contains the largest number of abnormal manifestation ( F inding) nodes’’ . The lowest level S-operators capture principles for local I-SSM analysis . These principles are essentially preferential constraints on node-chains . They are therefore expressed as ordered sets of ontological node-chains . For example , the principle ‘‘test a hypothesized malfunction ( D isease) before refining it’’ would be expressed as :
( D 5 ? D u F : D 5 h F j . D 5 h D j ) ,
where D / F means either D or F , ? denotes the unknown node in a node-chain , and h j denotes the multiplicity of the surrounded entity . The way this preferential constraint works in local I-SSM analysis can be illustrated in terms of how NEOMYCIN
applies the sub-task labeled Pursue-Hypothesis in Figure 7 . This sub-task induces a breadth-first search of the disease taxonomy by always invoking sub-task Test- Hypothesis whose goal is to look for additional F indings to support the focus D isease , before invoking sub-task Refine-Hypothesis whose goal is to specialize the focus D isease based on the accumulated findings . Respectively , a local I-SSM analysis involving the preferential constraint ( D 5 ? D u F : D 5 h F j . D 5 h D j ) would induce the same search pattern . Because Test-Hypothesis and Refine-Hypothesis are basically K-operators that take the ontological node-chains D 5 ? h F j and D 5 ? h D j , respectively , Test-Hypothesis will always be triggered earlier . Note that preferential constraints could refer to ontological entities as well as task-specific descriptors , e . g .
M . BENAROCH 702
Est
ablis
h-H
ypot
hesi
s-S
pace
...
Exp
lore
-and
-Ref
ine
Gro
ups
& D
iffer
entia
te
Pur
sue-
Hyp
othe
sis
Test
-Hyp
othe
sis
Ref
ine-
Hyp
othe
sis
...
I-S
SM
d sa
tisfi
ed p
ropo
siti
on p
oste
d in
the
I-S
SM s
unsa
tisfi
ed p
ropo
siti
on p
oste
d in
the
I-S
SM
S-op
erat
or f
or g
loba
l I-
SSM
ana
lysi
s Su
btas
k E
stab
lish-
Hyp
othe
sis-
Spac
e it
erat
ivel
y ap
plie
s th
e fo
llow
ing
stra
tegi
c pr
inci
ples
to
choo
se a
subs
et o
f th
e di
f fer
enti
al*
(or
a su
bgra
ph i
n th
e I-
SSM
) on
whi
ch t
o fo
cus .
1 . If
the
re a
re a
nces
tors
of
hypo
thes
es o
n th
e di
f fer
enti
al n
ot y
et t
este
d by
Tes
t-H
ypot
hesi
s , t
hen
perf
orm
Gro
up &
Dif f
eren
tiat
e on
the
m .
2 . If
the
re a
re h
ypot
hese
s on
the
dif f
eren
tial
not
yet
pur
sued
by
Pur
sue-
Hyp
othe
sis ,
the
n pe
rfor
m E
xplo
re-a
nd-R
efine
on
them
. 3 .
. . .
S-op
erat
or f
or i
nter
med
iate
I-S
SM a
naly
sis
Subt
ask
Exp
lore
-and
-Refi
ne
iter
ativ
ely
appl
ies
the
follo
win
g st
rate
gic
prin
cipl
es
to
choo
se
a di
seas
e on
w
hich
to
fo
cus
(fro
m
the
chos
en
I-SS
M
subg
raph
) .
The
su
btas
k ab
orts
w
hen
a ‘‘w
ider
-dif f
eren
tial
’’ † e
nd c
ondi
tion
is
enco
unte
red .
1 . If
the
cur
rent
foc
us i
s no
w l
ess
likel
y th
an a
noth
er h
ypot
hesi
s on
the
dif f
eren
tial
, the
n pe
rfor
m P
ursu
e-H
ypot
hesi
s on
the
str
onge
r hy
poth
esis
. 2 .
If t
here
is
a ch
ild o
f th
e cu
rren
t fo
cus
that
has
not
bee
n pu
ruse
d , t
hen
perf
orm
Pur
sue-
Hyp
othe
sis
on t
he c
hild
of
the
curr
ent
focu
s . (T
his
is t
rue
only
aft
er t
he c
urre
nt f
ocus
was
jus
t re
fined
and
rem
oved
fro
m d
if fer
enti
al . )
3 . If
th
ere
is
a si
blin
g of
th
e cu
rren
t fo
cus
that
ha
s no
t be
en
purs
ued ,
th
en
perf
orm
Pur
sue-
Hyp
othe
sis
on t
he s
iblin
g of
the
cur
rent
foc
us .
4 . If
the
re i
s an
y ot
her
hypo
thes
is o
n th
e di
f fer
enti
al t
hat
has
not
been
pur
sued
, th
en p
ursu
e it
(e . g
., pe
rfor
m P
ursu
e-H
ypot
hesi
s on
the
str
onge
st h
ypot
hesi
s no
t ye
t pu
rsue
d) .
S-op
erat
or f
or l
ocal
I-S
SM a
naly
sis
Subt
ask
Pur
sue-
Hyp
othe
sis
appl
ies
the
follo
win
g st
rate
gic
prin
cipl
e to
cho
ose
the
next
act
ivit
y to
carr
y ou
t w
ith
rega
rd t
o th
e fo
cus
(or
the
I-SS
M n
ode-
chai
n th
at i
s th
e ch
ild o
f th
e fo
cus
to b
e ad
dres
sed
next
) . 1 .
Tes
t th
e fo
cus
befo
re r
efini
ng i
t : pe
rfor
m T
est-
Hyp
othe
sis
on t
he f
ocus
, and
mar
k th
e fo
cus
as pu
rsue
d .
K-o
pera
tor
(Pri
mit
ive)
sub
task
Refi
ne-H
ypot
hesi
s ad
ds t
axon
omic
chi
ldre
n of
the
foc
us t
o th
e di
f fer
enti
al .
* T
he d
if fer
enti
al i
s th
e se
t of
mos
t sp
ecifi
c di
seas
e hy
poth
eses
the
sol
ver
is c
onsi
deri
ng .
† A
wid
er-d
if fer
enti
al m
eans
tha
t a
new
hyp
othe
sis
was
add
ed t
o th
e I-
SSM
tha
t is
not
sub
sum
ed by
any
exi
stin
g su
bgra
ph .
F IG
UR
E 7
. C
ontr
ol d
ecis
ions
mad
e th
roug
h a
hier
arch
ical
ana
lysi
s of
the
I-S
SM u
sing
S-o
pera
tors
. T
he e
xam
ple
illus
trat
es t
he g
loba
l , in
term
edia
te a
nd l
ocal
I-SS
M a
naly
sis
perf
orm
ed c
orre
spon
ding
to
the
way
spe
cific
NE
OM
YC
IN s
ub-t
asks
con
trol
the
dia
gnos
tic
reas
onin
g pr
oces
s .
ROLES OF DESIGN KNOWLEDGE IN KBSs 703
‘‘pursue a common inducer of a hypothesized malfunction ( D isease) before an unlikely one’’ . This sample preferential constraint would be expressed as :
(?( A common - unlikely ) 5 D : ( A common - inducer ) 5 D . ( A unlikely - inducer ) 5 D ) ,
where A stands for an A gent , and common - inducer and unlikely - inducer are the values that descriptor common - unlikely can assume .
The control scheme we have just described is used to identify which applicable K-operator is to be triggered . Once a K-operator was chosen and triggered by the S-plane , control is transferred to the K-plane . (We did not elaborate here on the reasons we construct the S-SSM ; these reasons are discussed in Section 4 . )
3 . 3 THE K-PLANE
A triggered K-operator uses K to derive a value to which it can bind the nodes marked as unknown in its node-chain argument . For example , the K-operator labeled K – OP : : FIND – DISEASE – CAUSING – FINDING in Section 3 . 2 receives the node-chain argument ?( D $var2) 5 ( F $var1) , and it seeks to bind $var2 to an appropriate disease instance in K .
Under the relational networks epistemology , K is a set of relational networks . Each network captures a certain relation between instances of specific ontological entities . It can take the form :
(( π node - chain [ k CF l ]) (node-chain [ k CF l ]) ? ? ? (node-chain [ k CF l ] l )) ,
where π is a relational constant and [ k CF l )] is an optional certainty factor . For example , a disease taxonomy and a network linking agents with diseases they induce would look as follows , respectively :
( Subtype ( D ) 5 ( D )) (( D Infectious-Disease) 5 ( D Meningitis)) (( D Meningitis) 5 ( D Acute-Meningitis)) ( ? ? ? )) ,
(( Induce ( and ( A ) [ h ( F ) j ]) 5 ( D ) k CF l ) ((and ( A E . coli) ( F Ventricular-Ureteral-Shunt) , 5 ( D Bacterial-Meningitis) k 0 . 3 l )) ((and ( A Klebsiella-Pneumoniae) ( F Ventricular-Ureteral-Shunt)) 5
( D Bacterial-Meningitis) k 0 . 3 l )) ( . . . )) .
In the later network , the first entry can be viewed as saying : if the hypothesized disease is Bacterial-Meningitis and the patient have had a Ventricular-Ureteral-Shunt procedure , the inducing agent is E . coli with a k 0 . 3 l certainty . Relational networks involving descriptors of ontological entities can be represented in a similar fashion . To illustrate , a network that distinguishes between agents that are common or unlikely inducers of diseases would look as follows :
(( common - unlikely - agent ( A k common - inducer unlikely - inducer l ) 5 ( D )) (( A Enterobacteriaceae common-inducer) 5 ( D Pelvic-Abscess)) (( A Gram-Positive-Rods unlikely-inducer) 5 ( D Pelvic-Abscess)) ( ? ? ? )) ,
M . BENAROCH 704
where common - unlikely - agent is a descriptor that takes on the values k common - inducer unlikely - inducer l .
If the epistemology chosen involves only taxonomic relational networks , for example , K-operators would search K and instantiate elements in K by matching node-chains . Specifically , the body of a K-operator would first find a relational network with a ‘‘header’’ containing an ontological node-chain that matches the one in the input argument of the K-operator . Then , it would scan the network to find entries containing the known nodes in its node-chain argument , based on which it would bind proper instances to the nodes labeled unknown in this node-chain argument . If the epistemology chosen (also) involves compositional relational networks , a K-operator might simultaneously search multiple networks and apply inferences on the search results to produce the binding sought .
In any case , the results generated by K-operators are reported to the I-plane , where they are used to update the current I-SSM (e . g . grow a link between I-SSM nodes , or aggregate I-SSM sub-graphs) .
3 . 4 . GENERALITY OF THE ARCHITECTURE
Based on the discussion so far , the architecture presented is said to be task- independent . It would work for any application task that seeks to construct I-SSMs whose underlying epistemology is that of relational networks . To see this point , we can examine the vocabulary used to express the parts of the inference procedure underlying the architecture and the knowledge constituents that these parts utilize . This vocabulary , D V , consists of :
(1) D V e —terms referring to epistemic entities (e . g . sub-graph , node , link , root , sibling) ;
(2) D V o —terms referring to ontological entities (e . g . actor , malfunction , manifesta- tion) ; and
(3) D V t —task-specific synonyms and descriptors referring to terms in D V o (e . g . agent , treatable disease , finding) .
The parts of the inference procedure we identified but not elaborated on involve operators for updating the I-SSM and the S-SSM . Under the relational networks epistemology , the I-SSM and S-SSM are directed graphs . Their updating therefore involves standard operations on graphs (e . g . add node , link nodes , append sub-graphs , find root) . Accordingly , operators that carry out such operations need to be expressed using only terms in D V e , without having to know what these terms correspond to in D V o and D V t . †
As to the knowledge constituents used by the inference procedure , these includes : I-operators corresponding to axioms in D A , K-operators , K , S-operators , and S . So
† This can be illustrated in the case of ACCORD (Hayes-Roth , Hewett , Johnson & Gravey , 1988) . ACCORD uses a hierarchy of general-purpose operators for manipulating structures similar to what we call I-SSMs . These operators are viewed as verbs , with the subject and object relations being epistemic terms . For example , an operator called YOKE appends two I-SSM sub-graphs that satisfy a certain ‘‘position’’ criterion . The meaning of YOKE’s operation can be interpreted only based on the ontological meaning of sub-graphs that are specific to the application where it is being used . If sub-graphs stand for partial device configurations (in design) , YOKE would conceptually treat ‘‘position’’ as a spatial constraint between the sub-graphs . On the other hand , if sub-graphs stand for partial disease process descriptions (in diagnosis) , YOKE would conceptually interpret ‘‘position’’ as a place on a time-line .
ROLES OF DESIGN KNOWLEDGE IN KBSs 705
far these constituents have been expressed using terms in D V t . Yet , beside constituents referencing descriptors of ontological entities (e . g . I-operators corres- ponding to modality axioms in D A ) , we could have expressed them using only terms in D V o . After all , except for descriptors , all the task-specific terms in D V t are synonyms of terms in D V o that we chose to use for clarity . We can look more closely at K , S , and their associated operators . K includes relational networks in which only the ‘‘header’’ requires referencing terms in D V o (optionally in D V t ) . As to K-operators , while their input / output arguments are node-chains that could re- ference terms in D V o and D V t , their ‘‘body’’ could also involve terms in D V e . For example , the body of a K-operator that seeks to generalize a finding (e . g . ‘‘12 hours headaches’’ into ‘‘CNS headaches duration’’) might do so by searching for the fathers of that finding within some taxonomy of findings . Finally , S includes strategic principles that are expressed in terms of I-SSM sub-graphs and ordered sets of node-chains . Although some of these node-chains reference only terms in D V o
(optionally in D V t ) and some also reference task-specific descriptors in D V t , S-operators do not need to know what these terms stand for to accomplish their job .
4 . Benefits from design knowledge
Having seen how D is used in the construction of explicit I-SSMs , we next focus on avenues that the availability and use of D open with respect to the design goals mentioned in Section 1 (ease of maintenance , reuse , robustness , etc . ) . We relate these avenues to several recent KBSs that were built to meet these goals , and rationalize the way these systems work in terms of how they capture and use D . The next discussion is by no means meant to be complete ; a detailed inquiry into how any one of the design goals can be met is the subject of a separate paper .
4 . 1 . KNOWLEDGE REUSE AND KBS CONSTRUCTION AND MAINTENANCE
Understanding how D can help to ease construction , simplify maintenance and support knowledge reuse requires introducing the notion of a microtheory . Guha & Lenat (1994) define a microtheory to be a ‘‘fairly adequate’’ solution to some application task , recognizing that a task can have several microtheories which correspond to dif ferent viewpoints , levels of granularity , etc . These authors say that a microtheory is the set of ground rules used to model , and reason about , certain phenomena in the context of a specific task . These ground rules includes :
(1) axioms (assertions , constraints , assumptions) that define the nature of relevant ontological entities ;
(2) an inference procedure for reasoning with the axioms about problem situations the task aims to address ; and
(3) a vocabulary (language , representational constructs) for expressing the axioms , the inference procedure , and the necessary domain knowledge .
We argue that D ( D A 1 D V ) can be identified with a microtheory of a specific application task . This is visible from the parallel between D and the three parts of a microtheory . First , D A is a set of axioms describing the generic I-SSM applicable to some task , where this generic I-SSM reflects the nature of relevant ontological
M . BENAROCH 706
entities in terms of assertions and assumptions with which they must comply . Second , D A reflects the underlying epistemology by using epistemic terms in D V (e . g . root , father) , and we saw in Section 3 . 4 that the inference procedure required to reason with D A depends only on the assumed epistemology . Thus , D A identifies the inference procedure needed to construct the kind of I-SSMs the task requires . Finally , D V is the vocabulary used to define the necessary representational constructs using such terms as nodes and node-chains . For example , three of the constructs we discussed earlier are : ordered sets of node-chains for representing preferential constraints in S , rule structures for expression axioms in D A as I-operators involving node-chains and unsatisfied propositions , and procedural structures with input / output node-chain arguments for representing K-operators .
Returning to the aforementioned design goals , the question is : how does the fact that D can be identified with a microtheory help with respect to these goals? Neches , Fikes , Finin , Patil , Gruber , Senator & Swartout (1991) propose to lower the ef fort needed to build and maintain KBSs using tools that act as frameworks for handling instances of specific task classes . They suggest developing such frameworks in the form of top-level abstraction hierarchies which KBS designers can reuse and elaborate to create specific applications .
Consider how a KBS shell in the spirit of the above proposal could be developed based on the kind of design knowledge we discuss . Assume the existence of an object-oriented ( O-O ) hierarchy with nodes organized in two tiers , corresponding to epistemic and ontological design choices like those in Figure 2 , where each tier in the hierarchy contains layers of increasingly specialized design knowledge . A node at the top tier would store or inherit an epistemic vocabulary , D V e , and an inference procedure for reasoning in terms of the kind of I-SSMs supported by a specific epistemology (e . g . the inference procedure in Figure 5) . A node at the lower tier would store or inherit from nodes in the same tier an ontological vocabulary , D V o , axioms defining the nature of the generic entities specific to an ontology , D A , and their derivative I-operators and K-operators .
Unlike common KBS shells which only provide representational constructs , a shell involving such an O-O hierarchy of design knowledge would provide reusable partial microtheories of various task classes . These are partial microtheoris because their D A would not include modality axioms , which might be unique to instances of these task classes (e . g . the axiom ‘‘an I-SSM root must represent a treatable disease’’ in NEOMYCIN ’s case) . Given such a shell , the KBS design endeavor would involve roughly the following steps .
(1) Make epistemic and ontological design choices from the options available in the O-O hierarchy . This would identify a partial microtheroy—inference procedure , D V e , D V o , D A and its derivative I-operators and K-operators—of the intended application task .
(2) Elaborate the partial microtheory identified (i . e . tailor it to the application task) . (2 . 1) Make perspective design choices by selecting from D A a subset of axioms ,
A , that refer to the ontological entities needed to model and examine domain phenomena from the angles of interest .
(2 . 2) Provide task-specific synonyms and descriptors of ontological entities in A , and add modality axioms to A . The shell would then present for the
ROLES OF DESIGN KNOWLEDGE IN KBSs 707
modality axioms proper templates that the KBS designer could fill to create the I-operators and K-operators corresponding to these axioms .
(3) Design K and S based on the elaborated microtheory . The shell would assist by : (3 . 1) suggesting relational networks needed in K based on relational constants
in axioms (e . g . an axiom embedding the phrase ( Cause D F ) requires a causal network relating diseases to findings) ; and
(3 . 2) proposing preferential constraints needed in S based on node-chain arguments in K-operators .
We can illustrate some of the things that a KBS designer would do while applying the above steps in the context of a mechanical diagnosis task like the one discussed by Console , Portinale , Dupre & Torasso (1993) . Figure 8 presents the generic I-SSM applicable to , and a sample I-SSM instance constructed for , this task . Following design choices made in step 1 , the shell would identify a partial microtheory similar to the one used in NEOMYCIN . Then , given the ontological entities that axioms in the identified D A refer to (agent , malfunction , etc . ) , design choices in step 2 . 1 would , for example , exclude from D A axioms pertaining to the ‘‘location’’ entity . (Recall from Figure 4 that , in medical diagnosis , ‘‘location’’ pertains to an organ system where the agent inducing a disease resides . ) In step 2 . 2 , one of the task-specific synonyms that would be provided to the shell will distinguish between an ‘‘internal agent’’ and an ‘‘external agent’’ (e . g . extremely hot climate) , and one of the modality axioms that would be added to D A will state that ‘‘an agent can be internal or environmental’’ . In step 3 . 1 , one of the relational networks that K must include is suggested by the remaining axioms in D A ; it is a taxonomy of malfunctions capturing type-of relations between them . Finally , in step 3 . 2 , the above modality axiom newly added to D A would indicate the need for a preferential constraint that specifies whether an internal agent should be pursued before an external agent , or vice versa .
The link between the KBS design approach outlined above and the aforemen- tioned design goals can be summarized as follows . This design approach could render the KBS construction process more structured and predictable by virtue of facilitating reuse of D (and , in turn , of K-operators , I-operators , and S-operators) . It enables a KBS designer to capitalize on the availability of a hierarchical ‘‘library’’ of reusable partial microtheories of various task classes , like in the above mechanical diagnosis example . This KBS design approach could also simplify maintenance in cases where the changes made to a KBS involve a revision of the ontological and perspective design choices (not instance choices) underlying the system . In such cases , we could require a KBS shell like the one described above to handle a revision in three steps . First , the shell would ask the designer to revise the definition of what the KBS seeks to accomplish—add , delete , and modify axioms in D A . Then , the shell would check for the consistency of axioms in the revised D A . Finally , the shell would compare axioms in the revised D A to axioms in the original D A to identify : (1) which of the existing I-operators and K-operators need to be added , deleted , or modified ; (2) which relational networks in K have to be added , deleted , or restructured ; and (3) which strategic principles in S must be added , deleted , or revised (based on the exclusion and / or addition of ontological entities , axioms in D A , etc . ) . In other words , the idea is to require that changes to a KBS would be preceded by a revision of the microtheory the system is utilizing . This idea is applied
M . BENAROCH 708
(Subtype)
induceMalfunction(disorder)
Actor internal orexternal (agent)
Manifestation(internal state/action)
Manifestation(finding/symptom)
cause
cause
badspark plugs
sparkplugs fault
excessivespark plugs
millage
...
irregularspark
ignition
excessive gasconsumption?
YES
gas smellpresent?
YES
irregulargas
mixture
highengine
temperature
auto-ignition
irregulargas
concentrationirregular
firing
faultymixtureignition
decreasedefficiency
powerdecrease
irregularacceleratorresponse!
temperatureindicator on?
NO
(a)
(b)
(Subtype)
F IGURE 8 . Re-use of D from NEOMYCIN (see Figure 4) in the context of a medical diagnosis task . (a) Generic I-SSM applicable , (b) sample I-SSM instance (adapted from Console , Portinale , Dupre &
Torasso , 1993) .
by KBS development approaches , such as CommonKADS (Schreiber , Wielinga & de Hoog , 1994) , which link the knowledge-level model that they develop for a task to the actual KBS design , and thus require that a revision of the KBS design be preceded by a revision of the knowledge-level model .
ROLES OF DESIGN KNOWLEDGE IN KBSs 709
4 . 2 . ROBUSTNESS
Robustness means performing well outside a narrow range of expertise , that is , the ability to avoid brittleness and exhibit novelty . According to David , Krivine & Ricard (1993) , the most common approach to increasing robustness is to provide a KBS with multiple knowledge sources , or K s (e . g . structural and functional models , quantitative and qualitative models) , so as to improve domain coverage . This approach usually involves some dif ficulties . The conceptual ones include : how to identify to which K to shift and when , and how to translate results between K s . Work on these issues is typically limited to dealing with K s capturing functional models that are abstraction of each other , and to using predefined task-specific ‘‘switching modules’’ and dictionaries containing knowledge about when and how to switch between K s as well as how to translate results between K s (e . g . Hunt & Price , 1993) . Implementation dif ficulties pertain primarily to the integration of reasoning across K s that are captured using dif ferent representation formalisms , such as rules , frames , and logic predicates (Simmons & Davis , 1993) . Recent proposals like the one of Guida & Zanella (1993) suggest addressing these dif ficulties by integrating K s through their ontologies , representational assumptions , and epistemological types , among other things .
In the spirit of such proposals , we argue that one way to address the above dif ficulties is to capture the D underlying available K s and talk about the integration of microtheories . The idea is to allow a system to ‘‘compose’’ the microtheory it needs by combining narrower microtheories of various tasks in the same general domain (to which it has access , e . g . via the above discussed O-O hierarchy of microtheories . ) We illustrate this idea using two cases . The first case shows the role of D in the composition process , assuming the availabiilty of only two microtheories that are complementary in some sense . The second case also shows the role of D in finding the right microtheories to compose , assuming the availability of a variety of microtheories .
The first case is based on the work of Liu & Farely (1991) . Suppose that a KBS for designing electronic devices is faced with the query : gi y en a de y ice whose structure is fixed , how can we lower the current flowing through one of its resistors without changing the resistance of , or y oltage across , that resistor ? Further assume that the KBS has access to two microtheories (see Figure 9(a) for more details) .
(1) The lumped-device microtheory characterizes a component / device using terms like y oltage ( V ) , current ( I ) and resistance ( R ) . Its associated K includes macro behavioural axioms like Ohm’s law and Kirchhof f’s law . To address the query using this microtheory , the KBS must directly question the component model of a resistor ; however , this model contains primitive behavioral axioms which cannot be derived using this microtheory (i . e . based on Ohm’s law , I 5 V / R , the only way to change the current through a resistor is to change R or V ; but , it is required that R and V remain unchanged) . This means that this microtheory lacks certain ‘‘perspectives’’ that would allow reasoning about the component at the micro level .
(2) The charge-carriers (CC) microtheory describes a component in terms of spatial characteristics like cross-sectional area ( A ) and in terms of the movement of electric charge carriers using parameters like field ( E ) , force ( F ) , CC motion
M . BENAROCH 710
y elocity ( y ) , charge fiow ( C ) , and current ( I ) . The K associated with this microtheory includes micro behavioral axioms like I 5 C 5 A 1 y (where x denotes a qualitative change in x ) .
What facilitates the dynamic integration of these two microtheories is two commonalities on the D s identified with them . As Figure 9(a) shows , both D V s involve common parameters and both D A s involve common axioms . Through these common parameters and axioms , the two microtheories are combined to compose a more encompassing microtheory using which the query can be addressed as shown in Figure 9(a) . †
The second case , which illustrates how D also helps to search for the right microtheories to be composed , is based on Falkenhainer and Forbus’ (1991) work (however , we omit many details here) . These authors built a system that uses a set of thermodynamics models to automatically compose new models for analysing whatever device phenomenon is tackled . As Figure 9(b) shows , the system defines a model in terms of the relations it captures ( K ) , the ontological entities it involves ( D V ) , and its underlying ontological , grain , operational , etc . assumptions ( D A ) . Because each model may be suitable for reasoning only about specific phenomena , a model is viewed as a microtheory of the domain , and the collection of models available is considered a theory of the domain . By grouping the assumptions underlying models based on their similarities , the system forms a hierarchy of ‘‘assumption classes’’ and their corresponding models , analogous to the hierarchy of microtheories we described in Section 4 . 1 . Through use of the hierarchy of assumption classes , the system addresses any given query as follows . (1) Identify the input / output parameters of the artifact (e . g . turbine) to which the
query refers . (2) Compose a model for the query . Use the hierarchy of assumption classes to
select models that involve the parameters identified . If the selected models depend on the output of other models , selects those other models as well , as long as their underlying assumptions are consistent with the ones selected initially . The selection of models is managed using a dependency network (i . e . an assumption-based truth maintenance system) capturing relationships between the assumptions of selected models .
(3) Use the composed model with a qualitative simulation engine (embedding S ) to produce a solution—an envisionment of the qualitative behavior of the simulated artifact that forms what we call an I-SSM .
(4) If the solution is deficient (e . g . an empty envisionment) , use dependency- directed backtracking with the dependency network of assumptions to identify inadequate assumptions that caused the ‘‘failure’’ .
(5) Return to step (2) to revise the composed model , until the desired solution is obtained .
The above discussion implies that what enables the system of Falkenhainer and Forbus to work well is the fact that the system has available to it the D underlying K s it integrates dynamically for problem solving purposes .
† The ability to integrate these microtheories through use of D also helps with queries involving structural changes to a device . For example , we can also address the query : gi y en a simple DC circuit containing a light bulb to which there is a serially connected resistor , how can we ‘‘ redesign ’’ the circuit to make the bulb light brighter without changing the current and y oltage ? (see Liu & Farely , 1991) .
ROLES OF DESIGN KNOWLEDGE IN KBSs 711
(a)QUERY: given a device whose structure is fixed, how can we lower the current flowing through one of its resistors without changing the resistance of, or voltage across, that resistor?
Lumped-Device MicrotheoryDV={V - voltage, I - current, R - resistance, ...}DA={(1) if parameter ?x is perturbed and it is connected to ?y, then ?y is also perturbed; (2) two connected parameters will continue to perturb each other until they reach a "steady state"; (3) ...}K={macro behavior axioms, e.g., Ohm's and Kirchoff's laws}
DV's have I and V as common entities, andDA's have axioms (1), (2), ...in common.
Charge-Carriers MicrotheoryDV={Q - charge, C - charge flow, v - flow velocity, E - field, I - current, F - force, L - component's length, A - cross-sectional area, ...}DA={(1) if parameter ?x is perturbed and it is connected to ?y, then ?y is also perturbed; (2) two connected p[arameters will continue to perturb each other until they reach a "staedy state"; (3) ...}K={micro behavioral axioms: ∂E=∂Q–∂L, ∂F=∂E, ∂v=∂F, ∂C=∂A+∂v,∂I=∂C, ∂V=∂A+∂v, ...}
(b)
(defModel (CONTAINED-LIQUID-GEOMETRY ?CL ?CAN) Relations ((Quantity (level ?cl)) (= (level ?cl) (/ (* 4 (mass ?cl)) (* (density ?sub) PI (expt (diameter ?can) 2))))
(= (pressure (bottom ?can) :absolute) (* (level ?cl) (density ?sub) G))))Individuals ((?can: conditions (fluid-container ?can)) (?cl : conditions (contained-liquid ?cl))
Assumtions ((CONSIDER (CONSIDER (CONSIDER (CONSIDER (CONSIDER
. . .
(container-of ?cl ?can))(substance-of ?cl ?sub)))(exists ?can))(viscous ?cl))(fluid-cs ?cl))(geometric-properties ?can))((staedy-state ?system ?q-type) (part-of ?component ?system) (steady-State (?q-type ?component)))
% K--the model
% DV part of a microtherory% ontological entities and their% relations
% DA part of a microtherory% simplifying assumptions, ontology% assumtions, grain assumptions,% approximations/abstractions, and% operational assumptions (e.g.. a% system is in a steady-state if all% its components are in a steady-state)
V
RI
0) ∂V=∂R=0 & ∂I=– (given)
1) ∂C=– (axiom ∂C=∂Ι)
2) ∂v=– (axiom ∂C=∂A=∂v)3) ∂F=– (axiom ∂C=∂F)4) ∂E=– (axiom ∂F=∂E)5) ∂L=– (axiom ∂E=∂Q–∂L & ∂Q=0)
L
E
v
QA
C
++
+
-
part-of Device/Component
State ofParameter(s)
characterizedby
Component State ofParameter(s)
characterizedby
causechange
causechange
+
F
∂x denotes a qualitative change in x
)
+
F IGURE 9 . Two examples of how D is used for robustness purposes . (a) D is used to compose a suitable microtheory from two narrower microtheories , (b) Domain model defined as a microtheory (adapted
from Falkenhainer & Forbus , 1991) .
M . BENAROCH 712
These examples demonstrate how D helps to increase the robustness of KBS by simplifying the integration of dif ferent K s in two ways . Firstly , availability of D eliminates the need to provide a KBS with predefined task-specific ‘‘switching modules’’ like those used by Hunt & Price (1993) . Secondly , using D with the KBS architecture we presented in Section 3 allows hiding implementation details of the K s being integrated . Since in our architecture any K is applied only with the K-operators defined by its associated microtheory , neither the inference engine nor the K-operators associated with the other K s being integrated need to know details pertaining to the representational formalism used to implement that specific K .
4 . 3 . EXPLANATION
Whereas the previously discussed design goals could be better met by virtue of having D available , the ability to enhance explanation capabilities has to do with the I-SSM and S-SSM constructed based on usage of D . The literature on explanation (Swartout & Moore , 1993 ; Tanner , Keuneke & Chandrasekaran , 1993) distinguishes between issues of generating the content of explanations and issues of generating explanations in a form that meets needs . Because our work does not aim to deal with the subject of explanation per-se , we focus here only on issues of content . Swartout and Moore submit that issues of content pertain primarily to the ability of a KBS to generate what , how and why explanations about its domain as well as its behavior . Figure 10 maps these types of explanations to four explanation modules which we associate with specific parts of the KBS architecture presented in Figure 5 .
Module 1 is concerned with defining and justifying knowledge elements that are part of K , or were derived based on K . For example , in the domain of electrical
Module 1Define and justifyelements in K based on K'
Explanation ofbehaviuor and strategy
Explanation ofsolutions
Explanation ofthe domain
Module 4How: explain the strategy usedto derive the solutionWhy: justify usage of thisstrategy
Module 3Why: explain why the solution generated and captured in the I-SSM is good
S-SSM S
currentS-SSM
genericI-SSMform(DA)
S'domain/world factsand environmentassumptions
justify principlesin S based on S'
Module 2
K
K'definitions and finer gradually
domainknowledge
F IGURE 10 . Four explanation ‘‘modules’’ and their mapping to our KBS architecture .
ROLES OF DESIGN KNOWLEDGE IN KBSs 713
devices , one assertion based on Ohm’s law is : the current through a resistor increases when the y oltage across it increases . Referring back to the discussion in Section 4 . 2 , the role of D in generating an explanation that justifies this assertion is one of facilitating the dynamic shifting between the lumped-device and CC microtheories (see Liu & Farely , 1991) . In other words , assuming that a finer granularity K 9 can be used to justify the above assertion in K , D could facilitate the dynamic linkage between K and K 9 in a task-independent manner .
Module 2 is responsible for justifying strategic principles in S , e . g . explaining why a diagnostic KBS ‘‘follows an agent that is a common inducer of a malfunction before an unlikely one’’ . Justifying such a principle requires no use of D . Instead , it normally entails referencing the commonsense world facts and task-specific environ- mental assumptions underlying that principle .
Module 3 deals with explaining why the solution that the system produced is good . By viewing D A as a documentation of the informational requirements of a task , the I-SSM constructed for that task can be used to show how these requirements are met by a generated solution . For example , referring to the partial I-SSM in Figure 1 , it could help explain why the diagnosed disease is (supposedly) Acute-Bacterial- Meningitis by providing such information as : (1) the diagnosed disease is supported by the presence of all reported symptoms (e . g . headaches , high-grade-fever) ; (2) the disease is induced by the E .coli organism ; (3) the presence of E .coli is established by evidence about the patient having a suppressed immune system because of pregnancy , alcoholism , or a recent surgical procedure that is an enabling step in the disease development process ; (4) the disease is treatable ; and (5) the disease is the most specific one that explains all symptoms . In other words , since D A essentially defines a causal script pertaining to a disease process , explaining a solution boils down to using the I-SSM to show that an instance of the causal script that implicates the diagnosed disease has occurred .
This approach can be complemented using the reconstructi y e explanation approach used in REX (Wick , 1993) . The reconstructive approach assumes the existence of two KBSs . One KBS solves the task using techniques that may not be explainable to the user (since sometimes the best way for a KBS to solve a task ef ficiently requires using techniques totally foreign to users) . The second KBS then receives a trace of the solution (e . g . an I-SSM) and uses dif ferent techniques to assemble a plausible story that justifies the solution , where the story may deviate from the actual processing that solved the task in the first place . For example , in the case of diagnosis , the story can support a solution using the arguments : (1) the I-SSM identifies a number of diagnostic hypothesis that might explain the principle symptoms ; (2) some of these hypotheses were ruled out because they cannot explain the principle complaints in this instance , or because they are implausible indepen- dent of what complaints they might explain ; and (3) the diagnostic conclusion is the best of the plausible hypotheses that can explain the symptoms . This story shows how the informational requirements of the task are met by essentially reflecting the general logical structure of diagnostic tasks , and abstracting away many of the task-specific details appearing in the I-SSM .
Module 4 is concerned with how explanations of the strategy used to solve the task and why explanations that justify usage of this strategy . Like NEOMYCIN , DIVA
(Davis , Shorbe & Szolovits , 1993) and ABLE (Patil , Szolovits & Schwartz , 1984) , our
M . BENAROCH 714
architecture can produce how explanations by reporting on the subtasks that were applied step-by-step based on the strategy embedded in the hierarchy of S-operators (representing sub-tasks) . Such explanations are useful , but they do not reflect global lines of reasoning followed during problem solving (e . g . top-down refinement with a disease taxonomy) . Generating global how explanations requires access to a record of the system’s problem solving process , something lacking from typical KBSs . DIVA
is somewhat of an exception in the sense that it treats the stack of active sub-tasks as such a record , and it produces global how explanations by collapsing sequences of sub-tasks currently posted on the stack into abstract lines of reasoning ; however , DIVA uses for this purpose special task-dependent explanation routines . In contrast , we produce an S-SSM , an explicit record of the entire problem solving process , which could help to generate line-of-reasoning explanations by collapsing node- chain sequences corresponding to the order in which K-operators were triggered . To illustrate this idea using the S-SSM in the bottom of Figure 6 , consider the node-chain sequence D h Meningitis j 5 Ds h Acute - Meningitis j 5 Ds h Acute - Bacterial - Meningitis j 5 F h High - Grade - Fe y er j 5 ? ? ? corresponding to one S-SSM branch , followed by the sequence D h Meningitis j 5 Dg h Infectious - Disease j 5 F h Fe y er j 5 ? ? ? corresponding to a later S-SSM branch . By virtue of knowing that Ds and Dg denote a specialization and a generalization of D , respectively , the first sequence tells that the system performed a top-down refinement of the hypothesized disease (based on some specialization taxonomy) , and the second sequence indicates a switch to a categorical mode of reasoning about the hypothesized disease . These node-chain patterns would probably be found in other applications involving diagnosis .
Regarding why explanations of behavior , the S-SSM could also be useful by virtue of it posting all node-chain (or sub-task) sequences that have been started but might be incomplete . In NEOMYCIN and DIVA , for example , the so-called end-conditions of active sub-tasks on the stack can interrupt ongoing lines of reasoning , and without keeping track of these lines of reasoning , decisions concerning the state of the inference process are rather adhoc . In contrast , an S-SSM not only enables to deliberately compare alternative lines of reasoning that were interrupted , but also allows to resume them from the point where they were left of f . More than that , the idea of how MOLGEN (Stefik , 1980) uses planning operators like ‘‘least-commitment’’ and ‘‘guess-undo’’ to control the problem solving process can be taken a step further . Intuitively speaking , given an explicit S-SSM , one might consider using a dependency-directed network ‘‘on top’’ of the S-SSM so as to permit capturing as well as explaining the rational for pursuing , interrupting , retracting , or resuming alternative lines of reasoning . Such a capability would facilitate the generation of true why explanations of behavior .
5 . Relation to other work
How does our work on representing and using design knowledge relate to previous work? In brief , we think that our work is essentially the result of another step in the progression of work that focuses on the need to represent explicitly various types of knowledge .
Work on early KBSs aimed to make domain knowledge , K , explicit by separating it from the inference procedure . It yielded KBSs like MYCIN , which represented K as
ROLES OF DESIGN KNOWLEDGE IN KBSs 715
DecomposableSub-tasks
PSM(s)
S
Problem-Solving Method(PSM)
Hierarchy ofGeneric Tasks (GTs)
Primitive GTs
Compotential &Task-StructureApproaches
Task
(Primitive) SolvableSub-tasks
CommonKADS
PSMs (Strategy Model)
Task Structure
Inference Structure
Generic Inference
map to
abstractedinto
Part of
map to
map to
...
define define roles of
analogous to
involveand control
define roles of
define Ontology of K
describe
K D
Hierarchy ofS-operators
report to
I-operators
I-SSM
K-operators
controlanalyse
update
define needed
documented by
capturemodel
I-SSM ontology (structure)
model capture
F IGURE 11 . Our work in relation to work on second generation KBSs .
rules and used a rule-chaining inference procedure . Despite the separation of K from the inference procedure , part of the strategic knowledge , S , used to control the processing order of rules was compiled into K , typically in the physical order of rule clauses , the physical order of rules in the KB , and the so-called metarules (Davis & Lenat , 1982) . As a result , K and S were neither accessible nor interpretable for purposes of explanation , reuse , and robustness .
Work on second generation KBSs thus focused on making S explicit by separating it from K . A related issue on which later work focused is separating the knowledge-level modeling of a task in the conceptualization stage from the symbol-level modeling of the task in the KBS design stage . Following Figure 11 , the next discussion reviews the major streams of work on these two issues and their key relations to our work .
There are task-specific and method-specific approaches to capturing S separately from K . The task-specific approach assumes that an application task is a hierarchy of sub-tasks which can be represented as procedural operators (e . g . see Figure 7) . High level operators capture S and use it to control the order of calls to lower level operators , whereas lowest level (primitive) operators use K to construct a solution . Because some sub-tasks are generic in the sense that they are common to many applications , such procedural operators are viewed as task-independent reusable modules . Chandrasekaran (1987) called these modules generic tasks ( GTs ) . To configure a complete inference procedure for a task , it is necessary to functionally compose a hierarchy of GTs that were possibly specialized for the task . Chandrase- karan & Johnson (1993) discuss several shells that assist in this endeavor . Related to this idea is McDermott’s (1988) method-specific approach , which captures S through the notion of role - filling problem - sol y ing methods ( PSMs ) . A PSM is a complete task-specific inference procedure that is conceptually composed of a hierarchy of GTs . It also relates a task to the K needed to accomplish the task . It can
M . BENAROCH 716
be specialized into various KBS applications , while helping knowledge acquisition ( KA ) by structuring K in terms of the roles that specific domain entities and relations play in problem solving . This idea was applied in KA tools like SALT
(Marcus , 1988) . Capturing S separately from K using GTs , in task- or method-specific ways ,
improved on first generation systems mainly in two respects . First , when S was expressed using an application-independent vocabulary , it became reusable in the context of KA tools like SALT . Second , making S explicit helped to make control decisions more explicit , hence providing for stronger explanations of behavior (or of problem-solving strategies) . Both improvements follow from the focus of second generation KBSs on the aspect of how a task is solved and the way this aspect can be captured using hierarchies of GTs .
Our work considers the how aspect to be a means to an end that has to be grounded in the aspect of what a task seeks to construct as a solution . For example , take the case of medical diagnosis . The what aspect first defines the structure of I-SSMs needed . Since hypotheses about the cause of a disease might be generated and added to the dif ferential on the basis of any data in the I-SSM , the how aspect then defines what data in the I-SSM to consider and when so as to limit the size of the dif ferential . The point is that focusing primarily on the what aspect does not prevent our KBS architecture from capturing the how aspect using hierarchies of GTs , as explained in Section 3 . 2 ; it only requires expressing S using the same graph vocabulary used to express I-SSMs . This means that our architecture preserves the two above improvements of second generation KBSs over earlier systems .
Focusing on the what (in addition to the how ) aspect of a task leads to additional improvements over typical second generation systems . Some improvements are due to the availability of design knowledge , D , reflecting the structure of I-SSMs constructed . First , as the examples in Section 4 . 2 suggest , D simplifies the dynamic integration of multiple K s . Compared to second generation KBSs , this helps to aovid a form of brittleness that arises not from lacking knowledge , but from lacking flexibility in using knowledge . Second , expressing K and S using an epistemic and ontological vocabulary ( D V e 1 D V o ) not only supports the kind of reuse of S found in the context of GTs and PSMs , but also provides for reuse of D itself and makes our KBS architecture task-independent (see Section 3 . 4) . Third , since D A implies the underlying structure of K , it could help structuring K during KA , like in PSM-oriented KA tools . Furthermore , compared to GT shells that help to specialize a hierarchy of GTs into a specific KBS application , D A helps in another respect : node-chains defined by axioms in a task-specific D A permit generating a list of the necessary primitive GTs (or K-operators) and checking for its completeness vis-a-vis the task .
Other improvements over second generation systems derive from the explicitness of the I-SSM and S-SSM constructed , both of which serve the function of a well-organized working memory . As we argued in Section 4 . 3 , an S-SSM permits to deliberately compare alternative lines of reasoning , and thus allows to make explicit the rationale behind global control decisions regarding what lines of reasoning to pursue , interrupt , retract , or resume . Related to this , an S-SSM also helps to generate global explanations of behavior that justify the particular strategy a KBS uses to induce specific lines of reasoning during problem solving . As to the I-SSM , it
ROLES OF DESIGN KNOWLEDGE IN KBSs 717
helps to explain solutions a KBS generates by showing how they meet the informational requirements of the task . Additionally , an I-SSM permits making control decisions genuinely a result of the present state of knowledge about the problem (as reflected by the changing content of the I-SSM) . In other words , grounding all how -related control decisions in the need to evolve the I-SSM into a state with characteristics specified by D A allows our architecture to view the process of problem solving as modeling . Van de Velde (1993) recognizes that this view reflects more of the real nature of problem-solving from a knowledge-level perspective .
The last point brings us to the second issue that is the focus of work on second generation KBSs—the creation of a knowledge-level ( KL ) model of a task prior to the construction of a symbol-level ( SL ) model in the KBS design stage . Here we find several streams of work , chief among them are the componential approach (Steels , 1990) , the task - structure approach (Chandrasekaran & Johnson , 1993) , and CommonKADS (Schreiber et al . , 1994) . Without elaborating on how exactly these approaches go about constructing KL models , suf fice to say that they are all rooted in the how aspect of a task , because they use the notion of PSMs to direct their construction process . Although some of them analyze I-SSMs (or case models , as CommonKADS refers to them) while constructing KL models , I-SSMs are not modeled as part of the KL models produced . Consequently , KBSs designed based on such KL models do not reflect the structure of I-SSMs they construct (implicitly) , and they fail to of fer the benefits discussed above . For these reasons , Van de Velde (1993) called for the need to focus the KL modeling endeavor primarily on what a task entails , or I-SSMs it constructs , and Wielinga , Van de Velde , Schreiber & Akkermans (1993) consider making the explicit modeling of I-SSMs an integral part of CommonKADS .
Some KL modeling approaches also seek to simplify the construction of a KL model and its conversion into a working KBS , through the notion of reuse in all stages of KBS development . For example , consider the case of CommonKADS (Schreiber et al . , 1994) . It provides a library of generic KL models that can be adapted to the task modeled so as to capture postulate system requirements . Once a specific generic KL model has been adapted , CommonKADS seeks to facilitate two other things .
(1) Adapting a generic SL model counterpart . A generic SL model captures ‘‘technical’’ design knowledge which , unlike the conceptual design knowledge we discussed , documents design decisions specifying computational aspects left open in the KL model (e . g . representational constructs and computational methods used to carry out inferences) . Using a structure-preserving design approach , CommonKADS first identifies postulate system requirements that changed in the generic KL model upon its adaptation , and these are then used with technical design knowledge to identify design decisions that must be adapted in the generic SL model counterpart . Additional work in this area is discussed by Vanwelkenhuysen (1995) .
(2) Mapping an SL model to a reusable K (or KB) through domain ontologies (referred to as domain metamodels) . CommonKADS uses several ontologies , each corresponding to a dif ferent abstraction of K and interaction type with a
M . BENAROCH 718
generic KL model . The lowest level , least detailed ontology is task-oriented in that it describes the interaction of K with the inference-structure part of a KL model ; a higher level , more detailed ontology could be method-oriented in that it might distinguish between finer types of ontological entities based on inferences that a particular PSM in a KL model applies to them . By using dif ferent ontologies with dif ferent generality , and partitioning K s accordingly , CommonKADS seeks to identify classes of K s with dif ferent scope , generality and reusability that can be mapped to generic KL models and , in turn , to their SL model counterparts .
Our work provides an alternative way to link the KL and the SL modeling endeavors . The ontology (or structure) of I-SSMs that tasks of a certain class construct can be considered the core of a generic KL model of these tasks . It can be adapted to any specific task in that class , in the way we explained and illustrated in Section 4 . 1 . Thereafter , carrying out the SL modeling endeavor could then be simplified in three ways . First , adapting a generic SL model counterpart could rely on the one-to-one mapping existing between axioms in the D A defined by an adapted I-SSM ontology and the K-operators and I-operators needed in the SL model . Second , K-operators and S-operators required in the SL model counterpart could be further adapted using CommonKADS’ approach , once they will be set to capture the so-called technical design knowledge . Finally , since the K needed and its form are directly determined by the I-SSM ontology called for by a task , it is possible to identify which specific parts of (or relational networks in) reusable K s are relevant to the KL model and its counterpart SL model adapted to the task .
6 . Conclusion
We discussed in this paper one type of conceptual design knowledge , D , which reflects specific design decisions that a KBS developer makes with regard to the structure of I-SSMs that a prospective system seeks to construct . We first showed how D can be represented in the case of systems that capture knowledge using relational networks . We next presented a KBS architecture that uses D to facilitate the construction of explicit I-SSMs . Then , we reviewed potential avenues that this architecture opens with respect to the design of KBSs that are simpler to build and maintain , support knowledge reuse , possess strong explanation capabilities , and are more robust . Last , we showed that the architecture presented is simply the result of another step in the progression of work that focuses on the need to represent various knowledge types explicitly , i . e . D in addition to K and S .
We are working to implement and use the architecture presented in the context of financial risk management applications . (We did not use examples from this domain because of its highly specialized terminology . ) Using this architecture to solve risk management tasks is appealing mainly for two reasons . Explicit I-SSMs allow to conduct various ‘‘what-if’’ analyses that probe a solution to find out how it changes under dif ferent market scenarios (without resolving for each scenario from scratch) . Additionally , since many risk management tasks involve the same kernel of knowledge about financial securities and markets , availability of explicit design
ROLES OF DESIGN KNOWLEDGE IN KBSs 719
knowledge would facilitate reuse and integration of this knowledge across KBS applications solving related tasks (for details see Benaroch , in press) .
Our implementation ef fort so far helped us to identify several areas were further research is needed . An important area pertains to the use of existing PSMs (propose and revise , cover and dif ferentiate , etc . ) with our architecture . Applying a PSM with some K constructs an I-SSM through a controlled interaction of its comprising primitive GTs (e . g . in ABLE , an I-SSM is called a patient-specific model) . However , typical PSMs are not modeled in a way that makes explicit the I-SSMs they construct , and they do not express the S (strategic principles) they employ using the graph vocabulary our architecture uses to analyse I-SSMs and make control decisions . To facilitate and simplify the use of existing PSMs , it is necessary to do two things : reveal the generic I-SSMs that specific PSMs construct , and understand how these PSMs use their S to select in each moment the I-SSM portion (i . e . sub-graph , branch , or node-chain) on which the next problem-solving activity will focus . Doing so requires considering the fact that typical PSMs decompose their task into sub-tasks , which themselves are either solved or further decomposed by ‘‘smaller’’ PSMs (Chandrasekaran & Johnson , 1993) . This might require extending our architecture so that it would treat an I-SSM as if it were composed of a hierarchy of I-SSMs mirroring the recurring task / method / sub-task patterns implied by PSMs . Under this scenario , subtask-specific I-SSMs are basically abstractions of the top level I-SSM that hide details which are irrelevant to their sub-task .
Section 4 implicitly identified two other areas for future research that deal with potential benefits that our architecture of fers with respect to the aforementioned KBS design goals . The first area has to do with simplifying the construction and maintenance of KBSs through reuse . Research on the ontologies and perspectives underlying generic I-SSMs that various task classes require is a pre-requisite to the development of a KBS shell involving a hierarchy of reusable design knowledge like the one we discussed . Such a shell could also include existing PSMs and facilitate their reuse in the context of specific generic I-SSMs that various task classes require , assuming that these PSMs are ‘‘expressed’’ in terms that relate to the generic I-SSMs they construct . The second area concerns the use of I-SSMs and S-SSMs for explanation purposes . It is necessary to develop techniques that can use any I-SSM to explain how the solution it captures meets the informational requirements of the task it addresses . As to the S-SSM , research is needed on the generic form of ontological node-chain sequences that specific PSMs produce during their execution . This research would provide the basis for generating true global explanations of behavior , in terms of the lines of reasoning that a KBS follows in problem solving .
References
B ENAROCH , M . (in press) . Towards the notion of a knowledge repository for risk managements . IEEE Transactions on Knowledge and Data Engineering .
B OBROW , D . (1984) . Qualitative reasoning about physical systems : an introduction . Artificial Intelligence , 24 , 1 – 3 .
B RACHMAN , R . J . (1979) . On the epistemological status of semantic networks . In N . V . F INDLER . Ed . Associati y e Networks : Representation and use of Knowledge by Computers , New York : Academic Press .
C HANDRASEKARAN , B . (1981) . Towards a functional architecture for intelligence based on
M . BENAROCH 720
generic information processing tasks . Proceedings of the Tenth International Joint Conference on Artificial Intelligence , pp . 1183 – 1192 . Los Altos , CA : Morgan Kaufmann .
C HANDRASEKARAN , B . & S WARTOUT , W . (1991) . Explanations in knowledge systems : the role of explicit representation of design knowledge . IEEE Expert , June 47 – 49 .
C HANDRASEKARAN , B . & J ONHSON , T . R . (1993) . Generic tasks and task structures : history , critique and new directions . In J . M . D AVID , J . P . K RIVINE & R . S IMMONS , Eds . Second Generation Expert Systems , pp . 233 – 272 . London : Springer-Verlag .
C LANCEY , W . J . (1988) . Acquiring , representing and evaluating a competence model of diagnostic strategy . In M . C HI , R . G LASER & M . J . F ARR , Eds . The Nature of Expertise , pp . 343 – 418 . Hillsdale , NJ : Lawrence Erlbaum Associated .
C LANCEY , W . J . (1992) . Model construction operators . Artificial Intelligence , 53 , 1 – 115 . C ONSOLE , L ., P ORTINALE , L ., D UPRE , D . T . & T ORASSO , P . (1993) . Combining heuristic
reasoning with causal reasoning in diagnostic problem solving . In J . M . D AVID , J . P . K RIVINE & R . S IMPSON , Eds . Second Generation Expert Systems , pp . 47 – 68 . London : Springer-Verlag .
D AVID , J . M ., K RIVINE , J . P . & S IMMONS , R . (1993) . Second generation expert systems : a step forward in knowledge engineering . In J . M . D AVID , J . P . K RIVINE & R . S IMMONS , Eds . Second Generation Expert Systems , pp . 3 – 23 . London : Springer-Verlag .
D AVID , J . M ., K RIVINE , J . P . & R ICARD , B . (1993) . Building and maintaining a large knowledge-based system from a ‘‘Knowledge-level’’ perspective : the DIVA experiment . In J . M . D AVID , J . P . K RIVINE & R . S IMMONS , Eds . Second Generation Expert Systems , pp . 376 – 401 . London : Springer-Verlag .
D AVIS , R . & L ENAT , D . (1982) . Knowledge - Based Systems in Artificial Intelligence . New York : McGraw-Hill .
D AVIS , R ., S HORBE , H . & S ZOLOVITS , P . (1993) . What is a knowledge representation? AI Magazine , Spring , 17 – 33 .
F ALKENHAINER , B . & F ORBUS , K . D . (1991) . Compositional modeling : finding the right model for the job . Artificial Intelligence , 51 , 95 – 143 .
G UHA , R . V . & L ENAT , D . B . (1994) . Enabling agents to work together . Communication of the ACM , 37 , 127 – 142 .
G UIDA , G . & Z ANELLA , M . (1993) . Knowledge-based design using the multi-modeling approach . In J . M . D AVID , J . P . K RIVINE & R . S IMMONS , Eds . Second Generation Expert Systems , pp . 174 – 208 . London : Springer-Verlag .
H AYES -R OTH , B ., H EWETT , M ., J OHNSON , M . V . & G RAVEY , A . (1988) . ACCORD : a framework for a class of design tasks . Report No . 88-19 , Knowledge Systems Laboratory , Stanford University , Stanford , CA .
H UNT , J . E . & P RICE , C . J . (1993) . Integrating functional models and structural domain models for diagnostic applications . In J . M . D AVID , J . P . K RIVINE & R . S IMMONS , Eds . Second Generation Expert Systems , pp . 131 – 160 . London : Springer-Verlag .
L IU , Z . & F ARELY , A . (1991) . Shifting ontological perspectives in reasoning about physical systems . Proceedings of AAAI - 1 9 9 1 , pp . 395 – 400 .
M ARCUS , S . (1988) . SALT : a knowledge acquisition tool for propose-and-revise systems . In S . M ARCUS , Ed . Automating Knowledge Acquisition for Expert Systems , pp . 81 – 123 . Boston , MA : Kluwer Academic .
M C D ERMOTT , J . (1988) . Preliminary steps toward a taxonomy of problem-solving methods . In S . Marcus , Ed . Automating Knowledge Acquisition for Expert Systems , pp . 225 – 256 . Boston , MA : Kluwer Academic .
N ECHES , R ., F IKES , R ., F ININ , T ., G RUBER , T ., P ATIL , R ., S ENATOR , T . & S WARTOUT , R . W . (1991) . Enabling technology for knowledge sharing . AI Magazine , Fall , 37 – 56 .
N EWELL , A . (1982) . The knowledge level . Artificial Intelligence , 19 , 87 – 127 . P ATIL , R . S ., S ZOLOVITS , P . & S CHWARTZ , W . B . (1984) . Causal understanding of patient
illness in medical diagnosis . In W . J . C LANCEY & E . H . S HORTCLIFFE , Eds . Readings in Medical Artificial Intelligence . Reading , MA : Addison-Wesley .
P OPLE , H . E . Jr (1982) . Heuristic methods for imposing structure on ill-structured problems : the structuring of medical diagnosis . In P . S ZOLOVITS , Ed . Artificial Intelligence in Medicine . AAAS Selected Symposium 5 1 . Colorado : Westview Press .
ROLES OF DESIGN KNOWLEDGE IN KBSs 721
R ICH , E . (1983) . Artificial Intelligence . New York , NY : McGraw-Hill . S CHREIBER , G ., W IELINGA , B . & DE H OOG , R . (1994) . CommonKADS : a comprehensive
methodology for KBS development . IEEE Expert , December , 28 – 36 . S HORTLIFFE , E . H . (1976) . Computer - Based Medical Consultation : Mycin . New York , NY :
Elsevier . S IMMONS , R . & D AVIS , R . (1993) . The roles of knowledge and representation in problem
solving . In J . M . D AVID , J . P . K RIVINE & R . S IMMONS , Eds . Second Generation Expert Systems , pp . 27 – 45 . London : Springer-Verlag .
S TEELS , L . (1990) . Components of expertise . AI Magazine , Summer , 28 – 49 . S TEFIK , M . (1980) . Planning with constraints . Technical Report No . STAN-CS-80-784 ,
Computer Science Department , Stanford University , Stanford , CA . S WARTOUT , W . & M OORE , J . (1993) . Explanation in second generation expert systems . In
J . M . D AVID , J . P . K RIVINE & R . S IMMONS , Eds . Second Generation Expert Systems , pp . 543 – 585 . London : Springer-Verlag .
S WARTOUT , W . R ., P ARIS , C . & M OORE , J . (1991) . Design for explainable expert systems . IEEE Expert , June , 58 – 64 .
T ANNER , M . C ., K EUNEKE , A . M . & C HANDRASEKARAN , B . (1993) . Explanation using task structure and domain functional models . In J . M . D AVID , J . P . K RIVINE & R . S IMMONS , Eds . Second Generation Expert Systems , pp . 586 – 613 . London : Springer-Verlag .
V AN DE V ELDE , W . (1993) . Issues in knowledge level modeling . In J . M . D AVID , J . P . K RIVINE & R . S IMMONS , Eds . Second Generation Expert Systems , pp . 211 – 231 . London : Springer-Verlag .
V ANWELKENHUYSEN , J . (1995) . Using DRE to augment generic conceptual design . IEEE Expert , February , 50 – 56 .
W ICK , M . R . (1993) . Second generation expert system explanation . In J . M . D AVID , J . P . K RIVINE & R . S IMMONS , Eds ., Second Generation Expert Systems , pp . 614 – 640 . London : Springer-Verlag .
W IELINGA , B ., V AN DE V ELDE , W ., S CHREIBER , G . & A KKERMANS , H . (1993) . Towards a unification of knowledge modelling approaches . In J . M . D AVID , J . P . K RIVINE & R . S IMMONS , Eds ., pp . 229 – 335 . London : Springer-Verlag .
Paper accepted for publication by Associate Editor Dr . B . Chandrasekan .