32
Artificial Intelligence ELSEVIER Artificial Intelligence in Medicine 8 (1996) 267-298 in Medicine Knowledge-based temporal abstraction in clinical domains Yuval Shahar *, Mark A. Musen Section on Medical Informtics, Knowledge Systems Laboratory, School of Medicine, Medical School Ofice Building IMSOB) x215, Stanford Uniuersity, Stanford, CA 94305-5479, USA Received 1 February 1995; revised 1 July 1995; accepted 2 October 1995 Abstract We have defined a knowledge-based framework for the creation of abstract, interval-based concepts from time-stamped clinical data, the knowledge-based temporal-abstraction (KBTA) method. The KBTA method decomposes its task into five subtasks; for each subtask we propose a formal solving mechanism. Our framework emphasizes explicit representation of knowledge required for abstraction of time-oriented clinical data, and facilitates its acquisition, maintenance, reuse and sharing. The &SUM6 system implements the KBTA method. We tested RBSUMI? in several clinical-monitoring domains, including the domain of monitoring patients who have insulin-dependent diabetes. We acquired from a diabetes-therapy expert diabetes-therapy tempo- ral-abstraction knowledge. Two diabetes-therapy experts (including the first one) created temporal abstractions from about 800 points of diabetic-patients’ data. R&UM6 generated about 80% of the abstractions agreed by both experts; about 97% of the generated abstractions were valid. We discuss the advantages and limitations of the current architecture. Keywords: Temporal reasoning; Knowledge acquisition; Clinical decision support; Diabetes 1. Temporal abstraction in clinical domains Most clinical tasks require measurement and capture of numerous patient data, often on electronic media. Physicians who have to make diagnostic or therapeutic decisions based on these data may be overwhelmed by the number of data if the physicians’ ability to reason with the data does not scale up to the data-storage capabilities. Most stored data include a time stamp in which the particular datum was valid; an emerging * Corresponding author. Tel.: (l-415) 725-3393; Fax: (l-415) 725-7944; e-mail: [email protected] 0933-3657/96/$15.00 0 1996 Elsevier Science B.V. All rights reserved PII SO933-3657(95)00036-4

Knowledge-based temporal abstraction in clinical domains

Embed Size (px)

Citation preview

Artificial Intelligence

ELSEVIER Artificial Intelligence in Medicine 8 (1996) 267-298 in Medicine

Knowledge-based temporal abstraction in clinical domains

Yuval Shahar *, Mark A. Musen

Section on Medical Informtics, Knowledge Systems Laboratory, School of Medicine, Medical School Ofice Building IMSOB) x215, Stanford Uniuersity, Stanford, CA 94305-5479, USA

Received 1 February 1995; revised 1 July 1995; accepted 2 October 1995

Abstract

We have defined a knowledge-based framework for the creation of abstract, interval-based concepts from time-stamped clinical data, the knowledge-based temporal-abstraction (KBTA) method. The KBTA method decomposes its task into five subtasks; for each subtask we propose a formal solving mechanism. Our framework emphasizes explicit representation of knowledge required for abstraction of time-oriented clinical data, and facilitates its acquisition, maintenance, reuse and sharing. The &SUM6 system implements the KBTA method. We tested RBSUMI? in several clinical-monitoring domains, including the domain of monitoring patients who have

insulin-dependent diabetes. We acquired from a diabetes-therapy expert diabetes-therapy tempo- ral-abstraction knowledge. Two diabetes-therapy experts (including the first one) created temporal abstractions from about 800 points of diabetic-patients’ data. R&UM6 generated about 80% of the abstractions agreed by both experts; about 97% of the generated abstractions were valid. We discuss the advantages and limitations of the current architecture.

Keywords: Temporal reasoning; Knowledge acquisition; Clinical decision support; Diabetes

1. Temporal abstraction in clinical domains

Most clinical tasks require measurement and capture of numerous patient data, often on electronic media. Physicians who have to make diagnostic or therapeutic decisions based on these data may be overwhelmed by the number of data if the physicians’

ability to reason with the data does not scale up to the data-storage capabilities. Most stored data include a time stamp in which the particular datum was valid; an emerging

* Corresponding author. Tel.: (l-415) 725-3393; Fax: (l-415) 725-7944; e-mail: [email protected]

0933-3657/96/$15.00 0 1996 Elsevier Science B.V. All rights reserved

PII SO933-3657(95)00036-4

268 Y. Shahar, M.A. Musen/Artijicial Intelligence in Medicine 8 (19961 267-298

pattern over a stretch of time has much more significance than an isolated finding or

even a set of findings. Experienced physicians are able to combine several significant contemporaneous findings, to abstract such findings into clinically meaningful higher-

level concepts in a context-sensitive manner, and to detect significant trends in both low-level data and abstract concepts. Thus, it is desirable to provide short, informative, context-sensitive summaries of time-oriented clinical data stored on electronic media,

and to be able to answer queries about abstract concepts that summarize the data. Providing these abilities would benefit both a human physician and an automated decision-support tool that recommends therapeutic and diagnostic measures based on the

patient’s clinical history up to the present. Such concise, meaningful summaries, apart

from their immediate value to a physician, would support the automated system’s further recommendations for diagnostic or therapeutic interventions, provide a justification for

the system’s or for the human user’s actions, and monitor plans suggested by the physician or by the decision-support system. A meaningful summary cannot use only time points, such as dates when data were collected; it must be able to characterize

significant features over periods of time, such as ‘5 months of decreasing liver enzyme levels in the context of recovering from hepatitis’.

We define the temporal-abstraction (TA) task as follows [28-301: The input includes a se& of time-stamped (clinical) parameters (e.g., blood glucose values), events

(e.g., insulin injections) and abstraction goals (e.g., therapy of patients who have insulin-dependent diabetes). The output includes a set of interval-based, context-specific

parameters at the same or at a higher level of abstraction and their respective values

kg., ‘a period of 5 weeks of grade III toxicity of the bone marrow in the context of therapy with AZT’). The structure {<parameter, value, context > , interval] denotes

that the logical proposition ‘the parameter has a particular value given a specific interpretation context ’ , is interpreted over a specific time interval. Such a structure is called an abstraction. Output abstractions should be relevant for clinical decision-mak- ing purposes in the given or implied clinical contexts. The goal of the TA task is to evaluate and summarize the state of the patient over a period, to identify problems, to assist in a revision of an existing therapy plan, or to support the generation of a new plan. In addition, clinical guidelines (skeletal plans for therapy) can be represented as TA patterns to be achieved, maintained, or avoided. Finally, generating meaningful

abstractions supports explanation of a decision-support system’s plans to its users. Fig. 1 shows an example of input for the TA task, and the resulting output, in the case of a patient who is being treated by a clinical protocol (detailed guideline) for treatment of

chronic graft-versus-host disease (GVHD), a complication of bone-marrow transplanta- tion.

A method solving the TA task encounters several conceptual and computational

problems: (1) the input parameter values might be of several data types and at various abstraction levels (e.g., blood-glucose level = 178 mg/dl; Glucose_state = HIGH), and similarly for the required output parameters and query patterns; (2) the input data might arrive out of temporal order, and existing interpretations must be revised accordingly; (3) several alternate interpretations might need to be maintained and followed over time; (4) clinical parameters have context-specific temporal properties, such as expected persistence of measured values, but much of the knowledge is implicit (for instance, in

Y. Shahar, M.A. Musen/Artificial intelligence in Medicine 8 (1996) 267-298 269

BMT

I ----------

,1:,~~~~~~~~-1, 0 50 100 200 400

Time (days)

Fig. 1. Typical inputs to and outputs of the temporal-abstraction task in a clinical domain. The figure presents

examples of abstractions of platelet and granulocyte values during administration of the PAZ protocol for

treating patients who have chronic graft-versus-host disease (CGVHD). The time line starts with a bone-mar-

row transplantation (BMT) event. ( * ) platelet counts; (A) granulocyte counts; (Dashed bar) event; (Shaded

arrow) open context interval; (Solid bars) closed abstraction interval; M[n] = myelotoxicity (bone-marrow

toxicity) grade n.

clinical guidelines for physicians); (5) acquisition of knowledge from domain experts should be facilitated, as well as maintenance of that knowledge. The method should enable reusing its domain-independent knowledge for solving the TA task in other domains, and enable sharing of the domain-specific knowledge with other tasks in the

same domain.

1.1. The structure of this paper

In Section 2, we present briefly the knowledge-based temporal-abstraction methodol-

ogy, which is described in detail elsewhere [28-301. In Section 3, we discuss the architecture of the Rl%UMl? system, which implements that methodology. We then

present in detail in Section 4 a study we performed in the domain of therapy of patients who have insulin-dependent diabetes. In Section 5 we discuss the results of our experiment, and the limitations and advantages of the overall framework. In Section 6, we compare our approach to other approaches for solving the TA task in clinical domains. Section 7 summarizes the work and its conclusions.

2. The knowledge-based temporal-abstraction method and mechanisms

We have defined a general problem-solving method for interpreting data in time-ori-

ented, knowledge-intensive domains, such as those common to clinical applications [30]. We propose a highly modular approach, with clear semantics for both the problem-solv-

ing method and for the domain-specific knowledge needed by it: the knowledge-based temporal-abstraction (KBTA) method (Fig. 2).

The KBTA method can be thought of as a knowledge-level [25] representation of the TA task and the knowledge required to solve that task. The KBTA method has a formal

270 Y. Shahar, M.A. Musen /Artificial Intelligence in Medicine 8 (1996) 267-298

The temporal-abstraction task I - Task

Problem-solving met hod

- Subtasks

. Problem-solving mechanisms

- Wquired Knowledge

types

Fig. 2. The knowledge-based temporal-abstraction method. The temporal-abstraction Task is decomposed into

five Subtasks. Each Subtask can be solved by one of five temporal-abstraction mechanisms. The temporal-ab-

straction mechanisms depend on four domain- and task-specific knowledge types. (Striped arrow)=

DECOMPOSED-INTO relation; (Dashed arrow) SOLVED-BY relation; (Solid arrow) USED-BY relation.

model of input and output entities, their relations and the domain-specific properties that are associated with these entities, the KBTA ontology [30]. The KFSTA method

decomposes the TA task into five parallel subtasks: (1) temporal context restriction: creation of relevant interpretation contexts crucial for focusing and limiting the scope of the inference, (2) vertical temporal inference: inference from contemporaneous propo-

sitions into higher-level concepts, (3) horizontal temporal inference: inference from similar-type clinical propositions, attached to different time intervals, (4) temporal interpolation: bridging gaps between similar-type disjoint point- or interval-based clinical propositions to create longer intervals, and (5) temporal pattern matching: creation of intervals by matching patterns over disjoint intervals, associated with clinical propositions of various types. The five TA tasks are, in fact, the basic subtasks solved in most temporal-reasoning systems in medicine, although they do not always appear explicitly [30].

The five subtasks of the KBTA method are solved, respectively, by five TA mechanisms (methods that are computationally nondecomposable, at least at the knowl-

Y. Shahar, M.A. Musen / Artificial Intelligence in Medicine 8 (1996) 267-298 271

edge level; see Fig. 2). The mechanisms, which we have described previously 128-301, include one mechanism for creating relevant temporal contexts, three basic TA mecha- nisms, and one TA mechanism for matching temporal patterns. The TA mechanisms produce output abstractions of several abstraction types: state (e.g., LOW), gradient (e.g., INCREASING), rate (e.g., FAST) and pattern (e.g., CRESCENDO). (Note that an abstraction of a parameter is a new parameter, usually with new TA properties.)

The context-forming mechanism creates interpretation-context intervals, a tempo- ral frame of reference for interpretation that enables TA mechanisms to create context- specific abstractions. Interpretation contexts also enable anticipation of future complica- tions, and interpretation of past findings, in the light of the present interpretation. Interpretation contexts are formed dynamically by the presence in the runtime database of a context-forming proposition. The relation between a context interval and its inducing task, event, or abstraction can be any of Allen’s thirteen temporal-interval relations [2] (see Fig. 8 in Section 4) thus allowing the creation of contemporaneous, prospective and retrospective interpretation contexts. Creating contexts requires knowl- edge about the structure of clinical tasks, events, and abstractions, and their relationship with the interpretation contexts that they induce in the past, present or future.

The contemporaneous-abstraction mechanism abstracts one or more parameters and their values, attached to contemporaneous time points or time intervals, into a value of a new, abstract parameter. It performs the subtasks of classification and computational transformation.

The temporal-inference mechanism performs two subtasks: temporal-semantic infer- ence infers specific types of interval-based logical conclusions, given interval-based propositions, using a deductive extension of Shoham’s temporal semantic properties [32]. For instance, unlike two anemia periods, two episodes of 9-month pregnancies can never be summarized as an episode of an 18-month pregnancy, even if they followed each other, since they are not concatenable, a temporal-semantic property. Similarly, a week-long episode of coma implies an abstraction of coma during each day (i.e., it has the downward-hereditary temporal-semantic property); that is not necessarily true for the abstraction ‘a week of oscillating blood pressure’. Temporal horizontal inference determines the domain value of an abstraction created from two joined abstractions (e.g., for most parameters and interpretation contexts, DECREASING and SAME might be concate- nated into NONINCREASING).

The temporal-interpolation mechanism bridges gaps between time points or time intervals, using domain-specific dynamic-change knowledge about the parameters in- volved. In particular, it uses local (forward and backward from an abstraction) and global (between two abstractions) truth-persistence functions to model a belief in the value of an abstraction [30]. Global truth-persistence (A) functions return the maximal temporal-gap threshold that can be bridged between two temporally disjoint abstractions, given the parameter involved, its value(s), the length of each abstraction, and the interpretation context of the abstractions.

The temporal-pattern-matching mechanism matches predefined complex temporal patterns or runtime temporal queries with the abstractions created by the other TA mechanisms. The output is a parameter of the pattern abstraction type, such as REBOUND HYPERGLYCEMIA.

212 Y. Shahar, M.A. Musen /Artificial Intelligence in Medicine 8 (1996) 267-298

To be useful for a particular clinical domain, the TA mechanisms require instantiation with domain-specific knowledge. This domain-specific knowledge-mostly declarative-is

the only interface between the KBTA method and the knowledge engineer or the domain expert. Thus, the development of a TA system particular to a new domain relies only on

creating or editing a predefined set of knowledge categories. We distinguish among four domain knowledge types used by the TA mechanisms (see Fig. 2): (1) structural knowledge (e.g., IS-A, ABSTRACTED-FROM, and PART-OF relations and statistical measure-

ment scales such as ORDINAL); (2) classification knowledge (including verricd classifi- cation knowledge, such as definition of a blood-glucose range as LOW, and horizontal

classification knowledge, such as definition of temporal patterns); (3) temporal-semantic knowledge (e.g., the relations among propositions attached to intervals and to their subintervals, or to two meeting intervals); and (4) temporal dynamic knowledge (e.g., temporal persistence of the value of a parameter when not measured and the value of a minimal significant change within a certain interpretation context). The TA mechanisms

use the parameterized domain knowledge in a predefined fashion [30]. The domain-specific knowledge required by the TA mechanisms is represented as the

TA ontology of the domain. A TA ontology includes a parameter-properties ontol- ogy-a theory of the relevant parameters and their temporal properties in the domain and the relations among these parameters (e.g., IS-A, ABSTRACTED-FROM). The parameter-

properties ontology is used by all the TA mechanisms. The context-forming mechanism also requires an event ontology, which includes event interrelations (e.g., PART-OF

relations) and properties, and a context ontology, which includes relations among interpretation contexts (e.g., the SUBCONTEXT relation). The domain’s TA ontology also

includes a set of abstraction goals and a set of dynamic induction relations of context intervals (DIRCS), which represent temporal relations among interval-based proposi-

tions (events, parameters and abstraction goals) and the context intervals that these propositions induce when asserted or computed in the runtime temporal fact base [30].

3. The RIkSUMk system

We have developed a computer program, l&SUM& that implements the KBTA method to create temporal abstractions when given time-stamped patient data, clinical events, and the domain’s TA ontology [29]. The R&SUM6 architecture is shown in Fig.

3. R&SUM6 is implemented in CLIPS [12] and is composed of a temporal-reasoning module, a static domain knowledge base (the domain’s TA ontology), and a dynamic

temporal fact base, i.e., the input and output data. The temporal fact base is coupled loosely to an external database, where primitive patient data and clinical events are stored and updated externally. The inferred abstractions are stored with the input data in the temporal fact base. The TA mechanisms in Rl?SUM6 do not operate in a fixed order; they are activated by the currently available data and the previously derived abstractions. The effects of updates to input parameter and event intervals, which might cause deletion of existing, previously concluded contexts and abstractions, are propa- gated through a truth-maintenance system augmented by the semantics of the TA process [29]. (The dynamic temporal fact base is thus essentially a historic database

Y. Shahar, M.A. Musen /Art$cial Intelligence in Medicine 8 (1996) 267-298

: Input primitive l a l i data OAA A

Domain knowledge base

Event ontology

Context ontology B&

Parameter ontology

Temporal fact base

Input events k - - 4

Inferred contexts 4

Inferred abstractions 1-1

Temporal-reasoning mechanisms \

V Context-forming mechanism

)

273

Fig. 3. A schematic view of the R&XJMl? system’s architecture. The temporal fact base stores intervals

representing external events, abstractions, and raw data, as well as system-created, interval-based interpretation

contexts and abstractions. Initial data in the temporal fact base are derived from an external database. The

context-forming mechanism is triggered by events, abstractions and existing contexts to create or remove

contexts. The temporal-abstraction mechanisms are triggered by intervals and contexts in the temporal fact

base to create or retract abstracted intervals. Both mechanism types use domain-specific knowledge repre-

sented in the domain’s ontology of events, contexts and parameters. (Dashed bar) Event; (Shaded bar) closed

context interval; (Solid bar) abstraction interval; ( -) ) data or knowledge flow.

[33]). The truth-maintenance system, which maintains logical dependencies among

parameters (and contexts) and the abstractions formed from them, is usually activated by modifications to the input data or when inconsistencies are detected by the temporal-in- ference mechanism [30]. The control structure of the Rl%UMl?Z system allows several levels of task-specific control, including a form of goal-directed one (e.g., specifying the

desired output abstraction types and parameter classes, the TA mechanisms to be used, and the relevant contexts).

3. I. The Parameter-properties ontology

The &SUM6 system represents the domain-specific parameter ontology in a special

knowledge structure called the domain’s parameter-properties ontology. The parame- ter-properties ontology represents the parameter entities in the domain, their properties, and the relations between them [29]. Fig. 4 shows a small part of the parameter-proper- ties ontology used for the task of managing patients who are being treated by clinical guidelines, such as experimental clinical protocols. The parameter-properties ontology can be viewed as an IS-A frame hierarchy that specializes parameters and their

274 Y. Shahar, M.A. Musen/ArtiJicial Intelligence in Medicine 8 (1996) 267-298

Abstrsctions

Fig. 4. A portion of the RESUME parameter-properties ontology for the domain of protocol-based care, showing a specialization of the temporal-abstraction properties for the granulocyte_state_abstraction (PSA) abstract parameter in the context of the prednisone/azathioprine (PAZ) experimental protocol for treating chronic graft-versus-host disease and in the context of each part of that protocol. (Oval) Class; (0) property; (Solid arrow) IS-A relation: (Striped arrow) PROPERTY-OF relation; (Dashed arrow) ABSTRACTED-FROM relation.

properties by their type (e.g., ABSTRACT), by their domain class (e.g., hematology), and by their relevant interpretation contexts (e.g., certain classification ranges are true only within a particular chemotherapy protocol). Relations such as ABSTRACTED-FROM, are

represented as frame slots. Property types common to all parameters include, for instance, the allowed values (or ranges), the amount of change considered to be clinically significant in each context, and the type of scale with which the parameter can

be measured and reasoned (nominal, ordinal, interval, or ratio). Properties can be inherited from one or more classes. For instance, all primitioe (raw data) laboratory

parameters inherit the interval scale as a default, whereas the default for abstract

parameters is an ordinal scale. Properties common to all abstract parameters include, for instance, ABSTRACTED-FROM relations to the defining parameters and qualitative depen-

dencies on these parameters (e.g., POSITIVE MONOTONIC). Relations to dynamically induced contexts also appear (see Section 3.2).

An important feature of the representation scheme shown in Fig. 4 is organization of abstract parameters by four output-abstraction types (STATE, GRADIENT, RATE and PAT-

TERN). For example, the frame of the Granulocyte-gradient parameter specialized to the interpretation context of the PAZ protocol for treatment of chronic-GVHD patients, has an IS-A link to the Granulocyte-gradient class, which has an IS-A link to the Gradient-ab- stractions class. Thus, it would inherit the ABSTRACT parameter type, the GRADIENT

abstraction type, the default allowed values (SAME, INCREASING, DECREASING, etc.), the default horizontal-classification function (e.g., INCREASING @ SAME = NONDECREASING),

Y. Shahar, M.A. Musen/Art$cial Intelligence in Medicine 8 (19961267-298 215

and the default gradient temporal-semantic-properties table, which includes tuples such as < INCREASING, concatenable, TRUE > . The context-specific A-function table (if the default is overridden) uses the default Granulocyte-gradient time unit (DAY) and includes tuples such as < INCREASING, 4, 5, 3 > (i.e., f: value, time-before, time-after -+ maximal

time gap). The ABSTRACTED-FROM relation is to Granulocyte_level with a SINGLE-cardi- nality dependence. Similarly, the knowledge requirements for the Granulocyte-state-ab- stractions class are inherited from the State-abstractions class, and include also vertical-

classification functions. The Pattern-abstractions class includes also properties such as a

TA pattern, which includes a defining input pattern (a set of parameter intervals), conditions (a set of value and time constraints), and concluded pattern (abstraction

interval). Each abstraction class (e.g., state) is specialized for particular abstract parame-

ters (e.g., state abstractions of platelets) and for specific relevant interpretation contexts (e.g., the PAZ protocol). This organization is very flexible for representing and modifying quickly TA knowledge in several domains.

The parameter-properties ontology does not contain a context-specialized node

corresponding to every potential interpretation context. The nonexistence of a specializa- tion signifies that, for that particular context, the abstraction is not relevant, thus cutting

down on unnecessary inferences.

3.2. The event and context ontologies

The event ontology in the RESUME system is a frame hierarchy with IS-A and PART-OF relations among frames [29] (see Fig. 6 in Section 4 for an example in the

diabetes-therapy domain). Typically, a pair of interpretation contexts that are induced by event types that belong to a PART-OF relation in the event ontology belongs to a

SUBCONTEXT relation in the context ontology (see Figs. 6 and 7 in Section 4). An event chain is a directed path in the event-ontology graph, starting from the topmost (EVENTS) node and ending in any nonterminal or terminal node, walking over PART-OF relations (see Fig. 6). Contemporary events whose event types can be found in the event ontology along the same event chain often induce at runtime a composite interpreta- tion context. Such a context is formed from a chain of interpretation contexts where

each pair belongs to a SUBCONTEXT relation in the context ontology. The event ontology includes all relevant events, subevents, and event dynamic

induction relations of context intervals (DIRCS), which the context-forming mechanism can use at runtime (see Fig. 8 in Section 4 for an example in the diabetes-therapy

domain). The default DIRC involving an event comprises a single DIRC representing an interpretation context concurrent with the event, whose name is the event’s name (DIRCS are referenced in similar fashion by context-inducing parameter propositions in

the parameter-properties ontology). In addition, the event ontology is used by the context-forming mechanism to disambiguate the relationship (e.g., PART-OF) of several coexisting events, and thus to decide when is it reasonable to form a new, more specialized, context when two or more events coexist (e.g., when both a CCTG-522 protocol event and an AZT event that is a part of it are detected), and which event is the subevent (i.e., the subcontext) of the other. Similar reasoning is useful also for purposes

of acquisition and maintenance of the context-forming knowledge.

276 Y. Shahm-, M.A. Musen/Arrificial Intelligence in Medicine 8 (1996) 267-298

The set of all the potentially relevant interpretation contexts and subcontexts of the domain and their properties defines a context ontology for the domain. The context ontology, like the parameter and event ontologies, can be represented as a frame

hierarchy [30] (see Fig. 7 in Section 4 for an example in the diabetes-therapy domain).

The types of semantic links among context nodes in the context ontology include IS-A

and SUBCONTEXT relations. The knowledge represented in the context ontology comple-

ments the knowledge represented in the parameter ontology and the event ontology and assists the context-forming mechanism in forming correctly context intervals from

several contemporaneous context intervals. For instance, the interpretation context induced by an event, or one of that events’ subevents (subparts), does not necessarily bear the name of its inducing event, nor does it necessarily have the same temporal

scope (see Figs. 6 and 8 in Section 4; the only indication for a SUBCONTEXT relationship exists in that case in the context ontology (see Fig. 7 in Section 4). Thus, an AZT subevent of the CCTG-522 protocol event induces a potential AZT-TOXICITY interpreta-

tion context that has a different name from the inducing event, that whose context interval has a different temporal scope from its inducing event (see Fig. 8 for a diabetes-therapy-domain example), and that has a SUBCONTEXT relation to the CCTG-522

interpretation context induced by a CCTG-522 protocol event. The context ontology has other uses in addition to enabling the representation of

knowledge about the interpretation contexts induced by events, abstractions, or abstrac- tion goals, rather than about the propositions inducing these contexts. For instance, in

the domain of diabetes therapy, the PREBREAKFAST interpretation context (induced by a morning meal), important for interpreting correctly fasting blood-glucose values, has an IS-A relationship to the more general PREPRANDIAL interpretation context (induced before any mealXsee Fig. 7 in Section 4). Such a relationship can be represented explicitly only

in the context ontology of the diabetes-monitoring domain. Finally, special types of interpretation contexts such as generalized interpretation contexts (representing a generalization of two or more interpretation contexts) and nonconvex contexts (repre-

senting a context interval whose subintervals are disjoint, that is, a nonconvex temporal

interval [20]) appear explicitly only in the context ontology [30]. The TA mechanisms (apart from the context-forming mechanism), which operate

within the temporal span of interpretation-context intervals, do not depend on the event and context ontologies. These mechanisms assume the existence of interpretation-con- text intervals and of parameter propositions that include interpretation contexts. The context-forming mechanism is thus the only inter&x with the domain’s events (or

rather, with the task-specific representation of these events in the event ontology), and shields the rest of the TA mechanisms from any need to know about these events, their structure, or the interpretation contexts they induce.

3.3. Semantics f2f classification functions

For representing various types of classifi$ation and functional knowledge in as declarative a manner as possible, the RESUME system uses several types of tables for representing knowledge. The internal representation of these tables is uniform. The Rl%UMfi system interprets the same table in different ways, depending on the table’s

Y. Shahar, M.A. Musen /Ar@cial Intelligence in Medicine 8 (19961 267-298 271

Table I

A 4: I (numeric and symbolic to symbolic) maximal-OR range table for the Systemic-toxicity parameter in the

context of the CCTG-522 experimental AIDS-treatment protocol

Parameter Toxicity-grade value

Grade I Grade II Grade III Grade IV

Fever

Chills

Skin

Allergy

I 38.5 c

None

Erythema

Edema

I 40.0 c

Shaking

Vesiculation

Bronchospasm

> 40 c

Rigor

Desquamation

Bronchospasm requiring medication

> 40 c

Rigor

Exfoliation

Anaphylaxis

The toxicity-grade value is determined for every one of the index parameters, or rows, in the table (e.g.,

Fever = 39°C. Chills = Rigor); then, the maxima1 value (in this case, Grade III) is selected.

declarative semantic type (i.e., the classification-function type that the table represents)

[30]. The table semantic types are combined from several different orthogonal dimen- sions, or axes, such as whether the classification function represents a conjunction (AND table) or a disjunction (OR table), and whether a maximal or minimal value is

being selected (Table 1). OR tables are quite common in clinical medicine. Given n parameters, each with k possible values or ranges that are being abstracted into an

abstract parameter with j possible values, an OR table reduces the space complexity from O(k”) to O(n * j). Adding additional attributes to an OR table increases the size of

the table linearly and not exponentially (as in the case of AND tables) - a considerable gain, if an OR representation is possible. Tables are a very concise, declarative, representation for many parameterized inference rules. Table objects can be inherited by

more specialized parameter classes. For instance, classification and global-persistence (A) function values for the same parameter in a more restricted context are typically inherited as defaults, and require that a developer perform only minor editing of the table. Table objects also have simple, well-defined semantics for interpreting the

function that the table represents, depending on the table’s semantic type. By knowing the semantics of a given function, the TA mechanisms can also perform more sophisti-

cated reasoning during runtime, such as infer reasonable missing values. The tables have a uniform representation and are easily acquired and modified. Their semantics rely only

on the small set of interpretation axes [30]. When a table is not sufficient, a, gene@ ‘black box’ function type is available;

several measures are used in the RESUME system to reduce the ambiguity of such functions and to increase the amount of potential automated reasoning possible regard- ing such functions. Measures include, for instance, the disciplined use, when possible, of classification tables with predefined semantics, and the explicit representation of AB-

STRACTED-INTO relations and qualitative-dependencies relations in the parameter ontol-

ogy for all functions.

4. Applications of the KBTA framework

In this section, we demonstrate (1) the work involved in acquiring and representing the four types of knowledge that the knowledge-based temporal-abstraction method

278 Y. Shahar, M.A. Musen /Art$cial Intelligence in Medicine 8 (19961267-298

requires, for the purpose of building a TA system in a new clinical area, (2) the results of that knowledge-acquisition effort and (3) the results of using the RI%UMl? system in

a particular clinical domain. We have tested the RI%UMI? system in several different clinical domains: protocol-

based care (and three of its subdomains) [29,30], monitoring of children’s growth

[19,30], and therapy of insulin-dependent diabetes patients [30]. We have applied the

RhSUMl? methodology to each domain in varying degrees. Sometimes, we focused on evaluating the feasibility of knowledge acquisition (including the time required for that process), knowledge representation and knowledge maintenance (i.e., modifications to

the resultant knowledge base). In other cases, we emphasized application of the resultant instantiated TA mechanis,ms to, several clinical test cases. In the diabetes-therapy domain, we applied the RESUME system, instantiated by the proper domain ontology,

to a larger set of clinical data. We thus examined most of the expected life cycle in the devel,opmen,t and maintenance of a TA system. In this paper we present the application

of RESUME to a set of clinical cases in the domain of diabetes therapy.

4.1. KBTA in the domain of therapy for insulin-dependent diabetes

We have applied the KBTA methodology to the monitoring task in the domain of treatment of insulin-dependent diabetes mellitus (DM) patients. Two endocrinologists

who are experts in therapy of diabetes were the domain experts for this experiment.

4.1.1. Creation of the diabetes-therapy TA ontology

We created a temporal-abstraction ontology for the domain of insulin-dependent diabetes (Figs. 5-8). The ontology included a parameter-properties ontology (Fig. 5), an event ontology (e.g., insulin therapy, meals, physical exercise) (Fig. 6), a context

ontology (e.g., preprandial [measured at fasting time, before a meal] and postprandial

[after a meal] contexts and subcontexts, and postexercise contexts) (Fig. 7), and the relevant dynamic induction relations of context intervals (DIRCS) (Fig. 8). Acquisition

of the ontologies and filling of all tables required about four 2-h meetings with one expert (three additional meetings with each of the two experts were needed for carrying out the particular experiment that we describe in this section).

In the diabetes-therapy ontology, administrations of regular insulin and of isophane insulin suspension (NPH) are euents (see Fig. 6), inducing different insulin-action interpretation contexts that are subcontexts of the DM interpretation context (see Fig. 8a) which represents the context of treating diabetes. Meals are events that induce preprandial and postprandial contexts (see Fig. 8b). Thus, values for the Glucose_state_DM_prebreakfast (the state of glucose in the context of DM and

measurement before breakfast) parameter (see Fig. 5) can be created, when relevant,

regardless of absolute time. The Glucose-state parameter is a new parameter with six values defined from corresponding ranges used by the domain expert (HYPOGLYCEMIA, LOW, NORMAL, HIGH, VERY HIGH, EXTREMELY HIGH). These values are sensitive to the context in which they are generated; for instance, postprandial values allow for a higher range of the normal value. Glucose-state propositions (for all allowed values) have the value TRUE for the temporal-semantic property concatenable (see Section 2 in the same meal-phase context.

Y. Shahar, M.A. Musen/Artificial Intelligence in Medicine 8 (1996) 267-298 219

State abstractions

H (Glucose_state_DM_prepndi

Fig. 5. Part of the diabetes parameter-properties ontology. The Glucose parameter is abstracted into the

Glucose-state parameter. This abstract parameter has a specialized subclass in the DM context, and is

abstracted in that context into the Glucose-state-state parameter. The Glucose_state_DM class is further

specialized in the preprandial and postprandial contexts, each of which has several subclasses corresponding to

the different relevant premeal contexts. (Oval) Class; (0) property; (Solid arrow) IS-A relation; (Dashed

arrow) ABSTRACTED-FROM relation; (Striped arrow) PROPERTY-OF relation; DM, diabetes mellitus.

The Glucose_state_state parameter is a higher-level abstraction of the Glucose-state

parameter, which maps its six values into three (LOW, NORMAL, HIGH, or L, N, H for short). It has different semantic properties, and allows creation of daily horizontal-in- ference patterns within a non-convex preprandial context (see Section 3.2 representing

abstraction over several meal phases, such as LLH (LOW, LOW and HIGH Glucose-state- state values over breakfast, lunch, and supper, respectively). Patterns such as LLH values for the Glucose-state-state parameter, especially in the preprandial subcontext, are extremely useful when a physician must decide how to modify a patient’s insulin regimen. Furthermore, once created, the prevalence of such patterns can be calculated, an important step in determining whether the pattern is a common one for the patient.

Glucose_state_state values that are measured within different phases (e.g., prelunch and presupper), but within the same day, can be joined by interpolation within the same

generalizing (see Section 3.2) interpretation context (e.g., a nonconvex version of the more general PREPRANDIAL context interval) creating an abstraction comprising several preprandial abstractions, up to 6-8 h apart. The maximal gap is defined by an interphase

280 Y. Shahar, M.A. Musen /Artificial Intelligence in Medicine 8 (1996) 267-298

Fig. 6. Part of the diabetes event ontology. (Oval) Class; (0 ) induced interpretation context; (Solid arrow)

IS-A relation; (Striped arrow) PART-OF relation: (Dashed arrow) INDUCED-BY relation.

A function. Diurnal state abstractions that are measured in the same phase but over different (usually consecutive) days, such as several values of the Glucuse_state_DM_prebreakfast parameter, can be joined by interpolation within the

same interpretation context (e.g., a nonconvex PREBREAKFAST context interval, that

comprises all breakfasts within a given interval), up to 24-48 h apart, using another

interphase A function.

[ Contexts )

Post_physical_excercise

Fig. 7. Part of the diabetes context ontology. (Oval) Class; (Solid arrow) IS-A relation; (Striped arrow)

SUBCONTEXT relation; DM, diabetes mellitus.

Y. Shahar, M.A. Musen/Artificial Intelligence in Medicine 8 (1996) 267-298 281

I Time

Fig. 8. Induction of interpretation contexts in the diabetes-treatment domain, using dynamic induction relations

of context intervals (DIRCS). (a) Creation of a Regular_insulin_action context, induced by a Reg&_insulin

administration event and of the corresponding DM_regular_insulin_action composite context. (b) Creation of

the Postprandial and Preprandial (prospective and retrospective, respectively) context intervals, induced by a

Meal event, and formation of the corresponding DM composite contexts, using the SUBCONTEXT relation.

(Dashed bar) event; (Solid bar) closed context interval: DM, diabetes mellitus (therapy context).

4.1.2. Materials and methods

The TA process is initiated by asserting in an appropriate place in the temporal fact base an abstraction-goal proposition named DM_planning. Asserting this proposition initiates the reasoning by inducing a DM retrospective-context interval for the preceding 2 weeks. This interpretation context enables, within its scope, creation of the DM domain abstractions. The time window of 2 weeks is used by the domain expert in

practice. It can be easily modified for particular applications, by changing the declara- tive definition of the dynamic induction relation of a context interval (DIRC) associated

with the abstraction-goal proposition (see Section 2). The data used for the study were taken from a set of electronic clinical records of

patients who have insulin-dependent diabetes and who were followed for at several months each. The raw data were originally stored in a text format (patient identification number, parameter code, time, value). Most of the glucose-parameter codes referred to specific meal times (e.g., a presupper-glucose code) but some referred to simply ‘non-specific’ glucose values, in which case only the time stamp was known. The data

282 Y. Shahar, M.A. Musen/Artijicial Intelligence in Medicine 8 (1996) 267-298

included mostly glucose and insulin codes. Special events (e.g., physical exercise and larger-than-usual meals) were sometimes reported too, as were symptoms of hypo-

glycemia. The data were converted into tuples in a relational database. From the database, it

was relatively straightforward to map the data into the Rl?SUMl? temporal fact base as

event and parameter intervals, adding the implied contexts when relevant. From the database, it was also possible to map the data into a spreadsheet, organizing the data by common measurement times, thus highlighting contemporaneous events or parameters.

The spreadsheet was useful for the production of graphical charts and detailed tables during the knowledge-acquisition and evaluation processes.

We submitted to the two diabetes-therapy experts (separately) eight data segments

from eight different patients, each segment consisting of 2 consecutive weeks (14 days)

of glucose-measurement and insulin-administration data. The data were presented to the experts as graphical charts and as detailed tables. The tables were included for additional reference, such as for looking up the precise dose of insulin administered or a particular

glucose value. Overall, each expert examined 112 days of data. Each day included 3-4 insulin-administration events and 3-5 time-stamped glucose measurements. Thus, each expert examined approximately 800 data points, We asked the experts to mark on the

charts the significant point- or interval-based abstractions that they would make from the data. We also asked for their therapy recommendations.

An example of running the Rl%iUMl? system on certain of the data, using as inputs both the diabetes ontology (Figs. 5-8) and the patient-specific raw data is shown in Fig. 9.

DM_planning_event e 1

DM I

Glucose_state_state_DM_presupper

Blood- lucose_state_state_DM_preprandial

f%=R glucose 200 A A

A values

A II l A

100 0 0.

l

1 7122 1 7123 1 7/24 1 7/25 1 7/26 b

Time (date in 1990)

Fig. 9. Abstraction of data by the F&SUMfr system. The figure shows abstractions of the Glucose_state_state

parameter in the diabetes-therapy presupper and preprandial contexts. (Striped arrow) (open) Context interval;

(Solid bar) abstraction interval; (0 1 prebreakfast glucose; (0) prelunch glucose; (A) presupper glucose: DM,

diabetes mellitus therapy interpretation context,

Y. Shahar, M.A. Musen/Artifcial Intelligence in Medicine 8 (19961267-298 283

In the particular time window shown, two significant abstractions are demonstrated:

(1) A period of 5 days of HIGH presupper blood-glucose values was created by the abstraction process. This abstraction was returned in response for an external (runtime) query for a period of at least 3 days of the Glucose-state-state parameter, with value

HIGH, in the presupper [nonconvex] context; (2) a set of three Glucose-state-state

abstractions representing a repeating diurnal pattern, consisting of NORMAL or LOW

blood-glucose levels during the morning and lunch measurements, and HIGH glucose levels during the presupper measurements. Individual abstractions in the set were created by the abstraction process; the whole set was returned in response to an external query

for Glucose-state-state values in the preprandial [nonconvex] context. The combined pattern suggests that an adjustment of the intermediate-acting insulin (e.g., NPH) may be

indicated. This pattern was noted in the data by both experts. Both of the abstractions involved in the combined pattern shown in Fig. 9 also can be

predefined as internal PATTERN-type parameters, if desired. In the general case, however, the second pattern involves a counting step (n occurrences of pattern P within

a certain period). Counting is a simple statistical abstraction, but currently, for reasons of simplicity, outside of the scope of the Rl%UMl? temporal-pattern-matching language,

which focuses on time and value constraints (statistical patterns can, in principle, be queried by saving the abstractions in the external database and employing additional

temporal-query mechanisms).

4.1.3. Results

The results of the abstraction experiment are listed in Table 2. Together, the two experts noted 402 abstractions. The abstractions included 2 18 different interval-based abstractions (185 temporal abstractions, such as ‘increasing glucose values from pre-

breakfast values to presupper values during the second week’ and 29 different statistical abstractions, such as ‘large variance of presupper glucose values’). 188 different

abstractions (164 temporal and 24 statistical) were noted by both experts. Of the

remaining 30 abstractions mentioned by only one of the experts, most were noted by only that expert because the other expert omitted to comment explicitly on the same data point(s), but his interpretation either was compatible with the first expert’s abstraction, or clearly supported or implied that interpretation, as judged by the experts in retrospect (each expert was asked to comment about other, alternative interpretations, that included the other expert’s abstractions). We recorded (see Table 2) how many abstractions were compatible or incompatible with the interpretation of the other expert (incompatible

Table 2

Abstractions formed by two experts in the diabetes domain

Expert Temporal abstractions Statistical abstractions Total

Compatible Incompatible Compatible Incompatible

I 179 2 24 I 206

II 164 4 26 2 196 Subtotal 343 6 50 3 402 Total 349 53 402

284 Y. Shahm, h4.A. Musen/Art$cial Intelligence in Medicine 8 (1996) 267-298

meaning mutually exclusive with the abstractions created for that time period by the

other expert). Only nine abstractions overall (six temporal, three statistical), each of which was

created by just one of the experts, were noncompatible abstractions of the same data. For example, in one abstraction, one expert interpreted a particular pattern of early morning hypoglycemia followed by a high prebreakfast glucose value as a possible Somogii

effect ‘. The other expert interpreted the same sequence as rebound hyperglycemia caused by increased food intake due to the hypoglycemia symptoms. Other idiosyncratic

abstractions included noting a pattern of very high glucose levels following larger than usual meals during particular phases of the day, a pattern of low prebreakfast and high

presupper glucose values in the context of an UltraLente insulin regimen, a high variance in bedtime glucose measurements, and a weekend ‘large brunch’ phenomenon

causing a pattern of elevation of only weekend postlunch glucose values. Since all the 164 compatible abstractions mentioned by Expert II were mentioned by

Expert I, we considered that set as the set of common abstractions. The RESUME system created 132 (80.4%) of the 164 temporal abstractions noted by both experts. None of the nine noncompatible abstractions mentioned by only one of the experts were created; these abstractions usually involved complex temporal or statistical contexts, such as contexts defined by particular insulin regimens or by glucose variance and minima during specific diurnal phases. Additional characteristics of the abstractions that

proved difficult to create are discussed in Section 5.1. As was the case in the domain of monitoring children’s growth [19], the Rl&iUMb

system produced many additional abstractions, most of which were low- and intermedi- ate-level abstractions such as glucose-value range classifications and gradient abstrac-

tions, and intermediate-length abstractions. Most intermediate abstractions could be viewed as partial computations that are invisible to users that define particular internal patterns or that ask particular external queries, but that are useful for anticipating potential queries. For reasons of limited expert time, not all these low-level abstractions were examined. Examination of the output for the first three cases with one of the experts showed that the expert agreed with almost all (97%) of the produced abstrac-

tions, a similar result to the one presented in the domain of growth monitoring [19,30]. This high predictive value was expected, since the domain ontology directly reflected that expert’s knowledge about these low- and intermediate-level abstractions. In the few

cases with which the expert did not agree, the reason was often due to the rigid classification of ranges (e.g., a blood-glucose value of 69 mg/dl glucose should not have been interpreted as LOW even if the expert agreed that, in general, normal values lie between 70 mg/dl and 110 mg/dl).

Of considerable interest (and quite different from the result we expected) is the fact that, although the abstractions identified by the two domain experts were remarkably

’ A Somogii effect occurs when an excess of insulin lowers the blood glucose to a state of hypoglycemia,

thus invoking several hormonal defense mechanisms that increase the blood glucose. The overall effect is a

paradoxical (rebound) elevation of the blood glucose in the context of a sufficient and even too large insulin

dosage.

Y. Shahar, M.A. Musen /Artificial Intelligence in Medicine K (1996) 267-298 285

similar, the therapy recommendations suggested by the experts differed for all of the eight cases [30]. We discuss the implications of this result in Section 5.1 and Section

5.3.3. Another somewhat surprising result was that both experts arrived at most of their

conclusions by looking only at the graphical charts of the data, whose representation

was rather qualitative and more suitable for general temporal-pattern matching. The detailed data tables were used almost exclusively only when the therapy options were considered (e.g., increasing the regular-insulin dose in the morning required looking at the current dose of that and other insulin types). The phenomenon would seem to

suggest that the experts base most of their conclusions on temporal patterns of

blood-glucose values, sensitive mainly to general insulin-therapy regimens (e.g., HIGH

prebreakfast glucose values in a patient who is being treated by a regimen of NPH and regular insulin twice a day), and need the actual doses of insulin only infrequently.

5. Discussion

We have presented the KBTA method, a knowledge-based framework for representa- tion and application of the kno {UNDERLINE {FUNC {wledge required for abstraction of higher-level concepts from time-oriented clinical data. The KBTA method decom-

poses the TA task into five subtasks, each of which is solved by a separate TA mechanism. The five mechanisms rely only on explicitly represented TA knowledge, the

domain’s TA ontology. We have implemented the KBTA method as the RESUME problem solver and have applied it to several clinical test domains. In the domain of

therapy for patients who have insulin-dependent diabetes, we modeled in a short time a diabetes-therapy TA ontology. Given the diabetes TA ontology as part of its input, and eight 2-week data samples from eight different diabetes-therapy cases, the RESUME system generated most of the abstractions that were noted by both experts in the same

data, while almost all of RESUME’s overall abstractions were deemed valid. The diabetes-therapy domain represented several interesting challenges to the KBTA

framework, but not all possible challenges. Each clinical domain in which we tested the RESUME system has demonstrated different facets of the system. Due to their diversity,

none of the domains tested all of the aspects of the RESUME system and the underlying KBTA methodology. For instance, the growth-monitoring domain, whose TA ontology included multiple levels of abstraction [ 191, challenged mainly the capabilities of the

contemporaneous-abstraction and temporal-pattern-matching mechanisms to perform multiple-parameter abstraction, given any level of abstraction in the input. The

protocol-based-care domains, in which events had many specific subevents, demon- strated mainly the use of the context-forming mechanism and of the event and context ontologies [29]. In addition, the hematological abstractions over weeks and months required extensive use of dynamic temporal knowledge (i.e., local and global persistence knowledge) by the temporal-interpolation mechanism. Finally, due to laboratory data

coming out of temporal order, or modifications in existent laboratory-data values, the protocol-based-care domains demonstrated the importance of maintaining logical depen- dencies by the truth-maintenance system.

286 Y. Shahar, M.A. Musen/Artijicial Intelligence in Medicine 8 (1996) 267-298

5.1. RlkJM~ and the diabetes-therapy domain

The results of application of the Rl&UMl? problem solver to the domain of therapy

for insulin-dependent diabetes were quite encouraging. The knowledge-acquisition effort was reasonable, the R6SUM6 system generated about 80% of the abstractions that were mentioned by both experts (who examined about 800 points of time-oriented data), and

about 97% of its overall abstractions were valid (though some were only intermediate

abstractions). The input in the diabetes-therapy domain included only one laboratory parameter

(glucose values at different phases of the day, such as prebreakfast), one qualitative parameter (hypoglycemia symptoms) and several events (different types of insulin

administrations, meals and physical exercise). Multiple-parameter abstractions, there- fore, were nonexistent, so that several aspects of the KBTA method were not relevant. However, the emphasis on temporal and statistical dimensions in the diabetes domain made abstracting concepts that are meaningful for therapy purposes quite challenging,

and demonstrated the full breadth of the temporal-inference and temporal-pattern matching mechanisms.

The results of this experiment demonstrate the potential for formalization and automation of the process of abstraction of higher-level concepts from clinical data.

However, they also show that additional features are needed for a complete TA architecture that is suitable for supporting the treatment of insulin-dependent patients. We noticed three limitations of the current RfiSUMfi system in general, and in the

diabetes-therapy domain in particular: (1) There are difficulties in querying general cyclical (e.g., diurnal) patterns, espe-

cially patterns dependent on external times. In the current design of the diabetes knowledge base, for instance, diurnal patterns are detected through horizontal-inference tables (e.g., creating an LLH abstraction for prebreakfast, prelunch and presupper times).

It would have been awkward to represent, say, a repeating pattern of LOW bedtime glucose followed by HIGH prebreakfast glucose. First, the ‘bedtime’ abstraction would have to be created as a repeating, cyclic abstraction, say, the absolute, external, time

being about 10 p.m. each day. But absolute time is currently converted in Rl?SUMl? to the more common internal time (e.g., patient-age in the growth-monitoring domain; days

of therapy in the protocol-based care domains), and only one time stamp is kept. Similarly, a ‘weekend’ abstraction would be awkward to define. A possible solution would be to represent both external time (e.g., 10 p.m.) and internal time (e.g., 5 days of

therapy). Second, bridging the gap between bedtime and prebreakfast time would be prevented by the preprandial interphase A functions that currently permit bridging only shorter gaps (e.g., from lunch to dinner), thus keeping the abstractions within the same day. The join might need to be defined as a temporal pattern, or as an extension of A functions that enables joining different contexts, specializing the function for particular context combinations.

(2) There is a difficulty in integrating statistical queries (e.g., means and standard deviations) with temporal queries. The statistical queries are straightforward (mean, variance, minimum, maximum, general distribution, count) and usually can be queried through the underlying database. However, the statistical computation needs a temporal

Y. Shahar, M.A. Musen /Art$cial Intelligence in Medicine 8 (1996) 267-298 287

context (e.g., prelunch times) in which to operate, and a temporal scope to limit the computation (e.g., previous 2 weeks).

(3) It might have been desirable to detect patterns defined by euents, such as insulin

use, in addition to those defined by parameters, such as glucose states. Such a requirement has not been noted in other domains, since the external events were usually assumed to be controlled by the human or automated therapy planner (e.g., chemother-

apy administration in the domains of protocol-based care was controlled by the

physician and the current protocol was well known). Such event-based patterns in the diabetes domain might generate more meaningful interpretation contexts, by detecting

periods in which the patient varied her own therapy significantly, or by detecting insulin-therapy regimens implicit in the data but never asserted explicitly by the

physician or the patient. Such early work in the diabetes domain has been described by Kahn and his colleagues [ 161, with encouraging results. Altematively,,complex contexts

might be asserted in the temporal fact base by the user before the RESUME system is called for performance of abstraction.

In Section 5.3.3 we mention a possible solution to the current architecture’s limita-

tions. In Section 5.2 and Section 5.3, we discuss the requireme+ for application and the

advantages of application of the KRTA approach and of the RESUME architecture. The

advantages and limitations of the system provide insights that seem more significant than standard measures such as sensitivity and predictive value. The specific number of

generated abstractions or their validity is useful for some purposes but might be highly

misleading with respect to evaluating a TA system. As we have mentioned, many of the abstractions that Rl%UMl? generates can be viewed as partial computations. These partial computations are cached, and have an important role for truth maintenance

purposes and for answering multiple-level queries. Most of these abstractions are

invisible to the user, whether a human or an automated therapy planner, and are used by the temporal-pattern-matching mechanism for detecting internal (predefined) or external

(online) temporal patterns. Therefore, the question we have attempted to answer is “would the system be able to answer the temporal queries that the user needs for

performance of the task at hand?” In this respect the results, were quite encouraging, although, as we have listed above, there is some room for extensions.

Although the abstractions identified by the two domain experts were remarkably

similar (see Table 2) the therapy recommendations suggested by the experts differed for

all of the eight cases. This observation would seem to validate one of the basic premises underlying the goal of the TA task, namely, that intermediate conclusions from the data

(the interval-based abstract concepts and patterns) are significantly more stable than are specific therapy ‘rules predicated on these conclusions. Thus, knowledge about forming intermediate temporal abstractions should be represented explicitly and separately from knowledge about prescribing therapy, which might depend on factors such as the expert’s level of comfort and experience with particular forms of therapy and on the

patient’s preferences. It is significant that both experts in the diabetes domain seemed to rely almost

exclusively on the rather qualitative graphical charts for interpretation of the data, and used the detailed data tables only for final decisions on therapy modification. This

288 Y. Shahar, M.A. Musen/Artificial Intelligence in Medicine 8 (1996) 267-298

observation lends further support to the use of the qualitative interpretation contexts as a major tool for organizing TA knowledge. For instance, both experts noted immediately, before creating a single abstraction, particular temporal patterns of insulin administration

(e.g., one NPH-insulin shot in the morning and three regular-insulin shots during

morning, lunch and supper). The precise doses involved did not seem to matter. This behavior is analogous to the way the context-forming mechanisms creates context

intervals: Events and abstractions induce interpretation contexts that are mostly qualita- tive, such as ‘regular-insulin action’ or ‘therapy by high-dose AZT’. Within these

context intervals, creation of abstractions is highly focused.

5.2. Knowledge-acquisition requirements for the KBTA method

Our initial assessment of the knowledge-acquisition requirements for typical clinical domains is based on the experience we have gained in applying the KBTA method to a variety of clinical domains (protocol-based care of AIDS and of chronic GVHD,

monitoring of children’s growth and therapy of diabetes). For any particular context in one of these domains, a significant, but manageable amount of knowledge must be

acquired. This knowledge must be entered either by the knowledge engineer who

designs the system, or by a clinical expert from whom specific knowledge (e.g., classification tables) is acquired and represented in the domain’s TA ontology. Both

these users can benefit from an automatically generated knowledge-acquisition tool that can acquire the appropriate domain-specific knowledge needed to instantiate the TA mechanisms [23].

The minimal amount of clinical knowledge that needs to be acquired includes (1) the primitive or abstract clinical parameters relevant to the task and their structural relations

-in particular, IS-A and ABSTRACTED-INTO relations (including all the relevant abstrac- tions of these parameters, classified into the four basic abstraction types: state, gradient, rate and pattern); (2) a clinically significant change for each relevant primitive parame-

ter, if gradient abstractions are required; (3) the list of potential state- and rate-abstrac- tion values for all parameters relevant to the task for which these abstraction types are

required; and (4) the maximal-gap A functions, when interpolation is required in the task, for each relevant parameter and context.

The inferential (temporal-semantic) properties, gradient-abstraction horizontal-in- ference values, and the interpolation-inference tables are more stable than are the knowledge types listed above, and are less dependent on the interpretation context. The default values for these types of TA knowledge either can be inherited through the appropriate abstraction class (e.g., state abstractions), or can be acquired for only the most general parameter class (e.g., platelet state abstractions in the context of the overall task). As additional applications are designed, the gain in development time is more apparent, as we have noticed in the case of the several subdomains of protocol-based care.

Instantiating the TA mechanisms in a new domain could involve, in theory, signifi- cant amounts of knowledge to be acquired. However, there are several encouraging insights from our experiments in the diabetes-therapy and other domains, with respect to the knowledge-acquisition requirements:

Y. Shahar, M.A. Musen /Art@icial Intelligence in Medicine 8 (1996) 267-298 289

(1) The major knowledge-acquisition effort in these non-trivial domains usually required two to four meetings (each 1 to 2 hours long) with the expert-a tenable amount of time. Also, the size of the ontologies was certainly manageable, even though one or more explicit, intermediate levels of abstraction were added. The addition, however, was

related to the abstraction levels of the expected input and of the expected queries.

Furthermore, maintenance of the resultant knowledge base by the knowledge engineer, including additional classifications by the expert, required significantly less time than

would be needed to create or modify pure programming code. Admittedly, we might not

be representative of the average user (designer), since we are familiar with RfiSUM6; in the domain of monitoring children’s growth, however, another physician who was not

familiar with the KBTA method, except for the TA ontology terms, was the main knowledge engineer [ 191.

(2) The knowledge-acquisition process was driven by the KBTA method, thus creating a purposeful structure to the process, and enabling the knowledge engineer to infer the existence of additional abstractions in a systematic way (e.g., a potential

gradient abstraction of a state abstraction of a known parameter). Thus, a certain measure of completeness is guaranteed. Additional sessions served to explore and refine the evolving parameter, event, and context ontologies. In the case of the growth-monitor-

ing domain, for instance, we believe that, without the explicit modeling methodology, we would not have progressed very far in our few meetings with the domain expert [ 191.

nor would the two knowledge engineers have been able to maintain the acquired knowledge [30]. Our experience in the protocol-based-care and diabetes-therapy domains

strengthened that impression. (3) Most important parameters and their values must be represented in one way or

another to perform the TA task at hand. Hematological toxicity tables (in protocols for treatment of AIDS), rate abstractions of the height standard-deviation score (in the growth-monitoring domain), and ranges and patterns of blood glucose in the domain of caring for diabetes patients need to be acquired and represented in some fashion. Thus, most of the knowledge-acquisition effort was spent on organizing in a useful, predeter-

mined architecture a significant amount of knowledge that must be accessed and

represented, implicitly or explicitly, to solve the task. Explicit representation, as we have shown, has multiple additional benefits; in the experiments that we have conducted, its

cost was not too prohibitive, while the results seemed to justify the effort.

5.3. Adcantages qf the KBTA framework and of the Rl&UMi system architecture

The KBTA method and its implementation as the Rl%UMl? system have several useful conceptual and computational advantages. The advantages can be categorizes as advantages for (1) summarization of time-oriented clinical data, (2) context-sensitive

interpretation, (3) automated therapy planning and (4) acquisition, representation, shar- ing and reuse of TA knowledge.

5.3.1. Summarization of clinical databases An important goal for clinical TA systems is the transformation of large volumes of

clinical data, stored as time-oriented medical records, into a concise representation. The

290 Y. Shahar, M.A. Musen /Artificial Intelligence in Medicine 8 (1996) 267-298

KBTA framework and the RESUME system support several useful aspects of that

process. The parameter ontology represents uniformly all clinical parameters (primitive or

abstract) of all types, numeric and symbolic. Thus, the input data can be of different types and at different levels of abstraction (e.g., both raw data, such as height

measurements, and higher-level concepts, such as the height standard-deviation score abstraction). For instance, some input data can be partially abstracted by the physician,

or by another computational module, before they are entered into the RESUME system. Furthermore, RESUME enables the expert to define temporal patterns at knowledge- acquisition time, while enabling the user of the application system to query the resulting

temporal fact base for new, arbitrary temporal patterns. Thus, output parameters are available at all levels of abstraction. In fact, the process of abstraction can be controlled in a partially goal-directed manner to support different types of users, who need different

types of output abstractions [29,30]. Another important property for clinical domains is that the RESUME system accepts

input data out of temporal order. This property follows from the truth-maintenance

system underlying the temporal-reasoning process (assisted by the temporal-inference

mechanism’s ability to detect contradictions, using the temporal-semantic properties of parameter propositions). The truth-maintenance system can retract conclusions that are

no longer true (notifying the user, if required), and can propagate new abstractions to the rest of the temporal fact base (this recursive process is finite, especially with respect to the potentially troublesome formation of new contexts [30]). Thus, the past can change our view of the present. Furthermore, new data enable the RESUME system to modify

past interpretations; thus, the present (or future) can change our interpretation of the past, a property referred to as hindsight [27]. The hindsight task is performed by several components of the RESUME system’s architecture: (1) the truth-maintenance system;

(2) the context-forming mechanism, which can create both prospective and retrospective

contexts dynamically; and (3) the temporal-interpolation mechanism, which has the ability to reason about both forward and backward persistence of belief in the truth of

parameter propositions. For instance, an additional data point, extending the length of the time interval of a certain abstraction interval, might enable that interval to be concatenated to a previous one of the same type.

5.3.2. Context-sensitiue interpretation of time-oriented clinical data

A unique feature of the logical framework underlying the KBTA method is that

interpretation contexts are separated logically from the events, abstractions, abstraction goals or the combinations of these entities that induce them. DIRCs (see Section 2) represent inference rules for inducing context intervals whose time interval can have any

of Allen’s [2] 13 temporal relations to the interval over which the inducing entity is interpreted, including quantitative temporal constraints on these relations. Contempora- neous interpretation contexts belonging to the SUBCONTEXT relation can form more specific composite interpretation contexts.

Abstractions are specialized in the parameter ontology by interpretation contexts. Interpretation contexts both reduce the computational burden and specialize the abstrac- tion process for particular contexts, by enabling within their temporal context the use of

Y. Shahar, M.A. Musen /Artificial Intelligence in Medicine 8 (1996) 267-298 291

only the TA knowledge (e.g., mapping functions) specific to the interpretation context. The use of explicit interpretation contexts and DIRCs allows us to represent both the induction of several different context intervals (in type and temporal scope) by the same context-forming proposition, and the induction of the same interpretation context by

different context-forming propositions (thus allowing us to represent the properties of the Hb parameter within a bone-marrow-toxicity context interval without the need to list

all the events that can lead to the creation of such a context interval). The expressiveness

of the interpretation-contexts language includes also allowing unified (or generalized) contexts (a union of different, temporally meeting contexts) and non-convex contexts

(interpolation between similar, temporally disjoint contexts), thus enabling, when de- sired, sharing of abstractions of the same parameter among different contexts and temporal phases (e.g., different therapy regimens within the same chemotherapy proto- col).

An important feature for any interpretation task is that several concurrent interpreta-

tion contexts can be induced, maintained and queried, thus creating different interpreta-

tions (different concurrent abstractions of the same type) for the same set of data points (e.g., one in the potential context of the patient having AIDS, and one in the potential

context of having complications of certain drugs given for prevention of AIDS). There is also room for some uncertainty during the interpretation in the expected data

values, and for some uncertainty in the time of the input or the expected temporal pattern. For instance, the context-sensitive significant-change property of each parameter captures the concept of measurement errors or clinically insignificant value changes; context-specific local and global persistence functions capture the notion of persistence of parameter propositions over time (both before and after the latter’s measurement);

temporal patterns leave room for variability in value and in time spans.

5.3.3. Automated therapy planning

The KBTA method performs an interpretation task, separate from tasks such as planning actions or of executing these actions. Such a separation is useful for knowl- edge-maintenance reasons and for task-specific reasons (e.g., although the abstractions

identified by the two endocrinologists in the diabetes-therapy domain were remarkably

similar, the therapy recommendations suggested by the experts differed significantly). Furthermore, the separation allows a TA system to reason about the data offline,

possibly accessing data directly from a temporally oriented database, such as an electronic medical-record database.

Another advantage of emphasizing the process of creating temporal abstractions and separating that process from the use of these abstractions is the ability to supply intermediate-level explanations to a health provider questioning, for instance, an advice

of a therapy planner. Since the abstractions are created, reasoned with, and saved independently of the recommendations using these abstractions, they are available for inspection.

The Rl%UMfi problem solver can be integrated within a broader context, such as a therapy planner that uses the KBTA method to solve one of its subtasks. One possibility we are considering for such an architecture is a temporal mediator architecture [6] that mediates between the user and a set of temporally oriented databases. Such a mediator

292 Y. Shahar, M.A. Musen/Artijicial Intelligence in Medicine 8 (1996) 267-298

will include not only a temporal-reasoning component but a temporal maintenance one, such as the Chronus system [5]. The integrated architecture can have both a more

expressive and user-friendly temporal-query language, thus solving most of the limita-

tions mentioned in Section 5.1. For instance, external, online temporal queries (as opposed to internal predefined patterns) need a higher-level language, particularly as the

user is often the physician. Another use of the KBTA framework by therapy planners that we are investigating, is

the ability to annotate clinical guidelines and policies by temporal patterns that need to be maintained, achieved, or avoided [31]. Such an annotation, combined with knowledge about the semantics of therapy actions and plans in the relevant domain, and knowledge

about generic potential revisions to clinical plans, would enable the planner to reason more intelligently about the physician’s goals and their adherence to the known policies

at different levels of abstraction. Such a capability would enhance considerably the cooperation between the planners, the automated and the human.

5.3.4. Acquisition, representation, maintenance, sharing and reuse of TA knowledge

Our experience with the KBTA method in several experimental test domains has demonstrated that the method is easily generalizable to several quite different clinical domains and tasks. The uniformity of the KBTA method is one of its main characteriz- ing aspects. The five uniform TA mechanisms solve the method’s five predefined subtasks by applying well defined, declarative knowledge roles that are parameterized

(instantiated) for each clinical domain. The domain-specific knowledge is represented as context-specific segments of the parameter ontology.

The domain-specific knowledge used by the KBTA method, its structure, and its semantics are explicit, and is as represented as declaratively as possible (as opposed to being represented in procedural code). The explicit representation of the knowledge

roles supports acquisition of the knowledge necessary for applying the method to other domains, maintenance of that knowledge, reuse of the domain-independent TA mecha- nisms and sharing of the domain-specific TA knowledge with other applications in the

same domain. The organization of the knowledge in the RESUME system into parame- ter, event, and context ontologies plays a major role in accomplishing these goals. The RESUME system exploits its underlying logical framework, and uses, as much as possible, declarative representations, such as uniform multidimensional tables, where

several semantic axes combine to indicate the proper interpretation of the table [30]. The declarative representations of various classification functions facilitate both modification of the knowledge base by the user and introspective reasoning by the TA mechanisms.

The organization of the knowledge in the parameter ontology as subclasses of the four general abstraction types (state, gradient, rate and pattern) with frame-based inheritance

of general abstraction-type and domain-specific properties, further enhances the ease of designing new systems, of acquiring the necessary knowledge and of maintaining the TA knowledge base.

To demonstrate the generality of the KBTA method, we have collaborated [l] with researchers in the European KADS-II project, a newer version of the KADS methodol- ogy [35]. The TA mechanisms can be represented using KADS-II primitive inference actions, a highly non-specific, modular set of inference actions. The tradeoff in such a

Y. Shahar, M.A. Musen /Arrijicial Inrelligence in Medicine 8 (1996) 267-298 293

modular representation is the loss of specificity to the task of abstracting concepts from time-oriented data. Representing the temporal-interpolation mechanism, for example, as a set of primitive inference actions tends to obscure the way that the domain-specific TA knowledge is used by that mechanism [l].

An explicit representation of TA knowledge also enables a designer to construct an automated knowledge-acquisition tool that might be used directly by an expert physician to augment the clinical knowledge base-in this case, the TA ontology. Constructing such

tools, when possible, has major benefits, mainly in facilitating the acquisition of knowledge without the intervention of a knowledge engineer 1231. One solution we are

currently exploring for automating the acquisition of TA knowledge is the automatic generation of a knowledge-acquisition tool for the TA mechanisms given the domain’s

ontology and the ontology of the KBTA method, using a framework similar to the PROTBGB-II project [24,26,34,9,11].

6. RESUME and other clinical TA systems

Several different approaches have been applied in philosophy, in general computer

science, and in AI to tasks that are at least comparable to the TA task as we define it. A comprehensive review of temporal-reasoning approaches in general and their application

to clinical domains in particular, and a comparison of frameworks applied in clinical domains to the KBTA method and its implementation in the RBSUMl? system is

presented elsewhere [30]. Systems performing a TA task that were implemented mainly for clinical domains include Fagan’s ventilation management (VM) system [lo], Blum’s Rx system for knowledge-discovery from time-oriented clinical databases [3]; Downs’ summarization program for medical records [8]; Kohane’s temporal utilities package

[ 181; de Zegher-Geets’ IDEFIX system for summarizing patient visits [7]; Russ’ temporal control structure system [27]; Kahn’s TOPAZ system [17]; the Guardian project [ 141; Haimovitz and Kohane’s TrenDx system [ 131; and Larizza’s TA module in the M-HTP project [21].

Several of the systems mentioned above defined their main task as abstraction of time-stamped clinical data and can be compared more directly with the RESUME

framework. The use of the truth-maintenance system in RI%I_JMB resembles Russ’s temporal

control structure (TCS) system [27]. However, TCS relegates all domain-specific temporal reasoning to user-created procedures. TCS treats the user-defined reasoning modules as black boxes, and supplies only temporal bookkeeping utilities. In this

respect, TCS is highly reminiscent of the time specialist of Kahn and Gorry [15]: It has no knowledge of temporal properties of the domain. In contrast, the Rl%UMB domain-

independent TA mechanisms perform all of the TA, given a (declarative) representation of the domain’s TA ontology.

The TrenDx system of Haimowitz and Kohane [13] builds on Kohane’s constraint- satisfaction temporal-utilities package [ 181, and defines domain-specific patterns called trend templates (TTs). TrenDx is useful in detecting that the data is consistent with one

294 Y. Shahar. M.A. Musen/Artifcial Intelligence in Medicine 8 (19961267-298

or more TTs, including TTs of which only a part is observed. The goal of TrenDx is different from that of RhSUM6. TrenDx does not create any intermediate abstractions,

since its goal is not to abstract, summarize, or answer queries about the data, as it is in the TA task, but rather to match data efficiently against a set of predefined patterns. Data can only be accepted at the lowest level; thus, no input of intermediate-level

abstractions is possible. No explicit domain ontology of parameters and events exists, and a constraint (e.g., significant change in a parameter) might be repeated with the

same implicit role in different TTs and even at different parts of the same TT. Like Rl%UMl?, TrenDx assumes implicitly an ill-defined domain that cannot be modeled

easily quantitatively, and therefore requires detection of essentially associative temporal patterns.

Kahn’s TOPAZ system [17] integrates a quantitative physiological model and a

symbolic model for aggregation of clinically significant intervals. TOPAZ can associate interpretation methods with an interval representing a context of interest. l&SUM6 extends this capability by the context-forming mechanism, which uses an explicit context ontology to enable creation of context-specific abstractions and activation of specific functions, but does not limit generated interpretation contexts to the temporal extent of the parent event, allowing any desired relation between the generating interval

and the generated context. In the particular domain of therapy for insulin-dependent diabetes, the AIDA system [22] is a diabetes-treatment decision support prototype system, whose underlying model attempts to reflect the (patho)physiology of insulin

action and carbohydrate absorption in quantitative terms. Note that systems such as TOPAZ and AIDA assume a precise underlying mathematical model of the domain; most clinical domains defy co?plete quantitative modeling.

A system more similar to RESUME, at least in its objectives, is the M-HTP system [21]. M-HTP monitors heart-transplant patient by creating interval-based abstractions and checking for predefined temporal patterns, using a relational database. M-HTP has a

taxonomy of domain-specific parameters. However, M-HTP does not have a clear separation between the ontology of its problem-solving method and the ontology of its domain. In addition, abstraction classes, such as the state of blood glucose, are not first-class, new parameters, as in the RI%UMk system. Rl?SUMh can be viewed as a metatool that might acquire and represent most of the domain-specific knowledge of M-HTP by creating an appropriate domain-specific TA ontology, thus instantiating its task-specific but domain-independent TA mechanisms.

The various systems mentioned in this section vary greatly. On inspection, however, most of the systems performing a significant amount of the TA task solve tasks closely

related to the five fundamental subtasks of the TA task, as that task is decomposed by the KBTA method [30]. Furthermore, such systems rely implicitly on the four types of knowledge defined in Section 2. Unlike the KBTA framework, however, this knowledge is usually not represented explicitly.

Thus, the KBTA method also can be viewed as an INFERENCE STRUCTURE, not unlike Clancey’s heuristic classification inference structure [4]. It is clear that the KBTA method makes explicit the subtasks that need to be solved for most of the variations of the TA interpretation task. These subtasks have to be solved, explicitly or implicitly, by any system whose goal is to generate interval-based abstractions. The TA mechanisms

Y. Shahar, M.A. Musen /Artificial Intelligence in Medicine 8 (19961267-298 295

that we have chosen to solve these subtasks make explicit both the tasks they solve and the knowledge that they require to solve these tasks. None of the approaches we examined focuses on the knowledge acquisition, maintenance, reuse, or sharing aspects of designing and building large knowledge-based medical systems. The approaches described, as applied to the TA task, are not representing their inference strategy at the knowledge leuel [25]. We might therefore expect full implementations of these ap-

proaches to encounter several typical design and maintenance problems of knowledge- based systems. In particular, we would expect difficulties, some perhaps insurmountable, when we attempt (1) to apply these approaches to TA tasks in new domains, (2) to reuse

them for similar tasks in the same domain, (3) to maintain the soundness and complete-

ness of their associated knowledge base and its interrelated components through modifications and (4) to acquire the knowledge required to instantiate them in a

particular domain and task in a disciplined and perhaps even automated manner.

7. Summary and conclusions

The knowledge used by domain experts, such as expert physicians, to extract meaningful temporal intervals from a set of data is intricate and is largely implicit. This intricacy is reflected in the complexity of the TA knowledge, when that knowledge is represented explicitly in the KBTA model by a domain-specific TA ontology of parameters, events, contexts, abstraction goals and DIRCs. Designers of medical knowl-

edge-based systems cannot escape this complexity if they wish to support tasks that

involve significant amount of reasoning about time-stamped data. The TA task and the methodology we propose for solving it, are relevant to many

different clinical domains. In particular, they are relevant to domains in which abstrac-

tion of concepts over time from raw or abstract time-oriented input data is needed, and in which several of the features described in Section 5.3 are desired. The methodology is especially useful when several abstraction levels and data types exist as possible inputs or outputs of the TA task, when data might arrive out of temporal order, when several context-specific interpretations might need to be monitored in parallel, and when large amounts of TA knowledge need to be represented and maintained in a disciplined way.

Most of the TA knowledge can be acquired from domain experts, and can be used for solving the TA task in a particular domain. Since the knowledge requirements of the TA

mechanisms are well defined, the knowledge-acquisition process can use either a manual methodology driven by the knowledge roles defined in the KBTA inference structure, or can use automatically generated knowledge-acquisition tools, tailored to the domain and to the task, such as the knowledge-acquisition tools generated by the PROTEGl%II

system [34]. Whatever the knowledge-acquisition methodology chosen, however, understanding

the knowledge required for abstracting clinical data over time in any particular domain is a useful undertaking. A clear specification of that knowledge, and its representation in an ontology specific to the task of abstracting concepts over time, supports designing new knowledge-based systems that perform temporal-reasoning tasks. The formal

296 Y. Shahar, M.A. Musen/Artificial Intelligence in Medicine 8 (1996) 267-298

specification of the TA knowledge also supports acquisition of that knowledge from domain experts, maintenance of that knowledge once acquired, reusing the problem- solving knowledge for temporal-reasoning tasks in other domains, and sharing the domain-specific knowledge with other problem solvers that might need access to the

domain’s temporal-reasoning knowledge.

Acknowledgements

This work has been supported in part by grant HS06330 from the Agency for Health

Care Policy and Research, by grants LM05157, LM06245 and LM05305 from the National Library of Medicine, and by gifts from Digital Equipment Corporation.

M.A.M. is a recipient of the National Science Foundation Young Investigator Award IRI-9257578. Drs. Fredric B. Kraemer and Lawrence V. Basso were our experts in the diabetes domain and contributed much of their valuable time. Dr. Michael G. Kahn supplied the data in the domain of therapy of insulin-dependent diabetes. Dr. Darrel M. Wilson was our very knowledgeable expert and Dr. Kuilboer was the knowledge engineer in the domain of monitoring children’s growth. Dr. Jeanette Sison assisted us in

the domain of AIDS therapy. We would like to thank Richard Fikes, Barbara Hayes-Roth,

Samson Tu, Amar Das and Michael Kahn for many useful discussions.

References

[1] M. Aben, Y. Shahar and M.A. Musen, Temporal abstraction mechanisms as KADS inferences, in B.R.

Gaines and M.A. Musen, eds., Proceedings of the 8th Banff Knowledge Acquisition for Knowledge-Based

Systems Workshop, Vol. 2 (SRDG Publications, Department of Computer Science, University of Calgary,

1994) 28-l-28-22.

[2] J.F. Allen, Towards a general theory of action and time, Arti$ Intell. 23 (1984) 123-154.

[3] R.L. Blum, Discovery and representation of causal relationships from a large time-oriented clinical

database: The RX project, Lect. Notes Med. Inform. 19 (1982).

[4] W.J. Clancey, Heuristic classification, Art$ Intell. 27 (1985) 289-350.

[5] A.K. Das and M.A. Musen, A temporal query system for protocol-directed decision support, Methods

Inform. Med. 33 (1994) 358-370.

[6] A.K. Das, Y. Shahar, S.W. Tu and M.A. Musen, A temporal-abstraction mediator for protocol-based

decision support, in J.G. Ozbolt, ed., Proceedings of the Eighteenth Annual Symposium on Computer

Applications in Medical Cure (Hanley and Belfus, Philadelphia, PA, 1994) 320-324.

[7] I.M. de Zegher-Geets, IDEFIX: Intelligent Summarization of a Time-Oriented Medical Database, M.S.

Dissertation, Program in Medical Information Sciences, Stanford University School of Medicine, 1987,

Knowledge Systems Laboratory Technical Report KSL-88-34, Department of Computer Science, Stan-

ford University, Stanford, CA, 1988.

[8] S.M. Downs, M.G. Walker and R.L. Blum, Automated summarization of on-line medical records, in R.

Salamon, B. Blum and M. Jorgensen, eds., MEDINFO ‘86: Proceedings of the FiJih Conference on

Medical Informatics (North-Holland, Amsterdam, 1986) 800-804.

[9] H. Eriksson, Y. Shahar, S.W. Tu, A.R. Puerta and M.A. Musen, Task modeling with reusable

problem-solving methods, Artif: Intell. 79 (199.5) 293-326.

Y. Shahar, M.A. Musen /Artificial Intelligence in Medicine 8 (1996) 267-298 297

[IO] L.M. Fagan, VM: Representing time dependent relations in a medical setting, Ph.D. dissertation,

Department of Computer Science, Stanford University, Stanford, CA, 1980.

[I 11 J. H. Gennari, S.W. Tu, T.E. Rothenfluh and M.A. Musen, Mapping domains to methods in support of

reuse, Int. J. Hum.-Comput. Stud. 41 (1994) 399-424. [12] J. Giarratano and G. Riley, Expert Systems: Principles and Programming (PWS, Boston, MA, 1994).

[13] I.J. Haimowitz and IS. Kohane, Automated trend detection with alternate temporal hypotheses, in

Proceedings of the Thirteenth International Joint Conference on Arti$cial Intelligence (Morgan Kauf-

mann, San Mateo, CA, 19931 146-151.

[14] B. Hayes-Roth, R. Washington, D. Ash, R. Hewett, A. Collinot, A. Vina and A. Seiver, Guardian: A

prototype intelligent agent for intensive-care monitoring, Artif Intell. Med. 4 (I 992) 165- 18.5.

[ 151 K. Kahn and G.A. Gorry, Mechanizing temporal knowledge, Artif Intell. 9 (1977) 87-108. [16] M.G. Kahn, CA. Abrams, S.B. Cousins, J.C. Beard and M.E. Frisse, Automated interpretation of

diabetes patient data: Detecting temporal changes in insulin therapy, in R.A. Miller, ed., Proceedings of the Fourteenth Annual Symposium on Computer Applications in Medical Care (IEEE Computing Society

Press, Los Alamitos, AZ, 1990) 569-573.

[17] M.G. Kahn, Combining physiologic models and symbolic methods to interpret time-varying patient data,

Methods Inform. Med. 30 (1991) 167-178.

[18] IS. Kohane, Temporal reasoning in medical expert systems. Technical Report 389, Laboratory of

Computer Science, Massachusetts Institute of technology, Cambridge, MA, 1987.

[19] M.M. Kuilboer, Y. Shahar, D.M. Wilson and M.A. Musen, Knowledge reuse: Temporal-abstraction

mechanisms for the assessment of children’s growth, in C. Safran, ed., Proceedings of the Senenteenth Annual Symposium on Computer Applications in Medicine (McGraw-Hill, New York, NY, 1993)

449-453.

[20] P. Ladkin, Time representation: A taxonomy of interval relations, in Proceedings of the Sixth National Conference on Artificial Intelligence, Philadelphia, PA (1986) 360-366.

[21] C. Larizza, A. Moglia and M. Stefanelli, M-HTP: A system for monitoring heart-transplant patients,

Artf: Intell. Med. 4 (1992) 111-126.

[22] E.D. Lehmann, T. Deutsch, E.R. Carson and P.H. Sonksen, AIDA: an interactive diabetes advisor,

Comput. Methods Prog. Biomed. 41 (1994) 183-203.

[23] M.A. Musen, Automated Generation of Model-based Knowledge-Acquisition Tools (Morgan Kaufmann,

San Mateo, 1989).

[24] M.A. Musen, J.H. Gennari, H. Eriksson, SW. Tu and A.R. Puerta, PROTEGE-II: Computer support for

development of intelligent systems from libraries of components, in MEDINFO ‘95: Proceedings of the Eight World Congress on Medical Informatics, Vancouver, British Columbia, Canada (19951 766-770.

[25] A. Newell, The knowledge level, Arftf Intell. 18 (1982) 87-127. [26] A.R. Puerta, J.W. Egar, S.W. Tu and M.A. Musen, A multiple-method knowledge acquisition shell for

the automatic generation of knowledge acquisition tools, Knowledge Acquisition 4 (1992) I71 - 196. [27] T.A. Russ, Using hindsight in medical decision making, in L.C. Kingsland, ed., Proceedings of the

Thirteenth Annual Symposium on Computer Applications in Medical Care (IEEE Computing Society

Press, Washington, 1989) 38-44.

[28] Y. Shahar, S.W. Tu and M.A. Musen, Knowledge acquisition for temporal-abstraction mechanisms,

Knowledge Acquisition 4 (1992) 217-236.

[29] Y. Shahar and M.A. Musen, RESUME: A temporal-abstraction system for patient monitoring. Compuf. Biomed. Res. 26 (1993) 255-273, reprinted in J.H. van Bemmel and T. McRay, eds., Yearbook of Medical bformatics 1994 (F.K. Schattauer and The International Medical Informatics Association,

Stuttgart, 19941 443-461.

[30] Y. Shahar, A knowledge-based method for temporal abstraction of clinical duta, Ph.D. dissertation,

Program in Medical Information Sciences, Stanford University School of Medicine, 1994, Knowledge

Systems Laboratory Report No. KSL-94-64, 1994, Department of Computer Science report No. STAN-

CS-TR-94-1529, Stanford University, Stanford, CA, 1994.

[31] Y. Shahar and M.A. Musen, Plan recognition and revision in support of guideline-based care, in

Proceedings of the AAAI Symposium on Representing Mental States and Mechanisms (Stanford Univer-

sity, CA, 1995) 118-126.

298 Y. Shahar, M.A. Musen /Artijicial Intelligence in Medicine 8 (1996) 267-298

[32] Y. Shoham, Temporal logics in AI: Semantical and ontological considerations, Artif: Intell. 33 (1987)

89-104.

[33] R. Snodgrass and I. Ahn, Temporal databases, IEEE Comput. 19 (1986) 35-42.

[34] SW. Tu, H. Eriksson, J.H. Gennari, Y. Shahar and M.A. Musen, Ontology-based configuration of

problem-solving methods and generation of knowledge-acquisition tools: Application of PROTI?GB-II to

protocol-based decision support, Anif htell. Med. 7 (1995) 257-289. [35] B. Weilinga, A.T. Schreiber and .I. Breuker, KADS: a modeling approach to knowledge engineering,

Knowledge-Acquisition 4 (1992) 5-53.