Representing quantitative and qualitative knowledge in a knowledge-based storm-forecasting system

Int. J. Man-Machine Studies (1986) 25, 523-547

Representing quantitative and qualitative knowledge in a knowledge-based storm-forecasting system

RENI~E ELIO AND .JOHANNES DE HAANt

Department of Computing Science, University of Alberta and t Computing Department, Alberta Research Council, Alberta, Canada

(Received 30 January 1986 and in revised form 8 October 1986)

METEOR is a rule- and frame-based system for short-term severe storm forecasting. Initial predictions are based on interpretations of contour maps generated by statistical predictors of storm severity. To confirm these predictions, METEOR considers additional quantitative measurements, ongoing meteorological conditions and events, and how the expert forecaster interprets these factors. Meterorological events are derived from interpreting human observations of weather conditions in the forecast area. This task requires a framework that supports inferences about the temporal and spatial features of meteorological activities. To accommodate the large amounts of different types of knowledge characterizing this problem, a number of extensions to the rule and frame representations were developed. These extensions include a view scheme to direct property inheritance through intermingled hierarchies and the automatic generation of production system rules from descriptions stored in frames on an as-needed basis.

1. Introduction

Knowledge-based systems are typically concerned with problems that do not have algorithmic solutions. Sometimes, these systems are viewed as an alternative to statistical analysis. However, when algorithms and statistical methods are integral aspects o f an expert 's problem-solving method, they can also be integral parts of a knowledge-based system. In particular, the combination of numerical or statistical methods within the knowledge-based system framework is important to many applications. This approach characterizes our system, METEOR, which uses a statistical model in conjunction with qualitative data and knowledge to forecast storms.

The knowledge that distinguishes experts from novices on some task includes strategies, heuristics, and conceptual understanding gleaned from experience rather than formal instruction. While these types of knowledge about a problem domain are non-numeric in nature, they are often used to interpret numeric methods. Because sophisticated statistical models and numerical analysis techniques are important problem-solving tools in many real-world problems, it is helpful to characterize the roles these methods play in problem-solving. First, they can serve as preliminary data reduction methods, particularly in data-intensive applications. By data-intensive, we mean problems involving large quantities o f data, typically gathered from a host of measurement devices. The complexity of these problems is compounded when data must be processed or interpreted in real-time. An expert may rely on a variety of

Address for correspondence: Professor Ren6e Elio, Department of Computing Science, 338 Assiniboia Hall, Edmonton, Alberta, T6G 2E7.

523

0020-7373/86/050523+25503.00/0 �9 1986 Academic Press Inc. (London) Limited

524 R. ELIO A N D J. DE H A A N

algorithmic or quantitative methods to collect, screen, and reduce the data prior to analysis. For domains like meteorology that have large historical databases, statistical modeling and analysis techniques are quite pervasive and useful. For these kinds of applications, what distinguishes an expert practitioner from a novice may include an ability selectively to apply numeric methods or interpret statistical models with greater accuracy.

For some problems, it is difficult to represent either the data or the analysis of a statistical model with formalisms developed for representing symbolic knowledge. Examples of this kind of data include acoustic signals, oil-well and dipmeter logs, and maps. If an expert reasons about data in these forms, then some representation of them must exist in a knowledge-based system. Knowledge-based systems often employ quantitative methods and numeric representations to extract significant features from data and convert them to some symbolic representation. Some version of this "signal-to- symbol" process (Nii, Feigenbaum, Anton & Rodimore 1982) is needed in many applications. For example, Milios & Nawob (1985) note that people who are skilled at inte/'preting acoustic signals selectively apply signal processing operators to aid their interpretation. This selection is influenced by the results of previous operators as well as knowledge about the general nature of the signal. Milios and Nawob's system for this task is based on a cycle of representing signal features, selecting operators to refine that representation, and using expert knowledge to formulate plans for subsequent operator selection. Knowledge-based systems in geological applications have similar characteristics. The Dipmeter Advisor uses geological knowledge as well as knowledge about types of dipmeter patterns to recognize particular features in the log patterns as the first step in interpretation (Davis, Austin, Carlbom, Frawley, Prucknik, Sneiderman & Gilreath, 1981). The degree to which symbolic knowledge guides or influences the use of quantitative methods and formalisms varies from application to application. However, the challenge lies in designing systems that integrate both symbolic and non-symbolic information and methods in a coherent knowledge representation framework.f

The problem of forecasting severe storms has many of the features described above. It is a data-intensive problem in which large amounts of quantitative and qualitative information must be analysed in near real time. Forecasting meteorologists often use statistical models as the foundation for a forecast. Our basic premise is that such algorithmic models and methods are just additional tools an expert brings to bear on the problem. In a broader sense, they are another aspect of the problem about which the expert reasons. Therefore, we made the expert's statistical forecasting model, and his knowledge about its interpretation and limitations, an integral part of METEOR. In the remainder of this paper, we will concentrate primarily on those aspects of the problem and the knowledge representation that make METEOR a useful case study on organizing and integrating different types of quantitative and qualitative data and methods in a knowledge-based system framework. Indeed, this has been our main interest and it would be misleading to characterize the current METEOR system as "expert" in terms of its performance. First, we will present some background on the application, noting the problems that influenced our design and development. Then, we will describe aspects of the knowledge representation scheme. Given this framework,

f It has been pointed out to us that some of our observations on integrating quantitative and qualitative knowledge are similar to those recently made by Ganascia (1984).

KNOWLEDGE REPRESENTATION FOR STORM FORECASTING 525

we will discuss implementation features that clarify the role of the statistical model in METEOR's problem solving. Finally, we will make some general observations on knowledge representation issues for problems of this type.

2. Application background

The Alberta Research Council's Atmospheric Sciences Department has conducted a research program on weather modification and hail suppression for a number of years. During "hail season", an experienced meteorologist and several assistants are responsible for predicting the occurrence, severity, and path of hail storms for this research problem.

The meteorologist begins his task by trying to understand patterns of meteorological activity. He typically consults a large number of maps, both diagnostic and prognostic in nature, generated from meteorological measurements taken at weather stations throughout the continent. These maps provide information such as temperature, humidity, wind direction, and wind speed at several levels of the atmosphere. The foundation of his forecast is a statistical model developed to evaluate potential atmospheric instability, or convective activity (Strong & Wilson, 1983). Very simply, this model (called the Synoptic Index of Convection, or Sc4) combines four predictor variables to evaluate whether the right ingredients--atmospheric instability and mois- t u r e - a r e present in the right amounts to generate severe weather. The model generates a single convective rating ( -3 to +5) of the degree of potential convective activity and this is used as a measure of possible storm severity. A higher rating means a more severe storm. For example, -1 means "scattered showers but no thundershowers" while +5 means "hail larger than golfballs". The positive-negative sides of this scale roughly correspond to a hail-no hail distinction.

2.1. EXPERT USE OF THE INDEX: INTERPRETATION OF CONTOUR MAPS

Since the Sc4 index is based on measurements taken at weather stations, there is a computed Sc4 index value for each station in the forecast area. Values for the index are interpolated between stations and the final output is provided in the form of a contour map. An Sc4 map is shown in Fig. l(a). Each contour is marked with the value of the index.

The index rates the entire day in terms of maximum expected convective intensity. The forecaster must make more fine-grained predictions of where storms will form, what their relative intensity will be, and where they will move. To do this, the forecaster examines the contour map of index values like that given in Fig. 1 (a) and interprets it with the aid of additional information. A simple description of this interpretation process goes as follows. The meteorologist, familiar with this model, knows that storms will form not where the maximum index values are, but rather where the index values are changing most rapidly. These areas of strong contour gradient on the map are candidate areas for storm initiation only if they are upwind of the line of maximum Sc4 values. Having located the regions of maximum Sc4, the forecaster considers wind speeds and directions at a particular pressure level. In so doing, some strong contour gradients are elimintated as potential storm initiation areas because of their position with respect to wind direction and the maxima. For the Fig. l(a) example, winds

526 R. E L I O A N D J. D E H A A N

J ~ g q ~ _.3-~S .~'' '- \ x ~ / ~ . .I- " d ~ " ~ I ~ \ '%'" a)

~ ~ "~, - --,.

_: :_ . . . . . ~ " - .~ - . "~

I A

FIG. 1. An Sc4 contour map (a) and a surface moisture contour map (b).

coming from the north-east suggest storm initiation areas lie to the left of the maximum index values (e.g. o f t h e contour marked 5) rather than to the right.

To refine his predictions of storm initiation and intensity further, the forecaster considers another contour map depicting levels of surface moisture. An example is given in Fig. l(b). Unlike the Sc4 index, this contour map is a simple plot of surface moisture measurements, not a statistical predictor. This information is important, because moisture "feeds" the storm and will influence both its intensity and its direction of movement. Having delineated some potential storm initiation regions on the Sc4 map, the expert considers the levels of the surface moisture surrounding these candidate regions as indicated by the surface moisture map. The expert uses the relative degrees of near-by surface moisture to refine the storm initiation areas as well as a direction of movement. The grey areas in Fig. l(a) marked with an "S" indicate the expert's forecast for severe storms; the areas marked with an " M " indicate his forecast for moderate storms.

K N O W L E D G E REPRE S E NT AT ION FOR STORM F ORECASTING 527

2.2. P E R F O R M A N C E OF STATISTICAL INDEX

Like ~ny analysis tool, the Sc4 model's accuracy in predicting the convective rating for a day has undergone considerable development and evaluation. It was shown to be correct within one rating category 66% of the time (i.e. predicting a rating of n when the resulting weather conditions were best described by a rating of n + 1), and within two rating categories 85% of the time. With modifications made by the expert, these accuracy levels rose to 74% and 91%, respectively. These improvements are primarily associated with the "in between cases", i.e. when the model is predicting neither a very high nor a very low amount of convective activity.

2.3. THE ROLE OF THE INDE X IN EXPERT F OR E CASTING

The Sc4 index is similar to most quantitative approaches to interpretation, in that it implicitly embodies expert knowledge abou t how measurable meteorological conditions can be related to infer large-scale weather dynamics. In designing a knowledge-based system for this problem, one could "unpack" the knowledge implicitly represented in the quantitative index and represent it symbolically. However, the Sc4 index is an et~icient first-pass analysis of large sets of meteorological data. It yields a concise framework for interpreting other kinds of information and is an integral part of expert forecasting. From a knowledge-engineering standpoint, the Sc4 index is the "language" that our meteorologist and his associates use. For them, it represents a useful and informative snapshot of the necessary storm ingredients. Most importantly, the meteorologist does not use the model as a "black box". He understands the factors that can fool the model's predictor variables and compensates for them. This kind of knowledge distinguishes his use of the model from that of less experienced forecasters. Rather than exclude this statistical index or reimplement it in a symbolic form, we concentrated on modeling the expert's interpretation strategies and his knowledge of their limitations. However, interpreting the S ~ index is only part of the reasoning METEOR does. Therefore, a framework for using the Sc4 index in conjunction with other types of qualitative data and knowledge was necessary. This qualitative knowledge is described in the following section.

2.4. QUALITATIVE DATA A N D K N O W L E D G E A B O U T W E A T H E R C O N D I T I O N S

Experienced forecasters who are familiar with a particular geographical area often have "local knowledge" about how weather forms in the area and its implications for storm development. Data collected at weather stations includes information about current weather conditions. As an example, a weather station report is given in Fig. 2. This report contains a set of meteorological measurements, some of which are used in the Sc4 index. Two portions of this report provide the qualitative information

2006 YEG SA 0600 E50 BKN 90 OVC 15+TRW- 1481161131301019961CB7AC3

(a)

LTGIC - CC - CG SW QUAD. SHWR$ HVIER N,W PRES UNSTDY 3012

(b)

FIG. 2. An example weather-station report.

528 R. ELIO AND J. DE HAAN

that METEOR uses. The first, marked (a) in the figure, indicates which types of clouds are observed and what observable portion of the sky they cover. In this example, the first cloud layer the observer sees, cumulonimbus (CB), covers 7/10 of the sky. A second cloud layer of altocumulus (AC) covers the remaining 3/10 of the sky the observer can see. Other data in the report describe the state of this cloud cover. For example, the CB clouds are "broken" (BKN), meaning the observer can see through this layer. If an earlier report from this station had indicated the layer was overcast, this shift to broken might signify certain dynamics or processes are occurring.

The other important information is the remarks section of the report, marked (b) in the figure. In this section, the human observer at the weather station provides additional information on current conditions that are not easily expressed in coded format. What these remarks report is the following: "There is lightning in the clouds, from cloud to cloud, and from cloud to the ground in the south-west direction (relative to the station). Showers are heavier in the north and west directions. The pressure (at the station) is unsteady".

Both the cloud-cover information and the information contained in the remarks section have gone unanalysed in the past because there is no simple way for traditional methods to "understand" their information and codify it in a form usable by statistical models. However, these qualitative conditions are meaningful to an experienced meteorologist familiar with this particular forecast area. For example, our expert has a number of informal heuristics based on cloud formations. An example from this "look-out-the-window" forecast strategy is: " I f there are cirrostratus clouds, then it's not likely a storm will occur". Like most heuristics, this rule comes with a set of qualifications and hedges. However, it is not simply an empirical association, but reflects the expert's causal understanding of storm formation and thermodynamics. Other observations are more direct evidence of the kinds of dynamic processes that the Sc4 index is, indirectly, trying to measure. They can also alert the expert to situations that can fool the statistical model's predictor variables into yielding an inaccurate index rating. A simple example of this occurs when current precipitation increases surface moisture. High surface moisture is an important ingredient for subsequent severe storm activity, but not when caused by current precipitation. Lacking knowledge about such weather conditions and their implications, the statistical model can only treat all moisture measurements the same.

2.5. S U M M A R Y

We can characterize this task in a number of ways. It is data-intensive: information like that shown in Fig. 2 arrives from nearly 100 stations every hour for our forecast area alone. Special reports can arrive during the hour and the data must be processed in pseudo-real time, i.e. forecast updates can be done as quickly as once every hour. The resulting situation is a typical case of too much data of too many different types. The data that is amenable to numerical methods is the data that is incorporated into statistical models. Data that cannot be interpreted by these traditional computer methods or interpreted fast enough by people goes unanalysed, despite its potential importance to the forecast process. To work effectively in this environment, the expert relies on visual presentation of both the data and statistical analyses.

There are also some unique spatial and temporal aspects to this problem. To interpret a map like the one in Fig. l(a), the expert identifies particular features of the map

K N O W L E D G E REPRESENTATION FOR STORM FORECASTING 529

such as closed contours, rapid changes in contour gradient, and spatial adjacency. To interpret the Sc4 index, a knowledge-based system must recognize and manipulate features of contour maps and integrate them over time. Our solution was an internal representation of the forecast "world" onto which the contour maps could be projected. Because spatial inferencing is done continually during the reasoning process, this representation of the "world" and its associated methods is necessary not only for understanding maps, but for deriving and organizing spatial information about all observed or inferred meteorological activities. A spatial representation, augmented with a scheme for representing temporal changes, is an integral part of the framework that supports interpreting contour maps as well as qualitative weather conditions. The design and organization of METEOR's knowledge representation framework is presented in the following section.

3. Knowledge representation and organization

3.1. OVERVIEW

METEOR combines several different knowledge representations. Some features of the forecasting problem were compatible with a rule-based approach: there were many subtasks METEOR must perform to acquire and interpret data; there was a relatively well-defined strategic approach to interpreting the quantitative predictors and maps for making and fine-tuning predictions; and the heuristics about weather conditions could be captured as IF-THEN associations. On the other hand, much of the domain knowledge lends itself to a hierarchic organization that will permit inheritance of properties. For example, there is knowledge about the concept "cloud" that is shared by both "non-convective clouds" and "convective clouds". There are many such hierarchically related concepts, some related to weather conditions, others related to areas recognized on contour maps like those in Fig. 1. More importantly, METEOR repeatedly instantiates these concepts dynamically for each forecast. The instantiated concepts must be organized in a way that is efficient for both storage and inferencing. Finally, there are relationships among hierarchically related weather observations, their properties, and their implications about longer-term processes that underlie the meteorologist's qualitative heuristics. We chose a frame-based representation for these kinds of knowledge.

Not all of METEOR's knowledge is symbolic, because not all of the data can be efficiently represented symbolically. In particular, the processing and analysis of contour maps is handled by LISP objects. However, all information, whether the result of contour map analysis of the interpretation of observed conditions, is organized within the frame system and accessible by the rule components.

Figure 3 presents a schematic view of major types knowledge in METEOR and how they interact. The frame system organizes spatial knowledge as well as domain knowledge. The spatial knowledge includes special geometric methods for manipulating meteorological activities and domain concepts as spatial objects. The geometric methods use a numeric representation of the forecast world. This representation (labelled "mapel map" in Fig. 3) allows other activities, such as human observations of weather conditions, to be oriented in space. Production systems manipulate, extend, and inspect this frame system as they guide METEOR through the forecast task. The

5 3 0 R. E L I O A N D J. D E H A . A N

/ /

/

~ tour map

22 I I [ I J l l l l ] l f I[

I I l l I I \ . . . . . . i l l l l l ] i i i i i [ i J

/ / / /

/ / / /

/ / / /

/ / /

/ / / /

/ /

/ I I I /

pap

Product ion systems

�9 atrets~les li/ld subtlSkS �9 domain inference rules

�9 remarks parsing

�9 region identification �9 contour map analysis

Product ion rule act ions

V

�9 frame creation, changes, seirches

�9 automatic rule generation

�9 algorithmic methods

, Hierarchical f rame system

Spatial concepts 1" and methods "1 i I" and methods ~ Domain concepts I

~lbl Concept instantiations II

I I I I L J

"Allocumuius clouds in southern Alberta"

FIG. 3. A schematic view of knowledge organization and interaction in METEOR.

link between the production systems and the frame systems is accomplished through production rule actions. Some of these actions are quite specialized algorithmic methods for manipulating contour maps and their projection onto the mapel map. We will refer to Fig. 3 as we discuss each of these aspects in more detail. Among the most important properties associated with concepts represented in the frame system are those relating to them to METEOR's numeric representation of the forecast area. We will describe this representation and its relation to the frame system next.

3.2. SPATIAL AND TEMPORAL INFERENCING

Given maps like those in Fig. 1, METEOR must be able to locate areas of absolute or relative maximum, identify areas of strong contour gradient, combine them with the location of other spatially defined areas on other contour maps, and create new objects corresponding to them. A number of questions are implicitly or explicitly raised

K N O W L E D G E R E P R E S E N T A T I O N F O R S T O R M F O R E C A S T I N G 531

during map analysis: Does this region intersect with another region? Was there a particular activity south-east of this region 2 h ago? What does the data look like west of this region9 To answer these, a framework must support creating and reasoning about entities located in space, but with characteristics that might change over time.

The basis for these processes is an internal representation of the geographical area from which METEOR collects weather station data for forecasting. This constitutes METEOR's "world", about 2.3 x 10 6 km 2. This geographical area is represented as a two-dimensional array of map elements or "mapels" (see Fig. 3). Mapels have no meteorological significance, although their size determines the resolution of activities that populate this world.

The representation of the forecast world includes spatial objects and methods for manipulating them with respect to the mapel map. There are three major objects that populate the world: STATIONs, GEOREGIONs, and REGIONs. A STATION is a source of data, located by a single mapel. GEOREGIONs are collections of mapels that represent a higher-level organization of the forecast world. An example georegion would be "Central Alberta". They are used primarily to speed certain inferences about the relative locations of REGIONS. A REGION is a collection of mapels to denote some particular geographical area that might correspond to a map feature or to some meteorological activity. Example types of regions are "Sc4 maximum", "storm initiation", or "cirrostratus observations". The knowledge associated with these objects and their members is represented in frames within a type hierarchy. Our implementation of frames includes standard features of inheritance, procedural attachment, and associations to other objects. This frame system was extended with a view scheme for organizing domain and spatial concepts. Views are described in a later section.

The most important knowledge associated with concepts like STATION, REGION, and GEOREGION are geometric methods that use the mapel map to derive spatial properties of their instantiations. For example, the method for deriving all the stations contained within CENTRAL-ALBERTA, and instantiation of GEOREGION, is a procedure stored on the GEOREGION frame. Similarly, methods for computing a property like "surrounding quadrants" is inherited from REGION for answers that directly or indirectly depend on the quadrants property of a particular region. To derive spatial relationships such as "contained-within", "west-of' , "northern quad- rant", these methods make extensive use of the mapel property stored on the concepts' frames. By listing a set of mapels from the mapel map, this property associates instantiations of the domain concepts with the geometric representation of the forecast world.

While STATION and GEOREGION have a fixed set of instantiations, the concept REGION is dynamically instantiated to represent particular instances of contour map features and meteorological activities observed or inferred in the forecast world. For example, Region-087 might be an instantiation of REGION that signifies "the area of broken-to-scattered cirrostratus clouds in southern Alberta". Figure 3 illustrates the conceptual distinction between METEOR's fixed spatial and domain concepts and the instantiations it creates of these concepts during problem solving.

The forecast world representation--the mapel map and geometric methods associated with spatial concepts in the frame system--allows METEOR to relate objects spatially. The frame system was augmented to keep track of how objects change over time. How an object changes over time is maintained on property-by-property basis. The mapels


property is a good example of this, because it iridicates location. Some instantiations of REGION move, over time, through the forecast area. For example, this occurs for areas of cloud cover or precipitation. In contrast, instantiations of other spatial objects like STATION and GEOREGION (e.g. Edmonton or Central Alberta) do not move-- the value of their mapel property does not change during a forecast. Thus, mapels is an example of a property that can be either time-varying or time-invariant, depending on the kind of object it is describing. Conversely, any given object can have a mix of time-varying and time-invariant properties.

We represent time-varying properties by specifying a path to a generic time-frame. Essentially, a time-frame has an add/delete list format for indicating a value for a property at a particular time. Time-frames are used extensively throughout METEOR's knowledge representation, since many objects have properties changing over time. Not only does this approach provide a way of maintaining what properties were, but they serve as a consistent way of representing predictions as changes to object properties for future times. Under this scheme, answers to the kinds of questions like the ones given above ("Does Region A intersect with Region B ?" and "Was there a Sc4 maximum south-west of Region C 4 h ago?") are handled the same way by the same mechanisms.

3.3. DOMAIN KNOWLEDGE: OBSERVATIONS AND EVENTS

In this task, the meteorologist reasons about meteorological processes that occur over time. These processes are inferred from direct observations that are made at particular points in time. This is analogous to the distinction some expert medical diagnostic systems make between symptoms and inferred pathological states (e.g. Weiss, Kulikowski, Amarel & Safir, 1978). Understanding the relationship between observations and events is critical, because much of the expert's qualitative knowledge concerns the recognition of certain meteorological events.

We distinguish between an observation and event in the knowledge representation primarily in terms of their spatial and temporal characteristics. Observations occur at a particular place and time. Events, inferred from observations, range over time and space. Where an event occurs is derived from the location of its observations. When it occurred can also be inferred, within limits, from when the observations occurred. We cannot unequivocally know when the event really started or stopped from a single observation. For example, observing a cirrostratus cloud might be taken as evidence for an event that included downward movement of air. This in turn might imply that air is drying-out in that region (a relatively longer-term process). However, if this observation were made when these clouds were dissipating, rather than forming, the dynamics might be quite different. Similar distinctions between observations and activities spanning time have been made by temporal logic systems (e.g. Vilain, 1982).

Knowledge about observations and events form part of METEOR's domain knowledge and is organized hierarchically in the frame system (see Fig. 3). An event's most distinguishing characteristic is that it has clearly defined stages that may have different inferencing implications. Each stage specifies a set of required observations. More importantly, the stages indicate particular properties these required observations must have and how the observations must be temporally and spatially related. For example, the concept for altocumulus-standing-lenticularis (ACSL), a high-level cigar-shaped cloud, will be instantiated when ACSL is observed. An ACSL observation is one of the required observations for an event that signals no further convective activity will


occur. The event will be recognized, however, only when this ACSL observation occurs after a storm event and to the west o f certain other observations. In combination with a different set of observations, or with slightly different properties, this ACSL observation might have somewhat different significance. Similarly, a "mid-level stabilization event" requires observations of mid-level cloud types (like altocumulus) that have progressed from a broken or scattered condition to overcast, and have covered an increasingly larger geographical area as well. The event frames specify these temporal and spatial relationships, as well as other features, for instantiations of its required observations.

Knowledge that relates observations and events is represented in frames, because this formalism provides an easily understood and maintainable specification of concepts and properties that are related through type hierarchies. However, recognizing which events are occurring given a set of observations is a matching problem. In particular, the same type of observation (where "type" might he ACSL clouds, or heavy showers) might be required by different events, and each event typically requires several different observations. In this regard, an event's required observations are a pattern that must be matched. Some production system languages, such as OPS (Forgy, 1982), provide powerful matching algorithms that are designed to efficiently handle problems in which there is a good deal of overlap among to-be-matched patterns. Therefore, although knowledge about events (such as domain heuristics and required observations) is maintained on frames, this knowledge is transferred to the production system to recognize events, a task for which the capabilities of the production system are better suited.

METEOR separates event-matching into two stages. First, it finds a set of potential events by matching each event's required observations to actual observations. This first stage is handled by production rules that METEOR automatically generates from specifications of required observations on event frames. All the required observations of all the events to be matched are gathered up and formulated first as goals to find the observations, and then as separate production rules associating instantiations of these observations to their events. For example, a required observation might have the following specifications: "cloud-type is ACC"; "condition is scattered"; "thickness is 0.5"; "time of occurrence is before noon"; and "location is east of mountains". Each set of specifications such as these is translated into a production rule designed to match information that an area corresponding to the required observation has been found. If METEOR has not already found observations of this type (and has not already established there aren't any), it attempts to find evidence for the observation. If these observations do exist (i.e. their specifications correspond to actual sightings) they are instantiated as regions in the frame system. Additionally, information corresponding to these instantiations is added to the production system's working memory. If all of an event's required observations are instantiated, the automatically generated production rule which associates these instantiations with their event fires, causing the second stage of event matching to begin. In this stage, METEOR checks whether the actual relationships among an event's instantiated observations match the spatial and temporal relationships specified on the event's frame. For example, an event's frame might include the constraint that the first of its required observations be west of the second of its required observations.

The goal of event-matching is to use the expert's heuristics about weather conditions. These heuristics are initially stored on event and observation frames. This means that


heuristics, like any property, can be inherited through the type hierarchy. Thus, if an instantiation of VIRGA (precipitation in clouds that does not reach the ground) occurs, then heuristics associated with VIRGA as well as those inherited from PRECIPITA- TION will be used. Only when METEOR matches a required event or observation does it transform the declarative form of these heuristics into production rule form so they can influence analysis.

3.4. THE VIEW SYSTEM FOR ORGANIZING KNOWLEDGE

In this domain, the same concept can participate in several different conceptual hierarchies. For example, cirrostratus is a kind of cloud, a kind of high-level cloud, and a kind of non-convective cloud. These distinctions do not represent different levels in a superset-subset hierarchy, but rather different inheritance hierarchies. We could use unconnected hierarchies and duplicate knowledge when necessary, but there would be no easy way to maintain duplicated knowledge or to know all the hierarchies in which a concept participates. We have approached this problem by developing a view

scheme that combines unconnected hierarchies into a single network. The notion of views as a means for organizing diverse or changing information about

a concept has been around since Minsky first introduced frames as a means of structuring knowledge (Minsky, 1975). Minsky used "view" in a literal sense for organizing visual information about an object as our perception (view) of the object changed by, say, moving around the object. The intent of views was more abstract than that, since it can be extrapolated to processes like analogy: viewing a chair as a ladder, for example, if it is serving a ladder's function. Hayes (1977) developed a scheme in which collections of nodes and links were organized into frame-like structures called "depictions". An instantiation of a depiction was essentially a mapping of between some nodes in the depiction and new nodes. For example, given a depiction of "human" which included nodes like "arm" and "leg" and links like "part-of", an representation of "John" as an instantiation of this depiction would include nodes that correspond to "John's leg" and "John's arm", mapped into the corresponding depiction nodes. In this manner, the relationships specified among the depiction nodes were inherited by the nodes representing the instantiation. Because the same nodes could be mapped into other depictions as well, this framework could be used to represent and organize different views of the same entity. In KRL, Bobrow & Winograd (1977) provided more direct ways of describing a concept in terms of other concepts from possibly distinct categories. By creating different manifestations of a concept, a set of disparate descriptions could be grouped with the concept in a single memory unit. For example, a 17th century chair might have a manifestation as a museum piece, another as an art object, and a third as a piece of furniture. Depending on how this chair was regarded, it could have quite distinct, non-overlapping features and associated expectations. Yet all this knowledge would be accessible from a single unit in memory. This capability for knowledge-based systems is becoming more prevalent, particularly as more systems are designed and implemented with object-oriented programming techniques. For example, multiple inheritance was a feature for objects in the SWIRL system for simulating military air battles (Klahr, McArthur & Narain, 1982).

Our scheme of keeping conceptually different types of knowledge about a concept separate but organized around a single unit is similar in spirit and aim, if not complexity,

K N O W L E D G E R E P R E S E N T A T I O N F O R STORM F O R E C A S T I N G 5 3 5

to these other frameworks. In our approach, repeated concepts in the unconnected hierarchies become a single node in this new network. A node is a set of independent frames that represents the collection of all knowledge about a concept. The independent frames associated with a node correspond to different views; these views are organized around a central frame and serve to subdivide the knowledge about a concept. Knowledge that was particular to a single hierarchy defines a certain view of the concept. Thus, each node in the network has as many views as the number of hierarchies in which the concept originally participated. Not all knowledge about a concept has to be associated with a viewmit is possible to declare knowledge as being true of the concept in general. In this ease, it resides on the central frame.

An example is illustrated in Fig. 4. In this case, the concept CLOUD is a member of both the OBS (observation) hierarchy and EVENT hierarchy. The complete concept

I - I Io- ! Properties ... Properties ...

I 2

Cloud viewed as an Event

Properties ... 4

t Cloud 1 AKO Event L AKO Obs I ~ ~ -t Other / | Cloud viewed as

| an Obs properties .., 3 ] Properties ... 5

I

Ci~e viewed as / / �9 Cloud Event /

/ / Properties ... / / 7

Event-47 AKO Cirrus

Cloud Event )lher

properties ...

Cirrus 1 AKO Cloud L Other I

properties ... 6 1

'1 / /

/ /

/

Obs-12 viewed as a Cirrus Cloud Obs Properties ...

11

I Cirrus viewed as ]

�9 Cloud Obs I I

Properties ... J

I 8

/ Obs-12 AKO Cirrus

Cloud Oils AKO Region Other

properties ...

10 \ \

\ \

Obs-12 viewed as �9 Region

Properties _. 12

FIG. 4. The use of views in a portion of METEOR's frame hierarchy.


for CLOUD includes both its views as an OBS and an EVENT. Knowledge generally true of CLOUD (that would have been found in both the OBS and EVENT hierarchies) resides on the CLOUD frame (frame 3). Knowledge unique to a CLOUD view (that would have been found in only one of the original hierarchies) is represented in view frames (frames 4 and 5) associated with the central CLOUD frame (frame 3). There are three ways to separate the total knowledge about CLOUD. It can be viewed as an EVENT, in which case knowledge is collected from frames 4, 3, and 1. When viewed as an OBS, knowledge from frames 5, 3, and 2 is accessible. Finally, all the knowledge associated with CLOUD (frames 1-5)--previously split over different hierarchies--is also accessible from CLOUD. This includes knowledge about what views it has.

A concept does not need to have more than one parent in order to have views. Although CIRRUS has only one parent in this example network, it would have been repeated in the unconnected hierarchies. So, just like CLOUD, CIRRUS has views that reflect the unconnected hierarchies. For example, CIRRUS can be viewed as a CLOUD EVENT, in which case knowledge is collected from frames 7, 6, 4, 3, and 1. When viewed as a CLOUD OBS, knowledge is collected from frames 8, 6, 5, 3, and 2. In effect, CIRRUS inherits knowledge from its parents' views (e.g. frames 4 and 5).

Frames 1-8 in Fig. 4 represent part of METEOR's static domain knowledge. The concepts at the lowest level of this hierarchy can be dynamically instantiated. It is possible to instamiate a particular view of concept, thereby restricting the inheritance of other views associated with that concept. Frames 9 and 10 illustrate two examples of view instantiations--EVENT-47 and OBS- 12. OBS- 12 is an instantiation of CIRRUS viewed as a CLOUD OBS. It inherits only those properties on an inheritance path that include frames 8, 6, 5, 3, and 2. Since most of METEOR's dynamic instantiations have some spatial definition, they will be attached to the REGION hierarchy and have a corresponding view as a REGION. That is why OBS-12 has at least those views shown in Fig. 4. Similarly, EVENT-47, as a-kind-of CIRRUS CLOUD EVENT, inherits only those properties on a path consisting of frames 7, 6, 4, 3, and 1 (we have not shown its view frames in Fig. 4).

A few details about the mechanisms for constructing, maintaining, and searching views are necessary to put this scheme in perspective with other view approaches. As in most frame systems, if search mechanisms fail to find a property directly associated with a concept, they will follow inheritance paths. At present, our search mechanism is quite unguided when confronted with diverging paths out of a node (e.g. the event and observation paths out of cloud in Fig. 4). That is, there is no basis for deciding a priori that the information needed is associated with one hierarchy or another. Representing properties as concepts and associating them with the domain and spatial taxonomies could guide the search mechanism to the most likely inheritance path.

Another feature of the view scheme is that retrieval processes restructure knowledge among an instantiated concept's views. To understand how this is done, it is necessary to have an idea of how views are created. Properties are always added to a concept with reference to a parent. The specification of the parent relegates those properties to the appropriate view, if it exists, or to a newly created view corresponding to the new distinct parent. Thus, creating A0014 as a-kind-of region means that A0014 will have a view as a region and properties given at this time will reside on that view frame. Thus, the system's default method of defining objects as a collection of views of existing concepts is quite KRL-like in spirit (Borow & Winograd, 1977).

K N O W L E D G E REPRESENTATION FOR STORM FORECASTING 53'7

The mechanism that adds properties to an object does not care whether the same information is unduly separated on different views. Recognizing :that knowledge is common to all an instantiation's views, and hence should be shared by all those views, is a by-product of information retrieval. In the simplest case, information retrieval is done by examining the frame associated with a particular view. In that case, only that view and its inheritance path are examined. When no view is specified ("what is OBS-12's time?"), the search mechanism examines the central concept frame and, if unsuccessful, it examines all the concept's views. In doing so, it may find that the property it is looking for exists on all the views. In this case, this knowledge is migrated from the view frames to the central concept frame, and marked as "shared". Because the retrieval has already cotlected information from different views, this is the best time to do this restructuring. If the object were to receive another view, the shared properties are checked to see whether they conflict with the new view. For example, the new view could contain a property that is marked "shared" on the central frame. If this new view has its own idiosyncratic property value, this property cannot reside on the central node. It is moved back down to all the view frames. This reorganization also occurs if the new view does not include a shared property or if Shared properties are changed on old views. In sum, our design of the view scheme included a means by which shared information is maintained and reorganized dynamically, so that processes using the frame system do not need to know whether information is view- specific or not. With this scheme, we have cast the "recognition" that alternative views share common knowledge as part of the retrieval processes. Although this approach accomplishes only limited knowledge reorganization, the important point is that the responsibility is left to the processes that operate on the knowledge structure.

In this manner, the view scheme serves a very useful organizational role: geometric methods and spatial knowledge are distinct from domain-related methods and knowledge, yet a single concept can be defined in terms of both. Domain knowledge itself includes several distinct hierarchies that can be linked to a single concept via the view method (the cloud taxonomies based on height and convectivity mentioned earlier are an example of this). In sum, the main advantages to this scheme are: (a) properties common to all views can be shared; (b) the structure of the network need not be repeated for every unconnected hierarchy; (c) the number of ways of viewing a particular concept is part of the knowledge available to the system; and (d) the property inheritance mechanism of unconnected hierarchies is preserved.

3.5. SUMMARY

One of the most difficult aspects for this problem was organizing different types of qualitative and quantitative knowledge, data, and methods within a coherent representation scheme. The other difficult feature was that METEOR must use methods from several levels (e.g. from production systems to geometric algorithms). The knowledge representation scheme allows the methods operating at different levels to com- municate. For example, our frame system serves as an important communication channel between the production systems and the geometric methods associated with spatial inferencing, since they both access and change the frame system. However, the production systems and the geometric methods deal with spatial knowledge on very different levels. To the geometric methods, a concept's mapel property has some numeric significance with respect to the mapel map. On the other hand, mapels and methods


that manipulate them are transparent to the production systems. The production systems pose higher-level spatial questions (e.g. "Was Region A west of Region B 2 h ago?") to the frame system and are ignorant of the means by which such questions are answered.

The representation of observations and events provides the clearest example of the interplay among knowledge types and how the view scheme contributes to it. The observations required for an event are stored as properties of events' stages. Other properties indicates how these required observations must be spatially and temporally related. If "altocumulus clouds" can be instantiated, a node is created with two views: one as an altocumulus cloud observation and the other as a REGION, on which its spatial features are stored. The production systems capitalize on the view scheme, particularly for accessing spatial methods and representations for manipulating domain concepts during event matching.

4. Interpreting quantitative and qualitative information

As Fig. 3 illustrates, METEOR dynamically creates instantiations of particular spatial and domain concepts to represent activities that are inferred or observed. Most of these instantiations have views as REGIONs. In this section, we outline the method of identifying and creating new REGION instantiations from the analysis of contour maps (Fig. 1) and weather station reports (Fig. 2).

4.1. INTERPRETING THE Sc4 INDEX

METEOR treats the Sc4 index like a preliminary diagnosis that must be confirmed or disconfirmed. Positive or negative evidence accrues for the two storm characteristics that index predicts: storm initiation location and intensity. Interpreting the index involves several distinct subtasks: acquiring the Sc4 and surface moisture maps, checking the wind direction, locating maxima and strong contour gradients, consulting the surface moisture map, and delineating one or more storm initiation regions. Intensity predictions and directions are then adjusted using meteorological knowledge about the model's predictor variables and qualitative influences on the model's performance.

4.1.1. Regions from contour maps To glean the same information from contour maps as the expert does, METEOR must recognize areas of maxima, locate the interesting areas of strong contour gradient, and create new entities in its internal storm world corresponding to these areas. Plotted contour maps like those in Fig. 1 are produced by a standard contouring package. For METEOR, these contour maps are represented by character arrays, with solid bands of characters used to denote individual contour regions. To simplify understanding these maps, some relationships between contours and local maxima and minima are identified. For example, information that one contour is contained within another contour is provided by the contouring package, as well as the identification of local maxima and minima. With this type of information, locating regions of Sc4 maximum is relatively straightforward.

The more complicated aspects of map analysis involve the location of strong contour gradients. The potentially interesting gradients are those with the greatest changes in


slope that occur downwind of the maximum. METEOR consults wind speed and direction data stored in the frame system. To get the changes in the slope of the contours, METEOR uses contour maps generated by taking the first derivative of the original contour map along the direction of the wind. These derivative maps are automatically computed ahead of time by the contouring package for each contour map. Derivatives are taken at four angles (0 ~ 45 ~ 90 ~ and 125 ~ to generate four maps, each of which can be used for two directions (north-south, north-east-south-west, east-west, and south-east-north-west, respectively). METEOR selects one of these maps after checking wind direction information stored as meteorological measurements with certain STATIONS. Given the original contour map, a derivative map, and the mapel-space map, METEOR examines those portions of the derivative map that are upwind of the line of maximum. Wind direction knowledge indicates a direction in which METEOR should search for strong gradients on the derivative maps. Knowledge of where the maximum values are located on the Sc4 map indicates when METEOR should stop looking for gradients. Those derivative contours that represent strong gradients on the original contour map are selected as potential storm initiation areas. In this way, METEOR can identify areas like those outlined by the forecaster in Fig. 1.

4.1.2. Fine-tuning the predictions For each storm initiation area, an initial and final intensity is predicted, as well as a direction of movement. The initial intensity is based on the Sc4 index's rating for the closest maximum. Using the map in Fig. l(a) as an example, the uppermost region marked S1 would be given an initial intensity rating of 5, since that is the rating of the closest maximum. METEOR's final predicted intensity is an adjustment--upwards or downwards---of this initial value. The adjustments are guided by production rules that represent the expert's knowledge about predictor variables and his knowledge about qualitative factors. The value of each predictor variable in the vicinity of a given storm initiation region is rated as positive, negative, or neutral evidence, relative to the initial intensity prediction for the storm. For example, if a storm region were initially rated "4" in intensity, but certain predictor variables in surrounding areas were neutral, the final intensity rating may be downplayed. The earlier situation of high surface moisture ratings caused by precipitation illustrates the affect of qualitative influences on the model's interpretation. If surface moisture in the vicinity of a storm initiation region is high but that vicinity coincides with a precipitation region, then the surface moisture would be re-rated as neutral or negative evidence. This in turn would impact METEOR's intensity predictions as well as its predictions about storm movement. Knowledge about precipitation is an example of a qualitative observation gleaned from the remarks sections of station reports.

4.2. REGIONS FROM HUMAN REMARKS

METEOR's information about ongoing weather conditions comes from observations made by trained observers and included in the "remarks" section of the weather station report (Fig. 2). As Fig. 2 might suggest, there is considerable variability in the nature of abbreviations, in whether punctuation is used to join elements or signal the end of a phrase, and in word order. This variability occurs both within and across remarks. The job of METEOR's Remarks Parser module (a production system plus lexical

5 4 0 R. ELIO AND J, DE HAAN

knowledge associated with domain concepts in the frame system) is to extract the remark from the station data, translate the abbreviations into full words, and parse the remark into its proper constituents.

First, remarks are extracted from the station report and represented as a string of tokens. To match each token to the concept its represents, METEOR uses a lexicon for all concepts commonly used in the remarks. Each word in the lexicon has an associated list of common abbreviations. If a token is merely one of these abbreviations, then the matching process is trivial. Otherwise, the token is compared with a set of probable words, and a rating is assigned to each comparison. The word with the highest rating (greater than a pre-defined minimum) is deemed to match the token. Each token is examined in isolation, and context (i.e. surrounding abbreviations or concepts) is not used. After this process is completed, the abbreviations in the remarks have been expanded into full words and identified as concepts.

Once the remarks have been expanded into concepts, it is the job of the remarks parser production system to group those concepts into phrases. The remarks parser relies solely on a set of syntactic heuristics gleaned from a corpus of remarks, rather than from semantic information. Although semantic information is necessary to parse a small percentage (5%) of these remarks, this approach works because the vocabulary and the structure of the remarks are very constrained. The parser assumes that all phrases are noun phrases--that is, a noun followed or preceded by some modifiers, and its major task is to find the separations between phrases. The remarks are parsed from left to right, and phrases are built as the parse progresses. Production rules are used to detect phrase separation patterns. For instance, a noun phrase followed by a noun usually signals a separation point, as does punctuation between words in different syntactic classes (e.g. adjective+punctuation+ noun.) As an example, in Fig. 2(b) the noun phrase "SHWRS HVIER N, W" followed by the noun "PRES" indicates a phrase separation, as does "QUAD, SHWRS". Other rules detect patterns indicating words which belong to the same phrase. For instance, punctuation between words in the same syntactic class usually indicates that they both belong to the same phrase (e.g. location + punctuation + location, or even, noun + punctuation + noun.) In the case of "N, W", the comma occurs between words from the same syntactic class, so these words are grouped into the same phrase. Exceptions to these general rules are detected by other productions. The remarks parser bundles each recognizable phrase into a standard format so that they can be easily used later on.

In addition to the remarks section, METEOR also extracts information about cloud cover [Fig. 2(b)]. Both the parsed remarks and cloud cover information are stored, as time-varying properties, on the frame of the station that reported them, since that serves to locate these conditions in mapel space. When METEOR is looking for a particular observation (e.g. as evidence for certain events), a description of the observation ("broken altocumulus at 1000 h") is matched against the observations associated with particular stations. Having found a set of stations with the requested observations, METEOR decides how to join these individual observations into one or more regions of activity. Several heuristics come into play, such as the absolute distances between the observations and whether intervening stations report a conflicting pattern of activity. METEOR then computes a set of mapels to define the region(s), and instantiates the corresponding concept ("altocumulus") with particular properties (e.g. time, condition), storing the spatial information on its view as a REGION.


5. System Organization and Operation

Figure 5 shows the current METEOR system and its major parts in terms of implementation. METEOR runs on a Xerox 1100 Lisp Machine and is written in INTERLISP-D and OPS4 (Forgy, 1979). Initial data collection is done on a VAX 11/780 running VMS. These machines are connected via an Ethernet.

W e a t h e r data circuit l ine

M E T E O R project account

�9 storm world configuretion

�9 automatic processing of data

�9 oroanizltionll scheme for data and maps

Xerox 11 O0 Lisp machine

~ themel

Inter l isp-D

oPs4

�9 Forecaster PS

�9 RemarksparserPS

�9 Region finOer PS

VAX 11/780 VMS

~ F m m e s ~ t a m

�9 stations, reOions, geofegions

�9 ot:qlervetions and events

�9 meteorological data

�9 remarks lexicon

�9 Maoel map

�9 Contour maps

FIG. 5. METEOR system organization.

The frame and rule-based components reside on the 1100. The rule-based component is currently implemented as three OPS production systems. The Forecaster production system initiates the configuration of the METEOR's forecast world as represented by the mapr map and orients the fixed spatial objects (e.g. STATIONs and GEOREGION) with respect to the mapel map. The Forecaster production system is also responsible for data acquisition (moving data and results across the Ethernet), interpretation, and

542 R. E L I O A N D .L D E H A A N

forecasting subtasks. To accomplish this, it invokes two other production systems that have very specialized tasks: the Remarks Parser production system, which parses the human observations contained in the station report remarks (Fig. 2) and the Region Builder production system, which identifies regions of meteorological activity from the parsed remarks and cloud reports (Fig. 2) in response to requests during event matching.

The frame system and view component was implemented in INTERLISP-D. The production systems interact with the frame representation through a feature of the OPS language that allows user-defined functions as part of both conditions and actions. In addition to the interaction illustrated in Fig. 3 and described earlier, production rules can transfer knowledge from frames into working memory and initiate processes for transferring maps from the VAX to the 1100. METEOR currently starts with 140 production rules (across the three PSs) and 300 frames. In the course of forecasting, many new frames are dynamically created that represent particular instantiations of the concepts in the type hierarchies. This could be an additional 300 or more frames per forecast. New OPS rules are dynamically generated from information stored in frames on an as-needed basis, particularly to aid the event-matching described earlier.

Certain features of METEOR are configured by information that resides on the VAX. These features include names and locations of weather stations, geographical details about the forecast area, the division of the forecast area into GEOREGIONS, and the resolution of the mapel map (i.e. the size of each mapel). Data from weather stations (Fig. 2) arrives hourly on a dedicated circuit line. This station data is automatically collected by processes that monitor a particular set of stations in the forecast area (currently about 70 stations). When the statistical model is run on the VAX, other processes automatically create the contour maps for the model, the surface moisture, and their associated derivative maps. A directory on the VAX organizes the station reports and all the contour maps. The important role of this directory is to coordinate the transfer of data from the VAX to the 1100. By using this directory, METEOR finds newly arrived station reports and maps.

METEOR currently takes about 50 min to do the tasks involved in making a forecast: interpreting So4 and surface moisture maps, parsing remarks, identifying qualitative weather conditions, and producing its report. METEOR outputs it predictions by creating maps of the forecast area and outlining the location of one or more storm initiation regions. An example is shown in Fig. 6. For each storm region, METEOR reports its predicted intensity and direction, as well as the factors that support these predictions [see Fig. 6(a)]. The four predictor variables used in the Sc4 index (SPLI, EDGEX, DTH7, DTH24) plus the surface moisture measure (Theta-E) are evaluated for consistency with the Sc4 index's initial intensity rating. Comments about the qualitative effects of current conditions on the model are also provided. An example is shown in Fig. 6b). In this case, METEOR did not use the information about precipitation to downplay the Theta-E rating, but just alerted the forecaster to this situation.

The remainder of METEOR's report consists of comments about current weather conditions it has discovered through its analysis of weather station reports and observations. These conditions are outlined on accompanying maps. An example is given in Fig. 7. This kind of presentation and analysis of current conditions was previously unavailable to the forecaster in real-time.


YL~'

YOJ I I ~

YMM vc\ YLL YQZ

YKA ~ 160

Map 1: Storm Initiation Regions at 1215

a) Daw an~l lime for this foeecaat is 1215

Storm Initiation Region: STORM-152 Predicted direction: SOUTH-EAST Sr CDC reti~ for ~ region: 5.0 Evidence in luplX)ll of this intenl~y level...

SPLI WEAK-NEGATIVE EGDEX POSITIVE D24TH POSITIVE D24"I-/' POSITIVE THETA.E NEUTRAL

Finat I~KlicNId inlenltty: 4.0

Storm Initiation Region: STORM-160 Predicted direction: SOUTH-EAST

"Sc4 CDC rating for this region: 5.0 Evidence in support of this intensity level...

S P L ! POSITIVE EGDEX POSITIVE D24TH POSITIVE D24T7 POSITIVE THETA.E POSITIVE

Rnal prKlictl~ intemMy: 5.0

b) SC4 MODEL COMMENTS

For this forecast, theta.e gradients were intersected with Sc4 gradients to ~ storm initiation regions.

It is poasible thlt precipitation Occurring mt the time the Sc4 model was run caroled thetl.e vllues to be inflated. This suggests that the above thatl.e illler~ltiOnS ihould be tempered.

The 850 winds used for this Iorecist were ((SPEED (YYE 18)) ((DIRECTION {YYE 329)) and were based on data taken at time 12!2.

The 700 winds used for this forecast were ((SPEED (YQF 17) (YYE 18)) ((DIRECTION (YQF 342) (YYE 315)) and were based on data taken at time 1212.

FIG. 6. METEOR's predictions of storm initiation areas and interpretation of the Sc4 index.

_ [ ] 221

I voJ / o22o \s.o.E.v....s YsE \

YzYI_J. L ~ ~ YVC Ait)erm, Parkland-Eastern-Alberta, Central- I ZZ, q t ~ I ~ Western-Alberta. C~ntrat-Easlern-Alberta,

YQZ II, YET I YLL These obsenrations, sighted It 1215, have z"zk I YEG been winted on Map 4.

I \ I , , | hese observations incude sightings of " ~ YYC 1 shower,.thundershowers and virga.

Nap 4: Precipitation at 1215

FIG. 7. A portion of METEOR's analysis of current precipitation areas, located by analyzing human remarks in the weather-station reports.

5 4 4 R. ELIO AND J. DE HAAN

6. System performance

A short field test of the METEOR prototype was conducted for 2 weeks in August 1985. During the test, 3 days had to be omitted because of data communications problems. For 9 of the remaining 11 days, METEOR correctly predicted that no severe storms would occur (the prototype was not configured to forecast the intensity of weak-intensity storms). On 3 August, METEOR successfully predicted the intensity (CDC = +5) and location of the most severe storm of the season. However, it made its most notable error on the following day, when it over-predicted the intensity of a weak convective storm. The forecaster that day had the advantage of knowing that the shortwave trough involved had already passed, so that conditions negating convective storms were already influencing the region at forecast time. METEOR does not currently have access to this information, but similar types of information could be incorporated in an improved version.

7. Observations and conclusions

From a knowledge-engineering standpoint, it is helpful to place METEOR in the context of weather forecasting as an apptied research effort. Forecast meteorologists develop prediction methods that do not simply reflect empirical associations, but rather represent complex, sometimes competing, causal models of weather. The research issues revolve around understanding more precisely the thermodynamic factors that can be measured and combined to explain, and hence predict, weather development. Forecasting meteorologists are pushing their existing models to be more accurate for even shorter time frames ("nowcasting"). In some cases, this research on quantitative models has the implicit goal of removing the variation in "subjective" human interpretation. Our expert's statistical model is a good example of this kind of research.

METEOR's practical success as a forecasting system is due to its ability to apply expert knowledge to the interpretation of the Sc4 statistical index. However, METEOR is by no means equivalent to an experienced forecasting meteorologist, even on this constrained task. METEOR's general meteorological knowledge is limited to knowledge of weather patterns and observations that has direct bearing on the interpretation of the index. Unlike a human expert, who has access to a wide range of meteorological data, METEOR needs the index to make a good storm forecast.

For domains like meteorology, the ability to interpret existing numerical models may be as important to the acceptance of knowledge-based systems as, for example, explanation is for medical applications. In the short term, knowledge-based meteorological systems that combine statistical approaches with expert interpretation may provide more accurate forecasts. The local expertise and idiosyncratic factors (e.g. topographical influences) are often diluted, or ignored, in general weather models; this is precisely the kind of information that is best represented in a knowledge-based system. In the long term, forecasting may be another area in which the "side-effect" of designing knowledge-based systems--compelling experts to think about their domain and knowledge in new ways--might be the more significant benefit. In our case, the quantity of the data, particularly qualitative information on weather conditions, was so great that it could not be processed by human forecasters in real-time. METEOR's ability to "understand" this information and its attempts to integrate it with the analysis


of the statistical index was the meteorologist's first opportunity to consider how he might characterize and use his qualitative knowledge with his numerical models for real-time forecasting. In this sense, METEOR tried to go beyond what the expert currently did and the knowledge-engineering and design effort was correspondingly more complicated.

For a domain so clearly marked by temporal characteristics, we have taken a pragmatic approach to representing how objects change over time. We recognize that representing time and temporal relations is a complex issue, be it for everyday temporal reasoning (Allen, 1983; McDermott, 1982) or expert reasoning about time-critical events (Tsotsos, 1984). We do not have a way of reasoning about uncertain times or unknown time boundaries and it is not clear whether our current approach would be extendable to include these features. Indeed, just as our geometric representation of the world is well-suited for spatial inferencing, there are special-purpose representations that can be exploited for computing temporal relations as well (e.g. Schubert, Papalaskaris & Taugher, 1983). Even without including a special representation for computing temporal relationships, we could imagine a forecasting system in which knowledge is organized around "time-slices" rather than around objects with changing properties. METEOR includes an adjunct organization like this--it keeps track of meteorological activities (events and observations) as a function of particular times. This redundancy allows METEOR to recreate what the world looked like at any particular time without searching through all its instantiated concepts. Given this feature, we can ask whether some activity occurred before, during, or after some other activity. However, because we are more interested in the nature of the changes over time rather than temporal relations per se, keeping track of changes at the property level is necessary. In sum, METEOR views the world as a series of static snapshots and infers dynamic conditions from these snapshots. This can be contrasted with representing dynamic processes more directly as, say, a set of differential equations that a cloud physicist might use to model clouds. It is not clear whether this kind of approach is feasible or suitable for qualitative reasoning about objects and their changing properties.

The inclusion of special-purpose representations (e.g. geometric representations for spatial reasoning, time-graphs for temporal reasoning) and inferencing methods are likely to be increasingly common in the next generation of knowledge-based systems. In addition, the conbination of frame- and rule-based representations seems to characterize the current trend, since these representations complement each other's expressive power (e.g. Aikens, 1983; Johnson, Duran, Hassebrock, Moiler & Prietula, 1981). As a representational scheme, frames afford a structure that can he exploited for both organizational and inferential purposes. While METEOR takes only a small and carefully controlled step towards automatically generating procedural forms of knowledge from declarative representations, this ability will, we believe, characterize more flexible knowledge-based systems. For example, the KEE language (Fikes & Kehler, 1985) formalizes several of the ideas we have used to design METEOR's knowledge base, including the representation of production rules in a frame system. From the cognitive science viewpoint, frames or schemes with associated inference rules, and the transition of "what" knowledge into "how" knowledge, have been useful frameworks for modeling the acquisition of skills (Anderson, Green, Kline & Neves, 1981). The transition of knowledge from one representation to another during

546 R. ELIO AND J. DE HAAN

problem-solving is likely to be an important issue for knowledge-based systems, particularly as they come to include declarative representations of causal knowledge and models.

This research was supported in part by a Natural Sciences and Engineering Research Council of Canada Industrial Research Fellowship to R. Elio while at the Alberta ResearCh Council and by NSERC operating grant 55-47809 to R. Elio while at the University of Alberta. Portions of the work described here were presented at the Ninth International Joint Conference on Artificial Intelligence, August 1985. We thank Peter Dixon for comments on earlier versions of this manuscript. We are especially grateful for the cooperation and growing enthusiasm of Geoff Strong and Chris Sackiw, meteorologists at the Alberta Research Council. Correspondence about this article should be sent to Renre Elio, Department of Computing Science, University of Alberta, Edmonton, Alberta, T6G 2E7.

References

AIKENS, J. S. (1983). Prototypical knowledge for expert systems. Artificial Intelligence, 20, 163-210.

ALLEN, J. F. (1983). Maintaining knowledge about temporal intervals. Communications of the ACM, 26, 832-843.

ANDERSON, J. R., GREEN, J. G., KLINE, P. J. & NEVES, D. (1981). Acquisition of problem- solving skills. In ANDERSON, J. R. Ed., Cognitive Skills and their Acquisition. New Jersey: Erlbaum.

BOBROW, D. G. & WINOGRAD, T. (1977). An overview of KRL, a knowledge representation language. Cognitive Science, 1, 3-46.

DAVIS, R., AUSTIN, H., CARLBOM, I., FRAWLEY, B., PRUCHNIK, P., SNEIDERMAN, R. ~" GILREATH, J. A. (1981). The Dipmeter advisor: interpretation of geologic signals. Proceed- ings of the Seventh International Conference on Artificial Intelligence ( IJCAI-81), pp. 846-849.

FIKES, R. & KEHLER, T. (1985). The role of frame-based representation in reasoning. Communi- cations of the ACM, 28, 904-920.

FORGY, C. L. (1979). The OPS4 reference manual. Technical Report, Department of Computer Science, Carnegie-Mellon University.

FORGY, C. L. (1982). Rete: A fast algorithm for the many pattern/many object pattern match problem. Artificial Intelligence, 19, 17-37.

GANASCIA, J. G. (1984). Using an expert system in merging qualitative and quantitative data analysis. International Journal of Man-Machine Studies, 20, 319-330.

HAYES, P. J. (1977). On semantic nets, frames and associations. Proceedings of the Fifth International Joint Conference on Artificial Intelligence, (IJCAI-77), pp. 99-107.

JOHNSON, P. E., DURAN, A. S., HASSEBROCK, F., MOLLER, J. & PRIETULA, M. (1981). Expertise and error in diagnostic reasoning. Cognitive Science, 5, 235-283.

KLAHR, P., MCARTHUR, D. & NARAIN, S. (1982). SWIRL: An object-oriented air battle simulator. Proceedings of the National Conference on Artificial Intelligence (AAAI-82), Pittsburgh, PA., pp. 331-334.

MCDERMOTT, D. (1982). A temporal logic for reasoning about processes and plans. Cognitive Science, 6, 101-155.

MILIOS, E. E. & NAWOB, E. H. (1985). Interpretation-guided signal processing via protocol analysis. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Tampa, Florida.

MINSKY, M. (1975). A framework for representing knowledge. In WINSTON, P. Ed., The Psychology of Computer Vision. New York: McGraw-Hill.

NII, P., FEIGENBAUM, E., ANTON, J. & RODIMORE, U. (1982). Signal-to-symbol transforma- tion: HASP/SIAP case study. The AI Magazine, Spring, 23-35.

SCHUBERT, L. K., PAPALASKARIS, M. A. & TAUGHER, J. (1983). Determining type, part, color, and time relationships. IEEE Computer, 16, 53-60.


STRONG, G. S. & WILSON, W. D. (1983). The synoptic index of convection: application to the Fort Collins Hailstorm of 30 July 1979. Paper presented at the American Meteorological Society Conference, Tulsa.

TSOSTOS, J. K. (1984). The role of symbolic processing in the computer evaluation of left ventricular wall motion: the ALVEN system. Department of Computer Science Technical Report, LCM-TR84-1, University of Toronto, May.

VILAIN, M. B. (1982). A system for reasoning about time. Proceedings of the National Conference on Artificial Intelligence, 1982, pp. 97-201.

WEISS, S. M., KULIKOWSKI, C. A., AMAREL, S. & SAFIR, A. (1978). A model-based method for computer-aided medical decision-making. Artificial Intelligence, 11, 145-172.

Documents

Representing quantitative and qualitative knowledge in a knowledge-based storm-forecasting system