Int. J. Man-Machine Studies (1986) 25, 523-547
Representing quantitative and qualitative knowledge in a knowledge-based storm-forecasting system
RENI~E ELIO AND .JOHANNES DE HAANt
Department of Computing Science, University of Alberta and t Computing Department, Alberta Research Council, Alberta, Canada
(Received 30 January 1986 and in revised form 8 October 1986)
METEOR is a rule- and frame-based system for short-term severe storm forecasting. Initial predictions are based on interpretations of contour maps generated by statistical predictors of storm severity. To confirm these predictions, METEOR considers addi- tional quantitative measurements, ongoing meteorological conditions and events, and how the expert forecaster interprets these factors. Meterorological events are derived from interpreting human observations of weather conditions in the forecast area. This task requires a framework that supports inferences about the temporal and spatial features of meteorological activities. To accommodate the large amounts of different types of knowledge characterizing this problem, a number of extensions to the rule and frame representations were developed. These extensions include a view scheme to direct property inheritance through intermingled hierarchies and the automatic gener- ation of production system rules from descriptions stored in frames on an as-needed basis.
Knowledge-based systems are typically concerned with problems that do not have algorithmic solutions. Sometimes, these systems are viewed as an alternative to statistical analysis. However, when algorithms and statistical methods are integral aspects of an expert's problem-solving method, they can also be integral parts of a knowledge-based system. In particular, the combination of numerical or statistical methods within the knowledge-based system framework is important to many applications. This approach characterizes our system, METEOR, which uses a statistical model in conjunction with qualitative data and knowledge to forecast storms.
The knowledge that distinguishes experts from novices on some task includes strategies, heuristics, and conceptual understanding gleaned from experience rather than formal instruction. While these types of knowledge about a problem domain are non-numeric in nature, they are often used to interpret numeric methods. Because sophisticated statistical models and numerical analysis techniques are important prob- lem-solving tools in many real-world problems, it is helpful to characterize the roles these methods play in problem-solving. First, they can serve as preliminary data reduction methods, particularly in data-intensive applications. By data-intensive, we mean problems involving large quantities of data, typically gathered from a host of measurement devices. The complexity of these problems is compounded when data must be processed or interpreted in real-time. An expert may rely on a variety of
Address for correspondence: Professor Ren6e Elio, Department of Computing Science, 338 Assiniboia Hall, Edmonton, Alberta, T6G 2E7.
0020-7373/86/050523+25503.00/0 9 1986 Academic Press Inc. (London) Limited
524 R. ELIO AND J. DE HAAN
algorithmic or quantitative methods to collect, screen, and reduce the data prior to analysis. For domains like meteorology that have large historical databases, statistical modeling and analysis techniques are quite pervasive and useful. For these kinds of applications, what distinguishes an expert practitioner from a novice may include an ability selectively to apply numeric methods or interpret statistical models with greater accuracy.
For some problems, it is difficult to represent either the data or the analysis of a statistical model with formalisms developed for representing symbolic knowledge. Examples of this kind of data include acoustic signals, oil-well and dipmeter logs, and maps. If an expert reasons about data in these forms, then some representation of them must exist in a knowledge-based system. Knowledge-based systems often employ quantitative methods and numeric representations to extract significant features from data and convert them to some symbolic representation. Some version of this "signal-to- symbol" process (Nii, Feigenbaum, Anton & Rodimore 1982) is needed in many applications. For example, Milios & Nawob (1985) note that people who are skilled at inte/'preting acoustic signals selectively apply signal processing operators to aid their interpretation. This selection is influenced by the results of previous operators as well as knowledge about the general nature of the signal. Milios and Nawob's system for this task is based on a cycle of representing signal features, selecting operators to refine that representation, and using expert knowledge to formulate plans for subsequent operator selection. Knowledge-based systems in geological applications have similar characteristics. The Dipmeter Advisor uses geological knowledge as well as knowledge about types of dipmeter patterns to recognize particular features in the log patterns as the first step in interpretation (Davis, Austin, Carlbom, Frawley, Prucknik, Sneiderman & Gilreath, 1981). The degree to which symbolic knowledge guides or influences the use of quantitative methods and formalisms varies from application to application. However, the challenge lies in designing systems that integrate both symbolic and non-symbolic information and methods in a coherent knowledge rep- resentation framework.f
The problem of forecasting severe storms has many of the features described above. It is a data-intensive problem in which large amounts of quantitative and qualitative information must be analysed in near real time. Forecasting meteorologists often use statistical models as the foundation for a forecast. Our basic premise is that such algorithmic models and methods are just additional tools an expert brings to bear on the problem. In a broader sense, they are another aspect of the problem about which the expert reasons. Therefore, we made the expert's statistical forecasting model, and his knowledge about its interpretation and limitations, an integral part of METEOR. In the remainder of this paper, we will concentrate primarily on those aspects of the problem and the knowledge representation that make METEOR a useful case study on organizing and integrating different types of quantitative and qualitative data and methods in a knowledge-based system framework. Indeed, this has been our main interest and it would be misleading to characterize the current METEOR system as "expert" in terms of its performance. First, we will present some background on the application, noting the problems that influenced our design and development. Then, we will describe aspects of the knowledge representation scheme. Given this framework,
f It has been pointed out to us that some of our observations on integrating quantitative and qualitative knowledge are similar to those recently made by Ganascia (1984).
KNOWLEDGE REPRESENTATION FOR STORM FORECASTING 525
we will discuss implementation features that clarify the role of the statistical model in METEOR's problem solving. Finally, we will make some general observations on knowledge representation issues for problems of this type.
2. Application background
The Alberta Research Council's Atmospheric Sciences Department has conducted a research program on weather modification and hail suppression for a number of years. During "hail season", an experienced meteorologist and several assistants are respon- sible for predicting the occurrence, severity, and path of hail storms for this research problem.
The meteorologist begins his task by trying to understand patterns of meteorological activity. He typically consults a large number of maps, both diagnostic and prognostic in nature, generated from meteorological measurements taken at weather stations throughout the continent. These maps provide information such as temperature, humidity, wind direction, and wind speed at several levels of the atmosphere. The foundation of his forecast is a statistical model developed to evaluate potential atmos- pheric instability, or convective activity (Strong & Wilson, 1983). Very simply, this model (called the Synoptic Index of Convection, or Sc4) combines four predictor variables to evaluate whether the right ingredients--atmospheric instability and mois- tu re -are present in the right amounts to generate severe weather. The model generates a single convective rating ( -3 to +5) of the degree of potential convective activity and this is used as a measure of possible storm severity. A higher rating means a more severe storm. For example, -1 means "scattered showers but no thundershowers" while +5 means "hail larger than golfballs". The positive-negative sides of this scale roughly correspond to a hail-no hail distinction.
2.1. EXPERT USE OF THE INDEX: INTERPRETATION OF CONTOUR MAPS
Since the Sc4 index is based on measurements taken at weather stations, there is a computed Sc4 index value for each station in the forecast area. Values for the index are interpolated between stations and the final output is provided in the form of a contour map. An Sc4 map is shown in Fig. l(a). Each contour is marked with the value of the index.
The index rates the entire day in terms of maximum expected convective intensity. The forecaster must make more fine-grained predictions of where storms will form, what their relative intensity will be, and where they will move. To do this, the forecaster examines the contour map of index values like that given in Fig. 1 (a) and interprets it with the aid of additional information. A simple description of this interpretation process goes as follows. The meteorologist, familiar with this model, knows that storms will form not where the maximum index values are, but rather where the index values are changing most rapidly. These areas of strong contour gradient on the map are candidate areas for storm initiation only if they are upwind of the line of maximum Sc4 values. Having located the regions of maximum Sc4, the forecaster considers wind speeds and directions at a particular pressure level. In so doing, some strong contour gradients are elimintated as potential storm initiation areas because of their position with respect to wind direction and the maxima. For the Fig. l(a) example, winds
526 R. EL IO AND J. DE HAAN
J ~ g q ~ _.3-~S .~'''- \ x~ / ~ ..I- "d ~ " ~ I ~ \ '%'" a)
~ ~ "~, - --,.
_: :_ . . . . . ~ "- .~ - . "~
FIG. 1. An Sc4 contour map (a) and a surface moisture contour map (b).
coming from the north-east suggest storm initiation areas lie to the left of the maximum index values (e.g. o f the contour marked 5) rather than to the right.
To refine his predictions of storm initiation and intensity further, the forecaster considers another contour map depicting levels of surface moisture. An example is given in Fig. l(b). Unlike the Sc4 index, this contour map is a simple plot of surface moisture measurements, not a statistical predictor. This information is important, because moisture "feeds" the storm and will influence both its intensity and its direction of movement. Having delineated some potential storm initiation regions on the Sc4 map, the expert considers the levels of the surface moisture surrounding these candidate regions as indicated by the surface moisture map. The expert uses the relative degrees of near-by surface moisture to refine the storm initiation areas as well as a direction of movement. The grey areas in Fig. l(a) marked with an "S" indicate the expert's forecast for severe storms; the areas marked with an "M" indicate his forecast for moderate storms.
KNOWLEDGE REPRESENTATION FOR STORM FORECASTING 527
2.2. PERFORMANCE OF STATISTICAL INDEX
Like ~ny analysis tool, the Sc4 model's accuracy in predicting the convective rating for a day has undergone considerable development and evaluation. It was shown to be correct within one rating category 66% of the time (i.e. predicting a rating of n when the resulting weather conditions were best described by a rating of n + 1), and within two rating categories 85% of the time. With modifications made by the expert, these accuracy levels rose to 74% and 91%, respectively. These improvements are primarily associated with the "in between cases", i.e. when the model is predicting neither a very high nor a very low amount of convective activity.
2.3. THE ROLE OF THE INDEX IN EXPERT FORECASTING
The Sc4 index is similar to most quantitative approaches to interpretation, in that it implicitly embodies expert knowledge abou t how measurable meteorological conditions can be related to infer large-scale weather dynamics. In designing a knowledge-based system for this problem, one could "unpack" the knowledge implicitly represented in the quantitative index and represent it symbolically. However, the Sc4 index is an et~icient first-pass analysis of large sets of meteorological data. It yields a concise framework for interpreting other kinds of information and is an integral part of expert forecasting. From a knowledge-engineering standpoint, the Sc4 index is the "language" that our meteorologist and his associates use. For them, it represents a useful and informative snapshot of the necessary storm ingredients. Most importantly, the meteorologist does not use the model as a "black box". He understands the factors that can fool the model's predictor variables and compensates for them. This kind of knowledge distinguishes his use of the model from that of less experienced forecasters. Rather than exclude this statistical index or reimplement it in a symbolic form, we concentrated on modeling the expert's interpretation strategies and his knowledge of their limitations. However, interpreting the S~ index is only part of the reasoning METEOR does. Therefore, a framework for using the Sc4 index in conjunction with other types of qualitative data and knowledge was necessary. This qualitative knowledge is described in the following section.
2.4. QUALITATIVE DATA AND KNOWLEDGE ABOUT WEATHER CONDIT IONS
Experienced forecasters who are familiar with a particular geographical area often have "local knowledge" about how weather forms in the area and its implications for storm development. Data collected at weather stations includes information about current weather conditions. As an example, a weather station report is given in Fig. 2. This report contains a set of meteorological measurements, some of which are used in the Sc4 index. Two portions of this report provide the qualitative information
2006 YEG SA 0600 E50 BKN 90 OVC 15+TRW- 1481161131301019961CB7AC3
LTGIC - CC - CG SW QUAD. SHWR$ HVIER N,W PRES UNSTDY 3012
FIG. 2. An example weather-station report.
528 R. ELIO AND J. DE HAAN
that METEOR uses. The first, marked (a) in the figure, indicates which typ...