A semiautomated methodology for knowledge elicitation

346 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS, VOL. 23, NO. 2, MARCHIAPRIL 1993

A Semiautomated Methodology for Knowledge Elicitation

Claudie Faure, Simonetta Frediani, and Lorenza Saitta

Abstract-A semiautomated methodology for eliciting knowledge from an expert is described. The methodology suggests to use a manual interview technique first and, then, to apply a machine learning program to structure the elicited information. In this way the experts are free of expressing their knowledge in a familiar format; on the other hand, the automated processing helps the knowledge engineer formalize knowledge. This interaction between manual and automatic techniques is particularly useful when the expert meets with difficulties in formulating general rules and prefers, instead, to reason by cases. The application of the methodology to an image interpretation task, namely speech spectrogram reading, is described as a test case.

I. INTRODUCTION N RECENT YEARS great attention has been devoted to the I development of methodologies for knowledge acquisition in

expert systems in order to cope with increasingly complex applications. On one hand, efforts have been done to identify and systematize principles underlying manual interview techniques [1]-[3], [7]. On the other hand, automated or semiautomated tools have been designed to reduce the dependency upon the knowledge engineer or the expert [4]. Programs allowing experts to create mostly by themselves a knowledge base have been proposed [5], [6], [8], [lo], [ll], [13], [14], whereas machine learning techniques have been applied to acquire the knowledge directly from the environment [ 161-[ 191.

In this paper we shall present a hybrid methodology for knowledge acquisition, in which classical acquisition techniques and automated learning cooperate toward the same acquisition task. The system QUANTIF, implementing this methodology, helps the knowledge engineer integrate and organize sparse pieces of knowledge given by an expert. The proposed approach stems from the following considera- tions: on one hand, there are applications in which automated learning is not feasible or suitable; on the other, it may be difficult for the experts to transfer their knowledge in terms close to the format used in the final expert system [37]. As a consequence, the knowledge engineer has to face a heavy work to paraphrase knowledge, to discover general behaviors and to resolve conflicts. In fact, possible incompletenesses and inconsistencies in the elicited knowledge cannot be naively interpreted as weak performance or irrationality by the part of the expert, who describes mental processes that they are not

Manuscript received January 26, 1991; revised June 22, 1992. C. Faure is with ENST, D6partment Signal, 46, Rue Barraults, Paris, France. S. Frediani is with the Politecnico di Torino, Dipartimento di Automatica e

Informatica, CENS-CNR, Corso Duca degli Abruzzi, 24, 10129 Torino, Italy. L. Saitta is with the Universiti di Torino, Dipartimento di Informatica,

Corso Svizzera 185, 10149 Torino, Italy. IEEE Log Number 9205790.

aware of. Several approaches to modeling human processes (such as problem solving, judgement, decision) have been proposed and discussed [15].

The proposed methodology proved useful to the knowledge engineers in their job of transforming the elicited knowledge into machine-usable information, at the same time leaving the experts free of expressing themselves in the most familiar terms. In this way, the experts can devote greater attention to what they say rather than to how to say it. This aspect is particularly important when the expertise involves introspection on a perceptual activity.

While performing an expert task, knowledge at different levels comes into play, including general problem-solving concepts, task-specific knowledge and domain-specific information [7], [9], [12]. The methodology for knowledge acquisition, presented in this paper, mainly concerns the task- level, which constitutes a trade-off between the high degree of abstractness of the top level and the specificity of the lowest one.

The test case presented in this paper is one of image interpretation, namely speech spectrograms reading. In this application the task-level knowledge involves information about the time evolution of the acoustic and phonetic features and its relation to the speech production, in particular phoneme co-articulation.

The proposed methodology is domain-independent. The choice of this specific test case, for exemplification purposes, was motivated by the particularly interesting feedback on the expert’s behavior generated by the interaction with the automated learning procedure.

A project, aimed at building a knowledge-based pattern recognition system, simulating an expert in visual speech spectrogram interpretation, provided the occasion to apply the methodology to this test case. The interpretation process starts with sensorial input data and the final decision arises from perceptual and cognitive processes, during which visual information treatment and domain-dependent knowledge interact; the experts seemingly have a model-driven strategy when they try to match the input data with an a priori finite set of mental references. Acoustic and phonetic knowledge is necessary to perform the task [26]-[29].

Knowledge elicitation is performed by recording expert’s thinking-aloud sessions. The use of freely produced verbal data, combined with a later refinement cycle, allows an adequate knowledge representation to be built up and the acquisition task to be simplified. A good way to help the expert report the domain-dependent knowledge is to use refinement

0018-9472/93$03,00 0 1993 IEEE

FAURE et al.: SEMIAUTOMATED METHODOLOGY FOR KNOWLEDGE ELICITATION 347

cycles to progressively replace the interpretation task with a correction task. The information elicited at each cycle is processed by the domain-independent system QUANTIF, which gives in output a set of decision rules.

The paper is organized as follows. Section I1 describes in more details the characteristics of the knowledge acquisition task, whereas Section 111 briefly introduces the system QUAN- TIF. Section IV illustrates the application of the proposed methodology to the chosen task and Section V reports the results. Finally, Section VI presents some conclusions.

11. KNOWLEDGE ACQUISITION FOR PERCEPTUAL TASKS As mentioned in Section I, the difficulties inherent in the

process of knowledge acquisition (KA) from domain experts are well known and some approaches have been proposed in the literature to ease this task. Nevertheless, no general methodology exists now and will not exist until deeper studies about the different categories of knowledge and the cognitive processes in which they are involved will be done.

One of the task, for which automated interview tools are available, is the classification task based on symbolic input data (see, for instance, [l]), encompassing problems from pattern interpretation to diagnosis. In this task, the relevant knowledge can be formulated and transferred more easily than in other ones, because the decision process mostly happens at the conscious level. Different is the case of tasks involving a great deal of intuition or perceptual processes, which are performed by deeply encoded routines, not available to introspection.

The task considered here belongs to both types. Spectrogram segmentation involves the activation of perceptual routines, direct recognition and a step-by-step conscious reasoning. Image processing combines bottom-up visual mechanisms, not related to a specific knowledge, and top-down processes that constrain the detected image cues to be assigned to problem- dependent symbols. The domain knowledge involved in the task is of interest to our work.

Verbalization is the most direct and widely used method to obtain information about the domain knowledge. For each processed image, the verbal report is a trace of the domain knowledge made active on a particular example. Despite the information processing underlying the interpretation task cannot be completely elicited through introspection, there are strong arguments in favor of verbal reporting. The fact that the experts teach the way the images are visually inspected and analyzed by them suggests that at least a part of their knowledge is transferable and learnable by other people. Moreover, we expect that specific knowledge is not deeply encoded; most of the expert’s knowledge has not reached a final state, therefore it has to be accessible for modification and learning.

The main characteristics of the proposed knowledge acquisition protocol are the use of freely produced verbal data, thinking-aloud procedure and the progressive replacement of the interpretation task by a correction task. The data that are obtained during expertise transfer are an indirect observation of the expert’s mental processes. Their analysis must include

the interaction between the observation protocol and the indi- rectly observed phenomenon. In actual experimental situations, a complete model quantifying this interaction does not exist. The risk to make erroneous interpretation is minimized when the mutual influence of the observation protocol and the observed process is kept as weak as possible. The protocol for expertise transfer will be defined according to this remark. The experts will not be constrained with fixed representation schemes or strict ordering of questions in order to respect as much as possible their natural framework.

The expertise transfer becomes more efficient if a priori knowledge on the mental functioning of the expert in action is available. This helps the elicitation protocol to adapt to the nature of the problem-solving process and it also predicts some limitations. An important property of image interpretation is the nonhomogeneity of the visual information treatment. Analytical and holistic processes are combined during image interpretation; therefore, the symbols processed may arise at different levels of information granularity. At the finest level of granularity, the expert associates a symbol to the pattern basic components. At the opposite, the symbols may correspond to the recognized complex pattern itself when the global appearance of the image constitutes the relevant information for its interpretation. The knowledge base will represent how a set of basic image cues are combined during a decision process. The suitable level of granularity for the symbols corresponds to an image description involving basic image cues. Such a description comes out from analytical processes that are favored by verbalization. Recording the nonverbal stimuli into a verbal form produces a load that may slow down the main task and forces the expert to a more analytical process [7].

A distinction is made between concurrent verbalization, where the subject simultaneously performs the task and produces the verbal data, and a posteriori verbalization, where the subject describes the problem-solving processes that occurred earlier in time. In the case of a posteriori explanation, the verbal data are often reduced to arguments in favor of the final decision and do not incorporate knowledge about selection processes, from which this decision arises. The deletion of irrelevant information and/or the conflict resolutions between several decisions may be ignored. The expected knowledge base must contain all the kinds of information detected on the image that participate to the decision process. Therefore, a posteriori explanation does not constitute the relevant procedure to access knowledge about the intermediate stage of the decision process.

Thinking-aloud procedure, where a concurrent verbalization is performed during the interpretation process, is more adapted to transfer knowledge about the on-going process itself. This procedure favors the recruitment of analytical processes and a better awareness of the intermediate stages of the decision. Nevertheless, when the expert is required to think aloud, the distinction between simultaneously given and a posteriori given verbal report is not so drastic in real expertise transfer conditions. Prototypical patterns can be directly recognized by effortless and spontaneously activated automatic processes. Only the final state is available to consciousness, with no

348 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS, VOL. 23, NO. 2, MARCHiAPRIL 1993

awareness of the intermediate stages. The thinking-aloud procedure can turn into an a posteriori explanation when the final decision pops out from the image.

The interpretation is a very fast process and the flow of information is too great to be completely reported. If the image interpretation task is replaced by a correction task, then the global problem is replaced by a subproblem. Only a part of the specific knowledge needs to be activated. Correction acts as a forced focus of attention to the expert and verbal reporting becomes easier.

The automatization of image interpretation involves the recognition of features that are relevant to the domain, their identification as a symbol of the domain-specific vocabulary, the evaluation of their structural attributes and the reasoning by which the features are combined to make a decision.

The first goal of knowledge acquisition is to learn from the expert which visual cues are the relevant features. It is the vocabulary definition step: a finite set of basic features and the list of their attributes are defined and associated to the corresponding cues in the image. Numerical procedures are then implemented in the pattern recognition system to ensure the best possible match between the features described by the expert and those automatically detected by the system. The knowledge that is used to combine the basic features and that allows the decision to be taken is explicitly represented in the system as a set of rules. This paper is focused on the acquisition of this explicit knowledge.

111. AUTOMATED KNOWLEDGE ACQUISITION

Among the different tasks addressed by machine learning, concept acquisition from examples is mature enough to be applied to real-world problems. A great variety of algorithms and systems exist today (see, e.g., [20]-[25]).

In the present work, learning from examples is used in a peculiar way; in fact, the presented system QUANTIF, starting from a preliminary set of simple rules, extracted manually from the transcript of the expert’s verbal reports, generates a reduced set of higher level decision rules, each one corresponding to a generalization of a subset of the original ones. From this perspective, the system QUANTIF can be considered to perform a generalizing reorganization of a knowledge base [331.

A. Input Rules

Initially, a set K1 of decision rules is manually extracted from the verbal reports. These rules have the following structure:

In rule (l), H is one among a set of alternative decisions (classes), defined according to the semantics of the problem, and cp is a formula of a first-order logic language C1.’ We will say that an example f verifies (is covered by) rule (1) iff the left-hand side (LHS) cp is true of f . Given a set F of

’As usual, variables occurring in both sides of (1) are universally quantified, whereas variables occurring only in the left-hand side of (1) are existentially quantified.

examples, EXT((p, F ) will denote the subset of F verifying cp. The expert will label all the examples in F with the correct class that they belong to; then, F is partitioned, regarding each class H , into a set E ( H , F ) of positive and a set C ( H , F ) of negative examples.

The language C1 = Ll(P, C, Q l ) is defined by the three sets P,C, and &I, which contain the basic predicates, the connectives and the quantifiers, respectively. The set P is domain-dependent, whereas C and Q1 are defined as

Disjunction is implicitly represented in the language by allowing several rules, sharing the right-hand side H , to exist.

B. Output Rules

The system QUANTIF transforms the rule set Kl into another set K2, expressed in a different language C2 = Cz(P ,C ,Q2) , which shares with C1 the sets P and C. The new set Q2 is defined as

Q 2 = {V, 3, ATL m, ATM m, EX m}

and the new rules, generated by QUANTIF, have the following format:

q m in an + H (m 2 1, q E {ATL, ATM, EX}) (2)

In (2) an is a set of n atoms and the symbols ATL, ATM, and EX stand for a t least, a t most, and exactly, respectively. Equation (2) is a nonstandard quantified formula, stating that the conclusion H can be asserted if At Least (At Most, Exactly) 7n out of the n conditions belonging to an are true.

Rules of the form (2) are often used in medicine [30]-[32] and are well suited to the problem at hand; in fact, they proved to be very helpful in discovering equivalence classes of features, in reducing redundancy while keeping flexibility and in limiting knowledge fragmentation while preserving the information content.

C. Generalization Algorithm

The fundamental ideas underlying the system QUANTIF will be illustrated in this subsection; for the interested reader a complete description can be found in [33].

Firstly, the set of rules K1 is partitioned into subsets Kim’ such that all the rules belonging to K i m ) have exactly m conjuncts in their left-hand sides. The possibility of separately handling these sets greatly reduces the computational com- plexity of the global process. In the following, let A denote the union of the sets of atoms occurring in any of the rules belonging to Kim).

The transformation algorithm works in two phases; the first one inspects the rules in Kim) in order to find out if they can be re-expressed using the “EX m” quantifier only. Afterwards, it searches for the “ATL m” and “ATM m” constructs. The output of the first phase is a forest, whose leaves are formulas cpj occurring as LHSs of rules belonging to IC!”, whereas

FAURE et ai.: SEMIAUTOMATED METHODOLOGY FOR KNOWLEDGE ELICITATION

every internal node is a formula II, with the following structure:

II, = A1 A A2 A . . , A A k A EXmqinCMq , (3)

The Ai’s are atoms belonging to A, CM, is a subset of A and

($4 ) T

q = l

The forest of formulas is built up starting from the leaves. Without entering into details, we can illustrate the basic heuristic used to add a new quantified formula to the current forest. If there exist T = ( z ) formulas with the following structure:

r . = B . 3 - 3 1 A B . 3 2 A . . . . . . A B j , - H ( l < j < T ) (5)

where the BjS’s are n out of N conditions belonging to a subset CN of A, then the quantified formula E = (Ex n in C,) can be generated as LHS of a new rule. Actually, this requirement turns out to be too strong and is very seldom verified in practice. In fact, it is very unlikely that the expert explicitly enumerates all the combinations of conditons that are possible. In general, if, for instance, the two rules A -+ H and B 3 H are given, the rule A A B -+ H is also assumed, unless the contrary is explicitly stated. Then, typically, only some of the rules (4) actually occur in I C [ m ) . A heuristic procedure decides whether the missing rules can be hypothesized and the global quantified formula accepted or not. Missing rules added by this procedure are marked deduced.

When the first processing phase halts, the nodes of the generated forest are candidates to be LHSs of rules replacing subsets of rules in Kim’. Usually, the number of nodes is quite large, so that some heuristics have to be used to select the most suitable ones [33]. The new sets Kim) are generated starting from m = 1. Finally, the rules currently contained in Kim) are inspected to see whether groups of them can collapse into one rule containing “ATL m” or “ATM m,” instead of “EX m.” The final set iC2 is selected according to an equivalence criterion stating that the new rules should cover at least as many positive examples and no more negative examples than the old rules in K 1 .

The system QUANTIF is domain-independent; when changing applications, only the heuristic criteria are possibly to be modified. In order to facilitate modification, these heuristics are given to the system in the form of a body of declarative knowledge.

IV. A CASE STUDY In this section an example of use of the automated system

QUANTIF, coupled with protocol analysis, will be described in some details and the benefits obtained in the elicitation process discussed.

A. Domain and Purposes

As mentioned in Section 11, the knowledge elicitation was aimed at building a knowledge base containing the specific

~

349

nz k z d i d a 5000

4500

4000

3500

3000

2500

2000

” , 0 S’O IO0 150 io0 i50 io0 j50 400 450 500 550 $00 $50 7’00 7’50

m

Fig. 1. Example of a speech spectrogram, corresponding to the French word “candidat” (candidate). The phonemic transcription of the world, i.e., “kadida,” is reported above the spectrogram.

knowledge that an expert uses to interpret speech spectrogram images. These images result from a spectral analysis of speech signals: the image is a representation of the spoken word in the time/frequency space. The expert interpretation process is based on the knowledge of the links between the production of speech and the resulting acoustic evidence. Speech spectrogram interpretation associates a sequence of phonemically labelled segments to the processed image. As an example, the spectrogram of the French word “candidat,” corresponding to the phoneme sequence “kiidida,” is reported in Fig. 1.

Most often, segmentation and segment identification are merged into a single task by the skillful experts, because they are performed on the same features. Fig. 2 shows some of these features with their associated names. The differences between the two tasks arise when the features’ attributes are defined: the segmentation emphasizes the detection and location of some changes in the features, while identification is made by a structural analysis based on feature properties such as their location and intensity, their shape and the size of the image area they cover.

The interest in acquiring and representing the expert knowledge that allows speech spectrogram reading is mainly based on the fact that there exist experts who are better than automatic systems for acoustic or phonetic decomposition. The integration of this knowledge into a speech recognition system would increase its performance.

At the time we started this study, an expert system was available [34], [35]. This system identifies a sequence of phonemes from a sequence of manually described segments. Our purpose was to feed this expert system automatically, through a segmentation of the spectrogram image and a description of the obtained segments.

Segmentation may rely on the occurrence of local changes on features or may result from a direct recognition and location of phoneme. As we said in Section 11, the knowledge acquisition methodology will help elucidate the analytical processes of the expert and will let the segmentation appear as a recognition of structural changes that occur in the features. The limit between two phonemes can be precisely located at


t R a k a s R I Frame BOUND- Hz

5000

4500

4000

3500

3000

2500

2000

1500

1000

500

0 0 50 I00 150 ZOO 2% 300 350 400 450 500 550 600 650 700 i50

I N T E R P R E T A T I O N : Left

EKP: F o r m : F 1 ...............

F 2 ............... F 3 ............... F 4 ...............

Voice ............... H - V o i c e ............... L-Voice ............... LF-Silence ............... H F - S i l e n c e ............... VTT ............... H F - N o i s e ............... M F - N o i s e ............... L F - N o i s e ...............

.................. Right .................. STBBT:

F o r m : F 1 ............... F 2 ............... F 3 ............... F 4 ...............

V o i c e ............... H-Voice ............... L-Voice ............... LF-Silence ............... H F - S i l e n c e ............... VOT ............... H F - N o i s e ............... M F - N o i s e ............... L F - N o i s e ...............

In5 OVERLAP .................................................................... CONTINUE ................................................................... INCREASE .............................. DICREASE ............

Fig. 2. Spectrogram of the French word “tracasserie” (phonemic transcription “t R A k a s R i”), in which some of the relevant features are marked. In particular, F denotes a formant, N a noise region, S a silence and V the voicebar. RULES

one instant or can have a certain duration. Different location candidates can also be present, but only one will be chosen for the final segmentation. The knowledge elicitation is not concerned with how the features are detected by the expert, because this stage involves perceptual routines that are not consciously activated. The sought information concerns the combination of detected features from which the final segmentation arises. This information is obtained by asking the expert to think aloud during a segmentation task in order to access the knowledge in action [35].

B. Collecting Verbal Data Preliminary discussions with the expert were aimed at

defining, at least in part, the set of vocabulary terms describing speech spectrograms. The record of the expert verbalization during a segmentation task starts with examples that she considered easy to interpret; the difficulty presented by the processed data is increased during the interview. The expert is asked to think aloud when processing the images and the verbal report is recorded.

This report will have three stages of representation: the first one is the faithful transcription of the recording; the second one is the result of a manual text analysis; the descriptions of each segmentation between two phonemes (the left and the right ones) are collected into a frame called BOUNDARY, shown in Fig. 3; the third representation is a set of formulas of the logic language C1 introduced in Section 111-A. The automatic step of rule acquisition starts with this third representation.

A look at the frame BOUNDARY will help one to understand what kind of information is given by the expert. The frame, which is filled by the knowledge engineer, is the result of an analysis of the rough verbal data. Each frame represents the description of a segmentation between the LEFT phoneme that ends and the RIGHT phoneme that starts. The expert description contains the features and events perceived, the final decision and (possibly) the events’ combination rules used to reach this final decision.

DECISION: Z o n e ( t i .......... tf .......... ) p ................ P r e c - R o u n d t ........................ p ...............

Fig. 3 . Example of the frame BOUNDARY, containing an intermediate representation of the description of a segmentation act, given by the expert. The features, whose END (START) is perceived in the LEFT (RIGHT) speech segment, are reported. Moreover, continuity or gradual variations (INCREASE or DECREASE) in the feature intensity are reported in the slots with the corresponding names. Finally, the slot RULES contains the decision rules (if any) explicitly mentioned by the expert as having been used to support the decision. This last can be either a precise boundary (PREC-BOUND) or an imprecise one (ZONE) occurring in the time interval ( t l , t r ) . The slots labelled by “p” optionally contain a subjective reliability evaluation of the expert’s own decision.

The relevant features that can be recognized on a speech spectrogram image are explicitly listed in the frame BOUND- ARY: Formants (FORM), Voicebar (VOICE), Silence (SIL) and Noise (NOISE). More than the features themselves, their changes are important, because feature changes are the events on which the expert bases her reasoning.

An event is a local property of a feature (or a set of features) and its complete description is given by the property, the feature and the time at which it is visible. Only two basic properties are necessary to describe our set of examples: START and END.

The events are linked through the time variable, because they can happen at the same time or not, according to the content of the slot CONDITIONS.

When the decisions are supported by the expert with explicit rules, these are reported in the slot RULES. These rules do not have to be confused with the rules that the generalization algorithm will process. These last derive from a global analysis of the pieces of information contained in BOUNDARY (including the content of the slot RULES), as described later.

If the expert was able to determine a precise boundary, the corresponding time instant is placed in the PREC-BOUND slot; if not, the expert gives two boundaries, delimiting a segmentation zone by two instants or a tolerance of some milliseconds around an instant. These two limits (or tolerance)

FAURE et af,: SEMIAUTOMATED METHODOLOGY FOR KNOWLEDGE ELICITATION 35 1

a s t a S t

1"' 500 55

noise

LCebar F1 ,q vtt

Fig. 4. Example of a precise segmentation at t,. (a) Part of the speech spectrogram. (b) The detected events are represented in a schematic way.

are put into the ZONE slot. There may be more than one decision, when the expert considers more than one alternative as possible.

The INTERPRETATION slot contains the segment's identification as a member of a phonemic class; however, its content is not used for the segmentation process.

In the example shown in Fig. 3, the expert reported2 the

MR5: (This rule mentions features, but it is, nevertheless, considered as a metarule, for it mainly concerns the vocabulary definition)

MRS-a:Voicing starts (finishes) where the energy in low

MR5-b:A short silence following a high or medium fre- frequency starts (finishes) to be regular.

quency noise is considered as part of the noise.

The MR1 through MR4 metarules were spontaneously generated by the expert. They are general rules supposedly con- trolling the interpretation process; however, they are not a consistent set, as the application of all of them may lead to contradictory decisions. Rules MR1, MR2 and MR4 contain strategic knowledge, i.e., criteria used by the expert to reach her conclusion; they are not of immediate interest for our knowledge acquisition task. On the contrary, metarules MR3 and MR5 do influence the verbalization process: rule MR3 has the effect that the expert does not report some existing events, making thus incomplete the elicited knowledge, also biasing the further automatic processing. The influence of metarule MR5 is more subtle and has been the root of inconsistencies in the first acquisition cycle. It refers to the flexibility with which a human expert switches between considering a feature as a whole and taking into account its component parts; in an artificial system this originates some confusion, that has to be removed.

end of the firstthree formants and of the voicebar at the same time instant (tl = t2 = t3 = tw) . However, a complete voice termination (m) is typically reported to occur later (tT > tu) . Moreover, a high frequency noise region ends simultaneously with the formants (tE = t l ) , leaving room for a silence interval ( ts x t l ) , and, then, starts again roughly at the same time as the V'IT occurs ( tB z tT) . The segmentation

C. From Verbal Datu to Rules

In order to automatically handle the information gathered from the expert, it is necessary to encode the content represented in the frame BOUNDARY into fc"las of the language Each Predicate P(t) of the set p has the following meaning:

corresponding to the frame of Fig. 3 is reported in Fig. 4. As said before, the expert describes two boundary candi- p(t) = The event p occurs at timet.

dates at times ta = tl = t 2 = t3 = t w = ts = t E and t P = tT = tB. The decision is a precise boundary at t@.

During verbal reporting, the expert some times mentions "metarules," exploited to achieve the decision: these metarules do not involve information about the processed data, but are rather general laws. Nevertheless, the influence of their verbalization on the decision seems to be valuable for discussion. The metarules are reported in the following, whereas their

The defined predicates are reported in Table I. Besides the previously defined predicates, two others, describing relations among time instants, have been defined:

SAME(t1, t2, . . . , tn) = Time instants t l , tz, + . . , t , coincide BEFORE(t1, t2) = Time tl precedes time t2.

One more Predicate completes the description language:

EX m[CAND(t)] = influence will be discussed in Section VI. MR1: If there exists a time instant where events occur both

on the left and on the right segment, then this instant "There are exactly m candidates for boundary."

is preferred for a precise segmentation. MR2: Low frequency information is more important than

high frequency information. MR3: If a set of events is sufficient to take a decision, then

other existing events are ignored. MR4: If several instants are competing for a precise seg-

mentation, the instant where the maximum number of events occurs is chosen.

'The Figs. 3 and 4 refer to the second interview with the expert (see later). During the first interview this case had been classified as a zone.

This predicate states that there are exactly m time instants sufficiently close to each other to be candidates for the same segmentation point or range. The variable t is bound to the candidate time instants.

In order to understand how the frame BOUNDARY is converted into a logic formula, let us consider Figs. 3 and 4 again. By analyzing the relations among the various instants mentioned, we notice that only two of them are different:

t , = tl = t2 = t3 = tu = ts = tE to = tT Iz tg.


TABLE I

SPEECH SPECTROGRAM I N BOTH LANGUAGES C1 and Cz PREDICATES DEFINED TO DESCRIBE RELEVANT EVENTS IN THE

TABLE I1 LOGICAL FORMULAS DESCRIBING SOME OF THE SEGMENTATION CASES

~ _ _ _ _

S e m # DescriDtion and English Paraphrase Predicate Meaning

The start (end) of some feature is detected at time t . One or more formant starts (ends) at time t . Formant FJ starts (ends) at time t .

Voicing starts (ends) to be regular at time t . Voice onset time is t . Voice termination time is t . Noise starts (ends) at t . Noise in the {high, medium, low] frequency range starts (ends) at time t . Silence starts (ends) at time t .

START(t), END(t)

SFORM(t), EFORM(t) SFj(t). EFj(t),

SVOIC(t). EVOIC(t) VOT(t)

V T t ) SN(t), EN(t) S a FN(t). E a FN(t) (0 E {H, M, L)) SSIL(t), ESIL(t)

( J E { 1 . 2 . 3 , 4 1 )

This allows the predicate EX 2 [CAND(t)] to be asserted ( t = t , and t = t p ) as well as BEFORE(t,, to ) . Then, considering the content of the slots referring to feature events, the following predicates can also be asserted: EFl(t,), EF2(f,), EF3(t,), EVOIC(t,), EHFN(t,), SSIL(t,), VTT(tp) and SHFN( to ) . According to the above analysis, the segmentation case corresponding to the considered frame (Segm #12) can be described as

Segm#12 = E X 2 [CAND(t)] A BEFORE(t,, t g ) A

EFl( t , ) A EF2(t,)A EFS(t,) A EVOIC(t,) A EHFN(t,)A SSIL(t,) A VTT(t0) A SHFN(t0).

The expert decision about this case has been one of a precise boundary at t,, reported in the PREC-BOUND slot. The content of the RULES slot tells us that the expert used metarule MR3 (just considering the end of the three formants); however, also metarule MR4 has been implicitly used to arrive at the conclusion (at t , most of the detected events occur). In Table 11, some other segmentation cases are described, for the sake of exemplification.

As mentioned previously, the expert describes the events occurring in one or many temporal locations for each segmentation. If the description concerns only one instant, then the decision is a precise boundary. If several instants are involved, then the decision can be a choice of one instant for precise segmentation or a zone between two instants. Whereas Fig. 4 shows an example of a precise segmentation, Fig. 5 reports one of a zone existing between the time instants t , and to.

In order to build up a knowledge-based system automatically performing speech segmentation, the most immediate way would be to consider each description given by the expert as the condition of a rule “Condition -+ Decision,” where “Decision” is the expert preferred decision for that case. However, it is quite obvious that the obtained rules would be at a too low level of abstraction, containing too many details and redundancies and providing a poor prediction power.

v. BUILDING THE KNOWLEDGE BASE Another solution could be to use the metarules given by

the expert; but these ones turn out to be too abstract and

2 EX 1 [CAND(f)]A EFl(f)A EF2(t)A EF3(t)A EF4(f) (There is one candidate t . where the four formants end.)

7 EX 3 [CAND(t)]A EF3(tl)A EF4(tl)A EFl(t2)A EF2(t2)A VTT(t3)A SSIL(t3)A BEFORE(t1 , t g ) A BEFORE(t2. t 3 ) (There are three candidates tl 5 t p 5 t 3 ; the third and fourth formants end at t l . the first and second formants end at f a . The VIT is at t3and a silence starts at t 3 . )

EX 2 [CAND(t)]A EFZ(tl)A EF4(tl)A EF3(t2)A BEFORE(t1. t 2 ) (There are two candidates tl 5 t z ; the second and fourth formants end in t l . whereas the third one ends in t 2 ) .

EX 1 [CAND(t)]A SFl ( t )A SF2(t)A SF3(t)A SVOIC(t)A EHFN(t) (There is one candidate t , where the first, second and third formants, as well as the voicebar, start; the high frequency noise ends in t.)

15

39

their application, as already said in Section IV-B, generates inconsistencies. A trade-off between the two levels has been reached by automatically generalizing the cases descriptions by means of the learning algorithm QUANTIF.

The algorithm requires that a decision strategy and a set of suitable decision classes be chosen. Two alternative definitions have been investigated. The idea underlying the first one is to consider each boundary candidate in isolation, to evaluate its suitability for being chosen as a precise boundary and, then, comparing the candidate evaluations to take the final decision. This will be called the two-steps decision procedure. The advantage of this approach is the possibility of handling the case of “no boundary,” Le., the case in which candidates have been taken into consideration in a part of the spectrogram where no actual boundary between two phonemes exists. These cases have not been considered so far in the training set, due to the expert explicit will, but they can occur in practice and the automated system shall face both a “nonboundary/boundary” decision and a “precise segmentation/zone” decision. The second approach considers each case of segmentation as a whole and label it directly as precise segmentation or zone, according to the expert indication. This is referred to as one- step decision procedure and has the advantage of simplicity.

For both approaches, the same set of spectrograms, containing 42 cases of precise segmentations and 12 cases of zones, has been used as training set. An independent set of 47 segmentations has been used to test the obtained knowledge base.

A. Two-steps Decision Procedure

By considering each single candidate time instant, two classes of decisions have been defined, namely YES and NO. The YES class contains all those time instants that have eventually been chosen as precise boundaries, whereas the NO class contains all those time instants that have not been chosen as precise boundaries. Then, an example of the NO class corresponds to a time instant that either was not chosen for precise segmentation location (and was simply discarded) or became part of a zone. For example, in Fig. 4, time t , belongs to the YES class and time t p to the NO class; in Fig. 5, both t , and to belong to the NO class.

FAURE et ai.: SEMIAUTOMATED METHODOLOGY FOR KNOWLEDGE ELICITATION 353

i R i R

500 i50 0 $50 7’00 7’50 ’ I$ U

(a) (b)

Fig. 5. Example of a “zone between t , and t3” decision. (a) Part of the spectrogram. (b) Schemization of the detected events (end of the second formant at t , and end of the third and fourth formants at t 3 ) .

Each candidate time t has its logical description associated to it, as well as its correct label. For instance, for the time instants t , and to in Fig. 4 we have

EFl(t,)AEF2(t,) A EFS(t,) A EVOIC(t,) A EHFN(t,) A SSIL(t,) + YES@,)

V T T ( t B ) ~ SHFN(t0) +- NO(tp) . (6)

Each description of a time instant, such as the LHSs of rules (6), is an element of the rule set K1. By applying the algorithm QUANTIF to this set, a new rule set K2 has been learned. Examples of the rules are reported in Table 111.

By applying the rules in Table 111 to the learning examples again, some systematic inconsistencies between the segmentations generated by the rules and those generated by the expert have been noticed. These inconsistencies mostly arose because of metarule MR5, when the expert interchangeably used either the name of a feature or that of a component of the same feature; this flexibility, perfectly normal for a human, could not be managed by the artificial segmentation system. More precisely, few strong contradictions were noticed; we call “strong contradiction” a case where the expert decided for a precise segmentation and the system for a zone or vice- versa. Differences arise also when the automatic system did not choose any boundary among a set of candidates, whereas the expert eliminated all but one candidate. Then, the expert, presented with the results, was asked to adopt a uniform convention for naming the features.

The major consequences were updating the decision classes and changing the meaning of some vocabulary terms. The segmentation zones were the main source of problems. Most of the zones are unprecise images, in which the precise location of the events is impossible, forcing one to introduce meaningless refinement in the locations. The expert then decided to eliminate them by merging very close instants or to rename as precise segmentation a previous decision for a zone, endorsing the fuzziness of the visual cues. Events occurring almost at the same time are seen now as simultaneous.

TABLE I11 EXAMPLES OF RULES FOR THE YES AND NO CLASSES

LEARNED BY THE ALGORITHM QUANTIF

Class Name Rule and English Paraphrase YES r: ATL 2 in {EFl( t ) , EFZ(t), EF3(t). EF4(t)}

YES r: ATL2 IN {SFl( t ) . SF2(t), SF3(t). SF4(t), (At least two formants ends in t )

SVOIC(t)} (At least two among the formants or the voicebar start in t )

NO S; ATM 1 in {SFl( t ) . SF2(t), SF3(t). SF4(t). SVOIC(t). SHFN(t)}A (-END( t ) ) (No more than one among the four formants, the high frequency noise and voicebar start in t; moreover nothing ends in t )

NO .s$ ATM 2 in {EFl( t ) . EF2(t), EF3(t), EF4(t)} (At most two formants end in t )

NO s: EVTT(t) (The voice termination ends in t )

Therefore, the number of candidate instants decreased and the number of precise segmentations increased.

This confrontation also showed that different mappings between the vocabulary terms and the image cues were used for describing the images. New predicates and new meaning of predicates are introduced to define precise conventions on the mapping. This updating concerns composite features like the voicing or the affricate noise. The voicing may be composed of a homogeneous and regular part preceded (followed) by an irregular part called the voice onset time (VOT) (VTT for voice termination time). In the first descriptions, the terms “voicing” and VTT were used. The meaning on the image of the term “end of voicing” could change during the verbal reporting, denoting the end of the regular part or the complete stop of voicing. The schematically depicted situation in Fig. 4(b) shows the two possible locations for the “end of voicing” event, t , or t p ; the V?T was always located in t g . This illustrates that a component of a composite pattern can be named by the expert with the symbol of the component or the symbol of the whole pattern. The case of the affricate noise is a typical example in which two events are close enough to be said simultaneous. This feature may appear as a noise followed by a small silence, as shown in Fig. 6. The first verbal report contains the description of the end of noise at time t , and of the end of silence at time t o . Afterwards, the short silence between t , and t g is no more considered as an independent feature.

After modifying the descriptions and the labelling of the time instants according to the expert new suggestions, the two- steps process was repeated; now, the YES and NO classes contain 42 and 40 examples, respectively. Some of the newly learned rules are reported in Table IV.

As said before, the rules belonging to the set R (YES class) and S (NO class) only allow a local decision about a single time instant to be taken and, hence, they are not sufficient to reach a final decision about the nature of the boundary; in fact, the case of no boundary and the case of a zone are not distinguishable. To overcome this difficulty, a second step of automated learning has been performed.

Now, groups of time candidates for the same segmentation are supplied jointly to QUANTIF, as single examples, labelled either as ZONE@,, t g ) or PRECISE(t,) (segmentation). The

354

S U

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS, VOL. 23, NO. 2, MARCWAPRIL 1993

s U TABLE V ., i Two RULES FOR THE SECOND STEP OF THE TWO-STEPS DECISION PROCEDURE a

L

Class Name Rule and English Paraphrase PRECISE 41 EX 1 [CAND(t)lA EX n in R ( t ) A EX m in

S(t)A-Greater( n, m) 3 PRECISE(t) (If there is only one candidate and strictly more YES rule than NO rules fire in t, then t is a PRECISE boundary.)

ZONE ~2 ATL 2 [CAND(t)]A EVOIC(t1)A SHFN(t2)A BEFORE(t1, t i ) 3 aZONE(t1 , t z ) (If there are two candidate times t l 5 t z , then ( t l , t 2 ) is a ZONE if the voicebar ends in ti and the high frequency noise starts in t 2 .)

aThe symbol R(t) denotes the set of rules belonging to R, evaluated at time t.

TABLE VI Fig. 6. Example of the composite feature “noise,” which may or may not include the small silence between t , and t p . In the first case, the noise is said to end at t o , hence t , % t p . (a) Part of the spectrogram. (b) Schematic representation of the events.

TABLE IV EXAMPLES OF LEARNED RULES FOR THE YES AND NO CLASSES,

AFTER CLASSES, AFTER PROBLEM UPDATING

Class Name Rule and English Paraphrase YES TI ATL 3 in {EFl(t) , EF2(t), EF3(t). EF4(t),

EVOIC(t)} (At least three features, chosen among the formants or the voicebar, end in t)

YES ~3 EVOIC(t), V SVOIC(t)V SHFN(t) (End or start of voicebar or end of high frequency noise occur at t )

EHFN(t), W ( t ) , EVOIC(t)}A -START(t) (No start event occurs at t and the end of no more than one among the formants, the high frequency noise, the VTT or the voicebar, occurs at t)

more than two formants start at t)

NO SI ATMl in {EFl(t) . EF2(t), EF3(t), EF4(t),

NO s4 ATM 2 in {SFl(t) , SF2(t), SF3(t). SF4(t)} (NO

descriptions of the new examples are the rules for YES and NO. As an example, let us consider Fig. 4:

Analogously, by considering the ZONE case reported in Fig. 5, we obtain

In (7) and (8) the predicate p ( t ) , where p is a name of a rule, denotes that the rule is true o f t . Running QUANTIF, two sets of rules, Q and 2, are obtained for the second step of the two-steps decision procedure. Q refers to the PRECISE class and 2 to the ZONE class. When no rules in the set Q U 2 apply, then there is a case of “no-segmentation.’’ One rule per class is reported in Table V, for the sake of exemplification.

When the resulting rules were applied to the training examples, five errors resulted; in particular, four cases of precise segmentation, where the expert decided for a zone, and 1 case of a zone, where the expert perferred a precise segmentation.

EXAMPLES OF LEARNED RULES FOR DISCRIMINATING BETWEEN A PRECISE BOUNDARY AND A ZONE IN THE ONE-STEP DECISION PROCEDURE

Class Name Rule and English Paraphrase PRECISE(t) h2 ATL 3 in {EFl(t) , EF2(t), EF3(t), EF4(t),

EVOIC(t): EHFN(t)} [At least three among the formants, the voicebar and the high frequency noise end in t] ATL 2 [CAND(t)]A ATM2 in {SFl ( t i ) ,

ATM2 in {SFl(tz), SF2(t2), SF3(tz), SF4(tz),

[There are at least two candidate times t i and t ~ , such that in both tl and t 2 no more than 2 amone the formants and the voicebar start]

ZONE(t1, t 2 ) 0 2 SF2(t1), SF3(tl) , SF4(tl) , SVOIC(t1 ) }A

SVOIC( t z )}

B. One-Step Decision Procedure For the one-step procedure, the distinction between a precise

boundary and a zone is directly made on each segmentation case. Then, the class names PRECISE and ZONE are directly given as labels to the examples. In this approach, the set of time instants { t e , t p } , appearing in Fig. 4, is one example of the PRECISE class, whereas the set {t,, t p } , appearing in Fig. 5, is one example of the ZONE class. The PRECISE class contains 42 examples and the ZONE class contains 12 examples. By applying the same algorithm as before, the set of rules, ‘Ft, for PRECISE, and the set 0, for ZONE, are obtained (See examples in Table VI). If both PRECISE and ZONE rules fire on the same case, the following strategy is applied for classification:

“If the number of PRECISE rules activated on t is different from 0 and greater than that of the ZONE rules, then t is a precise boundary, otherwise t is a zone. If no rule of either kind is activated, then there is no segmentation.”

By using the rules reported on Table VI with the above strategy, there was a total of two errors on the training set; precisely, the rules gave two cases of segmentation where the expert wanted a zone.

VI. CONCLUSION

As mentioned in Section V-A, a consequence of vocabulary updating is that some decisions have been changed. Some changes are easily explained by the merge of close instants, but others are not. These changes can be explained if the metarules and their influence on the descriptions and on the decisions

FAURE et al.: SEMIAUTOMATED METHODOLOGY FOR KNOWLEDGE ELlCITATION 355

are considered. The metarule MR1 was spontaneously reported several times by the expert during the first interview. On many examples, an event of the left segment is simultaneous with an event of the right segment. When MR1 is applied, the MR3 metarule influences the expert to ignore other events that can be present on the image. MRl has a strong influence on the decisions but, also, competes with other metarules.

Let us consider, for instance, the case of Fig. 4. During the first interview, this case was classified as a zone. At time tp an event occurs on the left segment (VTT) and another occurs on the right segment (start of noise), then MRl can be applied. According to MR1 the decision would be a precise boundary at to. Several events, which are also relevant for segmentation, occur in the close neighborhood of tp at time t,. The choice t p would be contradictory with the visual evidence and with MR4. According to MR4, the decision would be a boundary at t,. The choice t , would be a too obvious counter-example of MR1. The expert decides a zone. An imprecise segmentation is the decision that makes the best compromise.

In the second interview, the MR1 is no more the most influential metarule: the decisions most often agree with MR4. Another argument in favor of the dominance of MR4 in the second interview is the verbalization of new events, which had not been verbalized before despite their high visual prominence. Their addition favors the application of MR4. The decision for the case of Fig. 4 becomes a boundary at t , and the events detected on the left segments are explicitly reported.

Our knowledge acquisition methodology is based on example descriptions, being thus sensitive to the presentation order of the examples. As shown in [36], the influence of the recent context is one source of variability of descriptions during a categorization task. These data are good indicators for saying that, without a consistent domain theory, there exists an influence of the temporal context on the verbal report. The expert builds metarules during the interpretation process. Once a metarule has been verbalized, the expert tries to use it. The duration of the metarule influence varies with the importance given to the metarule and with the interpreted data: when a decision is taken according to a metarule, its influence is reinforced, when no data match the metarule, it remains unused and can be forgotten.

Whatever the task, a general strategy is the search for a minimum effort strategy. Metarules like MR1, MR3, MR4 generalize example descriptions. Several examples are seen as exemplars of the same category. Description and decision tasks are then facilitated by the categorization: they are performed at the category level and not at the exemplar level. Being metarules only consistent with the most recent temporal context of examples, they may introduce inconsistency and incompleteness in verbal reports.

A further test has been performed on new spectrograms by a novice human expert, who is able to recognize the features but has problem in discriminating relevant events and has poor performances in segmentation. The one-step and two-steps sets of rules are activated whenever events are perceived in the image: the instants where they happen are candidate boundaries for segmentation (one or many instants for each segmentation). Eight spectrograms, corresponding to

TABLE VI1 RESULTS OF THE SEGMENTATION EXPERIMENT ON NEW SPECTROGRAMS

A B C D Agreement 29 4 5 5

(3 beginning) (2 beginning) Acceptable 3 1 Note: A The exuert’s and the two automatic orocedure’s decisions am%. B: Only the two-steps procedure agrees with the expert. C: Only the one-step procedure agrees with the expert. D: Both the one-step and the two-steps decisions disagree with the expert.

47 segmentations, have been described and the results are reported in Table VII. The segmentations are divided into four sets: A , for which the expert and the two automatic procedures are in accordance, B (C) for which only the two-step (one- step) decision procedure agrees with the expert and D, for which no agreement exists. From the previous sets, four cases have been excluded: for these cases, the expert considered acceptable the decision obtained from the rules, even if she herself preferred a different one.

An obvious result is that the beginning of the speech string needs special rules: three out of the five errors in C and 2 out of the five errors in D correspond to this situation.

The semiautomatic methodology we have proposed is an efficient tool to obtain valuable knowledge bases and to create favorable conditions for knowledge refinement. Many aspects, like recording the complete discourse, accepting natural language verbal data and proposing correction tasks to the experts, avoid overloading them with heavy and time-consuming con- straints and preserve as well their natural framework.

The proposed methodology has also been used for acquiring knowledge in medical domains. Abstracting from the specific type of knowledge involved, the pattern of expert-system interaction, described in this paper, emerged as a general one across different domains.

ACKNOWLEDGMENT

This work has been carried on with the help of the expert in spectrogram reading Maxine Eskenazi, from the LIMSI (Orsay, France). The authors thank her to have contributed with her competence and her patience to the achievement of this experiment.

REFERENCES

[ l ] M. Shaw and B. Gaines, “Knowledge acquisition, some foundations, manual methods and future trends,” Proc. EKAW, Paris, France, 1989, pp. 3-18.

[2] S. Marcus, Ed. [3] N. J. Belkin, H. M. Brooks and P. J. Daniels, “Knowledge elicitation

using discourse analysis,” Int. J. Man-Machine Studies, vol. 27, pp. 127-144, 1987.

[4] J. Boose, “A survey of knowledge acquisition techniques and tools,” in Proc. 3rd Knowledge Acquisition for Knowledge-Based Systems Work- shop, Banff, Canada, 1988, pp. 3.1-3.23.

[5] M. Shaw and B. Gaines, “KIPEN-Knowledge initiation and transfer tools for experts and novices,” Int. J. Man-Machine Studies, vol. 27, pp. 251-280, 1987.

[6] J. H. Boose and J. M. Bradshaw, “Expertise transfer and complex problems, using AQUINAS as a knowledge acquisition workbench for knowledge-based systems,” Int. J. Man-Machine Studies, vol. 26, pp. 3-28, 1987.

[7] K. A. Ericsson and H. A. Simon, “Verbal reports as data,” Phychol. Rev., vol. 87, pp. 215-251, 1987.

Machine Learning, vol. 4, nos. 3/4, 1989.

356 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS, VOL. 23, NO. 2, MARCWAPRIL 1993

[8] J. S. Bennett, “ROGET: A knowledge-based system for acquiring the conceptual structure of a diagnostic expert system,” Int. J. Automated Reasoning, vol. 1, pp. 49-74, 1985.

191 B. J. Wielinga and J. A. Breuker, “Models of expertise,’’ in Advances in Artificial Intelligence, B. du Bouley, D. Hogg, and L. Steels, Eds. New York Elsevier Science, 1987, pp. 497-509.

[lo] G. Klinger, C. Boyd, S. Genetet, and J. McDermott, “A KNACK for knowledge acquisition,” in Proc. AAAI-87, Seattle, WA, 1987, pp. 488-493.

[ l l ] G. Kahn, S. Nowlan and J. McDermott, “Strategies for knowledge acquisition,” IEEE Trans. Pattern Anal. Machine Intell., vol. PAMI-7, pp. 511-522, 1985.

[12] P. E. Johnson, I. Zualkernan, and S. Garber, “Specification of expertise,” Int. J. Man-Machine Studies, vol. 26, pp. 29-40, 1987.

[13] L. Eshelman, D. Ehret, J. McDermott, and M. Tan, “MOLE: A tenacious knowledge-acquisition tool,” Int. J . Man-Machine Studies, vol. 26, pp.

[14] A. Kawaguchi, R. Mizoguchi, T. Yamaguchi, and 0. Kakusho, “SIS: Shell for interview systems,” in Proc. IJCAI-87, Milan, Italy, 1987, pp. 359-361.

[15] J. Rasmussen, Information Processing and Human-Machine Interaction. New York: North Holland, 1986.

[16] R. S. Michalski, J. G. Carbonell, and T. Mitchell, Eds., Machine Learning: An Artificial Intelligence Approach,, Vol. I, Palo Alto, CA. Tioga, 1983.

[17] R. S. Michalski, J. G. Carbonell, and T. Mitchell, Eds., Machine Learning: An Artificial Intelligence Approach,, Vol. 11, Palo Alto, CA: Morgan Kaufmann, 1986.

[18] R. S. Michalski, and Y. Kodratoff, Eds., Machine Learning: An Artificial Intelligence Approach, Vol. 111, Palo Alto, CA. Morgan Kaufmann, 1990.

[19] J. Carbonell, Ed., Special Issue on Machine Learning, Artificial Intell.,

[20] F. Bergadano, A. Giordana and L. Saitta, “Automated knowledge acquisition in noisy environments,” IEEE Trans. Pattern Anal. Machine Intell., vol. PAMI-10, pp. 555-575, 1988.

[21] J. R. Quinlan, “Induction of decision trees,” Machine Learning, vol. 1,

[22] R. S. Michalski, “A theory and methodology of inductive learning,” Artificial Intell., vol. 20, pp. 111-161, 1983.

[23] B. Cestnik, I. Kononenko and I. Bratko, “ASSISTANT 86: A knowledge elicitation tool for sophisticated users,’’ in Progress in Machine Learning, I. Bratko and N. Lavrac, Eds.,Wilmslow, UK: Sigma, 1987, pp. 31-45.

1241 R. Gemello, F. Mana, and L. Saitta, “RIGEL An inductive learning system,” Machine Learning, vol. 6, pp. 7-36, 1991.

[25] R. Quinlan, “Learning logical definitions from relations,” Machine Learning, vol. 5, pp. 239-266, 1990.

[26] J. H. Conolly, E. A. Edmonds, J. J. Guzy, S. R. Johnson, and A. Wood- cock, “Automatic speech recognition based on spectrogram reading,” Int. J . Man-Machine Studies, vol. 24, pp. 611-621, 1986.

[27] V. W. Zue and L. F. Lamel, “An expert spectrogram reader, a knowledge-based approach to speech recognition,” in Proc. ICASSP-86, Tokyo, Japan, 1986, pp. 23.2.1-23.2.4.

[28] R. Mizoguchi, K. Tsujino, and 0. Kankusho, “A continuous speech recognition system based on knowledge engineering techniques,” in Proc. ICASSP-86, Tokyo, Japan, 1986, pp. 23.8.1-23.8.4.

[29] J. Johannsen, J. MacAllister, T. Michalef, and S. Ross, “A speech spectrogram expert,” in Proc. ICASSP-83, Boston, MA, 1983, pp. 746-749.

41-54, 1987.

V O ~ . 40, pp. 1-3, 1989.

pp. 81-106, 1986.

[30] L. C. Kingsland 111 and D. A. Lindberg, “The criteria form of knowledge representation in medical artificial intelligence,” in Proc. 5th Conf Medical Informatics, Washington, DC, 1986, pp. 12-16.

[31] K. Spackman, “Learning categorical decision criteria in biomedical domains,” in Proc. 5th Int. Conf Machine Learning, Ann Arbor, MI, 1988, pp. 3 6 4 6 .

[32] P. Politakis and S. M. Weiss, “Using empirical analysis to refine-expert system knowledge bases,” Artificial Intell., vol. 22, pp. 23-48, 1984.

[33] S. Frediani and L. Saitta, “Knowledge base organization in expert systems,” Lecture Notes in Computer Science, vol. 286, pp. 217-224, 1987.

[34] L. Saitta, “Spectrogram segmentation, acquisition of the expert’s knowledge,’’ RI ENST-88DOl0, 1988.

[35] C. Faure, “InterprCtation de spectrogrammes de parole,”RI ENST-88 0 0 0 1 , 1988, French.

[36] L. Barsalou, “Intraconcept similarity and its implication for interconcept similarity,” in Similarity and Analogical Reasoning, S. Vosniadou and A. Ortony, Eds., Cambridge, MA. Cambridge Univ. Press, 1989, pp. 76-121.

[37] M. A. Musen, L. M. Fagan, D. M. Combs, and E. H. Shortliffe, “Use of a domain model to drive an interactive knowledge-editing tool,” Int. J . Man-Machine Studies, vol. 26, pp. 105-121, 1987.

Lorenza Saitta received the degree in nuclear engineering from the Poly- technic School of Torino, Torino, Italy.

She is a is Professor of Computer Science at the University of Torino, Torino, Italy. Her research activity started in decision theoretic and syntactic pattern recognition, mostly applied to speech recognition. Then, she became interested in knowledge-based systems and evidence combination problems and, finally, in machine learning, which constitutes currently her main interest. In particular, she is involved in using casual reasoning and abstraction mechanisms in automated knowledge acquisition. A related interest is in the cognitive aspects of learning.

Claudie Faure received the degree in physics from the University of Nice, Nice, France. She then studied Computer Science and Signal Processing at the University of Paris XI. She received the Doctor of Sciences degree in 1982 with a thesis on structural pattern recognition and signal interpretation.

From 1976 to 1985 she worked in pattern recognition at the University of Compi?gne, Compitgne, France. Since 1985, she has been with the SIGNAL department of TELECOM-Paris. She is a CNRS researcher since 1975. Her research interests are knowledge-based pattern recognition systems, knowledge acquisition and visual perception.

Simonetta Frediani received the degree in computer science from the University of Torino, Torino, Italy.

She has been researcher at the Polytechnic School of Torino with her main research interests in artificial intelligence and, more specifically, in machine learning.

Documents

A semiautomated methodology for knowledge elicitation