Upload
buidang
View
215
Download
0
Embed Size (px)
Citation preview
1
Critiquing As A Design Strategy For Engineering Successful Cooperative Problem-Solving Systems
By
Stephanie Anne Elisabeth Guerlain, Ph.D.
The Ohio State University, 1995
Dr. Philip J. Smith, Adviser
This research focused on the design of cooperative computerized decision aids, looking
in particular at the critiquing approach to providing decision support. Critiquing systems are
proposed to be more cooperative than decision support systems that are based on an
automation philosophy, since critiquing systems can support a human's decision-making
process, allowing the human to stay involved in the task, while providing context-sensitive
feedback when errors or faulty reasoning steps are detected. In addition, critiquing systems are
proposed to mitigate "the brittleness problem", i.e., the difficulty with which people are able to
detect and correct for faulty reasoning on the part of the computer. To test these proposals, a
part-task simulation study was run, comparing a critiquing system to no decision support on a
set of difficult immunohematology problems. Thirty-two certified medical technologists solved
an initial Pre-Test Case, after which members of the Treatment Group received a checklist
outlining the higher-level goal structure of the computer's knowledge base, and were trained on
the use of the critiquing system. All subjects then solved four Post-Test cases, one of which was
outside the range of cases that the computer was designed to support. (The Treatment Group
continued to use the critiquing system and checklist and the Control Group received no
decision support.) The results showed that the Treatment Group had a lower misdiagnosis rate
2
on all of the Post-Test Cases, with 100% correct performance on the three cases for which the
system was designed as compared to misdiagnosis rates of 33%, 38% and 64% incorrect for the
Control Group on the three respective cases (each difference is statistically significant, p < 0.05).
On the case for which the system's knowledge base was not fully competent, the Treatment
Group had an 18.75% misdiagnosis rate as compared to a 50% misdiagnosis rate for the Control
Group (p < 0.10). A detailed analysis of the behavioral protocols indicated that both the
checklist and the critiquing functions significantly contributed to these improvements in
performance and provided insight into how to design effective decision support tools.
Critiquing As A Design Strategy For Engineering Successful Cooperative
Problem-Solving Systems
DISSERTATION
Presented in Partial Fulfillment of the Requirements for the Degree
Doctor of Philosophy in the Graduate School of the Ohio State University
By
Stephanie Anne Elisabeth Guerlain, B.S., M.S.
************
The Ohio State University
1995
Dissertation Committee: Approved by:
Philip J. Smith, Ph. D.
David D. Woods, Ph. D. _________________
B. Chandrasekaran, Ph. D. Adviser
Department of Industrial & Systems Engineering
Graduate Program
Copyright by
Stephanie Anne Elisabeth Guerlain
1995
i
To my husband, Robert J. Haschart
ii
Acknowledgements
I would like to express my sincere thanks to my adviser, Dr. Philip J. Smith, for his
outstanding insight and guidance throughout my graduate career. I would also like to thank
my other committee members, Dr. David D. Woods and Dr. B. Chandrasekaran, for their input
and excellent course offerings. Patricia Strohm and Sally Rudmann provided invaluable blood
bank expertise and Larry Sachs was of great help for the statistical analyses. This project was a
team effort, and I would like to acknowledge fellow graduate students Thomas E. Miller, Susan
Gross, Jodi Obradovich and Craig Tennenbaum who also worked on the design, development,
and evaluation of various aspects of the Antibody IDentification Assistant. Melinda Green has
been an excellent support staff member of the Cognitive Systems Engineering Laboratory,
keeping us all organized. Finally, I would like to acknowledge the support of my family,
friends and, in particular, my husband, Bob, for their undoubting faith and patience. The
education I received at Ohio State was outstanding, and I owe that to the people with whom I
had the pleasure to work.
iii
Vita
October 13, 1967 ................................................... Born, Norwalk, Connecticut 1990 ..................................................................... B. S., Engineering Psychology magna cum laude Tufts University Medford, Massachusetts 1989 - 1990 ............................................................ Member of the Technical Staff, User-System Interface Group The MITRE Corporation Bedford, Massachusetts 1992 ..................................................................... Human Interface Designer Development Tools Group Apple Computer Cupertino, California 1993 .................................................................... M. S., Industrial & Systems Engineering The Ohio State University Columbus, Ohio 1995 .................................................................... Presidential Fellow The Ohio Sate University 1990 - 1995 ....................................................... Graduate Research Associate Graduate Teaching Associate Industrial & Systems Engineering The Ohio State University
Publications Chechile, R., Guerlain, S., & O'Hearn, B. (1989). A comparative study between mental workload and
a priori measures of display complexity. AAMRL/HEF Technical Report, Wright Patterson AFB, Dayton, OH.
Guerlain, S., Smith, P. J., Obradovich, J., Smith, J. W., Rudmann, S., and Strohm, P. (1995). The Antibody Identification Assistant (AIDA), an example of a cooperative computer support system. Proceedings of the 1995 IEEE International Conference On Systems, Man and Cybernetics.
Guerlain, S. (1995). Using the critiquing approach to cope with brittle expert systems, Proceedings of the Human Factors and Ergonomics Society 39th Annual Meeting - 1995, Santa Monica, CA.
iv
Guerlain, S. & Smith, P. J. (1995). Designing critiquing systems to cope with the brittleness problem. Position Paper for CHI '95 Research Symposium (May 6-7, 1995, Denver, Colorado U.S.A.).
Guerlain, S., Smith, P.J., Gross, S.M., Miller, T.E., Smith, J.W., Svirbely, J.W., Rudmann, S., and Strohm, P. (1994). Critiquing vs. partial automation: How the role of the computer affects human-computer cooperative problem solving, In M. Mouloua & R. Parasuraman (Eds.), Human Performance in Automated Systems: Current Research and Trends. (pp. 73-80). Hillsdale, NJ: Lawrence Erlbaum Associates.
Guerlain, S. (1993). Designing and Evaluating Computer Tools to Assist Blood Bankers in Identifying Antibodies. Master's Thesis, The Ohio State University.
Guerlain, S. (1993). Factors influencing the cooperative problem-solving of people and computers. In Proceedings of the Human Factors and Ergonomics Society 37th Annual Meeting, 1 (pp. 387-391). Santa Monica, CA.
Guerlain, S. & Smith, P. J. (1993). The role of the computer in team problem-solving: Critiquing or partial automation? In Proceedings of the Human Factors Society 37th Annual Meeting, 2 (p. 1029). Santa Monica, CA.
Guerlain, S., Smith, P.J., Miller, T., Gross, S., Smith, J.W., & Rudmann, S. (1991). A testbed for teaching problem solving skills in an interactive learning environment. In Proceedings of the Human Factors Society 35th Annual Meeting, 2 (p. 1408). Santa Monica, CA.
Miller, T. E., Smith, P. J., Gross, S. M., Guerlain, S. A., Rudmann, S., Strohm, P., Smith, J. W., & Svirbely, J. (1993). The use of computers in teaching clinical laboratory science. Immunohematology, 9(1), (pp. 22-27).
Rudmann, S., Guerlain, S., Smith, P.J., Smith, J.W., Svirbely, J. & Strohm, P. (1992) Reducing the complexity of antibody identification tasks using case-specific, computerized displays, Transfusion, 32s, (p. 95s).
Smith, P. J., Guerlain, S., Smith, J. W., Denning, R., McCoy, C. E., and Layton, C. (1995). Theory and practice of human-machine interfaces, In Proceedings of ISPE '95 Intelligent Systems in Process Engineering, Snowmass, Colorado.
Smith, P. J., Miller, T., Gross, S., Guerlain, S., Smith, J., Svirbely, J., Rudmann, S., & Strohm, P. (1992). The transfusion medicine tutor: A case study in the design of an intelligent tutoring system. In Proceedings of the 1992 Annual Meeting of the IEEE Society of Systems, Man, and Cybernetics, (pp. 515-520).
Field Of Study
Major Field: Industrial & Systems Engineering Concentration: Cognitive Systems Engineering
v
Table Of Contents
Acknowledgments...................................................................................................................................ii
Vita ............................................................................................................................................................iii
List of Figures...........................................................................................................................................ix
List of Tables ............................................................................................................................................x
Chapter I. Introduction.....................................................................................................................................1
Document Overview.................................................................................................................4 II. From Traditional Decision Support to More Cooperative Problem-Solving Systems.............................................................................................................................................6
The Traditional "Consultation" Model of Decision Support ...............................................6 The "Automated Assistant" Model of Decision Support......................................................10 Supporting the Decision Making Process Rather than Replacing It...................................13 Human Error..............................................................................................................................13
Slips of Action .............................................................................................................15 Mistakes .......................................................................................................................15 Skill Development and Skill Maintenance...............................................................16
Requirements for a Cooperative Problem-Solving System .................................................16 The Critiquing Model of Decision Support ...........................................................................17
Previous Critiquing Studies.......................................................................................18 Potential Problems ......................................................................................................21 Potential Benefits.........................................................................................................23
The Next Step: Designing a Proof-Of-Concept Cooperative Critiquing System .............26 III. Antibody Identification as a Testbed ...........................................................................................27
The Practitioners, a.k.a., "Medical Technologists" or "Blood Bankers" ..............................27 The Goal of Antibody Identification: Finding Compatible Donor Blood .........................28 The Antibody Identification Procedure .................................................................................29 Types of Knowledge Needed ..................................................................................................31 Characteristics that make antibody identification difficult .................................................33
Medical Technologists (MTs) arrive in the blood bank with minimal practice .........................................................................................................................33 Most MTs rotate ..........................................................................................................33 Infrequent encounters with difficult cases ..............................................................34 Very little feedback on performance ........................................................................34
Expert strategies ........................................................................................................................35 1+. Forming early hypotheses.................................................................................35
vi
1a+. Hypothesizing the number of antibodies present .........................36 1b+. Hypothesizing the type of antibodies present ...............................37 1c+. Hypothesizing a specific antibody by finding a pattern
match ...................................................................................................37 2+. Ruling out............................................................................................................38
2a+. Ruling out using homozygous, non-reacting cells ........................40 2b+. Ruling out using additional cells .....................................................40 2c+. Ruling out masked antigens by inhibiting the positive
reactions ..............................................................................................41 2d+. Ruling out only those antibodies that will react on the
current panel.......................................................................................41 2e+. Ruling out the corresponding antibody if the antigen
typing is positive................................................................................41 3+. Collecting independent, converging evidence ...............................................42
3a+. Making sure the patient is capable of forming the hypothesized antibodies ...................................................................42
3b+. Using a test procedure that is known to change the reactivity of an antibody ...................................................................43
3c+. Asking, �Is this an unlikely combination of antibodies?� ............43 3d+. Making sure there are no unexplained positive reactions............43 3e+. Making sure there are no unexplained negative reactions...........44 3f+. Making sure that all remaining antibodies are ruled out. ............44 3g+. Using the "3+/3-" rule .......................................................................44
4+. Solving cases in an efficient manner................................................................44 4a+. Picking additional cells efficiently ...................................................45
Poor problem-solving strategies..............................................................................................45 2-. Ruling out incorrectly ........................................................................................46
2f-. Ruling out using reacting cells .........................................................46 2g-. Ruling out regardless of zygosity ....................................................46 2h-. Ruling out the corresponding antibody if the antigen
typing is negative...............................................................................46 4b- Not using all information provided by a test result ......................................47
Kinds of Cases ...........................................................................................................................47 One antibody reacting strongly.................................................................................47 Antibody to a high-incidence antigen......................................................................47 Weak antibody ............................................................................................................48 Multiple antibodies.....................................................................................................49
On separate cells with differing reaction patterns ...................................49 Reacting in the same pattern as each other ...............................................49 On overlapping cells ....................................................................................50 Masking..........................................................................................................50
Variable reactions........................................................................................................51 Antibodies showing dosage ........................................................................51
Recent transfusion.......................................................................................................52 Drug interactions ........................................................................................................52 Auto-immune disorders.............................................................................................53
vii
IV. The Design of the Antibody Identification Assistant (AIDA3).................................................54 The Task .....................................................................................................................................55 The Practitioners........................................................................................................................55 The Problem-Solving Tool........................................................................................................56
Design Principle 1: Use a Direct Manipulation Problem Representation as the Basis for Communication................................................................................56 Design Principle 2: Use Critiquing to Enhance Cooperative Problem-Solving..........................................................................................................................59 Design Principle 3: Represent the Computer's Knowledge to the Operator to Establish a Common Frame of Reference...........................................60
V. Experimental Procedure ................................................................................................................63
Subjects .......................................................................................................................................63 Experimental Design.................................................................................................................64 Cases used to test AIDA...........................................................................................................66
Pre-Test Case: Two antibodies looking like one ....................................................68 Post-Test Case 1: Two antibodies looking like one................................................68 Post-Test Case 2: One antibody reacting weakly, right answer can be ruled out (because system is not fully competent). ................................................69 Post-Test Case 3: Two antibodies, one masking the other....................................71 Post-Test Case 4: Three antibodies, on overlapping cells, reacting on all cells of the main panel................................................................................................72
Procedure ...................................................................................................................................73 Phase 1. Subject Demographic Data ......................................................................74 Phase 2. Introduction to the Interface....................................................................74 Phase 3. The Pretest Case ........................................................................................75 Phase 4. Training and Introduction to the Checklist and Critiquing ................75 Phase 5. Post-Test Cases ..........................................................................................78 Phase 6. Debriefing...................................................................................................78
Data collection ...........................................................................................................................79 Data Analysis.............................................................................................................................80
VI. Results and Discussion...................................................................................................................82
Unaided Subject Performance .................................................................................................82 Example Subject Interactions...................................................................................................84 Gross Performance Measures ..................................................................................................86
Statistical Comparison of Misdiagnosis Rates ........................................................86 Expert Subjects ..............................................................................................86 Less Skilled Subjects.....................................................................................87
Slips vs. Mistakes ........................................................................................................91 Questionnaire Results.................................................................................................94
Detailed Analyses......................................................................................................................99 Proactive Training vs. Reactive Feedback (Critiquing) .........................................100 The Timing of the Critiques.......................................................................................101 Subjects Overriding the Critiques.............................................................................103 Analysis of the Weak D Case ....................................................................................104
When to use critiquing vs. some other form of decision support.......................................107 VII. Conclusion .....................................................................................................................................110
viii
Appendices A. Sample Answer Sheet ...............................................................................................................123 B. Sample Statistical Calculations................................................................................................124 C. Definition of Classes of Errors Logged by the Computer....................................................134 D. Sample Behavioral Protocol Log .............................................................................................136 E. Sample Error Log ......................................................................................................................148 F. Number of Mistakes and Slips Made on Each Case by Each Subject in the
Control Group (n = 16) .............................................................................................................151 G. Number of Mistakes and Slips Made on Each Case by Each Subject in the
Treatment Group (n = 16) ........................................................................................................160
ix
List of Figures
Figure Page
1. The Antibody Screen .....................................................................................................................30
2. Anti-Fyb looks likely .....................................................................................................................39
3. Anti-E and anti-K can account for the reactions as well ...........................................................39
4. Sample Screen ................................................................................................................................58
5. Sample Checklist ............................................................................................................................61
6. Test Results Available ...................................................................................................................64
7. Experimental Design .....................................................................................................................66
8. Sample Error Message ...................................................................................................................76
9. Sample Summary Screen ...............................................................................................................79
10. Paths Taken to Solve the Weak D Case, Control Group ...........................................................106
11. Paths Taken to Solve the Weak D Case, Treatment Group ......................................................107
x
List of Tables
Table Page
1. Process Errors Made by Treatment and Control Group Subjects on the Pre-Test Case. ...................................................................................................................................................83
2. Pre-Test/Post-Test Comparison of Misdiagnosis Rates..............................................................87
3. Post-Test Case results... ...................................................................................................................89
4. Combining p-values Given by the Log-Linear Analysis of Misdiagnosis Rates on the Post-Test Cases, Taking Into Account Performance on the Pre-Test Case.... ...........................89
5. Correctness of Answers, Treatment Group..... .............................................................................90
6. Correctness of Answers, Control Group.... ...................................................................................91
7. Number of Subjects Committing Each Type of Error at Least Once Per Case.........................93
8. Combining p-values Across the Five Error Types... ....................................................................94
1
Chapter I
Introduction
In any domain that requires complex decision making and problem solving,
practitioners are likely to occasionally make errors in problem-solving. There are a number of
potential contributions to the occurrence of such errors, including:
1) Situational factors. The workload is either too high, causing stress and memory
overload or the workload is too low, causing vigilance problems,
2) Decision making strategies. Practitioners may be missing knowledge, using
inappropriate knowledge, or using simplifying strategies or heuristics which are not
adequate in all situations, and
3) The types of information available. Practitioners may not have the right kind of data
available to them for the current situation, or there may be too much data to effectively
integrate and draw appropriate conclusions.
For this reason, researchers have long been interested in designing computerized decision
support systems. In many situations, such a device can aid tremendously by offloading some of
the workload of the practitioner, by performing time-consuming or tedious tasks and by
remembering important details that may only be pertinent in rare cases.
One of the critical problems with advanced decision-support systems, however,
is their potential brittleness, or failure to perform competently in all situations. This brittleness
can arise because of a deliberate design decision to use an oversimplified model of the decision
task (due to cost, time or technological limitations), the inability of the designer to anticipate and
2
design for all of the scenarios that could arise during the use of the system, a failure of the
designer to correctly anticipate the behavior of the system in certain situations, a failure to
correctly implement the intended design, or because of hardware failures or bugs in the
underlying software environment.
The typical "safety valve" to deal with this problem is to keep a person "in the loop",
requiring that person to apply his or her expertise in making the final decision on what actions
to take. The current view held by the Food and Drug Administration (the agency which
regulates the use of automated medical devices) is that a system is safer if a human is required
to review the automated device's proposed actions (Brannigan, 1991; Gamerman, 1992). Indeed,
this is often the role delegated to the users of many expert systems. In describing one medical
expert system, for example, the authors state that: "The staff themselves would not be displaced
by this tool because their expertise would still be necessary to verify PUFF's output, to handle
unexpected complex cases, and to correct interpretations that they felt were inaccurate" (Aikins,
Kunz, and Shortliffe, 1983). Thus, the designers of this system acknowledge that the computer's
reasoning is not perfect, but they assume that the human will be able to detect and correct any
errors made by the system when it exhibits its brittleness.
Contrary to this assumption, empirical data indicates that people are often not capable
of judging the validity of an expert system's conclusions. People may ignore the advice of a
system, even when it is relevant (as is the case with many "rejected" decision support systems),
or heed the advice of a system, even when it is faulty. "Complacency" may occur when
monitoring for automation failures if the automation reliability is unchanging or if the operator
is responsible for more than one task (Parasuraman et al., 1993; Parasuraman et al., 1994). Less
"obvious" automation failures may cause practitioners to be unduly influenced by an expert
system's proposed intermediate inferences or final solutions when these inferences do not take
3
into account all of the relevant aspects of the data. This was demonstrated in a recent study in
the domain of flight planning (Layton, Smith and McCoy, 1994). In this study, if the scenario
was one where the computer's brittleness led to a poor recommendation and the computer
generated a suggestion early in the person's own problem evaluation, then the person's own
judgment was negatively influenced, resulting in a 30% increase of inappropriate plan selection
over users of a manual version of the system. A second study by Guerlain, Smith et al. (1994)
found similar results in a medical application, with misdiagnosis rates increasing almost 30%
when an automated support tool led users of the system "down the garden path" to a plausible,
yet incorrect answer.
This phenomenon of people not adequately judging an expert system's conclusions is a
symptom of a larger problem with many advanced decision support systems, namely that they
do not work cooperatively with practitioners within the field of practice for which they were
designed (Malin, Schreckenghost, Woods, Potter, Johannesen, Holloway, and Forbus, 1991).
Such systems may not be integrated with other tools, information, and representations currently
used by practitioners, and may introduce new workload because of the need to manage the
intelligent system. Furthermore, these systems may be difficult to understand and validate in
the context of a particular task situation, especially when other task parameters demand
cognitive resources. Finally, the role played by the human and computer agents involved can
have a large effect on performance. Human practitioners may be delegated to a supervisory
control role when using an expert system, but not have sufficient understanding of the
computer's reasoning process or access to enough relevant data to be able to correctly detect
faulty performance of the computer.
Thus, we have a potentially difficult tradeoff to deal with. One alternative "solution" is
to not introduce an aiding system at all in order to avoid the potential negative consequences of
4
its presence in certain situations, and to try to enhance the training of the practitioner
population in order to improve unaided performance. A second alternative is to accept the risks
associated with placing a potentially brittle support system in a high-consequence domain. A
third alternative is to have a system that provides some of the aforementioned benefits, while
minimizing the introduction of new forms of errors due to its brittleness and lack of cooperation
with domain practitioners.
A study by Guerlain (1993a) provided objective data indicating how the latter
alternative might be possible, even for systems that are brittle. This study showed that when a
problem-solving strategy is encoded into an expert system and its knowledge is applied
automatically by the computer, performance can degrade significantly if the task situation is
outside the computer's range of competence (the classic brittleness problem). Such a
degradation, however, did not occur when the computer used its knowledge to critique the user
who was performing the problem-solving task while the computer looked "over her shoulder."
Furthermore, many aspects of critiquing were identified as promoting cooperative problem-
solving performance between the human and the computer. These results provide initial
evidence that placing an expert system in a critiquing role may be a safer and more effective
form of decision support than automating all or part of the problem solving. The goal of the
research conducted for this dissertation, then, is to examine in much greater detail the critiquing
approach of decision support as a means to promote human-computer cooperative problem
solving.
Document Overview
Chapter II of this document gives an overview of different approaches to designing
decision support systems, by identifying many of the problems with the automation philosophy
underlying many such systems, and contrasting that with systems that focus more on a
5
cooperative interaction with the people who are utilizing it. In particular, critiquing systems are
identified as one type of decision aiding system that satisfies many of the requirements of a
cooperative decision support system.
The problem-solving task that was used as a testbed to study human-computer
cooperative problem-solving is antibody identification. Chapter III describes the antibody
identification procedure and the types of knowledge and strategies that experts use to solve
cases. These are contrasted with poor problem-solving strategies used by many medical
technologists.
Chapter IV describes the design and functionality of the critiquing system that was built
as a testbed for this research. A formal experiment was conducted to test the hypothesis that a
well-designed critiquing system can significantly reduce misdiagnosis rates compared to
unaided performance, even when the critiquing system is not fully competent. Chapter V
describes the experimental procedure used, including a description of each of the test cases used
and why they were chosen. An exploratory behavioral protocol analysis was also conducted as
part of this study to examine the influence of the system's design on the practitioners' problem-
solving behaviors. Chapter VI describes the results of the study, including an analysis of the
errors made and the overall system influence on practitioners' choices and uses of strategies.
Chapter VII gives closing comments, relating the results from this research to other domains
and gives a list of guiding principles for the design and evaluation of critiquing systems and
other cooperative problem-solving systems.
6
Chapter II
From Traditional Decision Support to More Cooperative Problem-
Solving Systems
A significant literature exists in the artificial intelligence community on the design and
evaluation of decision support "consultation" systems. To a large extent, these systems are
evaluated based on the computer's ability to solve problems as compared to expert practitioners
in that field of practice. This kind of evaluation, which focuses on the computer's reasoning
capabilities, hides the fact that this form of decision support is not practical or effective in actual
settings. For one, the interface is often poor. More fundamentally, the interaction is not set up
for effective communication between the human and the computer. In response to this, a more
recent wave of research has focused on the design of "cooperative" decision support systems.
The purpose of this chapter is to outline some of the problems with the automation model of
decision support and to provide principles for the design of cooperative decision support
systems that take into account human problem-solving and decision making. Critiquing is
proposed to be a form of decision support that satisfies a number of these principles. As such,
the few previous studies of critiquing systems are discussed and issues that remain to be
understood about critiquing are identified.
The Traditional "Consultation" Model of Decision Support
In examining the artificial intelligence literature, one finds that there have been many
attempts to build decision support systems to act in a "consulting" role. A typical interaction
7
with such a system in the medical domain, for example, is to have the computer program first
examine the available data about a patient, perhaps with some questions posed to the attending
physician about the patient's history, treatment, or the results of additional tests, and to then
develop a diagnosis and/or treatment plan for consideration by the practitioner. The idea
behind such a system is that, in practice, practitioners would "consult" the system as they might
consult other doctors for advice.
Such systems can vary along several dimensions, such as the intended type of support.
For example, in the medical domain, a system can be designed to aid with the diagnosis of
diseases (e.g., MYCIN, Shortliffe, 1976 and MENINGE, François, Robert, Astruc et al., 1993), or
with the management of a patient's treatment plan (e.g., ONCOCIN, Shortliffe et. al., 1981). The
systems can also vary according to the underlying computational model of the problem. Many
systems are knowledge-based, but some use mathematical models, such as Bayesian reasoning
(Sutton, 1989). Almost all of these systems, however, follow the same model of decision
support: The computer tries to solve the problem for the person and then gives its results (possibly
along with an explanation) to the person for review.
Some systems give one diagnosis, while others generate a list of possible diagnoses,
usually rank ordered according to the computer's model of how well the diagnosis accounts for
the data about the patient. Typically, evaluations of such systems focus on whether or not the
computer system is able to generate the "gold standard" (i.e., best answer) as either the top
answer or as at least a highly rated answer on a range of cases (e.g., Wellwood, Johannessen,
and Spiegelhalter, 1992; Hickam, Shortliffe, Bischoff, Scott, and Jacobs, 1985; Shamsolmaali,
Collinson, Gray, Carson, and Cramp, 1989; Bernelot Moens, 1992; François, Robert, Astruc et al.,
1993; Nelson, Blois, Tuttle et al., 1985; Plugge, Verhey, and Jolles, 1990; Sutton, 1989; Verdaguer,
Patak, Sancho et al., 1992; Berner, Webster, Shugerman et al., 1994). Usually, the results of such
8
evaluations show that the expert system performs better than novice practitioners and close to
or as well as the expert practitioners in terms of this gold standard evaluation. A few long-term
studies have been done comparing overall performance previous to the introduction of the
computer to overall performance during the period of time with the computer present to see if
the presence of the system has had any major affects on the treatment of patients (e.g.,
Wellwood, Johannessen, and Spiegelhalter, 1992), but even these evaluations focus on overall
performance before the introduction of the computer to overall performance with the computer.
Only just recently have researchers in the medical informatics area begun to realize that
focusing only on the computer's performance is a limited and unrealistic evaluation of a
decision support system, if the goal is to successfully incorporate the system into actual practice
(e.g., Forsythe and Buchanan, 1992; Miller and Maserie, 1990; Wyatt and Spiegelhalter, 1992).
Although evaluations of medical expert systems have rarely gone beyond the
computer's ability to identify the "gold standard", other issues have been identified as
potentially problematic with these systems as a decision aid. First, the human interface is
almost always cited as a problem (e.g., Berner, Brooks, Miller et al., 1989; Collinson, Gray,
Carson, and Cramp, 1989; Harris and Owens, 1986; Miller, 1984; Shortliffe, 1990). In particular,
many such systems require that the practitioner enter data into the computer so that it can have
the information necessary to perform its reasoning. It is neither the practitioner's job, nor place,
to spend time doing data entry. This is why one of the most cited requirements for a successful
medical informatics system is to already have the necessary data on-line (Linnarson, 1993;
Miller, 1984; Shortliffe, 1990). Second, these systems may have an incomplete knowledge base
or use simplifying assumptions that make them brittle, meaning that they can fail on cases that
the system was not designed to handle. This leaves the practitioner in the role of having to
detect and correct any problems generated by faulty computer reasoning (Aikins, Kunz and
9
Shortliffe, 1983; Andert, 1992; Bankowitz, McNeil, Challinor, Parker, Kapoor and Miller, 1989;
Bernard, 1989; Berner, Brooks, Miller et al., 1989; Gregory, 1986; Guerlain, Smith et al., 1994;
Harris and Owens, 1986; Miller, 1984; Roth, Bennett and Woods, 1988; Sassen, Buiël and
Hoegee, 1994).
A more in-depth evaluation of this "consultation" model of decision support reveals the
possible underlying causes for the poor user acceptance of these kinds of systems. For example,
Roth, Bennett and Woods (1988) conducted a study analyzing how users of an expert system
(designed to aid in the diagnosis of electro-mechanical equipment failures) interacted with and
used the system to diagnose faults. This study focused on the actual interaction with the
computer by particular users diagnosing particular faults, rather than on a global evaluation of
whether or not the expert system's knowledge base was accurate. This kind of evaluation
revealed reasons why the "consultation" mode of decision support is not cooperative at all.
With this consultation model, a human decision maker must give up control of the
problem-solving to the computer. Once the "black box" generates an answer, the human must
decide, often without adequate understanding of the computer's reasoning process, whether or
not to accept the computer's diagnosis or treatment plan. The system's support focuses on the
outcome of a decision, without providing the users of such systems adequate information about
the computer's problem-solving process. Furthermore, since the user is not involved in the
problem-solving, it may be necessary for the person to independently solve the problem in
order to adequately accept or reject the computer's proposed solutions. In other words, the
person must be an "expert" at the problem-solving to be able to detect and recover from any
faulty inferences or conclusions generated by the computer. However, by assigning the
computer the task of doing the routine problem-solving, users are giving up control to the
10
computer and losing skill in the meantime. The only other alternative is for users to always
solve the problems themselves, in which case use of the computer has little utility.
The traditional decision support model of having a computer independently solve a
problem and provide a final answer lessens the possibility of establishing a common ground for
communication. Users of such systems, who must "listen" to the computer as it tries to
retrospectively explain its reasoning (if such a facility is available), can become frustrated very
quickly, since the computer's explanation capabilities generally do not follow the
communication model employed by humans. Rather than engaging the user in the task,
supporting cooperative work, an expert system may actually cause breakdowns in
communication. Users must stop what they are doing and actively try to understand the actions
that the computer has done without their knowledge, and interpret messages that may not be
commensurate with their skill level or their formulation of the problem (Malin, Schreckenghost,
Woods, Potter, Johannesen, Holloway, and Forbus, 1991). Furthermore, the expert system's
consultation often comes at a time when the practitioner is busiest with other task demands
(Johannesen, Cook and Woods, 1995; Wiener, 1989).
The "Automated Assistant" Model of Decision Support
Many of the problems associated with the classic "consultant" expert system are present
in other forms of decision support that rely on an automation philosophy. For example, there
are many examples of computers serving as automated assistants, performing some subtask for
the person. Usually such systems are fairly well-integrated with the person's current task
environment. For example, the Traffic Collision Avoidance System (TCAS) is a system that is
installed in most commercial airplanes to monitor for surrounding air traffic and provide
warnings if air traffic is detected nearby. Furthermore, it instructs the pilot on how to avoid
11
traffic if certain safety envelopes are violated. Pilots are instructed to follow the advice of the
system.
Similar to the problems with expert system decision support systems, automated
assistants can be difficult for users to understand or use effectively. TCAS, for example, has
many modes of operation that can potentially confuse the user, since the system will act
differently depending on what mode it is in. Mode error has been cited as a common problem
with automation (Sarter and Woods, 1994) and can be the cause of some major accidents.
A second similarity is that automated assistants can also have brittle performance, such
as when there is noisy data or when a situation is encountered that the designer had not
anticipated. For example, when TCAS was introduced, the system did not know how to discern
"real" traffic from normal routine traffic and would generate false alarms. For example, TCAS
would sometimes instruct pilots to descend when they were just taking off, because the system
would detect traffic from the planes coming in for a landing and the planes on the ground.
Controlled studies have shown that a person using a brittle automated assistant may
not be able to cope well with such failure situations (Guerlain, 1993; Layton, Smith and McCoy
1994). First of all, there may be a biasing effect such that inappropriate inferences made by the
computer may seem reasonable to the person. Layton, Smith and McCoy (1994), for example,
evaluated alternative methods for providing decision support to pilots and dispatchers
rerouting an airplane due to bad weather on the original flight plan. It was found that a
significantly greater number of subjects using a partially automated system (that would
automatically generate alternative routes when problems were detected along the current route)
would select a faulty computer-generated plan over alternatives that they had explored, even
though in retrospect they concluded that they should never have accepted such a solution
12
because it was very risky. Nine out of ten members of a Control Group, who did not have the
computer generating any solutions, either rejected or did not generate this risky plan.
A similar phenomenon was found in an investigation of computer support systems
designed to aid with the identification of antibodies (Guerlain, 1993). A significantly greater
percentage of subjects who had the partially automated version of the system (that would
automatically rule out antibodies based on the available evidence) ruled out the correct answer
and misdiagnosed the case than subjects who did not have this automatic function available.
Furthermore, nine out of ten subjects who ruled out the correct answer with the automated
function did so without any further analysis of the case. Seven out of 10 subjects who did not
have the automated rule-out function available collected additional data before finishing the
case.
Different phenomena have been proposed to account for this biasing effect of a partially
automated system. This may be due to the person: 1) not being skilled enough to judge the
validity of the computer's conclusions, 2) not being actively involved in the task, 3) having an
inappropriate mental model of the system, 4) overreliance on the system, or 5) the triggering of
human cognitive biases (Fraser et al., 1992; Guerlain, 1993b; Layton, Smith and McCoy, 1994).
The study by Layton et al. (1994) has yielded some insight as to why performance can be better
when the person is actively engaged in the problem-solving him/her-self. It was found that
very local, data-driven factors can trigger a person's expertise at the appropriate time. The
verbal reports of the subjects, for example, showed that subjects would consider uncertainty in
the weather when generating a flight plan, but not when evaluating a flight plan that was
generated by the computer. By relegating the person to a higher level supervisory role, subjects
using the automated assistant were not encountering the triggering situations that allowed them
to apply their expertise as they would when doing the task themselves.
13
In conclusion, many problems have been identified with the automation or partial
automation model of decision support. Users of such systems may lose skill, may become
frustrated with the system because they cannot understand its reasoning process, may
misunderstand its intentions or reasoning, and may not be able to adequately cope with the
system's brittleness.
Supporting the Decision Making Process Rather than Replacing It
An automation philosophy is one that intends to reduce the consequences of human
error by replacing the fallible human. However, if the computer's reasoning is fallible, then this
philosophy breaks down. A different approach is to support the process by which humans make
decisions and solve problems, thus making it less likely that outcome errors will occur due to
faults in the person's reasoning or other errors made along the way. It is often the case that just
one faulty step in the reasoning process can lead a person astray. By focusing on supporting the
human's decision making process, it may be possible to correct the person at the site of the
problem, as s/he begins exploring a faulty path or making a judgment error that could lead to
an incorrect outcome. Such a decision aiding strategy relies on the ability to detect the kinds of
errors that people are likely to make.
Human Error
To a large extent, process errors can be predicted by studying the task domain and
understanding the strategies by which people solve problems in that domain. One major reason
for people's inability to be perfect problem solvers is the limits of their information processing
system, which only allows them to keep a few "chunks" of information in short term memory at
a time (Miller, 1956; Newell, 1972; Wickens, 1984). Thus, people must use strategies that reduce
the amount of information that must be considered at one time in order to achieve their goals.
14
One way to do that is to use heuristic reasoning methods that narrow down the search space.
People can use general, or "weak" methods, that are applicable to many problem-solving tasks,
such as a brute force technique (trying all possible solutions to see which ones fit) or means-
ends analysis (working towards a final goal by applying operators that will move the current
state of the problem-solving closer to the goal state). People can also use "strong" methods, that
take advantage of domain-specific knowledge of the task characteristics and problem
constraints. Such heuristic methods, whether "weak" or "strong", help people cope with their
limited information processing capabilities in order to achieve "good" overall performance.
However, such heuristic methods may sometimes fail, leading to poor outcome performance.
Kahneman and Tversky (1974), for example, have noted that the use of some common
heuristic judgmental methods (such as the representation heuristic) can lead to certain biases in
reasoning (such as insensitivity to sample size and the gambler's fallacy). This phenomenon of
heuristics leading to errors is not limited to the judgment of probabilities. All heuristic
reasoning strategies have the capability to fail in certain "garden-path" instances, in which the
assumptions behind the use of the strategy are violated.
For example, one documented problem-solving approach is for people to use an
elimination by aspect strategy (Tversky, 1972). This strategy is such that people will prune a
search space by selecting one aspect or characteristic of the problem and eliminating all those
solutions that do not meet a specified criteria on that dimension. This process is repeated on a
successive number of dimensions until a single solution is reached. For example, if one is
searching for a job, the location might be the first aspect considered. Thus, all jobs outside of the
location of interest would be eliminated from consideration. Salary might be the second
dimension, such that all jobs below a certain salary would be eliminated from the search space.
This process would continue until one solution had been reached. However, by reducing the
15
search space in this manner, a job that is absolutely spectacular on all other aspects and globally
preferred, but not ranked high on a previously considered aspect, would not be taken into
consideration.
Slips of Action
In addition to the errors induced by the use of fallible heuristic strategies, people make
errors because of slips, in which the person has the correct intention but fails to carry it out
correctly, either because the intention was forgotten (an error of omission) or the intention was
incorrectly carried out (error of commission) (Norman, 1981). Slips of action can account for
many of the errors made in interacting with the environment and can contribute to serious
outcome errors. For example, in the hospital blood bank, a year-long study of performance
showed that by far the most common type of error was a slip, e.g. a transcription error, and that
sometimes such a slip could have dire consequences (such as transfusing the wrong kind of
blood to a patient).
Mistakes
Mistakes are distinguished from slips in that they occur at the level of intention
formation rather than at the level of action selection. Thus, one may perform the correct action
sequence given the intention, but the intention is inappropriate for the given situation. If a
situation is not assessed appropriately, the prerequisites for an inappropriate rule may be met
and thus the rule is correct for the given situation assessment, but the assessment itself is wrong.
Alternatively, the situation may be assessed appropriately, but the wrong rule is instantiated
(Reason, 1990).
16
Mistakes can also occur because a person does not have the requisite knowledge to
adequately perform or understand the task. It is conceivable that in any complex problem-
solving task, all but true experts at the task will have some missing knowledge which will
hinder their problem-solving for at least some task situations.
Skill Development and Skill Maintenance
One major difference that has been identified between novices and experts is that
experts have a much better mental model of a situation, which allows them to interpret
appropriate cues to guide problem solving. Mental models or schemas are the representational
structures in memory which guide information storage and retrieval. As people learn a task,
they build up a mental model over time that, as it becomes more accurate, allows them to
become better and better at problem-solving. As people develop skill, they continue to test and
refine their knowledge. If such a process does not continue, then people will lose their expertise
over time. Therefore, skill development and skill maintenance are important factors that
contribute to the human's capability to perform well at a task.
Requirements for a Cooperative Problem-Solving System
Thus, when designing a decision support system, it is necessary to conduct an in-depth
cognitive task analysis of the task domain in order to understand the kinds of problem-solving
strategies that people are likely to employ and where those strategies might lead them astray.
This helps to define appropriate opportunities for a decision support system to re-direct
performance or warn the user if his/her solution could be improved by looking at the problem
in a different way. Furthermore, the system should be able to help users detect and recover
from slips, and help people to develop and maintain their skill. A decision support system
should also not hide information that would normally allow a person to detect and recover from
17
errors. Relegating the user to a supervisory control role, for example, may change a person's
ability to detect anomalous situations (Layton et al, 1994). A decision support system should in
fact encourage and teach the effective use of error detection strategies and supply the user with
the necessary cues and information to be able to do so.
People will also build up a mental model of the tool that they are using. A decision
support system should be designed to encourage a good mental model of the situation and of
how the tool is designed to aid in the analysis of that situation. Many researchers have
identified the importance of providing users with a good understanding of how a decision
support system works so that people can effectively judge the appropriateness of its analysis of
the situation (e.g., Giboin, 1988; Lehner and Zirk, 1987; Muir, 1987; Roth, Bennett and Woods
1988), similar to the way effective human-human teams work together (Serfaty and Entin, 1995).
This requires ample training and hands-on use of the system (Guerlain, 1993a; Muir, 1987). It is
also important for the design of the decision aid to be based on an effective understanding of
how the users of the system view the problem-solving, so that the computer's advice is relevant
(van der Lei, Westerman, and Boon, 1989)
Although much has been studied and written about individual aspects of human
problem solving and decision making, a critical challenge that now confronts us is how to
integrate what has been learned in order to develop decision aids that accommodate these
human characteristics. Simply put, a good cooperative problem-solving system should work
well with people. Not so simply put, it must try to overcome or supplement some of the
limitations of human information processing, reduce the consequences of human error, and
allow people to still apply their skills. Furthermore, it should encourage learning through
practice and feedback. It also needs to fit in with the person's current environment without
being obtrusive, difficult to learn, or overly complicated to use.
18
The Critiquing Model of Decision Support
The critiquing model is the third form of decision support that will be considered here.
Critiquing systems were originally explored as a decision aiding strategy by Perry Miller. A
critiquing system is a computer program that critiques human-generated solutions (Miller,
1986). In order to accomplish this task, the critiquing system must be able to solve parts of the
problem and then compare its own solution to that of the person. Then, if there is an important
difference, the system initiates a dialogue with the user to give its criticism and feedback. The
primary difference, then, between a critiquing system and an automated or partially automated
system is that the person always initiates actions and the critiquing system only uses its knowledge to
react to the user's understanding of the problem.
Previous Critiquing Studies
The first attempt at building a large-scale critiquing system for the medical community
was made by Miller (1986). He developed a prototype system, called ATTENDING, which was
designed to work in the anesthesiology domain. Based on this initial research, he also
experimented with critiquing systems for hypertension, ventilator management, and
pheochromocytoma workup. All of these prototypes operated in a similar manner. The user
was required to enter information about the patient's status and symptoms, as well as the
proposed diagnosis and treatment. The computer then critiqued the proposed solution,
generating a three paragraph output summarizing its critique.
Miller saw much potential to the critiquing approach and was able to provide
recommendations to other designers for developing good critiquing systems. First, Miller
concluded that choosing a sufficiently constrained domain was important. ATTENDING was a
system attempting to aid anesthesiologists in treating their patients, a task that takes years for
19
people to learn and practice. Attempting to build a useful expert system in this field turned out
to be too difficult due to the expanse of knowledge required. This lesson led him to switch to
the more constrained hypertension domain. Second, Miller concluded that critiquing systems
are most appropriate for tasks that are frequently performed, but require the practitioner to
remember lots of information about the treatment procedures, risks, benefits, side effects, and
costs, as these are conditions under which people are more likely to make errors if unaided, thus
making the critiquing system potentially valuable.
A second critiquing system was developed by Langlotz and Shortliffe (1983), who
adapted their diagnostic expert system, ONCOCIN (designed to assist with the treatment of
cancer patients) to be a critiquing system rather than an autonomous expert system because
they found that: "The most frequent complaint raised by physicians who used ONCOCIN is that
they became annoyed with changing or 'overriding' ONCOCIN's treatment suggestion". It was
found that since a doctor's treatment plan might only differ slightly from the system's treatment
plan (e.g., by a small difference in the prescribed dosage of a medicine), it might be better to let
the physician suggest his/her treatment plan first, and then let the system decide if the
difference is significant enough to mention to the doctor. In this manner, the system would be
less obtrusive to the doctor. Thus, Langlotz and Shortliffe changed ONCOCIN to act as a
critiquing system rather than a diagnostic expert system with the hopes of increasing user
acceptance.
A third critiquing system, called JANUS, was developed by Fischer, Lemke, and
Mastaglio (1990) to aid with the design of kitchens. It is an integrated system, in that the user is
already using the computer to design, and the system uses building codes, safety standards, and
functional preferences (such as having a sink next to a dishwasher) as triggering events to
critique a user's design.
20
To test the potential value of critiquing systems, Silverman (1992b) compared
performance on two versions of a critiquing system designed to help people avoid common
biases when interpreting word problems that included multiplicative probability. The first
system only used debiasers, meaning that it provided criticism only after it found that the user's
conclusion was incorrect. It had three levels of increasingly elaborate explanation if subjects
continued to get the wrong answer. Performance was significantly improved with the critiques
than without (69% correct answers for the Treatment Group after the third critique vs. 4%
correct for the Control Group), but was not nearly perfect. Subsequently, a second version of
the critiquing system was built that included the use of influencers, i.e., before-task explanations
of probability theory that would aid in answering the upcoming problems. With the addition of
these influencers, performance improved to 100% correct by the end of the third critique.
In examining these results and the performance of several other critiquing systems on
the market, Silverman (1992b) proposed that to be effective, a critiquing system should have a
library of functions that serve as error-identification triggers, and include the use of influencer,
debiaser, and director strategies. (A director demonstrates a strategy to the user). He sums up
his definition of a good critiquer by saying: "A good critic program doubts and traps its user
into revealing his or her errors. It then attempts to help the user make the necessary repairs."
The final study that will be discussed was conducted in our lab (Guerlain, Smith et al.,
1993). Knowledge about how to rule out antibodies was encoded into a computer, and
critiquing the user at the task (AIDA1) was compared to having the computer perform that
subtask (AIDA2). There was no statistical difference in outcome errors for cases for which the
computer's knowledge was competent (12% vs. 6% misdiagnosis rate). On a case for which the
system's knowledge was brittle, however, the critiquing system reduced misdiagnosis rates
from 72% down to 43% (p < 0.05).
21
Thus, the design of critiquing systems has been explored in a number of domains, but
we have only been able to find two studies using objective data to evaluate actual use of such
critiquing systems. Silverman's study compares alternative designs for a critiquing system,
finding improved performance with additional levels of critique when teaching students
probability theory. The study by Guerlain is the only source of objective data contrasting the
design of decision support systems based on the automation or partial automation models vs.
the critiquing model and looks at the processes by which actual practitioners using such a
system are aided with this kind of support. Guerlain's results suggest that cooperative problem-
solving is superior on brittle cases when using the critiquing model of decision support.
Potential Problems
Despite the potential value of critiquing systems, the act of designing a computer to
critique human performance is not sufficient for satisfying the requirements of a good
cooperative problem-solving system. For example, the ATTENDING systems developed by
Miller share many of the problems identified with the consulting expert system model of
decision support. These systems are designed such that the physician is required to enter the
patient symptoms as well as his/her proposed solution and then read the relatively long output
generated by the computer. Thus, the physician is required to act as the computer's secretary,
typing in all the information that is required (similar to many diagnosis and management expert
systems).
In order for a critiquing system to be successful, it should require very little extra effort
on the part of the human to interact with it. The computer must be able to directly infer the
user's conclusions, which can only be done if the person is already using the computer as an
integral part of task performance. The critiquing version of ONCOCIN was a step in the right
22
direction. Physicians were already using ONCOCIN to fill out patient data forms, so the expert
system used this information as its primary source of protocol data for the patient. JANUS was
also an integrated system, allowing users to design kitchens on the computer and get feedback
in the context of their work. The designers of AIDA1 used the strategy of putting the paper
forms normally used in the blood bank lab onto the computer so that the technologist's
problem-solving strategy could be directly inferred by his/her use of the system.
Second, users of critiquing systems still need to have an adequate understanding of the
computer's reasoning, so that the person can interpret messages appropriately. Silverman, for
example, found that an "influencer" that provided before-task explanations of the model of
problem-solving employed by the computer provided the users with a better understanding of
the problem domain and significantly improved their performance when using the critiquing
system.
Finally, a true test of a cooperative problem-solving system is that users of such a
system are able to detect faulty reasoning on the part of the computer. Silverman provides
some of what little data exists in terms of empirical assessments of critiquing systems, finding
significant improvement in performance with their use. The domain that he studied, however,
was an artificial task with untrained students as subjects. Perhaps most importantly, the task
was one where the system's knowledge was guaranteed to be correct. Thus, if the user
understood the advice being given by the computer and heeded it, s/he would always get the
case right. The study by Guerlain (1993) is the only known study that has tested the critiquing
model in a more complex, real-world domain and included an examination of performance on
cases for which the system's knowledge was brittle. That study provided some initial evidence
that having a computer act in a critiquing role can mitigate the brittleness problem. However,
overall misdiagnosis rates in that study were still poor (averaging 19% across all cases). Thus,
23
some of the users' process errors (related to ruling out antibodies) may have been reduced, but
the computer support was not enough to reduce outcome errors to an acceptable level.
Potential Benefits
Many of the problems with critiquing systems that were identified above have to do
with interface issues that can be resolved through better interface design and integration with
existing representations and data structures. Thus, although some critiquing systems (i.e., those
developed by Miller) and many traditional decision support systems have a common problem
(i.e., too much data entry required by the practitioner), these issues can be resolved for both
kinds of systems with a better interface (as was done by the designers of ONCOCIN, JANUS,
and AIDA1).
Furthermore, issues that can not be so easily resolved with systems that are based on an
automation philosophy (such as having control over the automation, having an appropriate
mental model of the system, and losing practice and expertise when using the system) can
potentially be resolved by designing the system using the critiquing approach. Critiquing
systems are potentially more cooperative and informative to practitioners than automated or
partially automated systems because they structure their analysis and feedback around the
problem-solving strategies and proposed solution generated by the user. Since there are often
many ways to solve a problem, the fact that the system uses the person's initial solution or
partial solution as a basis for communication reduces the amount of redundant information that
must be discussed. This contrasts with traditional diagnosis systems, where the computer
generates the entire solution and is unaware of the conclusions drawn by the practitioner. In
such situations, it is up to the person to process the computer's output, compare what it has
24
proposed to what s/he thinks s/he would have done, and then think about any differences that
were detected between the machine- and human-generated solution.
With the critiquing approach, the burden of making the initial comparison and deciding
what needs to be discussed further is placed on the computer (or, more accurately, on the
computer system designer). Furthermore, the feedback focuses on the particular aspects of the
solution that are in question. The feedback is therefore more likely to be pertinent to the user,
and in turn more understandable and hopefully more acceptable (Langlotz and Shortliffe, 1983;
Miller, 1986). In addition, partial or intermediate conclusions proposed by the user can be
critiqued immediately (instead of waiting until a complete answer is formulated by the person),
providing feedback in a more timely and potentially more effective context.
Furthermore, users of a critiquing system are doing the task themselves, and thus are
still able to apply their own skills and strategies. This is important for many reasons. First,
practitioners will not lose skill because of the introduction of the decision support system. In
fact, they may become more skilled because of the feedback provided by the decision aid
(Fisher, Lemke, and Mastaglio, 1990). Second, users of a critiquing system have the potential to
build up a better mental model of the decision support system's knowledge because they will be
reminded of the computer's view of the problem-solving each time they do something that the
computer thinks is wrong (Guerlain, 1993; Fisher, Lemke, and Mastaglio, 1990). Third, because
the system is only reactive to aspects of the task that it is knowledgeable about, the person can
still apply extra expertise and/or different strategies that may complement the computer's
knowledge. For example, because practitioners are doing the task themselves, they are more
likely to detect anomalous situations because they may encounter event- or data-driven
triggering factors that call to mind relevant expertise during their problem-solving. Thus, there
25
is the potential for better overall performance than either the computer working alone (as in the
automation mode) or the person working alone (as in the pre-automation mode).
Other potential benefits of critiquing systems are the following:
� Critiquing systems are flexible - they can work in conjunction with other decision support
techniques, such as good representations of the problem and can be seamlessly integrated
with information that is already online (Console, Conto, Molino, Ripa di Meana, and
Torasso, 1991; Guerlain, 1993; Fisher, Lemke, and Mastaglio, 1990; van der Lei, Musen, van
der Does, Man in 't Veld, and Bemmel, 1991).
� Critiquing systems can be designed to detect and correct common human weaknesses - such
as slips, mistakes, process errors, biased reviewing, hypothesis fixation, etc. (e.g., Silverman,
1992c).
� Critiquing systems are less likely to trigger human cognitive problem-solving biases.
� Critiquing systems can work in context of the task.
� Critiquing systems can not only be used as an on-line decision aiding system (Lepage,
Gardner, Laub and Golubjatnikov, 1992), but also to give experts practice on rare and
difficult cases, as a testing device to give feedback to supervisors and regulatory agencies
(van der Lei, Musen, van der Does et al., 1991) and to train new practitioners (Console,
Conto, Molino, Ripa di Meana, and Torasso, 1991; Fisher, Lemke, and Mastaglio, 1990;
Smith, Miller,, Fraser et al., 1990; Smith, Miller, Gross et al., 1991; Smith, Miller, Gross et al.,
1992).
Finally, there is evidence that designing a decision support system as a critiquing
system may be a strategy to mitigate the brittleness problem of expert systems. First, critiquing
systems that are acting on an incomplete knowledge base can still be helpful, whereas an
26
automated expert system cannot generate a solution if a problem is ill-specified (Fischer, Lemke
and Mastaglio, 1990). Second, evidence from the AIDA study showed that designing a system
as a critiquing system rather than a partially automated system reduced error rates on a case for
which the system's knowledge was incompetent by 29%. This suggests that humans are better
able to judge flaws in the computer's reasoning when interacting with a critiquing system than
when interacting with a traditional expert system that leaves the human out of the decision
process until the computer has completed its inferences.
The Next Step: Designing a Proof-Of-Concept Cooperative Critiquing
System
Critiquing is proposed to be a good model for studying the design of effective
cooperative problem solving computer systems. Although many aspects of critiquing systems
have been identified as potentially good ways to promote cooperative problem-solving, very
little research has been done to test the efficacy of these claims. The focus of the research
conducted here was to try to develop a proof-of-concept critiquing system that would
successfully aid practitioners on a wide range of difficult problems. The design strategy used
was to conduct an in-depth cognitive task analysis of the domain of interest (antibody
identification) and design a system that addressed the domain-specific problems identified, as
well as the general problems with many decision support systems that were identified in this
chapter. The next chapter (Chapter III) details the results of the cognitive task analysis of
antibody identification while Chapter IV discusses the design concepts used to develop an
integrated cooperative problem-solving system that revolves around the critiquing model of
decision support.
27
Chapter III
Antibody Identification as a Testbed
One domain that we have found to be highly suitable for studying the use of computer
aiding is that of antibody identification. This is a laboratory workup task, where medical
technologists must run a series of tests to detect antibodies in a patient's blood. Antibody
identification satisfies all of the requirements outlined by Silverman and Miller. It is a
sufficiently constrained domain which is frequently performed but difficult for people to do. It
requires analyzing a large amount of data and deciding which tests to run to yield the most
information. There is large variation in practice as to how to solve antibody identification cases,
and technologists have been documented to make errors in transcribing and interpreting the
data (Smith et. al., 1991; Strohm et. al., 1991). Furthermore, it has the classical characteristics of
an abduction task, including masking and problems with noisy data.
The Practitioners, a.k.a., "Medical Technologists" or "Blood Bankers"
Blood bank practitioners are trained by going to medical technology school. At a
minimum, students must complete a two-year program past high school to be certified as a
Medical Lab Technologist (MLT). During this time, an MLT learns not only blood banking, but
many other medical technology areas such as hematology and chemistry. A more advanced,
four-year bachelor's degree leads to certification as a Medical Technologist (MT), which is also a
general program involving many areas besides blood banking. After becoming an MT, one can
enroll in a Specialist in Blood Banking (SBB) program. The SBB program requires two years of
clinical experience and a baccalaureate for entry. The program is one to two years in length and
28
may lead to a master's degree in addition to certification as an SBB. For simplification, the term
"Medical Technologist", "MT", or "blood banker" will be used to describe practitioners who
work in the blood bank, but keep in mind that the discussion applies to practitioners at all levels
of certification.
The Goal of Antibody Identification: Finding Compatible Donor Blood
The blood banker�s goal is to make sure that a patient who needs a blood transfusion
does not have a transfusion reaction. One type of transfusion reaction takes place when the
patient�s immune system "recognizes" the donor blood as being foreign and attacks it. This can
happen because antigens, which are chemical structures on red blood cells, can elicit an immune
response in the form of antibodies. Antibodies can form against any foreign antigens that are
detected, i.e., those antigens that are present in the donated blood but not present in the
patient's blood.
Since there are over 400 known human blood antigens, the potential for incompatibility
between donor and recipient blood is quite high. When identifying compatible blood for a
patient, blood bankers do not try to find donor blood that exactly matches the antigenic
characteristics of the patient's blood because of the effort and cost involved. Rather, they try to
determine which antibodies the patient has at the time of the transfusion, and then give donor
blood lacking the antigens that those antibodies will recognize and attack.
Antibodies can form whenever foreign antigens are introduced into the human blood
stream (including past blood transfusions). Therefore, it is possible that the donor blood that
was compatible for transfusion one month will no longer be compatible the next month, because
in that time period, the patient may have formed antibodies against some of the antigens
present in the previously donated blood. For this reason, it is necessary to re-test patients for
antibodies each time they are to receive a transfusion.
29
The Antibody Identification Procedure
While the blood banker�s goal is to determine which antibodies are present in the
patient�s blood, the only direct information conveyed by running a blood sample test is whether
or not agglutination has occurred. Agglutination is a clumping of blood cells which indicates
that antibodies in the patient's serum have bound to antigens on the foreign red blood cells.
Agglutination in the test tube can be seen with the naked eye in most cases, but sometimes must
be confirmed via a microscopic examination.
Given that agglutination has occurred in a set of tests, blood bankers must then make a
series of inferences to determine what antibody-antigen reactions must have occurred to have
caused the agglutination. In going from raw data to a diagnostic conclusion, blood bankers
must call upon a large body of factual knowledge, apply strategies that have either been taught
or derived from past experience, and make hypotheses and predictions to help them through
the problem-solving process. The more advanced knowledge and strategies used by expert
blood bankers may take years for practitioners to learn. Indeed, these more advanced problem-
solving skills are never mastered by some practitioners.
The basic procedure for typing a patient's blood for antibodies involves combining test
cells (red blood cells), which contain known antigens, with the patient's serum, which may
contain antibodies. The test cells have been carefully typed for the presence or absence of
antigens by the commercial supplier of the cells. When the test cells and the patient�s serum are
combined, the blood banker looks for agglutination, which indicates that antibodies from the
patient�s serum have bound to some of the antigens contained in the test cells. The amount of
agglutination is rated on a scale from 0 (no clumping) to 4+ (very strong; one big clump).
Initially, this process is performed with two or three different test cells that cover all of
the major antibodies likely to be formed. This is called the antibody screen test (see Figure 1). If
30
there is a positive reaction with any of these screening cells, then the process is repeated with
many more test cells to allow the blood banker to determine which antibodies the patient must
have to be causing the reactions.
Figure 1. The Antibody Screen
There are several different reagents which, when added to the patient serum/test cell
combinations, will enhance or diminish the reactions of certain antibodies. For example, adding
enzymes to the test tubes will have the effect of enhancing the antibody reactions to the antigens
in the Rh system of antigens and eliminating the antibody reactions to the M, N, S, s, Fya, and
Fyb antigens.
The test tubes can also be heated or cooled. At certain temperatures, some antibodies
are more likely to agglutinate than others. Because antibodies react differently depending on
the process of testing, it is not enough to mix each patient serum/test cell combination in just
31
one way. At a minimum, three tests are generally performed: Immediate Spin (mixed and read
immediately), 37° (heated for 30 minutes at 37°C, then mixed and read), and AHG (also known
as the �Coombs� phase, where anti-human globulin reagent is added to the vials, then mixed,
washed, and read).
Types of Knowledge Needed
Knowing the antibody history of patients will help blood bankers begin blood typing
because once an antibody has formed once, the immune system remains sensitive to the
corresponding antigen and will quickly form antibodies against that antigen if it is seen again.
Although specific antibody formation histories are often unavailable, there are general
clues that give blood bankers a sense of the likelihood of antibodies being present. For example,
the more times a patient has had foreign red cells introduced into their blood stream (from past
transfusions or past pregnancies), the more likely it is that the patient has formed antibodies.
Knowing a patient�s ethnicity can also help blood bankers to make blood typing
inferences. For example, people of Caucasian background rarely display the V antigen. This
antigen is much more common among people of African descent.
Even without any information about a patient, blood bankers can draw on knowledge
about the formation of antibodies in general to aid them in their diagnosis. Some antigens are
more common and elicit a greater degree of antibody formation than others. For example, it is
likely that anti-D (the antibody formed against the D antigen) will form before anti-C in a
patient who has been given previous transfusions containing the C and D antigens, assuming
that the patient lacks both those antigens.
Blood bankers can also use knowledge about the distribution of antigens in the
population to help determine the likelihood of various antibodies being present. Two
conditions must normally be met before an antibody will be formed: 1) the patient�s blood must
32
lack the antigen and 2) the patient must have been exposed to that antigen. Antibodies against
high incidence antigens are therefore rare because almost all patients contain these antigens and
would not normally form antibodies against them. Antibodies against low incidence antigens
are also rare even though almost all patients lack these antigens, because exposure to them is
unlikely.
Blood bankers sometimes experience difficulty when interpreting a set of agglutination
reactions. This difficulty arises because there is not a definite one-to-one mapping between a
positive reaction and a particular antibody. On the contrary, a positive reaction means only that
one or more antibodies have reacted with one or more of the many antigens present on a given
test cell. To determine which antibodies are present in a patient�s blood requires a process of
elimination similar to the exercises found on a logic exam:
If the C antigen is present on all of the positively reacting cells and not present on all of
the negatively reacting cells and the patient lacks the C antigen, then anti-C could be
one of the antibodies present.
The use of such rules is not necessarily straightforward. Belief in hypotheses must be tempered
by the fact that there could be noisy data, weak expressions of antibodies, or multiple antibodies
together, overlapping and potentially masking the presence of others.
Characteristics that make antibody identification difficult
Antibody identification is only one of the many tasks performed by medical
technologists when they work in the blood bank. There are many factors about this task that
make it difficult to perform well, some of which are listed below.
33
Medical Technologists (MTs) arrive in the blood bank with minimal practice
Many Medical Technologists arrive in the blood bank lab out of school with very little
practice in solving antibody identification cases. Consequently, they have not yet adequately
learned how to apply the facts taught in school, nor have they formed the effective strategies
necessary to solve the wide range of antibody identification cases that they might encounter.
Quite often, it is the responsibility of the blood bank supervisor to oversee and train newly
arrived graduates on the procedures for solving a case. This means that the extent to which the
trainee learns the task is dependent on the local instruction received within the lab. Such local
instruction is neither required nor formalized and may or may not include antibody
identification problems. It is not necessary for blood bank supervisors to train their techs on this
task. Therefore, the skill and problem-solving strategies used by practitioners can vary to a
great extent.
Most MTs rotate
Since Medical Technologists are trained in many areas, such as Chemistry, Hematology,
and Blood Bank, they may rotate, working in one area for some time period before moving on to
the next area. Thus, they may work in Blood Bank just a few months out of the year. Because
blood banking requires so much knowledge and skill, it is difficult for a rotating practitioner to
develop and maintain expertise in that area.
Infrequent encounters with difficult cases
Most of the cases a blood banker encounters require only an initial test called an
antibody screen test. If the results of this initial test are negative, then there is no need to run
the further tests required for a full workup, since no antibodies are likely to be present.
34
Therefore, depending on what shift the blood banker works and the size of the hospital, it may
be seldom that the blood banker gets a case with positive results from the initial screening test.
With so few encounters of cases requiring a full workup, there is less of a chance for blood
bankers to develop a sense of the probabilities of various antibody combinations and to build up
the pattern recognition and problem-solving skills that can aid in this kind of diagnosis.
Very little feedback on performance
Unless blood bankers ask for assistance on a case, they work alone to determine the
answer. In most labs, no one checks their procedure or their reasoning for coming up with an
answer. Furthermore, once a diagnosis is made, blood bankers never really know whether they
were "right" or not. Based on the antibodies that they identify, blood lacking the corresponding
antigens is dispensed to the patient. If the diagnosis was wrong, there is a chance that the blood
that is dispensed would still be compatible because it may also happen to lack the antigens
against which the patient actually has antibodies. So, an incorrect diagnosis goes undiscovered
and the blood banker continues to think that s/he is performing adequately.
Even if the patient does get incompatible blood, the ensuing transfusion reaction may
not be evident as such to the administering doctor. The patient is already sick, possibly
receiving other medication and treatments, so if s/he gets sicker one day, it may not be
recognized as having been caused by the blood transfusion just received. Rather, it could be
attributed to some other procedure or precondition. Again, the blood banker may not get any
feedback and assume that "no news is good news", making it difficult for the blood banker to
accurately gauge his/her performance .
35
Expert strategies
Although there is complexity in identifying antibodies, expert blood bankers perform in
this domain quite well. The expert blood banker tries to sort out which antibodies are causing
the reactions by recognizing reaction patterns and making early hypotheses upon which to base
further analyses. In order to minimize the chance for an incomplete or incorrect diagnosis, the
expert blood banker tries to collect independent, converging evidence to "rule-in" the
hypothesized antibodies and to rule out all other possible contenders. Thus, there is a high-
level skill involved in knowing how to combine various problem-solving strategies such that the
overall protocol is likely to succeed. Following is a list of "middle-level" expert strategies that
have been identified which can be combined together to form a good protocol. In general, each
of these strategies are good, so they are numbered with a '+' sign after them, but it must be
remembered that if applied in isolation, without applying other strategies to collect converging
evidence (to guard against the fallibility of the heuristics), then poor performance can still occur.
Later, known poor strategies will be identified with a '-' sign after them, to indicate that they are
usually poor strategies to apply.
1+. Forming early hypotheses
Each time blood bankers run tests on a patient�s blood, time and money are spent.
Because there are many different tests possible, blood bankers need to know under which
conditions the various tests are diagnostic. Consequently, if they can form a hypothesis early in
the blood typing process, they can make predictions of how various tests will affect future
reactions and pick tests that are most informative for that case. For example, if the blood banker
hypothesizes that a patient has the M antibody, then s/he can combine that with the knowledge
that anti-M is most likely to agglutinate at cold temperatures and run the cells using cold
36
temperatures. This is diagnostic in this case because, if the reactions are enhanced, then that is
evidence that at least one cold-reactive antibody is part of the patient�s blood profile. If, on the
other hand, no reactions occur, then all antigens on the test cells that are cold-reactive can be
ruled out.
1a+. Hypothesizing the number of antibodies present
Experts tend to make a very early hypothesis about the number of antibodies present by
looking at the reaction patterns across test results and hypothesizing that one antibody is
accounting for each different reaction pattern. For example, if four cells are reacting '0 0 3+'
across the three phases of testing (IS, 37°, and AHG) and three cells are reacting '2+ 2+ 1+', then
the expert will hypothesize that two different antibodies are causing the two different reaction
patterns. If all the reacting cells have the same reaction pattern, '0 0 2+' for instance, the expert
will hypothesize that there is just one antibody present. If the reaction patterns are only slightly
different, some '0 0 2+' and some '0 0 3+', for instance, then the expert might hypothesize that
there is either one antibody reacting variably or two different antibodies causing the reactions.
It is possible that two or more antibodies could be reacting with the same reaction
pattern. Therefore, the heuristic that a set of reaction patterns is caused by one antibody is a
simplifying assumption that helps expert blood bankers to start forming hypotheses. Their
hypotheses might be revised later if they encounter evidence in the analysis that suggests the
need to do so.
1b+. Hypothesizing the type of antibodies present
A second way to form early hypotheses is to look at the reaction patterns and
hypothesize the type of antibodies that are present. Each antibody tends to react in a certain
37
pattern, depending on what genetic system the antibody belongs and other factors. Blood
bankers can use this information to help them identify the subset of antibodies that are likely to
react with the given reaction pattern. For example, reactions that are negative in the Immediate
Spin and 37° phases but strong in the AHG phase are most likely exhibited by antibodies
belonging to the Kell, Duffy, Kidd and Rh systems. Therefore, they will hypothesize this and
only look in those three systems for the specific antibody that matches.
1c+. Hypothesizing a specific antibody by finding a pattern match
Once the blood banker has narrowed down the possible set of antibodies and perhaps
run some further tests, then s/he will look for the specific antibodies causing the reactions. One
technique for doing so is to find an antigen that is present on all of the test cells that are reacting
with the same pattern and not present on all of the cells that are not reacting with that pattern.
If, for instance, the blood banker has hypothesized that the '0 0 3+' reactions on cells 1, 6, and 9
are all caused by the same Rh antibody, then s/he will look at the test cells for the presence of
an Rh antigen on just cells 1, 6, and 9. If one can not be found, then s/he may look in other
systems that may react with that pattern, hypothesize that two antibodies are reacting together
to form the three reactions, or hypothesize that one antibody is reacting variably, accounting for
these reactions plus other, different ones. This is one strategy in particular that, if applied alone
without alternative strategies for converging evidence, can lead to a premature conclusion that
may be wrong.
2+. Ruling out
Even though an antibody may seem very probable, it is important to rule out all other
frequent, clinically significant antibodies to be sure that other antibodies are not being masked
38
by the reactions of the first one. This converging evidence is protection against slips (Norman,
1981, Norman, 1989) and the fallibility of the heuristics used (Smith et al., 1991). If a set of
antibodies cannot be ruled out, it may become evident that the group of them together or some
subset therein could account for the reactions as well as those originally recognized as being
possible.
For example, when first looking at the antigram panel for the case shown in Figure 2, it
looks as if anti-Fyb is a very likely candidate because the Fyb antigen is present on all cells
where there is a positive reaction (strategy 1c+). However, after ruling out on this panel, four
other antibodies still remain as possibilities (see Figure 3). In looking at the remaining set, two
subsets could account for the positive reactions. Anti-E and anti-K together could account for
the reactions because one or the other is present on all reacting cells. Or, anti-E, anti-K, and
anti-Fyb could all be reacting together. At this point, it is necessary to run further tests that will
discriminate between these three sets of answers. It turns out that, for this case, anti-E and anti-
K together form the answer, not anti-Fyb as originally hypothesized. This case clearly
demonstrates the importance of ruling out other contenders even though one answer may at
first seem very likely.
39
Figure 2. Anti-Fyb looks likely
Figure 3. Anti-E and anti-K can also account for the reactions, however.
40
2a+. Ruling out using homozygous, non-reacting cells
The heuristic that most experts will use for ruling out is to look at test cells that have no
reactions ('0 0 0') and to rule out antigens that are present and homozygous on those test cells.
A homozygous antigen is one that is present on the cell without its corresponding genetic allele.
For example, Fya is normally homozygous (double dose) on a cell if its genetically paired
antigen, Fyb, is not present on the cell. An antigen is said to be heterozygous (single dose) if
both antigens in the pair are present on the test cell. An antibody tends to react more strongly
with an antigen that is homozygous than with one that is heterozygous. Therefore, if the test
cell is not reacting, it is safest to rule out using homozygous antigens, since they have a double
dose of antigen and would be most likely to cause a reaction if the antibody really was present.
2b+. Ruling out using additional cells
After running an antigram panel that has anywhere from ten to twenty test cells, there
may be some antibodies that still can not be ruled out. One strategy for ruling these out is to
selectively pick additional cells from other panels. Efficiently picking additional cells requires
having formed a hypothesis about which antibodies are present. That way, test cells can be
chosen that are negative for those antigens considered to be present and homozygously positive
for those antibodies to be ruled out. For example, if anti-C is hypothesized as being present and
anti-Fyb still needs to be ruled out, then the practitioner will look on other panels for a test cell
that is negative for the C antigen and homozygously positive for the Fyb antigen. If the
practitioner's hypothesis is correct, then there will be no reaction and anti-Fyb can be ruled out
according to the strategy of ruling out on homozygous, non-reacting cells.
41
2c+. Ruling out masked antigens by inhibiting the positive reactions
If some antibodies can not be ruled out because there are not enough non-reacting cells,
then, depending on the antibody that is causing the reactions, the blood banker can use a
procedure that inhibits those reactions (i.e., if Fya is hypothesized as being present, running the
cells at enzymes will inhibit the Fya reactions). Therefore, if the reactions are negative, then all
those antibodies that would react at that test phase with that test cell can be ruled out.
2d+. Ruling out only those antibodies that will react on the current panel
The heuristic of ruling out on non-reacting cells will only work for antigens that are
going to react given the current testing procedure. If the current procedure inhibits some
antibodies from reacting, (i.e., running the cells at enzymes will inhibit the Fya reactions, as
explained above), then only those antigens that would not be destroyed can be safely ruled out
with that panel. The practitioner, therefore, must know how the various test procedures will
affect all of the antibodies and know when to refrain from using the normal rule-out heuristic.
2e+. Ruling out the corresponding antibody if the antigen typing is positive
Running antigram panels will show the presence or absence of antibodies in the
patient�s blood. A completely different kind of test can be run to determine which antigens the
patient possesses. This test can be used to help rule out antibodies because if the patient
possesses an antigen in his/her own blood, s/he will not form the corresponding antibody,
(barring some auto-immune disorder). Therefore, the blood banker can type the patient's red
blood cells for an antigen. If the results are positive, the corresponding antibody can be ruled
out.
42
3+. Collecting independent, converging evidence
It is not a good idea to make a diagnosis based on just one type of test, or by using one
problem-solving heuristic. For example, the strategy of hypothesizing a specific antibody by
finding a pattern match (1c+) can actually lead to an erroneous diagnosis if the hypothesized
antibody is masking the presence of other antibodies. A general way to minimize the chance of
a misdiagnosis is to collect converging evidence. In other words, it is wise to use a set a
strategies and test results that will independently point to the same answer as being conclusive.
This section lists a number of meta-level strategies used by expert practitioners to help them
catch their own errors and increase the likelihood of correctly solving a case.
3a+. Making sure the patient is capable of forming the hypothesized antibodies
As a reminder to the reader, antigen typing is a different kind of test than combining
test cells with the patient's serum. Antigen typing is used to test the patient's red blood cells for
antigens. Positive results can be used to rule out an antibody as described above. This test can
also be used as a final check on the answer. If a patient is said to have an antibody, typing the
patient's cells for the corresponding antigen should have a negative result. Thus, if the results
are negative, that is converging evidence that the corresponding antibody could be in the
patient's blood. If, on the other hand, the results are positive, the antibody should be ruled out
and another answer must be found for the case. For example, in the case described above,
where anti-Fyb looks very likely based on the antigram panel alone, it turns out that an antigen
typing test shows the patient to possess the Fyb antigen and to lack both the E and the K
antigens. This information rules out anti-Fyb, which looked very likely at first, and provides
more evidence for the possibility of anti-E and anti-K.
43
3b+. Using a test procedure that is known to change the reactivity of an antibody
Another way to get converging evidence for the presence of an antibody is to run the
test cells at a different phase (i.e., using different reagents, changing the temperature of the cells,
using a longer incubation time, etc.) and see if the results change as would be predicted for that
antibody (either enhanced or inhibited). If such a change takes place, that is more evidence for
the presence of that antibody. For example, since anti-Fyb is destroyed by the use of enzymes,
then negative reactions with enzymes provides more evidence that anti-Fyb could be a
contender. (In fact, in the case described above, the reactions were NOT eliminated when
enzymes were added, providing evidence that anti-Fyb was not likely to be the only antibody
causing the reactions).
3c+. Asking, “Is this an unlikely combination of antibodies?”
Expert blood bankers will check their answer for plausibility given the normal
formation patterns of antibodies. Due to the way antibodies are formed, some antibody
combinations are extremely unlikely. For instance, anti-D will almost always form before anti-C
in a patient that lacks both the D and C antigens. Therefore, if such is the case, and the patient is
diagnosed as having anti-C alone, then that should stand out as "a unicorn", i.e., an extremely
unlikely event. It does not matter how well the normal antibody identification procedure points
to anti-C alone, such a rare finding should prompt the blood banker to rethink the case and
examine it more closely.
3d+. Making sure there are no unexplained positive reactions
Expert blood bankers will review their cases to be sure that all the data is accounted for
by their interpretations. They make sure that all positive reactions are accounted for by the
antibodies chosen to be the answer (i.e., that there are no unexplained positive reactions).
44
3e+. Making sure there are no unexplained negative reactions
Similarly, experts will check to be sure that each non-reacting test cell does not have any
of the hypothesized antibodies present.
3f+. Making sure that all remaining antibodies are ruled out.
Good practitioners will rule out all remaining, clinically significant antibodies, to make
sure that there are no underlying antibodies.
3g+. Using the "3+/3-" rule
A final way to have converging evidence for the presence of each hypothesized
antibody is to make sure that there are at least three test cells which are reacting to just one of
the hypothesized antibodies and three test cells that are negative for all of the antigens that
would cause reactions. In other words, if anti-c and anti-Jka are the hypothesized antibodies,
then the practitioner tries to have at least three cells that are positive for the c antigen and
negative for the Jka antigen, three cells that are negative for the c antigen and positive for the
Jka antigen, and three cells that are negative for both antigens. Fulfilling this condition often
requires finding additional cells on other panels that have the right characteristics. This
heuristic seems to be especially useful in avoiding the presence of extraneous antibodies in an
answer set.
4+. Solving cases in an efficient manner
Solving antibody identification cases is time consuming and expensive. For each
additional test that the blood banker runs, more time and money are spent. Therefore, blood
45
bankers need to solve cases as efficiently as possible. In order to do so, they need to form early
hypotheses and know which tests will most efficiently distinguish between the various
hypotheses. One way to do this is to pick additional cells effectively.
4a+. Picking additional cells efficiently
Selecting additional cells can aid in ruling out and confirming hypotheses. The best
way to do so is to find additional cells that will yield the most information. For example, if anti-
E is hypothesized as being present and anti-C, anti-Lea, and anti-M still need to be ruled out,
the blood banker can either find three cells, each to rule out one of the antibodies (i.e., one cell
that is negative for the E, M, and Lea antigens and positive for the C antigen, another cell that is
negative for the E, C, and M antigens and positive for the Lea antigen, and a third cell that is
negative for the E, C, and Lea antigens and positive for the M antigen). Or, the blood banker
can try to find just one cell that is negative for the E antigen and positive for the other three
antigens. This will allow all three antibodies to be ruled out at once. Obviously, if such a cell
can be found, it is much more efficient to run that one cell rather than three separate cells.
Poor problem-solving strategies
Blood bankers vary to the extent that they understand and use all the knowledge that
they need to solve a case. Poor performance can result from failing to use a good strategy or
from using incorrect strategies (or from making slips). Since the good strategies are already
outlined above, failure to use those strategies will be identified by the number followed by a '-'
sign. The following section lists known incorrect strategies that have been observed in use.
46
2-. Ruling out incorrectly
2f-. Ruling out using reacting cells
Some blood bankers will rule out using antigens that are not present on reacting cells.
This strategy will only work if there is just one antibody present. As soon as there is more than
one antibody, such a strategy might cause the correct answer to be ruled out. A cell that is
reacting only indicates that one or more of the antigens on the cells is causing reactions. It does
not indicate which ones can be ruled out.
2g-. Ruling out regardless of zygosity
Some antibodies will react more strongly with a homozygous antigen than a
heterozygous antigen. For these antibodies, if the reactions are not very strong, then the
difference might be enough for the homozygous antigen to react but for the heterozygous
antigen to not react. For this reason, it is not a good idea to rule out these antibodies using a
heterozygous cell. Many blood bankers, however, do not take zygosity into account at all when
they rule out. For those that do take zygosity into account, they may not remember which
antibodies are affected by zygosity and which ones are not.
2h-. Ruling out the corresponding antibody if the antigen typing is negative
Some blood bankers misunderstand the results from the antigen typing test, and will
rule out the presence of an antibody if the results from the corresponding antigen typing is
negative. This is the opposite of what should be done, which is to rule out the corresponding
antibody if the antigen typing results are positive.
47
4b- Not using all information provided by a test result
Often, practitioners will not make all of the inferences possible given a test result. As an
example of this problem, subjects may not use the information from a positive antigen typing on
the patient's cells to rule out the corresponding antibodies (2e-). As another example, when a
subject runs an additional cell so that s/he can rule out a particular antibody, it may be that she
fails to notice that other antibodies can be ruled out on that cell as well.
Kinds of Cases
Since most of the strategies used in solving antibody identification cases are heuristic in
nature, there will be instances where the strategies may be less useful or even detrimental to
finding the correct answer. A list of the kinds of cases encountered and the effectiveness of
various strategies follows.
One antibody reacting strongly
The simplest case a blood banker will encounter is a case where the patient has just one
antibody reacting strongly in its expected phases. Since the antibody is fairly common, the
blood banker has seen similar cases in the past. Since there is only one antibody, it is easy pick
out because all reacting cells contain the antigen and all non-reacting cells do not contain the
antigen (strategy 1c+).
Antibody to a high-incidence antigen
There are instances where a one-antibody case is more difficult to solve. If the antibody
is one that reacts against a high-incidence antigen, i.e., an antigen that almost all the test cells
contain, then the antibody will react with all of those cells. So, even though the antibody is
easily diagnosed (strategy 1c+), it is difficult to rule-out all other contenders because there are
48
no non-reacting cells. A good strategy in this case is to run the cells using a phase of testing that
is known to inhibit the reactions of the hypothesized antibody, allowing the practitioner to get
data for those antibodies that will not be destroyed by the same procedure (strategy 2c+).
Antigen typing can also help to rule out antibodies (strategy 2e+), although this is a more
expensive type of test.
Weak antibody
Another difficult one-antibody case occurs if an antibody is not reacting strongly. This
can happen for a number of reasons.
1) The patient may have been transfused long ago and may have formed an antibody that is
no longer present. However, the patient is still highly sensitive to that antigen and will
form an antibody quickly if exposed to such an antigen again.
2) The patient has been transfused recently and is currently forming antibodies.
3) Exposure to certain drugs may cause weak reactions when testing for antibodies.
4) Pregnant women who are Rh negative will often be given Rho Gam, a drug that causes
weak reactions with the D antigen.
In these cases, there may be very few positive reactions because the antibody is reacting to only
some of the test cells that contain the antigen. The result is that there is no good match between
the reactions exhibited and any one contender, so looking for a pattern match will not be an
effective strategy (1c+). Furthermore, the strategy of ruling out on homozygous, non-reacting
cells (2a+) may cause the correct answer to be ruled out. Ruling out using reacting cells (2f-),
which is normally considered to be a poor strategy, is actually one of the few effective rule-out
strategies in a weak antibody case, because it will not rule out a weak antibody, so long as it is
49
the only antibody present. If more than one antibody is present, then this strategy is likely to
rule them both out (which is why it is normally not a good strategy to apply).
The most effective strategy in this case is to run the cells in a test phase that will
enhance the reactions (3b+). Of course, one needs to be able to hypothesize which type of
antibody is reacting (1b+) to know which type of test is likely to enhance the reactions.
Knowing which antibodies are likely to react strongest in which phases will help to narrow
down the answer.
Multiple antibodies
As soon as there are two or more antibodies in a case, the task of identifying them
becomes more difficult.
On separate cells with differing reaction patterns
The simplest multi-antibody case is when the antibodies react in very different patterns
and the cells reacting to the two or more antibodies do not overlap. In other words, any cell that
contains an antigen that is causing a reaction does not contain any of the other antigens that are
causing reactions. Since the antibodies are reacting in different patterns, it is possible to
decompose the problem into several, simpler, single antibody cases (1b+).
Reacting in the same pattern as each other
If the multiple antibodies react at the same phase and temperature, such as two Rh
antibodies, then both antibodies may be reacting exactly the same. Therefore, to the
practitioner, it may appear that there is only one antibody present (1a+). Here is where the
strategy of grouping all like reactions and assuming that only one antibody is causing them
50
(1c+) will fail. If the practitioner can not find one antibody that will account for the reactions,
s/he needs to start looking for groups of antibodies that together could be causing the reactions.
On overlapping cells
Things get more complicated as soon as some of the test cells possess two or more
antigens that are causing reactions. The antibodies reacting together usually do not react in an
additive fashion, so that an antibody that reacts '1+ 1+ 0' by itself combined with an antibody
that reacts '0 1+ 2+' by itself may produce a '1+ 1+ 2+' reaction when the antibodies occur
together. Thus, there may be three different patterns of reactions accounted for by two
antibodies � one pattern of reaction occurs when only one antibody is reacting with some of the
test cells, a second kind of reaction occurs when the other antibody is reacting alone with
different test cells, and a third kind of reaction occurs when both antibodies are reacting
together on the test cells that contain both antigens.
Masking
Masking is a special case of the multiple antibody scenario in which one or more
antibodies completely cover up the presence of another antibody. This can happen when all of
the test cells that contain the antigen corresponding to the masked antibody also have antigens
that are reacting with other antibodies at least as strongly in every phase that the masked
antibody reacts. Thus, since there are no cells for which the covered antigen is present without
other reacting antigens also being present, and since there are no noticeable differences in the
reactions, it appears to the practitioner that only one antibody is present (1a+). Here is a case
where making sure that all remaining antibodies can be ruled out (3e+) is necessary to correctly
solve the case. In trying to rule-out the masked antibody, the practitioner can run the cells at a
51
phase which destroys the dominant antibody and not the underlying one (2c+). Alternatively,
one can try to find a test cell from another panel that is positive for the antibodies still not ruled
out and negative for the antibody that has been confirmed (2b+, 4a+). Finally, one can type the
patient for those antigens. If the results are positive, those antibodies can be ruled out (2e+).
Variable reactions
Some antibodies are likely to have more variable reactions than others. For example,
the P1 antibody can react weakly with some cells and more strongly with others. A panel that
has 2+, 3+, and 4+ reactions might suggest to the blood banker that there are multiple antibodies
present since the reactions are so different (1a+). It is possible, however, for one antibody to
react differently like that. Here is a case where one antibody may look like many. In addition,
because of this variability in reactivity, an antibody may not always show its normal pattern of
reactivity across different test phases (IS, 37°, AHG). Here is a case where the simplifying
assumption that one type of reaction is caused by one antibody (1c+) is going to fail. The
practitioner needs to be able to give up that strategy if it is not helping him/her to find an
answer for a given case.
Antibodies showing dosage
A special case of the variable reactions, and one that is easier to spot, is for an antibody
to react more strongly with a homozygous antigen than with a heterozygous antigen. Some
antigens are genetically paired with others, such as M and N. A test cell that possesses just one
antigen of the pair is normally homozygous (double dose) for that antigen. It will probably
react more strongly than a cell that is heterozygous (one dose) for the antigen. Here, the blood
banker needs to recognize that the test cells are reacting in the same pattern, but in different
52
strengths. Grouping them all together and comparing the zygosity of antigens with the test cells
will help the practitioner to find the correct answer (1c+).
Recent transfusion
If a patient has been transfused recently, then there may still be the previous donor's
red blood cells in the patient's system. Thus, in antigen typing a sample of the patient's red
blood cells, some of the cells tested will be the patient's and some will be from the transfusion
donor. Because it is difficult to determine which of the antigens detected belong to the patient
and which belong to the previous donor, the blood banker can not interpret the antigen typing
tests reliably. A further problem with diagnosing a recently transfused patient is that the
patient's immune system may currently be forming antibodies against some of the antigens
found in the transfusion donor's blood. The newly forming antibodies may not yet show up as
positive reactions when being tested in vitro, (e.g., in the test tube), but could be strong enough
to cause a reaction if the patient received that kind of blood for their next transfusion. The
blood banker, therefore, must know when the antigen typing test is valid and when it is not. If
an antigen typing test is used when the patient has recently been transfused, there may be false
positive results (2e+).
Drug interactions
Similar to the problems that can occur from a recent transfusion, certain drugs in a
patient's system can both cause the formation of antibodies and interfere with the interpretation
of various test results. Thus, blood bankers must be aware of the kinds of medication received
by each patient and know how those medications will affect their ability to interpret test results.
53
Auto-immune disorders
If a patient has an auto-immune disorder, s/he can form antibodies against his/her own
antigens. This special case is not treated in this dissertation.
54
Chapter IV
The Design of the Antibody Identification Assistant
(AIDA3)
The ultimate goal of this research is to develop a computer system that improves the
antibody identification procedure, both by making the task simpler and by efficiently bringing
more knowledge to the blood banker. Based on studies of the expert strategies and
erroneous/inefficient strategies found to be used in this domain, a number of opportunities for
a computer to aid the blood banking practitioner were identified. In the short term, the
computer can help practitioners on specific cases by checking for slips and the use of inadequate
strategies. In the long term, a well-designed system can also help the practitioner to learn the
problem-solving strategies and extensive knowledge necessary to become a true expert.
The research presented here is an extension of previous research focusing on the design
of decision support tools for certified medical technologists as they perform the task of antibody
identification. Several additions were be made to the previous version of the critiquing system,
as well as a change in the way users are trained to use the system. These changes were intended
to reduce the error rates prevalent in this community (Smith et al, 1991a; Smith et al, 1991b;
Strohm et al, 1991; Guerlain, 1993a) and to provide a demonstration of how to design an
effective cooperative problem solving system for this task and tasks with similar characteristics.
In order for this research to generalize as desired, it is necessary to form a mapping between the
characteristics of this domain, this class of users and this aiding strategy, so that results from
this study can transfer to other domains and classes of users with similar characteristics.
55
The Task
The task, as indicated, is a medical diagnosis task that can be characterized abstractly as
an abductive reasoning task (Chandrasekaran, 19xx; Josephson and Josephson, 1994; Pople,
1973), so characteristics such as masking and noisy data are factors that are known to cause
problems for users (Fraser, Strohm, Smith, et al., 1989). It is a high-consequence task, since an
incorrect diagnosis can lead to transfusion reactions and possibly even death. Usually there is
not significant time pressure to complete the task, although time pressure is a factor in
emergency (STAT) situations. In addition, there are financial pressures to limit costs.
The Practitioners
The practitioners are certified medical technologists, who have been documented to
make a significant number of errors on this task (Smith, Miller, Fraser, et al., 1991; Guerlain,
1993a). These errors include slips, failures to form appropriate hypotheses to guide problem
solving, failure to rule out alternative hypotheses, failure to collect independent, converging
evidence for the answer, failure to use as much information as possible from a test result,
ignoring base rates, biased assimilation, and biased reviewing. Many of these errors are due to
a lack of training and practice with the task, since a given particular practitioner may only
perform this task occasionally (especially if the hospital is small or if the technologist rotates
through other labs besides blood bank).
The Problem-Solving Tool
The computer support system used for this study is called the Antibody IDentification
Assistant 3 (AIDA3) to distinguish it from the previous versions of the system that were studied
earlier. The system was developed on the Macintosh using Symantec's® Think C programming
language. With all error checking turned off, the system can be used as an information display
56
tool that allows practitioners to request and interpret the various tests used for antibody
identification similar to the way they normally would using paper and pencil. With error
checking turned on, the system monitors the practitioner's procedure for errors and provides
feedback if errors are detected. Both systems have the same set of test cases built into them.
These cases were either designed by an expert blood banker or were taken from real patient
data to ensure validity. The cases that were used for testing were carefully selected to have
certain characteristics (weak antibodies, multiple antibodies, etc.) and predictions were made as
to how a practitioner's performance would change depending on the case characteristics, the
practitioner's strategy, and the type of system enhancements the practitioner was using. Three
design principles were used to guide the design of this problem-solving tool.
Design Principle 1: Use a Direct Manipulation Problem Representation as the
Basis for Communication
First, the interface is designed not only to be helpful and easy to use, but also to provide
data for the computer to diagnose errors in the user's problem solving. The technologist can
request test forms and mark hypotheses on those forms, so the computer is able to watch the
person's problem-solving process, potentially detecting errors in the subject's procedure. Thus,
no extra work is required on the user's part to feed information to the computer. Practitioners
just work as they naturally would and, because of the interface design, the data on the user's
problem-solving activities is rich enough for the computer to detect problems and provide
feedback. A description of how the user interacts with the system follows.
For each of the cases built into the system, the practitioner performs the antibody
identification process by asking the computer (via a pull-down menu option) to show test
results and other pertinent information, such as relevant facts about the patient's medical
57
history. The computer has data stored in it for the results of every possible test so the user can
choose to look at those tests that are deemed pertinent to the particular case.
Users can make markings on the data sheets as they would on paper by selecting from a
set of color-coded "markers", available as buttons along the top of the screen, and clicking on
cells of interest using the mouse. Rows, columns, and cells can be highlighted using a yellow
pen. Antibodies can be marked as either 'ruled-out', 'unlikely', 'possible', 'likely' or 'confirmed',
using pens ranging from green for ruled-out to red for confirmed. The colors are chosen to
correspond somewhat with the danger of introducing that antigen into the patient's blood
stream. A ruled-out (green) antibody indicates that it is safe to use blood for a transfusion that
contains the corresponding antigen, while a confirmed (red) antibody indicates that a
dangerous transfusion reaction could occur if blood containing that antigen is given to the
patient.
The data sheets used in AIDA3 are very similar to those currently used in paper form in
labs. The organization of the display is that shown in Figure 4. The only difference between the
paper version and the one on the computer is that the background grid has been made less
salient than the data contained inside. This follows the principles of Tufte (1990) of reducing the
amount of "chartjunk", or background display information, so that the important data is
enhanced.
58
Figure 4. Sample Screen.
As the user selects tests, any antigens that have been marked as ruled-out, possible, etc.
on previous panels carry over to the current panel. This aids the user in remembering where
s/he is in a case. Using paper forms, blood bankers must copy over their markings from panel
to panel. Here, the computer performs that subtask for them, reducing the potential for slips
and saving them time.
59
Design Principle 2: Use Critiquing to Enhance Cooperative Problem-Solving
The second design principle followed was to use a critiquing approach to decision
support, because of the previous benefits found with designing the system as a cooperative aid.
Based on our studies of human experts, the AIDA3 system was designed around a broad
strategy of collecting converging evidence before completing a case. This global strategy
provides protection against the fallibility of the heuristic methods underlying strategies applied
at different points in the case (i.e., individual steps on the checklist). To help ensure use of this
strategy, AIDA3 monitors for both errors of commission and errors of omission. The types of
knowledge encoded into the second version of the system include detecting:
1) Errors of commission (due to slips or mistakes):
� Errors in ruling out antibodies (same as in previous study).
2) Errors of omission (due to slips or mistakes):
� Failure to rule out an antibody for which there was evidence to do so.
� Failure to rule out all clinically significant antibodies besides the antibodies
included in the answer set.
� Failure to confirm that the patient did not have an auto-immune disorder (i.e.,
antibodies directed against the antigens present on their own red blood cells).
� Failure to confirm that the patient was capable of forming the antibodies in the
answer set (i.e., that the patient's blood was negative for the corresponding
antigens, a requirement for forming antibodies in the first place if the possibility
of an auto-immune disorder has been ruled out).
3) Errors due to masking:
� Failure to detect and consider potentially masked antibodies.
60
4) Errors due to noisy data:
� Failure to detect situations where the quality of the data was questionable.
5) Answers unlikely given the data (low probability of data given hypothesis):
� Failure to account for all reactions.
� Inconsistency between the answers given and the types of reactions usually
exhibited by those antibodies (e.g., that a warm temperature antibody was
accounting for reactions in cold temperatures)
6) Unlikely answers according to prior probabilities (regardless of the available
evidence)
� Antibody combinations that are extremely unlikely due to the way the human
immune system works.
Design Principle 3: Represent the Computer's Knowledge to the Operator to
Establish a Common Frame of Reference
Third, a check-list was designed that enumerates the subgoals the computer considers
necessary to adequately solve a case (Figure 5 shows the checklist). This checklist provides an
explicit, high-level representation of the computer's goal hierarchy. The design of the system is
such that users can apply additional strategies without interference from the computer, and can
override a critique from the computer, but the checklist makes it clear what steps the computer
expects the person to have done before completing a case. The computer also allows the user
flexibility in deciding what order to use in completing the subgoals listed on the checklist (i.e.,
the computer does not monitor for the ordering of the steps listed in the checklist except when
that ordering is critical to successful problem-solving).
61
Name: __________________ Phone Number: _____________ Hospital: _____________
Checklist for Alloantibody Identification
Case: ____________
Step 1. Complete ABO and Rh typing. Step 2. Check screen cells.
a. Mark the unlikely antibodies (usually f, V, Cw, Lua, Kpa, Jsa). b. Rule out antibodies.
� Homozygous: C, E, c, e, M, N, S, s, Lea, Leb, Fya, Fyb, Jka, and Jkb
� Homozygous or Heterozygous: D, P1, Lub, K, k, and Xga (as well as
the six unlikely antibodies: f, V, Cw, Lua, Kpa, Jsa) Step 3. Check patient history if available. Step 4. Check auto control on the Poly Panel. Step 5. Check the Polyspecific Panel. (If necessary, use another panel to enhance
reactions.) a. Rule out antibodies.
Antibody reactions that could be weakened in certain test conditions: Enzyme: M, N, S, s, Fya, Fyb, Xga Prewarm: M, N, P1, Lea, Leb, Lua, Eluate: M, N, P1, Lea, Leb Room Temperature: D, C, E, c, e, f, V, Cw, s, Lub, K, k, Kpa, Jsa, Fya, Fyb, Jka, Jkb, Xga Cold 4° C: D, C, E, c, e, f, Cw, S, s, P1, Lub, K, k, Kpa, Jsa, Fya, Fyb, Jka, Jkb, Xga
b. Mark likely antibodies. Step 6. If necessary, use additional cells to rule out the remaining antibodies, and to help you to
confirm your answer. Step 7. If necessary, use antigen typings to rule out the remaining antibodies. Step 8. Use antigen typings to help confirm your answer. Step 9. Make sure that all antibodies that have not been confirmed or marked unlikely
(usually f, V, Cw, Lua, Kpa, Jsa) have been ruled out. Step 10. Make sure the confirmed antibodies are not on any non-reacting cells. Step 11. Make sure that at least one confirmed antibody is on every reacting cell. Step 12. Look at your answer and ask whether it is plausible (or is it a "unicorn"?)
Figure 5. Sample Checklist
62
Although a checklist is not the only way that such information could be conveyed or
represented, it was hypothesized that the checklist would work as an effective aid for a number
of reasons. First, the checklist serves as an external memory aid, reminding users of certain
types of knowledge related to antibody identification, such as: 1) Factual information (i.e., what
antibodies are destroyed in certain phases of testing) and 2) Procedural information (i.e., what
constitutes a complete protocol).
Second, since the checklist is a representation of the kinds of knowledge the computer is
expecting the user to apply in solving a case, the introduction of the checklist provides the user
with an appropriate frame of reference for interpreting any feedback given by the computer. In
other words, the checklist helps to ensure that the user has an appropriate mental model of the
problem-solving strategies understood to be correct by the computer. Use of the checklist is a
way for designers to ensure that both the computer system and the practitioners using it have a
common frame of reference for communication and understanding.
Finally, the checklist provides an alternative form of aid to practitioners in situations for
which the critiquing system is not helpful. For example, if a practitioner gets stuck during a
case, s/he can review the checklist to see if there are any other tests or knowledge that may be
applicable, since the checklist lists the goals that should be completed before finishing a case.
63
Chapter V
Experimental Procedure
An earlier study with the AIDA system had shown that if the computer is
knowledgeable about one aspect of the antibody identification procedure (how to rule out
antibodies), then critiquing the users' application of that strategy may be more appropriate than
automating the task in cases where the computer's knowledge is not fully competent. The goal
of this second study was to see if misdiagnosis rates could be reduced or eliminated with the
design of a more complete critiquing system, and to explore its effects on cooperative
performance.
Subjects
Two subject pools were used to test AIDA3. The first was a group of four "experts"
(certified Specialists in Blood Bank (SBBs)) who were tested with the system as a pilot group.
These subjects came from three different hospitals. (The objective of this preliminary study was
to make sure AIDA3 did not interfere with the performances of skilled practitioners.)
Subsequently, thirty-two blood bankers from seven different hospitals were tested. All
of these technologists were identified by their supervisors as "actually performing the task of
antibody identification as part of their job but who would benefit from additional experience
and training". Their years of experience ranged from 1 to 35 years (with a mean of 10 years).
64
Experimental Design
Half of the subjects in each group were randomly assigned to be in the Control Group
and the other half were randomly assigned to be in the Treatment Group. All of the subjects
were tested on the same six cases. The first case was used to give both groups the same initial
training on how to use the system. Subjects were shown how to use the pull-down menus to
select test results (see Figure 6) and how to interpret the test results on each screen. After
walking the subject through the layout of each type of test and explaining how to interact with
the system (i.e., how to see the results for a particular test cell, how to mark an antibody as ruled
out, how to mark an answer for the case, etc.) the subject was asked if s/he understood how to
use the system and if s/he was ready to continue. During this initial training, no knowledge
specific to blood banking was discussed, except in relation to how the computer displayed test
results and how the user interacted with the computer. Furthermore, subjects were not asked to
solve the first case, but just used it to practice selecting and marking individual test results.
Figure 6. Test Results Available.
65
All subjects (in both the Treatment and Control Group) solved the second case (herein
referred to as the "Pre-Test Case") without any aid from the computer. Thus both groups were
using the control version of the system. The purpose of this Pre-Test Case was to get a
benchmark on the practitioners' current performance strategies (i.e., did they rule-out, did they
do antigen typing, did they seem to notice that there were two reactions present) against which
to compare the Treatment Group's strategies when using the experimental system. The Pre-Test
Case was one of two matched cases, and it was randomly determined at run-time which of the
two cases a particular subject solved as a Pre-Test Case and which as a Post-Test Case. With
this design, a within-subjects comparison could be made for the Treatment Group. After
solving the first Post-Test Case that was matched in characteristics to the Pre-Test Case, both
groups solved three more cases, with the Treatment Group using the critiquing and the checklist
and the Control Group solving the cases on their own. Performance on these cases could be
examined for differences in a between-subjects manner.
Figure 7 shows the experimental design for this study. It was hypothesized a priori that:
1) There would be a within-subjects reduction in misdiagnoses rates for the Treatment
Group as they went from solving the first of the two matched cases without any
critiquing to solving the other matched case after the checklist and critiquing were
introduced, but that the Control Group would not show such improvement,
2) There would be a between-subjects improvement, such that the Treatment Group
would have a significantly lower misdiagnosis rate than the Control Group for all of the
Post-Test Cases, and
3) The critiquing system would influence practitioners' cognitive problem-solving
processes, promoting effective use of strategies for solving cases, and effective
cooperative problem-solving between the human and the computer (i.e., the system
66
would detect errors in the human's problem-solving, the practitioners would find the
system helpful and beneficial, and the practitioners would be able to detect and recover
from errors generated by the computer's faulty reasoning).
The order for these two cases was randomly decided at run-time.
Test Cases Pre-test Case 2 antibodies looking like 1 (matched to Case 1)
Case 1 2 antibodies looking like 1 (matched to Pre-test case)
Case 2 Weak antibody (the "brittle" case)
Case 3 1 antibody masking another
Case 4 3 antibodies reacting on all cells (from another lab)
Control Group Treatment Group
Introduce checklist and training cases for Treatment Group
Figure 7. Experimental Design.
Cases used to test AIDA
A crucial part of testing a cooperative problem solving system like AIDA is to use a set
of tasks that test the range of scenarios that might be encountered in practice. If such a range of
67
tasks is not used, then results may not be representative of actual performance with the system.
Thus, it was important to include cases where the computer's support tools fail, and to see
whether the technologists would detect and cope with such failures. Picking scenarios where
one's design might fail is imperative for understanding how the introduction of tools may
influence or change performance in potentially dangerous ways. Issues such as loss of skill,
fixation on one set of solutions (Fraser, et al, 1992), and how the dynamics of the system and
characteristics of a case can combine to cause problems, are all central to our understanding of
how humans interact with complex systems. Analyzing how users cope with such system
failures may lead to better solutions, or more generally, to the identification of principles
describing how people interact with decision support systems. Therefore, in testing AIDA, a
case was used for which the computer�s rule-out knowledge was not fully competent. (This
case was the same previously tested weak anti-D which did not react with all of the test cells.)
There were four Post-Test Cases. The first Post-Test Case was randomly selected from
one of two matched cases, the other of which was the Pre-Test Case. Both of these cases had the
characteristic that the original testing panel seems to indicate that only one antibody is present,
but in actuality, two different antibodies are together accounting for the reactions. The second
Post-Test Case was the weak antibody case, for which the computer's knowledge was not fully
competent. Case 3 was a masking case (where one antibody masks the presence of another).
Case 4 was sent to us by a blood bank lab that had no knowledge of our work. It turned out to
be a difficult three antibody case. The following section describes each of these cases in more
detail.
68
Pre-Test Case: Two antibodies looking like one
Both the Pre-Test Case and the first Post-Test Case had the same characteristics (two
antibodies looking like one) and it was randomly determined at run-time which case a
particular subject would get as the Pre-Test Case and which as the first Post-Test Case. The
pattern of reactions are somewhat consistent with the pattern for one antibody (either anti-Fyb
or anti-S) showing dosage. There are '0 0 2+' and '0 0 3+' reactions in the AHG phase on all cells
for which this single antigen is present. Although this pattern is not following the pattern of
dosage well (some of the '0 0 2+' reactions are on cells where the single antigen is homozygous
and some of the '0 0 3+' reactions are on cells where the single antigen is heterozygous), past
experiments have shown that practitioners often fail to note this inconsistency in the dosage
pattern and hypothesize that single antigen as the answer (Rudmann et al., 1992). Those
practitioners who do not follow through with ruling out and antigen typing could conclude that
the single antigen is the answer while in fact, two different antibodies (either anti-E and anti-K
or anti-c and anti-K) are causing the reactions. The single antigen can be ruled out by running
additional cells.
For this case, it was predicted that many practitioners would originally hypothesize the
single antigen as the answer. Those who followed through with ruling out and antigen typing
were predicted to be able to eventually rule out the single antigen and find the correct set of
antibodies.
Post-Test Case 1: Two antibodies looking like one
Case 1 had the same characteristics as the Pre-Test Case, as described above. It was
predicted that those subjects using the critiquing system would eventually get the right answer
because they would follow a complete protocol, as prescribed by the checklist.
69
Post-Test Case 2: One antibody reacting weakly, right answer can be ruled out
(because system is not fully competent).
Case 2 was a real patient case used to test how practitioners cope with the brittleness of
an expert system. On this case, there are very weak reactions caused by a newly forming anti-
D. The case history tells us that the patient is a pregnant woman in her 36th week of gestation.
She is also Rh negative. Such information should prompt the blood banker to hypothesize that
she is probably getting the drug Rho Gam to counteract the possible negative effects if her baby
is Rh positive. Rho Gam will cause reactions against cells that contain the D antigen. Therefore,
anti-D is most likely to be detected from antibody testing. The reactions for this case, however,
are very weak. Only two cells are reacting '0 0 +/-'. Three of the test cells that contain the D
antigen are not reacting. Therefore, the normal rule-out heuristics fail: anti-D can be ruled out
according to the heuristics when anti-D is actually the answer.
The blood banker needs to recognize that the reactions are very weak and try to
enhance the reactions somehow. Running enzymes will enhance the reactions of all Rh
antibodies, including anti-D. If the blood banker does not recognize the need to enhance the
reactions, s/he may try to determine the answer to the case just by looking at the Polyspecific
reactions. On that panel, all antibodies can be ruled out except anti-E. Anti-E is heterozygous
on some of the cells that are non-reacting and homozygous on the two cells that are reacting. It
appears that anti-E could be accounting for the reactions, showing the classic pattern of dosage.
However, anti-E alone in an Rh negative patient is extremely unusual and should be recognized
as a "unicorn" (an unusual answer). This is because the normal formation of antibodies is such
that anti-D will almost always form before another Rh antibody if the patient lacks both the
antigens.
70
The clues in this case that should prompt the blood banker to find the correct answer
are the following: the reactions are very weak so they should try to be enhanced by running the
cells at a different temperature or using a different technique (such as running the cells with
enzymes). The patient is Rh negative, so any Rh antibodies that form should include anti-D.
The patient is in her 36th week of gestation, so it is likely that she is receiving Rho Gam, a drug
that will induce the formation of anti-D. Finally, running additional cells, even at Polyspecific,
allows anti-E to be ruled out on a homozygous cell.
For this case, two features of the critiquing system could help practitioners on this case.
The first feature is the fact that the system has some meta-knowledge embedded in it to detect
that some of its own rules may not be valid. In particular, when there are weak reactions on a
panel, the system provides a warning message to the user saying that rule-out may not be
appropriate and that enhancing the reactions would be a good idea.
Although the system does not require practitioners to rule out antibodies on this case
(as it does on cases for which there are no weak reactions), it also does not prevent users from
ruling out antibodies if they choose to do so. In essence, the system gives the user a warning
about the appropriateness of ruling out and is leaving it up to the practitioners to decide
whether or not to rule-out. If users do choose to rule out antibodies despite the system's
warning, they may end up with the anti-E as the answer (since anti-E matches the pattern of
reactions fairly well).
The second feature that the system has is a set of detectors that check for unlikely
antibody combinations. These unicorn detectors will detect that anti-E alone in an Rh negative
patient is an extremely unlikely event, given the way the immune system works. Thus, if users
mark anti-E as their answer, the system suggests enhancing the reactions with enzymes to see if
71
the presence of anti-D (which is likely to have formed before anti-E in this Rh negative patient)
becomes more obvious through the enhancement.
For this case, it was predicted that some practitioners would heed the weak reaction
warning of the system and enhance the reactions and most likely detect the anti-D. For others
who did not heed the warning, it was predicted that many of them would conclude that anti-E
was present and then get the system's warning about anti-E alone in an Rh negative patient,
prompting them to re-evaluate their answer and probably discover anti-D. If the user
concluded something different than anti-E or anti-D, it was unclear whether the user would be
able to recover and detect the correct answer.
Post-Test Case 3: Two antibodies, one masking the other
Case 3 was used to test for completeness of rule-out. For this case, reactions are all '0 0
3+' on the main panel at the AHG phase. This suggests the presence of only one antibody.
Anti-Fya immediately looks like a good candidate, since all reacting cells contain the Fya
antigen and all non-reacting cells are negative for the Fya antigen. It was predicted that most
practitioners would see this and mark anti-Fya as a likely candidate. For those who did not
rule-out all other possible antibodies, the case would likely be considered as being done at this
point when in fact, there is a second antibody, anti-E, that is present as well. This antibody is
completely masked by the anti-Fya reactions on the main panel. The typical way for the
practitioner to discover that anti-E is present is in the process of trying to rule it out. To rule it
out, the practitioner would run additional cells that are positive for E but negative for Fya. In
running such an additional cell, the reactions are positive, rather than the expected negative,
indicating that there is something else besides anti-Fya causing reactions. With a little looking,
anti-E can quickly be determined to be that antibody.
72
For this case, it was predicted that many practitioners would originally hypothesize the
single antigen as the answer. Those who followed through with ruling out and antigen typing
were predicted to be able to eventually detect the second antigen and find the correct set of
antibodies. It was predicted that those subjects using the critiquing system would eventually
get the right answer because they would follow a complete protocol, as prescribed by the
checklist.
Post-Test Case 4: Three antibodies, on overlapping cells, reacting on all cells of
the main panel
Case 4 came from a blood bank lab that had no knowledge of our work. We asked this
lab to supply us with "a case that many blood bankers would have difficulty solving." On this
case, three antibodies, anti-E, anti-c, and anti-Jkb were reacting together, such that all of the cells
on the main panel had positive reactions. A knowledgeable blood banker would notice that
there were three different patterns of reactions ('0 0 2+', '0 0 3+' and '0 2+ 3+') and hypothesize
that at least two antibodies were present.
Anti-E is fairly easy to recognize as causing the stronger, '0 2+ 3+' reactions, but it is
very difficult to detect what could be causing the other reactions. This is because the anti-c and
anti-Jkb are overlapping with each other and with the anti-E. Furthermore, anti-c and anti-Jkb
are both showing dosage, so the reactions accounted for by their presence are variable.
Furthermore, no antibodies can be ruled out using this panel because there are no non-reactive
cells. Finally, just two antibodies, anti-E and anti-c can account for all of the reactions on the
main panel, so practitioners might abandon anti-Jkb as part of their hypothesis set when looking
at that set of reactions. On the other hand, anti-E and anti-Jkb account for all of the reactions on
73
the additional cells panel, so practitioners might abandon anti-c as part of their hypothesis set
when looking at that set of reactions.
The only way to rule out antibodies is to: 1) Run the cells at Room Temperature, in
which case, none of the cells will react and all those antibodies that normally react at Room
Temperatures can be ruled out (anti-M, -N, -S, -s, etc.) 2) Run additional cells, in which case
there will be two non-reacting cells (because they happen to be negative for anti-E, anti-c, and
anti-Jkb) which can be used to rule out some of the antibodies. 3) Use antigen typings to type
the patient for the presence of certain antigens, which will allow more antibodies to be ruled
out.
For this case, it was predicted that many practitioners would not know how to proceed
because all of the cells are reacting. Some might predict a high frequency antibody, such as anti-
k or anti-Lub as the answer, since these antigens are present on all of the donor cells. These
high frequency antigens can be ruled out, however, either by running additional cells or by
doing antigen typing. Those who used additional cells and antigen typing would be able to rule
out all but the three antibodies present (and two low-frequency antibodies). Those who
followed through with ruling out and antigen typing were predicted to be able to eventually
solve the case. It was predicted that those subjects using the critiquing system would eventually
get the right answer because they would follow a complete protocol, as prescribed by the
checklist. Furthermore, if they hypothesized just a subset of the antibodies present, the system
would flag them and tell them that reactions were present for which no confirmed antibody was
present.
Procedure
There were six phases to the experiment. During Phase 1, the experimenter collected
demographic data from the subject being tested. Phase 2 introduced subjects to the basic AIDA
74
interface. Phase 3 was used to test subjects on a Pre-Test Case without any aiding for either
groups. Phase 4 was used to introduce subjects in the Treatment Group to the checklist and the
critiquing version of the system. Subjects were asked to solve partial cases with the checklist
and critiquing system. The Control Group also solved these same partial cases, but without any
aiding by the computer or use of the checklist. Phase 5 was used to test all subjects on four Post-
Test Cases. Phase 6 was a debriefing stage where treatment subjects were asked to fill out a
questionnaire rating the utility and usability of the critiquing system.
Phase 1. Subject Demographic Data
In Phase 1, subjects were briefly told what the experiment would involve and asked to
sign a consent form. Their name, hospital, certification level, and years of blood bank
experience were logged. Furthermore, subjects were asked with what frequency (number of
times per month) they normally encountered antibody identification cases.
Phase 2. Introduction to the Interface
In Phase 2 of the experiment, all subjects underwent the same training and testing.
Training on the system was done with the first two cases. This involved initially showing the
subjects how to use the mouse to select test results from the pull-down menus provided (with
all of the critiquing messages turned off). Each subject was then asked to select one by one each
of the test results.
For each kind of test result screen, the experimenter told the subjects what kinds of
operations could be performed and asked them to try each function as it was being described.
For instance, on the ABO/Rh panel, subjects were asked to select their ABO/Rh interpretation
using the drop-down menus provided. On the antibody screening panel, subjects were asked to
75
highlight rows and columns, mark individual antibodies as ruled out, unlikely, possible, likely,
and confirmed. They were then told to undo some of those actions. As subjects worked
through each panel, it was pointed out to them how their markings were automatically carried
over from screen to screen. On screens where test results were not automatically given to them,
subjects learned how they could simulate running a test by highlighting the appropriate row or
column.
It was explained to the subjects that a case was not considered complete until they hit
the Done button, so that it was perfectly acceptable to undo or re-mark antibodies as much as
they liked until they considered themselves done with the case.
During training, subjects could ask as many questions as they liked, and all questions
having to do with the interface were answered. However, questions specific to blood banking,
such as "What's the rule-out rule again?" were not answered. All interface features were taught
on the first case, and the experimenter refrained from talking on subsequent cases unless the
user was still having trouble or did not remember how to perform an action (such as how to get
an antigen typing result).
Phase 3. The Pretest Case
Phase 3 involved testing subjects on a Pre-Test Case. This case was randomly selected
at run-time from one of two matched cases, the other of which became the first Post-Test Case
for that subject. Both groups used the control version of the system, without any critiquing or
feedback. They were asked to solve the case as they normally would and to mark their answer
on an answer sheet (see Appendix A for a sample answer sheet).
76
Phase 4. Training and Introduction to the Checklist and Critiquing
In Phase 4, the Treatment Group was given a checklist like the one shown in Figure 5. It
was explained that the computer would now be monitoring them, and that it would check that
each of the steps in the checklist was followed. It was explained that certain steps in the
checklist would be practiced with some training cases. During all of these training sessions, the
critiquing system was monitoring for errors according to the lesson being followed.
The first training case was an ABO/Rh screen, corresponding to Step 1 in the checklist.
Subjects were told that the computer expected this test to be run first for every case. They were
then asked to purposely misinterpret the test, so that they would encounter an error message
from the computer (see Figure 8). Subjects were shown how they could either
77
Figure 8. Sample ABO/Rh Error Message.
undo their action by clicking on the "Undo Marking" button, or override the computer by
clicking on the "Leave as Is" button. It was explained to them that overriding the computer was
always an option, but we asked that they fill out a form explaining their reason for disagreeing
with the computer.
The second type of training case was an Antibody Screen. Subjects were told to follow
Step 2 in the checklist, namely to mark the six low frequency antibodies (anti-f, anti-V, anti-Cw,
anti-Lua, anti-Kpa, and anti-Jsa) as Unlikely, using the Unlikely button. Subjects were then told
to rule out antibodies according to the strategy described in the checklist, namely, to only rule
78
out on non-reacting cells, taking into account zygosity. Subjects practiced this step with three
Antibody Screen practice cases, one of which had no nonreacting cells, and thus could not be
used to rule out any antibodies.
The third type of training case was ruling out on full panels, according to Step 5 in the
checklist. Subjects had to follow the same steps as outlined in Step 2 (marking Unlikely
antibodies and Ruled Out antibodies) with the exception that if the panel results were run at
certain temperatures or phases of testing (e.g., Enzymes, Prewarm, Room Temperature, etc.)
then a corresponding set of antibodies could not be ruled out, since those antibody reactions are
normally weakened or destroyed in those phases of testing. Subjects also had to mark whether
or not the patient had an auto antibody present (Step 4 in the checklist), based on the
information located at the bottom of the panel in the Auto Control section of the panel. Subjects
practiced ruling out in this manner on four panels, two of which had no exceptions to the rule-
out procedure, and two of which had exceptions (both being due to an Enzyme panel).
The fourth type of training case corresponded to Step 7 in the checklist. Subjects were
given three antigen typing screens and asked to rule out antibodies based on the presence of
antigens in the patient's blood. (If a patient possesses an antigen in their own blood, then there
should not be an antibody formed against it).
Finally, subjects were asked to solve two entire, single-antibody cases using the
critiquing system and following the whole checklist. For each case, subjects were asked to fill
out an answer sheet. The first case was a straightforward anti-C and the second was a
straightforward anti-K.
After completing each lesson, the computer gave the subjects a summary of their
performance, showing the correct answer for the lesson, vs. their answer. Furthermore, a list of
79
error messages showed what procedural errors had been made during the problem-solving (see
Figure 9 for a sample summary screen).
During Phase 4, the Control Group was asked to solve the two entire cases solved by
the Treatment Group during training. However, the Control Group did not receive any aiding
or feedback from the computer or experimenter. For each case, subjects were asked to fill out an
answer sheet.
Phase 5. Post-Test Cases
During Phase 5, both groups solved the four Post-Test Cases described in the previous
section. The experimenter refrained from answering any questions from either subject group.
The Control Group filled out answer sheets for each case and got no feedback from the
computer. The Treatment Group was asked to follow the checklist and fill out an answer sheet
for each case. The critiquing system was monitoring for errors and giving error messages if any
were detected.
Phase 6. Debriefing
In the final phase of the experiment, subjects were told how they did on the cases and, if
in the Treatment Group, were asked to fill out a questionnaire. If there were any questions
about the cases or the experiment in general, they were answered. When there were no more
questions, subjects were thanked and paid $50.00 for their time.
80
Figure 9. Sample Summary Screen.
Data collection
As subjects were working on the test cases, the AIDA system was logging all of the
person's actions (such as which buttons were selected, what answers were typed in etc.) in such
a way that all of the actions could be reproduced by running the program again with the log
data as input. Besides user actions, user performance measures were also being logged, i.e., the
time taken to complete each case, the number of incorrect rule-outs, and incorrect or incomplete
answers. The computer logged all errors that it detected whether or not it displayed them to the
81
user. Thus, errors for both the Control Group and Treatment Group were logged. The system
automatically codes the data, counting types of errors, time to complete cases, and misdiagnosis
rates. Appendix C shows a sample of the coding categories detected and logged by the
computer. An actual sample of this log data is shown in Appendices D and E. These logs, along
with the questionnaire results, will be the primary sources of data for the experiment.
Data Analysis
Three types of analyses were made: 1) An analysis of outcome performance, as
measured by misdiagnosis rates, 2) A behavioral protocol analysis to examine subjects'
strategies, behaviors, and process errors, and 3) A questionnaire to get subjective reactions to
using the critiquing system. A number of statistical tests were run to measure differences in
misdiagnosis rates. McNemar's Chi Square test was used to test the hypothesis that subjects in
the Treatment Group improved in performance from the Pre-Test Case to the matched Post-Test
Case. Fisher's exact test was used to test the hypothesis that the Treatment Group had better
performance on each of the Post-Test Cases than the Control Group. Finally, a test for
difference between the two groups was conducted using a log-linear analysis that takes into
account performance on the Pre-Test Case (see Appendix B for a sample of the statistical
calculations used). A behavioral protocol analysis of subjects' performances was also conducted
to study differences in strategies used by subjects, how those strategies were influenced by the
system design and how they lead to good or bad outcomes (see Appendices C through G for
sample behavioral protocols and error logs that were used for this analysis). Finally,
questionnaire results were examined to determine the perceived usability and utility of the
critiquing system.
82
Chapter VI
Results and Discussion
As discussed earlier, two populations were studied. The first was a set of four highly
proficient technologists, the second a set of thirty-two practitioners who their supervisors
identified as "actually performing the task of antibody identification as part of their job but who
would benefit from additional experience and training." In this section, data is first presented
describing unaided performance for the population needing additional experience and training.
To give a flavor for the interaction with the critiquing system, a few sample interactions are then
given. Comparison of misdiagnosis rates for the Treatment vs. Control Groups are then given,
as well as results from the Questionnaire that was administered to the Treatment Group.
Finally, a detailed protocol analysis highlights the important behaviors that were exhibited with
use of the system.
Unaided Subject Performance
In order to understand unaided performance on this task, we can look at the problem-
solving strategies exhibited by the subjects in the Control Group on all of the cases and the
Treatment Group on the Pre-Test Case.
For example, one Control Group subject made many process errors that lead to an
incorrect solution on all of the cases. On Case 1, for instance, this subject ruled out by using a
strategy that will fail in multiple antibody cases (ruling out using reacting cells, strategy 2f-),
which caused her to rule out both of the right answers (anti-c and anti-K). She also marked anti-
S as the answer, even though it accounted for most, but not all, of the reactions exhibited. Thus,
83
she is violating strategy 3d+ (Making sure there are no unexplained positive reactions).
Furthermore, she made other procedural errors such as ruling out heterozygously (2g-), ruling
out antibodies using results from test procedures that usually inhibit those reactions (2d-), and
failing to do antigen typing for the antibody marked as the answer (3a-).
Multiple process errors were made by unaided subjects. Table 1 shows the number of
subjects (Treatment and Control groups combined) who made a particular kind of error at least
once on the Pre-Test Case. Four subjects' data were not counted in this analysis because the
data was invalid. (Due to a bug in the program, the reactions for a particular test panel that
these subjects used were incorrect. The other subjects never accessed this test panel).
Thus, we have evidence consistent with previous studies that practicing medical
technologists make a significant number of process errors and outcome errors when solving
antibody identification cases.
Table 1. Process errors made by Treatment and Control Group subjects on the Pre-Test Case.
Error Number of Subjects (out of 28) Who
Committed that Error at Least Once on the Pre-Test Case
1. Ruling out Hypotheses Incorrectly 20 2. Failing to Rule Out When Appropriate 14 3. Failure to Collect Converging Evidence 26 4. Data Implausible Given Answer 11 5. Answer Implausible Given Prior Probabilities 11
84
Example Subject Interactions
As a comparison to unaided performance, this section gives the reader an idea of how
the critiquing system interacted with a sample subject, detecting errors in performance and
steering the subject towards a successful solution path. This subject got the Pre-Test Case
wrong, but then got the rest of the cases right with the aid of the critiquing system. On two of
those Post-Test Cases, she initially had an incorrect solution set, but changed her answer in
response to the critiques she received.
On the Pre-Test Case, this subject correctly reviewed initial data about the patient, such
as the ABO/Rh and the Case History. After seeing that the initial Antibody Screen results were
positive, she selected a full panel for interpreting test results. There, she ruled out using
homozygous, non-reacting cells (a good strategy) and selected additional cells for further
analysis. At this point, she confirmed anti-Fyb and continued on to the next case. The correct
answer for the case was anti-E plus anti-K. Anti-Fyb accounted for most of the reactions, but it
did not fit the pattern of dosage (strength of reaction depending on the strength of the antigen)
and did not account for two of the reacting cells, (one on the initial Antibody Screen test and
one on the Additional Cells panel). This subject's erroneous conclusion stemmed from
following an incomplete protocol. She did not try to rule out all remaining antibodies, and did
not run an Antigen Typing test as independent evidence leading towards her answer.
Furthermore, her answer did not account for two of the reactions seen, nor the strength of
reactions on the reacting cells. On the matched Post-Test Case, however, this subject correctly
followed a complete protocol, being sure to rule out all remaining antibodies besides the ones
marked as Confirmed, and successfully solved the case.
On Post-Test Case 2, the weak antibody case, the system alerted the subject to the fact
that since some of the reactions were weak, rule-out might not be an appropriate strategy. The
85
subject heeded this warning and enhanced the reactions before proceeding with rule-out. When
looking at the reactions on the Additional Cells, this subject tried to run another test, but the
system warned her that she could have ruled out more antibodies on that panel. Because of this
message, she continued to rule out on the Additional Cells panel that she was looking at, and
was able to finish ruling out all remaining antibodies besides anti-D. Thus, she confirmed anti-
D and continued with the next case. Here, the system aided her by suggesting that she enhance
reactions before ruling out and, once at such an enhanced phase of testing, checking to be sure
that she ruled out all of the antibodies possible. In this way, the system helped her to avoid
running extra tests which were not necessary.
On Post-Test Case 3, this subject solved the case to the point where anti-Fya and anti-E
were the only remaining antibodies. At this point, she confirmed anti-Fya (which accounts for
all of the reactions) and marked anti-E likely. The system reminded her that she had not ruled
out all antibodies besides anti-Fya and warned her that anti-E was confounded with anti-Fya.
This message prompted her to run the cells at Enzymes (a technique that will destroy Duffy
antibodies, including Fya, and enhance Rh antibodies, including anti-E). Thus, she was able to
expose the presence of anti-E and correctly add anti-E to her answer set.
On Post-Test Case 4, this subject proceeded to the point where all antibodies but anti-
Jkb, anti-c, and and-E were ruled out. Again, she marked one of them as Confirmed (anti-Jkb),
but did not confirm or rule-out either of the other two remaining antibodies. The system
warned her that 1) she had not ruled out all remaining antibodies, 2) the confirmed antibody
did not account for many of the reactions exhibited, 3) it is rare to see anti-Jkb as the only
antibody, and 4) antibodies tend to form in a certain order, and that anti-c and anti-E would be
more likely to form before anti-Jkb. Thus, the system used knowledge about prior probabilities
as well as data specific to that case to warn the subject that her answer was implausible. Plus,
86
the system made the general remark that her protocol was incomplete (i.e., that she had not
ruled out all remaining antibodies). In response to these messages, the subject further examined
the case and included anti-E and anti-c in her answer set, thus getting the case right.
Gross Performance Measures
The following sections give the results for the overall misdiagnosis rates, a comparison
of the mistakes and slips made by the two groups, and discusses the results from the
questionnaire.
Statistical Comparison of Misdiagnosis Rates
This section gives the misdiagnosis rates of the two subject populations (expert and less-
skilled practitioners).
Expert Subjects
The group of four experts was tested prior to the group of thirty-two less-skilled
practitioners, primarily as a check to evaluate AIDA for usability and to make sure AIDA did
not create difficulties or induce new errors for skilled technologists. Briefly, two of these four
technologists were tested as the Control Group and two as the Treatment Group. All four
subjects got all of the cases (Pre-Test and Post-Test) correct on the first try. Thus, there is no
evidence from this data to suggest that the system interfered with expert problem-solving
performance. No further analyses were made regarding the expert subjects' performance with
the system, although the experts' responses to the questionnaire will be included in the section
discussing questionnaire results, since a couple of their suggestions merit consideration.
87
Less Skilled Subjects
Of the thirty-two less skilled blood bankers tested in the actual evaluation study, sixteen
were randomly assigned to the Control Group and sixteen to the Treatment Group. In
analyzing the data, it was discovered that on one of the two matched cases, the reactions to one
set of test results that was requested by four of the subjects were incorrect. Thus, the data from
those four subjects was discarded from the analyses that follow.
As would be expected, the results showed that there was no significant difference in
performances on the Pre-Test Case for the Control and Treatment Groups (using Fisher's exact
Test, see Table 2). The misdiagnosis rates were eliminated for the Treatment Group from 4/15
wrong on the Pre-Test Case to 0/15 wrong on the matched Post-Test Case 1, although this
difference failed to reach significance (using McNemar's Chi Square for dependent samples, χ2
= 2.25, p = 0.133, 1 df). The Control Group also failed to show a significant improvement in
performance from the Pre-Test Case to Case 1, as would be expected.
Table 2. Pre-test/Post-test comparison of misdiagnosis rates.
Test Cases Pre-test Case
2 antibodies looking like 1 (randomly chosen from one of two matched cases, the other of which was Case 1)
Case 1 2 antibodies looking like 1 (randomly chosen from one of two matched cases, the other of which was the Pre-Test Case)
Control Group
6/14 wrong 5/14 wrong NS
Treatment Group
4/15 wrong 0/15 wrong NS
88
The between-subject comparisons showed marked differences in performance across
the two groups (see Table 3). On Cases 1, 3, and 4, all subjects in the Treatment Group solved
the cases correctly, while 5/15 of the subjects in the Control Group misdiagnosed Case 1, 6/16
misdiagnosed Case 3, and 10/16 misdiagnosed Case 4. Using Fisher's exact test, each of these
differences is significantly different (p < 0.05). For the case that the system was not designed to
completely handle (Case 2), 8/16 subjects in the Control Group misdiagnosed the case
compared to 3/16 in the Critiquing Group. This improvement in performance is marginally
significant (p = 0.072). Thus, with the design of a critiquing system and checklist, we were able
to eliminate misdiagnoses on cases for which the system was designed (Cases 1 and 3) and on a
case for which the system was not explicitly designed but for which the system's knowledge
was appropriate (Case 4). Finally, misdiagnosis rates were reduced on a case for which the
system's knowledge was not fully competent (Case 2), but not significantly so.
89
Table 3. Post-Test Case results.
Test Cases Case 1
2 antibodies looking like 1 (randomly chosen from one of two matched cases, the other of which was the Pre-Test Case)
Case 2 weak antibody (for which the system was not designed to adequately handle)
Case 3 1 antibody masking another
Case 4 3 antibodies reacting on all cells (a case for which the system was not explicitly designed, sent by another blood bank lab)
Control Group
5/15 (33.3%) wrong
8/16 (50.0%) wrong
6/16 (37.5%) wrong
10/16 (62.5%) wrong
Critiquing Group
0/16 (0.0%) wrong
3/16 (18.75%) wrong
0/16 (0.0%) wrong
0/16 (0.0%) wrong
Signi- ficance
p < 0.05 p = 0.072 p < 0.01 p < 0.001
Besides individual comparisons using Fisher's exact test, a log-linear analysis was run to
take into account the difference in performance on the Pre-Test Case. Both Treatment and
Control Groups were subdivided into whether or not the Pre-Test Case was correct. This
analysis gave very similar results on individual cases and gave a combined significance level
(Weiner, 1971) of p ≤ 0.000005 favoring performance for the Treatment group (see Table 4).
Table 4. Combining p-values given by the Log-Linear analysis of misdiagnosis rates on the
Post-Test Cases, taking into account performance on the Pre-Test Case.
Case p-value -ln(p) 1 0.0234 3.76 2 0.0957 2.35 3 0.0078 4.85 4 0.0002 8.52
Total = 19.47
90
χ2 = 2(19.47) = 38.94 df = 8 p = 0.000005
91
Tables 5 and 6 give a subject by subject breakdown of misdiagnoses per case. These
tables show if a subject got a case right (represented by a 1) or wrong (represented by a 0) or, in
the case of the Treatment Group on the Post-Test Cases, got some feedback from the computer
regarding the plausibility of the answer. In this case, the table may show a series of answers
(such as 0-0-1), the last one indicating the correctness of the final answer given by the subject.
Table 5. Correctness of Answers, Treatment Group.
(0 = wrong, 1 = right. A series of numbers indicates the subject marked an answer more than
once, in response to critiques given by the computer).
Subject Pre-Test
Case Case 1 Case 2 Case 3 Case 4
T1 1 1-1-1-1 1 0-1 0-1 T2 0 1 0-0 1 1 T3 1 1 0-1 1 0-1 T4 1 1 1 1 0-0-0-0-1 T5 0 1 1-1 1 1 T6 1 1 0-0-0-0-0-0-1 0-0-0-1 1 T7 1 1 0-1 1 1 T8 1 1 0-1-1 1 1 T9 invalid data 1 0-0 1 1 T10 1 1 0-0-0 1 1 T11 1 1 1 1 1 T12 1 1 0-1 1 0-1 T13 0 1 1 0-1 0-1 T14 1 1 1 1 1 T15 0 1 1 1 1 T16 1 1 1 1 1
92
Table 6. Correctness of Answers, Control Group.
(0 = wrong, 1 = right).
Subject Pre-Test
Case Case 1 Case 2 Case 3 Case 4
C1 0 0 0 0 0 C2 0 1 0 1 0 C3 1 1 1 1 1 C4 0 1 1 0 0 C5 1 invalid data 1 1 0 C6 1 1 1 1 1 C7 0 0 1 1 0 C8 1 1 1 1 1 C9 invalid data 0 0 0 0 C10 1 1 1 1 0 C11 1 1 0 0 1 C12 0 0 0 0 0 C13 0 0 0 0 0 C14 1 1 0 1 1 C15 1 1 1 1 1 C16 invalid data 1 0 1 0
Slips vs. Mistakes
The behavioral data logs were examined for evidence that the errors detected by the
critiquing system were either mistakes (from subjects having either missing or incorrect
knowledge) or slips (from subjects either making unintentional actions or oversights according
to their current goals and knowledge). The identification of slips in the behavioral protocol data
were operationally defined as follows: The user provided evidence of forming the goal relating
to an action but either failed to carry it out (or carry it out completely) or carried it out in a way
that did not correspond to his/her current or known strategy. Any error messages relating to
the plausibility of the answer were considered to be mistakes. Thus, the subjects' data was
93
analyzed to develop a model of what inferences and procedures they knew how to perform
correctly, and that model was used to determine if a particular error was a slip or a mistake.
Using this classification scheme, the number of subjects in each group making a
particular kind of mistake or slip is shown in Table 7 (More detailed tables are shown in
Appendices F and G) . On the Pre-Test Case, a comparable number of subjects in each group
are making mistakes. On the Post-Test Cases, the number of subjects making process errors in
the Treatment Group is consistently less. It is difficult to make comparisons across the Post-Test
Cases, since each case has different characteristics and thus different kinds of process errors are
likely to be made. It is appropriate, however, to compare performance from the Pre-Test Case
to the matched Post-Test Case, to see if fewer subjects in the Treatment Group make process
errors after receiving the training and the checklist. Individual comparisons made for each of
the five error types shows that only Error Type 3 (Failure to Collect Converging Evidence) is
significantly reduced from the Pre-Test Case to the Post-Test Case, (p < 0.01, McNemar's Chi
Square for dependent samples, df = 1). When the p-values are combined across the five Error
Types, however (see Table 8), there is an overall significance level of p < 0.01, showing that the
training and the checklist indeed reduced the number of subjects making process errors from
the Pre-Test Case to the Post-Test Case. In terms of slips made, there does appear to be an
increase in the number of slips made by the Treatment Group on the Post-Test Cases. This
difference is significant for Error Type 2 (Failure to Rule Out When Appropriate), (p < 0.05,
McNemar's Chi Square for Dependent Samples, df = 1). This significant increase may be due to
the fact that the protocol followed by members of the Treatment group was to initially mark low
frequency antibodies as Unlikely. Many subjects in the Treatment Group subsequently failed to
notice that a previously marked Unlikely antibody could be changed to Ruled Out, and this
94
would be detected by the system as an error of Type 2 and classified as a slip, since subjects had
previously demonstrated the intention and ability to rule out all antibodies on a panel.
Table 7. Number of Subjects Committing Each Type of Error at Least Once Per Case
(Mistakes are the first number shown and slips are shown in parentheses)
ERROR TYPE GROUP CASE Pre-Test
Case Case 1 Case 2 Case 3 Case 4
1. Rule out Hypothesis Incorrectly
Control Treatment
10 (3) 8 (5)
13 (5) 3 (7)
11 (3) 6 (8)
12 (3) 2 (4)
11 (3) 8 (6)
2. Failure to Rule Out When Appropriate
Control Treatment
7 (2) 7 (2)
10 (4) 5 (10)*
5 (1) 4 (4)
9 (2) 3 (8)
11 (1) 5 (6)
3. Failure to Collect Converging Evidence
Control Treatment
10 (0) 10 (0)
12 (0) 1** (0)
11 (0) 3 (0)
10 (0) 3 (0)
11 (0) 4 (0)
4. Data Implausible Given Answer
Control Treatment
7 (0) 4 (0)
5 (0) 0 (0)
7 (0) 5 (0)
1 (0) 2 (0)
7 (0) 5 (0)
5. Answer Implausible Given Prior Probabilities
Control Treatment
5 (0) 6 (0)
7 (0) 1 (0)
5 (0) 8 (0)
4 (0) 3 (0)
13 (0) 4 (0)
* Significant increase in number of slips made for process error Type 2 for the Treatment Group from the Pre-Test Case to the Matched Post-Test Case (McNemar's χ2 = 4.9, p < 0.05, 1 df) **Significant reduction in process error Type 3 for the Treatment Group from the Pre-Test Case to the Matched Post-Test Case (McNemar's χ2 = 7.11, p < 0.01, 1 df)
95
Table 8. Combining p-values across the five error types.
Error Type p-value -ln(p) 1 0.2285279 1.48 2 0.5051982 0.68 3 0.0076654 4.87 4 0.1336145 2.01 5 0.0736382 2.61 Total = 11.65 χ2 = 2(11.65) = 23.30 df = 10 p < 0.01
Questionnaire Results
The questionnaire was administered to all members of the Treatment Group to get a sense
of how they viewed the software in terms of usability and utility. Two subjects did not fill out
the questionnaire. Responses to the questionnaire by the two expert subjects from the pilot
study are also included in the following section, because their comments were commensurate
with those made by the less-skilled subjects. Four open-ended questions were asked:
1) How would you rate this software in terms of ease-of-use?
In response to this question, nine of the less-skilled subjects wrote that the software was
very easy to use and three said it was easy to use after initial explanation and practice. One
subject mentioned that she did not like the software: "I did not like it but I'm not a computer
person and I tend to want to do things concretely", and one mentioned that she, "had some
trouble with getting on the wrong line on panels". The two expert subjects from the pilot study
96
said, "I found it easy to use, but it seemed to take me longer to do the identification." and the
other said it was "quite easy after you get the hang of it."
2) Would you find this software useful for your job?
When asked, " Would you find this software useful for your job?", all of the less-skilled
subjects either said, "Yes" (8 Subjects), "Absolutely"(1 Subject), "Very Useful" (1 Subject), or
mentioned particular aspects of the software that would be useful such as: "... to show how... to
solve antibody problems" (1 Subject), "...it was useful in telling you when you made the wrong
assumptions" (1 Subject), "to maintain proficiency...", (1 Subject), and "...in teaching new
employees and students..."(1 Subject). The two expert subjects both concurred on this last point,
one saying that it would be , "great for new employees and students.", and the other saying it
would be "useful in teaching students." This subject also mentioned that, " If put into use in the
Blood Bank, all panel results would be saved on computer disc."
3) What did you like most about this software?
In response to this question, five of the less-skilled subjects mentioned ease of use
and/or interface features, such as: "[it was] user friendly", "It was fun to use!", "It made it very
easy to rule out antibodies.", and "The highlighting was very helpful when I started making use
of it." One of the expert subjects said, "I liked highlighting positive reactions and doing the rule
outs and the screen showing what was still left as possible antibodies."
Three of the less-skilled subjects mentioned aspects related to teaching/enforcing a
logical protocol, such as: "It takes you through the antibody identification step by step.", "The
format was very logical.", and "[it was useful] to update my thinking and have a set format to
help rule out and ID."
97
Finally, nine of the less-skilled subjects and one of the expert subjects specifically
mentioned the critiquing/error checking as useful. The less skilled subjects liked, "The ability to
view your answers and the computer interaction allowing you to see what the problems with
your answers are.", "The "beeps" to warn you of a possible mistake were helpful.", "That it alerts
you to things you may have missed.", "The thoroughness of the antibody testing � you can't
complete the panels if you have not exhausted all possibilities.", "The explanations of problems
when you could not go to the next step.", "It finds things that I may have skipped or missed and
then makes good suggestions.", "The capability of letting the tech know about the wrong or
inconsistent results by audible alarm.", and "The error check is great, particularly in multiple
antibodies ID with additional alleles." The expert subject said she liked, "The corrections if you
made an incorrect assumption."
4) What would you suggest to improve this software (including additional functions that you
would like it to perform)?
In response to this question, eight subjects had no suggestions. Two of the less-skilled
subjects and both of the expert subjects mentioned aspects related to the interaction. The less-
skilled subjects said in turn that they would like, "A chance to skip steps if you knew the
answer but the need to confirm that you knew it even though you didn't complete all steps.",
and "To be able to look at a screen (say a panel) and be able to go back out without it telling you
to put in answers." One of the experts said, "I felt I was flipping back and forth from screens.
Especially the weak reaction that was stronger with increased serum/cell and enzyme." The
other expert subject said, "I would like it to give you a chance to correct yourself if you hit the
wrong button when you rule out."
98
Two of the less-skilled subjects mentioned interface issues as being difficult, specifically
that they had difficulty manipulating the hierarchical menu and trackball.
Finally, two of the less-skilled subjects mentioned information needs, such as: "The
ability to view the correct answer at the end of each case." and "Maybe [having] another set of
additional cells for ruling out, otherwise great software!"
Thus, overall, the results from the questionnaire were very positive. All of the subjects
had good comments and almost all of them seemed to welcome the feedback and training
provided by the critiquing aspects of the software. All of the subjects also thought that the
software would be useful for their job and an overwhelming majority mentioned specific
features that they liked, such as the ability to see things better on the screen, the logical format
of the protocol enforced, and the error checking to aid them if they made mistakes or were
stuck.
In terms of suggested improvements, it may be necessary to re-design the menu system
so that no hierarchical menu selections are necessary and to always provide a mouse to subjects
as an alternative to the trackball for input, since these were mentioned as being difficult to use
by a few of the subjects. One subject mentioned navigation issues, finding it difficult to
compare results across screens. This problem merits further consideration, with perhaps a re-
design of the way information is displayed so that values that need to be compared can be
presented in view at the same time in a way that supports the users' goals.
Two of the other suggestions merit consideration and are difficult design challenges.
The subject who wanted , "a chance to skip steps if you knew the answer but the need to
confirm that you knew it even though you didn't complete all steps" was asking for a computer
that was smart enough to know when a person was able to skip some of the steps in a
99
procedure. This is a very difficult design dilemma, and goes against our philosophy of the
design of this critiquing system. As will be argued later, the success of this critiquing system in
reducing misdiagnosis depended relied heavily on its enforcement of the protocol (although
subjects could override any critiques), reminding subjects to consider all of the data that they
had encountered in a case, and to collect sufficient data to confirm an answer.
The subject who wanted, "to be able to look at a screen (say a panel) and be able to go
back out without it telling you to put in answers" was referring to a similar problem, but
perhaps on a smaller scale. This subject wanted the system to refrain from critiquing when new
test results were requested. The design of this system is such that it gives three types of
critiques: 1) It checks for errors of commission as soon as the subject performs an action when
viewing a panel of test results (such as ruling out an antibody) 2) It checks for errors of omission
related to that particular panel (such as failing to rule out an antibody) as soon as the subject
wants to leave a screen to view a different test result and 3) It checks for plausibility of the
answer when the subject says they are done with the case and "overall" errors of omission, i.e., a
failure to complete the full protocol and to collect converging evidence. Thus, one could re-
design the system such that it did not immediately check for errors of omission when the subject
left a panel. However, there is a tradeoff such that a message that is displayed at a later time
may be out of context when the error is finally pointed out. Furthermore, if what was missed
was important to the problem-solving, it may be that in going back to fix the problem the reason
for viewing a new test result may be resolved.
This problem is worth thinking about some more, since an error message at this
juncture may interrupt the users' current thought processes and lead them on an error-recovery
track that makes them forget their original intention of selecting the new test. It might be
possible to push some of the less important checks to the end of the case. For example, the
100
system currently checks that a person has marked whether or not an auto-antibody is present
when the Polyspecific Albumin panel is viewed (since that is where this information is located),
but this information is not usually directly relevant to the rest of the problem-solving, so
perhaps this kind of a check could be done later, thereby reducing the number of times that this
message might interrupt practitioners when it is not currently relevant to them.
The subject who said, "I would like it to give you a chance to correct yourself if you hit
the wrong button when you rule out" gave a good suggestion but also one that would be
difficult to implement. Perhaps a small delay could be introduced, to allow the subject a short
amount of time to undo an action without a critique from the computer. For subjects who are
very fast at ruling out, though, the system may end up giving an error message after they have
ruled out another antibody, and then the message would be out of context. This is a design
change that would need to be tested to see if it would work well or not. Delaying the check for
errors of commission until a user selects another test panel would only heighten the problem of
the system interrupting the users' thought process out of context.
The next suggestion that was made for improving the software was: "The ability to view
the correct answer at the end of each case." The system does not know the answer to any of the
cases, so it would not be able to tell the subjects the answer (unless the answer was programmed
in ahead of time � a possibility in a test situation but not in real lab settings). The final
suggestion made was "Maybe another set of additional cells for ruling out", meaning that the
subject wanted more test cells available. The number of additional cells that we provided,
however, was comparable to the number of additional cells that most labs have to work with
and was certainly sufficient for solving these cases. Thus, this suggestion is not realistic in a lab
setting and would encourage inefficient diagnosis practices.
101
Detailed Analyses
Besides summary statistics and questionnaire results, more detailed analyses of
behavior for the group of less-skilled subjects were conducted from the behavioral protocol logs
that were automatically generated by the computer, to determine if important behaviors with
the system, whether good or bad, could be identified.
Proactive Training vs. Reactive Feedback (Critiquing)
One interesting question to ask is to what extent is the improvement in performance of
the Treatment Group due to the initial training and use of the checklist (proactive training) and
to what extent is it due to the presence of the critiquing system monitoring their performance
(reactive feedback)? Clearly, a large improvement is seen from the Pre-Test Case to the
matched Post-Test Case in terms of outcome errors (which were eliminated) and process errors.
This indicates that the proactive training with the checklist was immediately helpful to subjects
and helped to significantly improve their procedural performance.
It is also interesting to note, however, that subjects in the Critiquing Group did not
always get a case right immediately (see Table 5). Even though they eventually got the right
answer on almost all of the Post-Test Cases, this was not without assistance from the computer.
For example, in 18 instances, subjects indicated that they were done with a case and the
computer detected one or more errors (see Table 5). Fourteen of the errors were concerned with
errors of omission in their procedure (an incomplete protocol), 17 with an inconsistency with the
answer marked given the reactions, and 16 with an implausible answer given prior
probabilities, for a total of 47 errors detected. On 16 out of the 18 cases, the subjects' answers
were wrong and of those 16, 13 subjects subsequently changed their answer to the correct one
because they were prompted by the critiques to re-examine the case (remember that the
102
computer does NOT know any of the answers and is merely checking for particular kinds of
process and intermediate inference errors). Thus, besides having evidence of the benefits of a
proactive training approach, we also have evidence that the presence of the critiquing system,
giving context-specific critiques in response to the current case situation and the person's state
of problem-solving, was beneficial in improving overall performance.
The Timing of the Critiques
In order to examine in more detail the timing of the critiques (i.e., should there be
immediate feedback about errors of omission on the current test panel when the person selects a
new test panel?), an analysis was made of the number of times that a person selected a new test
panel, got an error of omission for the current test panel, and then subsequently failed to re-
select the same new test panel. In order to determine whether the critiquing was interrupting
the person's thought process, each situation was examined to determine if 1) the subsequent
change in selection was due to a slip in selecting the test, 2) whether the new test was no longer
needed after fixing the critique, 3) the change in selection appeared not to matter significantly or
4) it appeared that the critique clearly interrupted the subject's thought process. Further, it was
noted which message was being displayed, either A) the person did not mark the auto control,
R) the person did not do all of the rule-outs possible, or U) the person did not mark all of the
unlikely antibodies as Unlikely.
The most common scenario is that the person did not complete all of the rule-outs
possible on a test panel and, after adhering to the message and finishing the rule-outs, no longer
needed to view the subsequent test (this happened to ten subjects). This suggests that the
critique for the rule-out error of omission is appropriate to display at that time since, in fixing
the errors detected by this message, the need for selecting the subsequent test was eliminated.
103
The Auto-Control message may or may not have interrupted the users' thought processes since,
in two instances subjects changed their subsequent test selection but it was impossible to tell
from the context if that change in selection was important to the overall problem-solving. The
Unlikely message clearly interrupted one subject in two instances. In both those instances, she
selected the Case History menu item, but received a message saying that she had forgotten to
mark all of the unlikely antibodies. After remedying the situation, she then failed to select the
Case History again, and thus missed viewing possibly important information.
These scenarios illustrate that messages that are not directly relevant to a specific panel
should perhaps be displayed at a later time (i.e., when the person has indicated that s/he is
done), to avoid interruption of users' thought processes as much as possible. Alternative
solutions to this problem may also be found, such as reminding the person where they had
intended to go next so that they can decide whether or not to pursue the same course of action
after fixing a problem. In trying to decide when to display a particular critique, one question
that designers must ask themselves is the following: "Will the immediate remediation of the
problem being pointed out possibly preclude the reason for selecting further tests?" If the
answer to that question is yes, then the message should be displayed immediately. If no, then
the message should perhaps be displayed at a later time.
The problem of when to display a critiquing message deserves some higher-level
thought, particularly if critiquing is going to be used as a decision support strategy in a higher-
tempo, increased workload situation. Studies of multi-person teams working together in a high-
stress situation show that a person will judge the "interruptability" of another person before
making a suggestion to a co-worker (Johannesen, Cook and Woods, 1995). Designers of
critiquing system should consider this problem further of how to give critiques in context while
minimizing the extent to which the critiques may be interrupting users' thought processes.
104
Subjects Overriding the Critiques
In 10 out of 249 instances, subjects chose to over-ride a critiquing message. One
message was a rule-out error of omission when leaving the initial Antibody Screen panel (the
subject had ruled out most of the antibodies possible but had missed one). In the other nine
instances, the subject over-rode the computer in response to one of the "end-checkers" (the
messages generated at the end of a case when the subject has confirmed a set of antibodies and
the computer detects errors of omission in their protocol or a problem with the plausibility of
the answer given prior probability information or given data in the case). These end-checkers
fire one at a time in a particular order.
If the subject has not ruled out all remaining clinically significant antibodies, then a
general message is displayed, saying that it is a good idea to complete rule-outs before finishing
a case. One subject overrode this message twice and another subject overrode this message
three times. In all five of these instances, the next "end-checker" message that was generated,
pertaining to the plausibility of the answer, was adhered to by the subject.
In two instances, subjects on the weak D case had marked anti-D plus anti-E as their
answer and received a message from the computer that they should not have (due to a bug in
the program), namely that anti-E as the only Rh antibody in an Rh negative patient is rare. This
message should not have fired since the subject had marked more than one Rh antibody, and
both subjects receiving this message ignored it. This gives some evidence that buggy
knowledge in the system will be detected by subjects and appropriately ignored.
In one instance, a subject had marked anti-E alone as the answer on the weak D case
and correctly received the end-checker message that says that anti-E alone in an Rh negative
patient is rare. This subject over-rode the computer and got the case wrong. In another
105
instance, a subject marked anti-K as the answer on the weak D case and got another "end-
checker" message, namely that Kell antibodies (including anti-K) do not normally react in the
pattern seen on the case. This subject over-rode the computer and got the case wrong. Further
analysis of performance on this case is given in the next section.
Analysis of the Weak D Case
It is worth examining the Weak D case in more detail, because this case is one where the
knowledge in the system is not fully competent for solving the case. Thus, there is the potential
for "brittleness" in the computer's reasoning. The question to consider is if the critiquing system
is still helpful to subjects solving such a case and if so, what are the mechanisms that are
contributing to this improvement? One measure of performance is the misdiagnosis rate for the
two groups. The Control Group had a 50% misdiagnosis rate as compared to the Treatment
Group, who had a 18.75% misdiagnosis rate (p = 0.072). This measure is marginally significant,
suggesting a trend towards improved performance with the critiquing system. Thus, one may
ask what aspects of the critiquing system are contributing to this possible improved
performance.
Three design features in particular had the potential to aid subjects. The first was the
application of some "meta-knowledge" such that the critiquing system is aware that its rule-out
strategy is fallible in the case of weak reactions. The system's "solution" in this case is to warn
the user that ruling out when there are weak reactions is dangerous, and the system suggests
trying to enhance the reactions first. The second possibly helpful design feature is the use of
prior probability information when examining the plausibility of an answer. In particular, one
common misdiagnosis on this case (based on the case characteristics and previous testing of this
case on practitioners) is anti-E, since anti-E accounts for the weak reactions on the initially
106
displayed test reactions. As part of its check for the plausibility of answers, the system "knows"
that anti-E is a rare finding when the patient is Rh negative and anti-D has not been confirmed
(as is the case here). Thus, the system displays the following message: "Anti-E as the only Rh
antibody is uncommon in an Rh negative
person. Normally anti-D would form first. It would be better to double check and ensure that
anti-D is not present by doing an enzyme panel, by increasing the serum:cell ratio or by using
some other enhancement technique. (In addition, if this patient is a pregnant woman, check
to see if she has been administered RhIG.)" Such a message may prompt the subject who marks
this answer to re-consider the answer to the case. Other "prior probability" messages could
potentially be instantiated if the person marks an answer other than anti-E. Furthermore, an
answer besides anti-E will not account for all of the exhibited reactions and would thus cause
the system to warn the user that the answer given does not account for all of the data seen on
the case.
Figures 10 and 11 show the paths taken when solving the Weak D Case for the Control
and Treatment Groups respectively. (One subject's protocol data was lost due to a computer
error and thus it is not clear how he arrived at the correct answer and his path is shown as a
question mark in Figure 11). In comparing these two figures, we see that a comparable group of
subjects in both groups successfully solved the case "on their own" by either waiting to rule out
until the reactions were enhanced (5 subjects in each group) or by enhancing the reactions after
having ruled out D and subsequently confirming D (3 subjects in the Control Group and 2
subjects in the Treatment Group), such that 8 subjects in each group initially solved the case and
8 misdiagnosed it. However, the Treatment Group had the benefit of the critiquing messages
that check the plausibility of a solution. Five of the subjects receiving such a message
subsequently changed their answer to correctly include anti-D as part of their answer. In
107
conclusion, it is unclear if the "warning" at the beginning of the case aided subjects at all, but the
end-of-case error checking (that checks the plausibility of an answer) was clearly beneficial. The
benefit of these end-checkers is evident on other cases as well, causing subjects to correct their
answers in three instances on Case 3 and in five instances on Case 4 (see Table 5).
Weak D Case
Don't rule out Rule out (including D)
Enhance reactions
Confirm D Confirm E Confirm low frequency antibody
Confirm nothing
Enhance reactions Un-rule out D Confirm D
5 11
5
5
3
233
3
Figure 10. Paths taken to solve the Weak D Case, Control Group
108
Antibody Screen "Be careful ruling out because there are weak reactions"
Don't rule out Rule out (including D)
Enhance reactions
Confirm D Confirm KConfirm E Confirm nothing
Enhance reactions Un-rule out D Confirm D
5 10
5
5
2
116
2
"E alone in an Rh negative patient is rare"
"The pattern of reactions is not consistent with anti-K"
Confirm D Confirm D+E Confirm E Confirm K
32
1
?
1
Confirm D
1
Figure 11. Paths taken to solve the Weak D Case, Treatment Group
When to use critiquing vs. some other form of decision support
Critiquing appears to be a very well-matched decision support strategy to the domain
of antibody identification. To what extent can these results be generalized to other domains? In
general, the approach that has been taken in the design of this critiquing system is to examine
109
the constraints on the task and check to be sure that the person's solution to the problem does
not violate any of those constraints (by checking that the solution takes into account all of the
data that is currently available, as well as general domain knowledge that also should be taken
into account when generating a solution). These two techniques are applicable in almost any
domain and should definitely be considered in the design of any decision support system.
There are many other benefits of using the critiquing approach to decision support.
First, practitioners will not lose skill on a task (a typical problem with the introduction of
automation). In fact, users of a critiquing system have the potential to learn from the computer's
critiques. Thus, the same system that is used for on-the-job decision support can also be used
for training or for maintaining job proficiency.
Second, it is likely that users of a critiquing system have the capability to build up a
better mental model of this type of a decision support system than one that generates a solution
for the user, because the system provides feedback in the context of the user's formulation of the
problem, and only when a discrepancy is detected.
Third, there is a growing body of evidence that practitioners are more likely to question
a garden-path result and/or explore more alternatives if doing the task themselves. For
example, results from a study by Layton, Smith, and McCoy (1994) showed that much of what
triggers practitioners to apply their expertise at the appropriate time when solving a difficult
problem is data-driven. Critiquing systems can support users as they solve a problem by
providing context-specific feedback. More often than not, tasks are such that a problem-solver
is confronted with too much information, or information that is not "pre-packaged", so that
much seemingly disparate information has to be accounted for when generating a solution.
Once a solution has been generated, a typical "mistake" made by a problem-solver is that
inadequate aspects of the plan or solution are overlooked (either because of time constraints, a
110
failure to understand how the solution has violated some of the constraints, or biases/memory
distortions, such as biased assimilation). A critiquing approach to decision support is such that
the person generates a solution or plan, and the computer system analyzes the plan based on
pre-defined constraints.
Thus, in any domain where a computer system can take into account case-specific
parameters when analyzing the plan and give context-sensitive feedback, a critiquing system is
a viable form of decision support. Finally, besides considering critiquing as the final and only
form of decision support, one could also use critiquing as a stepping stone to automation or in
conjunction with automation. For example, a computer system which generates alternative
solutions to a problem could also use its knowledge to critique any additional plans generated
by a human.
111
Chapter VII
Conclusion
This study focused on how to design a cooperative decision support system. By
cooperative, it is meant that the human problem-solver and computerized support system
working in partnership interactively have better performance than either one working alone.
Thus, if the design is effective, the computer should be able to detect and correct errors made by
the human and the human should be able to detect and correct errors made by the computer,
and the design of the support system should help to enhance the user's performance by helping
to trigger or stimulate the application of relevant expertise. This contrasts with the automation
model of decision support, which tries to reduce human error by replacing the fallible human
with an automated decision aid. This latter philosophy breaks down if the automated decision
aid is also fallible. The form of cooperative support studied here was to develop a
representation (in the form of a checklist) to provide guidance in the form of a high level goal
structure, and to have the computer act in a critiquing role, monitoring the human's problem-
solving process for faulty reasoning steps.
The study presented here builds upon previous work examining the effectiveness of
critiquing as a form of decision support. In particular, one previous study provided initial
empirical evidence that critiquing is an effective approach to support cooperative problem-
solving even on a case where the computerized decision aid is fallible, whereas an automated
system was shown in that study to significantly worsen performance on this case by 29%. In the
present study, the Treatment Group (supported by the written checklist and a more complete
112
critiquing system than the one tested previously) correctly identified this same case 32% more
often than the Control Group, even though the critiquing system was still not fully competent
on that case. These results provide further evidence that a critiquing system does not make
performance any worse than a person working alone when the computer's reasoning is faulty.
Furthermore, on cases where the computer's reasoning was fully competent, misdiagnosis rates
were eliminated for subjects using the critiquing/checklist system, whereas subjects with no
decision support were misdiagnosing cases 33% to 63% of the time.
Critiquing, although not explored to date by very many researchers as a form of
decision support, seems to be a viable solution for greatly improving performance on certain
kinds of tasks, including the important, real world medical diagnosis task of antibody
identification. Clearly, this is a task that medical technologists find difficult, since many of them
are getting moderately difficult, yet realistic, patient cases consistently wrong when unassisted.
A well-designed critiquing/checklist system has proven to be a method for virtually eliminating
the errors that it was designed to catch, and for aiding on cases for which its knowledge is
incomplete.
A systems approach was taken in the design of this decision support system. This
systems approach led us to design a computer system that revolved around the application of a
complete protocol, using a number of complementary problem-solving strategies to
independently converge on an answer. The critiquing model of interaction was employed so
that the human practitioners could stay involved in the task, apply their own expertise, learn
from the computer, and judge the computer's feedback in a context-sensitive manner. There
was evidence that the critiquing system aided subjects by catching slips and mistakes and
helping users to recover from these errors, employing five different types of error checking
mechanisms, (checking for errors of commission, checking for errors of omission, checking for
113
an incomplete protocol, checking that the data was consistent with the answer, and checking
that the answer was plausible given prior probabilities). The use of a checklist was beneficial in
quickly training subjects on the high-level goal structure implicit in the computer's knowledge
base, and served as a reminder to subjects of the steps necessary to successfully solve a case.
Finally, the success of the system's interaction with the user relied on its unobtrusive interface
that allowed subjects to naturally solve antibody identification cases as they normally would
using paper and pencil, while providing the computer with a rich set of data regarding the
characteristics of the case and the user's problem-solving steps without requiring the
practitioner to enter information that was outside of normal task requirements.
Subjects in this study reported that the system was easy and fun to use, and that it
would be helpful on the job for identifying antibodies. Subjects also noted that the same system
could be used for maintaining proficiency and/or training new hires on the procedures for
solving antibody identification cases. Thus, a correctly designed critiquing system can not only
immediately improve overall performance by catching slips and mistakes in a more cooperative
and less obtrusive manner than many automated systems, but it also has the potential to
transfer much of its knowledge and strategies to the person by the nature of the interactions.
One issue that remains to be explored is the extent to which the results from this study
can be generalized to other domains. Prototype critiquing systems have been developed for
medical diagnosis tasks, design tasks, and training tasks, but were not extensively tested in
realistic domain settings. Important characteristics to consider when trying to decide if
critiquing is a viable alternative is the extent to which the computer would be able to
unobtrusively gather data about the task situation and the person's reasoning process, so as to
have the necessary information to generate timely and appropriate critiques.
114
Several guiding design concepts were successful in the design of this critiquing system,
and may be generalized to other domains and other types of decision support:
1) Design the interface so the human naturally uses the computer system as an integrated
information-based tool, while at the same time providing the computer with adequate
information about the task situation and the human's reasoning process.
2) Develop a knowledge base that checks for context-specific errors in reasoning including:
a) errors of commission,
b) errors of omission,
c) a solution that is not based on converging evidence,
d) a solution that is not consistent with all of the available data, and
e) a solution that is not plausible given prior probabilities.
3) Develop a representation (such as a written checklist) that outlines the higher-level goal
structure expected by the computer system. This can have several benefits:
a) the checklist serves as an external memory aid, reminding users of certain types of
knowledge related to the task, such as factual and procedural information,
b) the checklist provides the user with an appropriate frame of reference for
interpreting any feedback given by the computer. In other words, use of the
checklist is a way for designers to ensure that both the computer system and the
practitioners using it have a common frame of reference for communication and
understanding, and
c) the checklist provides an alternative form of aid to practitioners in situations for
which the critiquing system is not helpful. For example, if a practitioner gets stuck
during a case, s/he can review the checklist to see if there are any other tests or
115
knowledge that may be applicable, since the checklist lists the goals that should be
completed before finishing a case.
This study goes well beyond previous analyses of critiquing systems for a number of
reasons:
1) A complete, usable system with a direct manipulation interface was developed, which
allowed practitioners to work normally and provided detailed information to the critiquing
system about the task context and the user's problem-solving behavior.
2) The critiquing system contains error-checking knowledge that is based on a detailed
cognitive task analysis of the strategies and kinds of errors made by practitioners.
3) The system was tested with certified practitioners on realistic, difficult patient cases. Both
highly experienced and less experienced subject populations were studied and it was shown
that the critiquing system did not interfere with the former group and significantly helped
the latter group.
4) A behavioral protocol analysis showed how and why the critiquing system was helpful,
pointing to error checking mechanisms that were particularly beneficial (such as the "end-
checkers" that check the answer for plausibility and check for errors in the person's overall
protocol that led to the answer). This protocol analysis also led to the conclusion that both
the initial training with the checklist and the presence of the critiquing system contributed
to improvements in performance.
5) The protocol analysis uncovered areas for further study, such as the issue of when to
display a critiquing message so that it is in context but not interrupting the user's thought
processes.
6) The experimental design allowed us to make both within-subjects and between subjects
comparisons of performance, showing that the critiquing system eliminated misdiagnosis
116
rates on three difficult cases and also reduced misdiagnosis rates on a fourth case for which
the system's knowledge base was not fully competent.
7) We had evidence from the questionnaire that subjects enjoyed using the system and would
find it useful on the job.
Finally, it should be noted that the data collection technique used for this study was
quite successful for examining behaviors and strategies employed by subjects. The computer
automatically generated a time-stamped behavioral protocol log, based on the actions
performed when using the computer. The computer also generated an error report, logging all
of the types of errors made by each subject on each case. These data logs were relatively easy to
analyze for evidence of different kinds of problem-solving behaviors and could be re-analyzed
as new important characteristics were identified. Furthermore, these computer-generated data
logs and error reports will allow for long-term and remote data collection for future studies
when the system is introduced into widespread use.
117
List of References
Aikins, J., Kunz, J., & Shortliffe, E. (1983). PUFF: An expert system for interpretation of pulmonary function data. Computers and Biomedical Research, 16, 199-208.
Bernard, J. A. (1989). Applications of artificial intelligence to reactor and plant control. Nuclear Engineering and Design, 113, 219-227.
Berner, E., Webster, G., Shugerman, A., Jackson, J., Algina, J., Baker, A., Ball, E., Cobbs, C., Dennis, V., Frenkel, E., Hudson, L., Mancall, E., Rackley, C., & Taunton, O. (1994). Performance of four computer-based diagnositic systems. The New England Journal of Medicine, 330(25), 1792-1296.
Bernelot Moens, H. J. (1992). Validation of the AI/RHEUM knowledge base with data from consecutive rheumatological outpatients. Methods of Information in Medicine, 31, 175-181.
Billings, C. E. (1991). Human-Centered Aircraft Automation Philosophy: Concepts and Guidelines. NASA Ames Research Center, NASA Technical Memorandum 103885.
Brannigan, V. M. (1991). Software quality regulation under the Safe Medical Devices Act of 1990: Hospitals are now the canaries in the software mine. Proceedings of the Fifteenth Annual Symposium on Computer Applications in Medical Care, American Medical Informatics Association, November 17-20, Washington D.C. p. 238-242.
Chandrasekaran, B. (19xx). Generic tasks as building blocks for knowledge-based systems: The diagnosis and routine design examples. The Knowledge Engineering Review.
Console, L., Conto, R., Molino, G., Ripa di Meana, V., & Torasso, P. (1991). CAP: A critiquing expert system for medical education. In M. Stefanelli, A. Hasman, M. Fieschi, & J. Talmon (Ed.), AIME 91 Proceedings of the Third Conference on Artificial Intelligence in Medicine, 44 (pp. 317-327). Maastricht: Springer-Verlag.
Eberhardt, K. R., and Fligner, M. A. (1977). A comparison of two tests for equality of two proportions. American Statistician, 31(4), 151-155.
Fischer, G., Lemke, A., Mastaglio, T., and Morch, A. (1991). Critics: an emerging approach to knowledge-based human-computer interaction, International Journal of Man-Machine Studies, 35, 695-721.
Fischer, G., Lemke, A., & Mastaglio, T. (1990). Using critics to empower users. In CHI '90 Human Factors in Computing Systems Conference Proceedings (pp. 337-347). New York: Association for Computing Machinery.
118
François, P., Robert, C., Astruc, J., Begue, P., Borderon, J., Floret, D., Lagardere, B., Mallet, E., Pautard, J., & Demongeot, J. (1993). Comparative study of human expertise and an expert system: Application to the diagnosis of child's meningitis. Computers and Biomedical Research, 26, 383-392.
Fraser, J. M., Strohm, P., Smith, J. W. J., Galdes, D., Svirbely, J. R., Rudmann, S., Miller, T. E., Blazina, J., Kennedy, M., & Smith, P. J. (1989). Errors in abductive reasoning. Proceedings of the 1989 IEEE International Conference on Systems, Man, and Cybernetics, 1136-1141.
Fraser, J. M., Smith, P. J., & Smith, J. W. (1992). A catalog of errors. International Journal of Man-Machine Systems, 37, 265-307.
Galdes, D. (1990). An Empirical Study of Human Tutors: The Implications for Intelligent Tutoring Systems. Ph.D. Dissertation, The Ohio State University.
Gamerman, G. E. (1992). FDA regulation of biomedical software. Proceedings of the Sixteenth Annual Symposium on Computer Applications in Medical Care, American Medical Informatics Association, p. 745-749.
Giboin, A. (1988). The process of intention communication in advisory interaction. IFAC Man-Machine Systems, 365-370.
Gregory, D. (1986). Delimiting expert systems. IEEE Transactions on Systems, Man, and Cybernetics, SMC-16(6), 834-843.
Guerlain, S., Smith, P. J., Miller, T., Gross, S., Smith, J., & Rudmann, S. (1991). A testbed for teaching problem-solving skills in an interactive learning environment. Proceedings of the Human Factors Society, 1408.
Guerlain, S. (1993a) Designing and Evaluating Computer Tools to Assist Blood Bankers in Identifying Antibodies. Master's Thesis, The Ohio State University.
Guerlain, S. (1993b). Factors influencing the cooperative problem-solving of people and computers. In Proceedings of the Human Factors and Ergonomics Society 37th Annual Meeting, 1 (pp. 387-391). Seattle, WA:
Guerlain, S., Smith, P. J., Gross, S. M., Miller, T. E., Smith, J. W., Svirbely, J. R., Rudmann, S., & Strohm, P. (1994). Critiquing vs. partial automation: How the role of the computer affects human-computer cooperative problem solving. In M. Mouloua & R. Parasuraman (Eds.), Human Performance in Automated Systems: Current Research and Trends (pp. 73-80). Hillsdale, New Jersey: Lawrence Erlbaum Associates.
Harris, S. D., & Owens, J. M. (1986). Some critical factors that limit the effectiveness of machine intelligence technology in military systems applications. Journal of Computer-Based Instruction, 13(2), 30-34.
119
Hickam, D., Shortliffe, E., Bischoff, M., Scott, A., & Jacobs, C. (1985). The treatment advice of a computer-based cancer chemotherapy protocol advisor. Annals of Internal Medicine, 103(6 pt 1), 928-936.
Jones, P. and Mitchell, C. (1995). Human-computer cooperative problem solving: Theory, design, and evaluation of an intelligent associate system, IEEE Transactions on Systems, Man, and Cybernetics, 25(7), 1039-1053.
Johannesen, L., Cook, R., & Woods, D. (1995). Grounding Explanations in Evolving, Diagnostic Situations (Center for Cognitive Science Technical Report No. 14). Columbus, OH: The Ohio State University.
Josephson, J. and Josephson, S. (1994). Abductive Inference. Computation, Philosophy, Technology, Cambridge University Press.
Langlotz, C. P., & Shortliffe, E. H. (1983). Adapting a consultation system to critique user plans. International Journal of Man-Machine Studies, 19, 479-496.
Layton, C., Smith, P. J., & McCoy, E. (1994). Design of a cooperative problem-solving system for enroute flight planning: An empirical evaluation. Human Factors, 36(1), 94-119.
Lehner, P. E., & Zirk, D. A. (1987). Cognitive factors in user/expert-system interaction. Human Factors, 29(1), 97-109.
Lepage, E. F., Gardner, R. M., Laub, R. M., & Golubjatnikov, O. K. (1992). Improving blood transfusion practice: Role of a computerized hospital information system. Transfusion, 32, 253-259.
Linnarsson, R. (1993). Decision support for drug prescription integrated with computer-based patient records in primary care. Medical Informatics, 18(2), 131-142.
Malin, J., Schreckenghost, D., Woods, D., Potter, S., Johannesen, L., Holloway, M., and Forbus, K. (1991). Making Intelligent Systems Team Players: Case Studies and Design Issues. Volume 1: Human-Computer Interaction Design, NASA Technical Memorandum 104738, Houston, TX: NASA Johnson Space Center.
Miller, G. A. (1956). The magical number seven, plus or minus two: Some limits on our capacity for processing information. Psychological Review, 63, 91-97.
Miller, P. (1986). Expert Critiquing Systems: Practice-Based Medical Consultation by Computer. New York: Springer-Verlag.
Miller, R., & Maserie, F. (1990). The demise of the "greek oracle" model for medical diagnostic systems. Methods of Information in Medicine, 29, 1-2.
Miller, R. A. (1984). INTERNIST-/CADUCEUS: Problems facing expert consultant programs. Meth. Inform. Med., 23, 9-14.
120
Miller, R. A., Pople, J., H. E., & Myers, J. D. (1982). INTERNIST-1, An experimental computer-based diagnostic consultant for general internal medicine. , 307, 468-476.
Muir, B. (1987). Trust between humans and machines, and the design of decision aids. International Journal of Man-Machine Studies, 27, 527-539.
Nelson, S. J., Blois, M. S., Tuttle, M. S., Erlbaum, M., Harrison, P., Kim, H., Winkelmann, B., & Yamashita, D. (1985). Evaluating RECONSIDER, a computer program for diagnostic prompting. Journal of Medical Systems, 9(5/6), 379-388.
Newell, A., & Simon, H. A. (1972). Human Problem Solving. Englewood Cliffs, N. J.: Prentice-Hall Inc.
Norman, D. (1981). Categorization of action slips. Psychological Review, 88, 1-15.
Norman, D. (1988). The Psychology of Everyday Things, Basic Books, Inc., Publishers New York.
Norman, D. A. (1990). The 'problem' with automation: Inappropriate feedback and interaction, not 'overautomation'. In D. E. Broadbent, A. Baddeley, & J. J. Reason (Eds.), Human Factors in Hazardous Situations (pp. 569-576). Oxford, England: Clarendon Press.
Parasuraman, R., Molloy, R., & Singh, I. (1993). Performance consequences of automation-induced "complacency". International Journal of Aviation Psychology, 3(1), 1-23.
Parasuraman, R., Mouloua, M., & Molloy, R. (1994). Monitoring automation failures in human-machine systems. In M. Mouloua & R. Parasuraman (Eds.), Human Performance in Automated Systems: Current Research and Trends (pp. 45-49). Hillsdale, NJ: Lawrence Erlbaum Associates.
Pea, R. (1985). Beyond amplification: using the computer to reorganize mental functioning. Educational Psychologist, 4, 167-182,
Plugge, L., Verhey, F., & Jolles, J. (1990). A desktop expert system for the differential diagnosis of dementia. International Journal of Technology Assessment in Health Care, 6, 147-156.
Pryor, T. A. (1994). Development of decision support systems. In M. Shabot and R. Gardner (Eds.) Decision Support Systems in Critical Care (pp. 61-73). New York: Springer-Verlag.
Pryor, T. A. (1983). The HELP system. Journal of Medical Systems, 7, 87-101.
Rasmussen, J. (1983). Skills, rules, and knowledge: Signals, signs, and symbols, and other distinctions in human performance models. IEEE Transactions on Systems, Man and Cybernetics, SMA-13(3), 257-266.
Roth, E., Bennett, K., and Woods, D. (1988). Human interaction with an "intelligent" machine, Cognitive Engineering in Complex Worlds, (pp. 23-69), London: Academic Press.
121
Rothschile, M. A., Swett, H. A., Fisher, P. R., Weltin, G. G., & Miller, P. L. (1990). Exploring subjective vs. objective issues in the validation of computer-based critiquing advice. Computer Methods and Programs in Biomedicine, 31, 11-18.
Rudmann, S., Miller, T., Smith, P.J., and Smith, J. W. (in press). Problem-based education for immunohematologists. Clinical Laboratory Science.
Sarter, N. B., & Woods, D. D. (1994). Decomposing automation: Autonomy, authority, observability and perceived animacy. In M. Mouloua & R. Parasuraman (Eds.), Human Performance in Automated Systems: Current Research and Trends (pp. 22-27). Hillsdale, NJ: Lawrence Erlbaum Associates.
Sassen, A., Buiël, E., & Hoegee, J. (1994). A laboratory evaluation of a human operator support system. International Journal of Human-Computer Studies, 40, 895-931.
Schewe, S., Scherrmann, W., & Gierl, L. (1988). Evaluation and measuring of benefit of an expert system for differential diagnosis in rheumatology. Expert Systems and Decision Support in Medicine, 351-354.
Serfaty, D. and Entin, E. (1995). Shared mental models and adaptive team coordination. Proceedings of the First International Symposium on Command and Control Research and Technology, National Defense University: Washington, D.C..
Shamsolmaali, A., Collinson, P., Gray, T., Carson, E., & Cramp, D. (1989). Implementation and evaluation of a knowledge-based system for the interpretation of laboratory data. In AIME '89, (pp. 167-176).
Shortliffe, E. H. (1976). Computer-Based Medical Consultations: MYCIN. New York: Elsevier.
Shortliffe, E. H., Scott, A. C., Bischoff, M., Campbell, A. B., van Melle, W. and Jacobs, C. (1981). ONCOCIN: An expert system for oncology protocol management. In Proceedings of the seventh International Joint Conference on Artificial Intelligence, Vancouver, British Columbia, pp. 815-822.
Shortliffe, E. (1990). Clinical decision-support systems. In E. Shortliffe & L. Perreault (Eds.), Medical Informatics. Computer Applications in Health Care (pp. 466-500). New York: Addison-Wesley Publishing Company.
Silverman, B. (1992a). Survey of expert critiquing systems: Practical and theoretical frontiers., Communications of the ACM, 35(4), 107-127.
Silverman, B. G. (1992b). Building a better critic. Recent empirical results. IEEE Expert, April, 18-25.
Silverman, B. G. (1992c). Modeling and critiquing the confirmation bias in human reasoning, IEEE Transactions on Systems, Man, and Cybernetics, 22(5), 972-982
Simon, H. (1969). Sciences of the Artificial. Cambridge, MA: MIT Press.
122
Smith, P. J., Miller, T. E., Fraser, J., Smith, J., Svirbely, J., Rudmann, S., & Strohm, P. (1990). An intelligent tutoring system for antibody identification. Proceedings of the 14th Annual Symposium on Computer Applications in Medical Care. 1032-1033.
Smith, P.J., Miller, T.E., Gross, S., Guerlain, S., Smith, J., Svirbely, J., Galdes, D., Rudmann, S., & Strohm, P. (1991). The transfusion medicine tutor: Methods and results from the development of an interactive learning environment for teaching problem-solving skills. Proceedings of the 35th Annual Meeting of the Human Factors Society, 1166-1168.
Smith, P. J., Miller, T. E., Fraser, J., Smith, J. W., Svirbely, J. R., Rudmann, S., Strohm, P. L., & Kennedy, M. (1991). An empirical evaluation of the performance of antibody identification tasks. Transfusion, 31, 313-317.
Smith, P. J., Galdes, D., Fraser, J., Miller, T., Smith, J. W., Svirbely, J. R., Blazina, J., Kennedy, M., Rudmann, S., & Thomas, D. L. (1991b). Coping with the complexities of multiple-solution problems: A case study. International Journal of Man-Machine Studies, 35, 429-453.
Smith, P. J., Miller, T., Gross, S., Guerlain, S., Smith, J., Svirbely, J., Rudmann, S., & Strohm, P. (1992). The transfusion medicine tutor: A case study in the design of an intelligent tutoring system. In Proceedings of the 1992 Annual Meeting of the IEEE Society of Systems, Man, and Cybernetics, (pp. 515-520).
Strohm, P., Smith, P. J., Fraser, J., Smith, J. W., Rudmann, S., Miller, T., & Kennedy, M. (1991). Errors in antibody identification. Immunohematology, 7, 20-20.
Sutton, G. C. (1989). How accurate is computer-aided diagnosis? The Lancet(October 14, 1989), 905-908.
Tufte, E. R. (1990). Envisioning Information, Graphics Press, Cheshire, CT.
Tversky, A. (1972). Elimination by aspects: A theory of choice. Psychological Review, 79, 281-299.
Tversky, A., & Kahneman, D. (1974). Judgment under uncertainty: Heuristics and biases. Science, 185, 11124-1131.
van der Lei, J., Musen, M. A., van der Does, E., Man in 't Veld, A. J., & van Bemmel, J. H. (1991). Comparison of computer-aided and human review of general practitioners' management of hypertension. The Lancet, 338(Dec. 14, 1991), 1504-1508.
van der Lei, J., Westerman, R. F., & Boon, W. M. (1989). Evaluating expert critiques. Issues for the development of computer-based critiquing in primary care. In J. Talmon & J. Fox (Eds.), Lecture Notes in Medical Informatics, Proceedings of the Workshop "System Engineering in Medicine" Maastricht, March 16-18, 1989 (pp. 117-128). Springer-Verlag.
Verdaguer, A., Patak, A., Sancho, J., Sierra, C., & Sanz, F. (1992). Validation of the medical expert system PNEUMON-IA. Computers and Biomedical Research, 25, 511-526.
123
Weiner, B. J. (1971). Statistical Principles in Experimental Design, Second Edition. NY: McGraw-Hill.
Wellwood, J., Johannessen, S., & Spiegelhalter, D. J. (1992). How does computer-aided diagnosis improve the management of acute abdominal pain? Annals of the Royal College of Surgeons of England, 74, 40-46.
Wickens, C. D. (1984). Engineering Psychology and Human Performance. Columbus, Ohio: Charles Merrill.
Wiener, E. L. (1989). Human Factors of Advanced Technology ("Glass Cockpit") Transport Aircraft (Technical Report No. 117528). NASA Ames Research Center.
Woods, D. D., (1984). Visual momentum: a concept to improve the cognitive coupling of person and computer. International Journal of Man-Machine Studies. 21, 229-244.
Woods, D. D. (1991). The cognitive engineering of problem representations, in Human-Computer Interaction and Complex Systems, edited by George S. Weir and James L. Alty, Academic Press Limited, London, 169-188.
Woods, D. D. (1992). Cognitive activities and aiding strategies in dynamic fault management. Cognitive Systems Engineering Laboratory Technical Report CSEL 92-TR-05, The Ohio State University.
Woods, D. D., Johannesen, L. J., Cook, R. I., & Sarter, N. B. (1994). Behind Human Error: Cognitive Systems, Computers, and Hindsight. Dayton, OH: CSERIAC.
Woods, D. D. (1994). Automation: Apparent simplicity, real complexity. In M. Mouloua & R. Parasuraman (Eds.), Human Performance in Automated Systems: Current Research and Trends (pp. 1-7). Hillsdale, NJ: Lawrence Erlbaum Associates.
Zachary, W. (1986). A cognitively based functional taxonomy of decision support techniques. In Human-Computer Interaction (pp. 25-63). Hillsdale, New Jersey: Lawrence Erlbaum Associates, Inc.
Zhang, J., & Norman, D. (1994). Representations in distributed cognitive tasks. , 18, 87-122.
124
Appendix A. Sample Answer Sheet
Name: __________________________
Answer Sheet
Your answer:
ABO ________
Rh ________
Alloantibodies ___________
Certainty about alloantibodies (check one):
____Unsure _____Fairly Sure _____Certain
Comments: ____________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________ _____________________________________________________
125
Appendix B. Sample Statistical Calculations
When deciding what statistical test to run on the misdiagnosis rates recorded for this
experiment, an ideal test would be one that is non-parametric (i.e., for nominal data), takes into
account the repeated measures aspect of the data (since the same subject solved five cases) and
also takes into account performance on the Pre-Test Case (to account for any differences
between the two groups of subjects before the treatment was introduced). However, such a
statistical test does not exist. Fisher's exact test is a good statistic for testing the difference
between two groups for nominal data, particularly when the expected cell frequencies are small
(when a Chi Square test is not valid). Thus, this test was run to test the difference in
performance between the two groups on each of the individual cases. However, this test does
not take into account the repeated measures aspect of our experimental design, nor does it
account for performance on the Pre-Test. A log-linear analysis is a statistic that can be run to
take into account performance on a Pre-Test case for nominal data. Thus, this statistic was also
run on the data and the p-values for each of the individual cases were combined to get an
overall significance across cases. A final test that could be run is a Repeated Measures Analysis
of Covariance, which will take into account performance on the Pre-Test Case, takes into
account the repeated measures aspect of the design, but assumes interval (not nominal) data.
The results from this test will be reported as well to give a comparison.
In order to test for a difference in performance for nominal data in a within-subjects
manner, (i.e., when going from the Pre-Test Case to the Post-Test Case for the Treatment Group)
a McNemar's Chi Square test must be used. McNemar's Chi Square Assumptions. Nominal data, dependent samples. Method.
Post
Incorrect Correct Pre Incorrect a b Correct c d χ2 = (|b-c|-1)2
126
b+c
127
Example 1. Difference in misdiagnosis rates from Pre-Test Case to Post-Test Case for the Control Group.
Post
Incorrect Correct Pre Incorrect 4 2 Correct 0 7 χ2 = (|2-0|-1)2 = (2-1)2 = 1 = .50 -> accept Ho. (requires 3.84 to be significant at α = 0.05) 2+0 2 2 Example 2. Difference in misdiagnosis rates from Pre-Test Case to Post-Test Case for the Treatment Group.
Post
Incorrect Correct Pre Incorrect 0 4 Correct 0 11 χ2 = (|4-0|-1)2 = (4-1)2 = 9 = 2.25 -> accept Ho. (requires 3.84 to be significant at α = 0.05) 4+0 4 4 Example 3. Difference in number of Treatment subjects who made process errors from the Pre-Test Case to Post-Test Case. Error Type 1: Ruling Out Hypotheses Incorrectly
Post
Incorrect Correct Pre Incorrect 0 8 Correct 3 3 χ2 = (|8-3|-1)2 = (5-1)2 = 16 = 1.45 -> accept Ho. (requires 3.84 to be significant at α = 0.05) 8+3 11 11 Error Type 2: Failing to Rule Out Hypotheses When Appropriate
Post
Incorrect Correct Pre Incorrect 1 6 Correct 3 4
128
χ2 = (|6-3|-1)2 = (3-1)2 = 4 = .44 -> accept Ho. (requires 3.84 to be significant at α = 0.05) 6+3 9 9 Error Type 3: Failing to Collect Converging Evidence
Post
Incorrect Correct Pre Incorrect 1 9 Correct 0 4 χ2 = (|9-0|-1)2 = (9-1)2 = 64 = 7.11 -> reject Ho. (requires 6.64 to be significant at α = 0.01) 9+0 9 9 Error Type 4: Implausible Answer Given Data
Post
Incorrect Correct Pre Incorrect 0 4 Correct 0 10 χ2 = (|4-0|-1)2 = (4-1)2 = 9 = 2.25 -> accept Ho. (requires 3.84 to be significant at α = 0.05) 4+0 4 4 Error Type 5: Implausible Answer Given Prior Probabilities
Post
Incorrect Correct Pre Incorrect 1 5 Correct 0 8 χ2 = (|5-0|-1)2 = (5-1)2 = 16 = 3.20 -> accept Ho. (requires 3.84 to be significant at α = 0.05) 5+0 5 4 Since many of these differences are "close" to significance, combine the p-values across the five kinds of process errors to obtain an overall significance level by using the following method:
Error Type
p-value -ln(p)
1 0.2285279 1.48 2 0.5051982 0.68 3 0.0076654 4.87 4 0.1336145 2.01 5 0.0736382 2.61
Total = 11.65
129
χ2 = 2(11.65) = 23.30 df = 10 p < 0.01
Thus, over all types of errors, there is a significant improvement from the Pre-Test Case to the matched Post-Test Case for the Treatment Group. Chi Square Assumptions. Nominal data. Most expected cell frequencies must be ≥ 5. The cell entries must be independent of each other. Method. χ2 = Σ ((fo-fe)2/fe) where, Observed frequencies (fo) = : Correct Incorrect Totals Control Group a b a+b Treatment Group
c d c+d
Totals a+c b+d N=a+b+c+d Expected cell frequencies (fe) are calculated by multiplying the row total*column total and dividing by N = : Correct Incorrect Control Group (a+b)*(a+c)
N (a+b)*(b+d) N
Treatment Group
(c+d)*(a+c) N
(c+d)*(b+d) N
Example 1: Difference between two groups, Pre-Test Case. Correct Incorrect Totals Control Group 6 8 14 Treatment Group
11 4 15
Totals 17 12 29 Expected Cell Frequencies: Correct Incorrect Control Group 8.21 5.79
130
Treatment Group
8.79 6.21
All expected cell frequencies are ≥ 5, so the Chi Square test is valid. χ2 = Σ (fo-fe)2/fe = 4*(2.2)2/29 = 2.78, Accept Ho; not significant (the critical value for χ2, 1 d.f., α=0.05 is 3.84). Example 2: Difference between two groups, Post-Test Case 1. Correct Incorrect Totals Control Group 10 5 15 Treatment Group
16 0 16
Totals 26 5 31 Expected Cell Frequencies: Correct Incorrect Control Group 12.58 2.42 Treatment Group
13.42 2.57
Most expected cell frequencies are not ≥ 5, so the Chi Square test is not valid. Thus, use Fisher's exact test instead. Fisher's Exact Test Assumptions. Nominal data. The cell entries must be independent of each other. Method. p = (a+b)!*(c+d)!*(a+c)!*(b+d)! a!*b!*c!*d!*N! where, Observed frequencies (fo) = : Correct Incorrect Totals Control Group a b a+b Treatment Group
c d c+d
Totals a+c b+d N=a+b+c+d
131
If one of the expected cell frequencies is 0, then you're done. Otherwise, you must add probabilities from more extreme values as shown below in example 2. Example 1: Difference between two groups, Post-Test Case 1. Correct Incorrect Totals Control Group 10 5 15 Treatment Group
16 0 16
Totals 26 5 31 p = (a+b)!*(c+d)!*(a+c)!*(b+d)! = 15!*16!*26!*5! p = .018 < 0.05, Reject Ho. a!*b!*c!*d!*N! 10!*5!*16!*0!*31!
132
Example 2: Difference between two groups, Post-Test Case 2. Correct Incorrect Totals Control Group 8 8 16 Treatment Group
13 3 16
Totals 21 11 32 p = (a+b)!*(c+d)!*(a+c)!*(b+d)! = 16!*16!*21!*11! p = .055859 a!*b!*c!*d!*N! 8!*8!*13!*3!*32! Since none of the observed cell frequencies is zero, we must calculate the p-value for a more extreme ratio, and add that on, until an expected cell frequency of 0 is reached. Reduce cell d (3) by 1 to get a more extreme split. Recalculate the other values in the table to keep the row and column totals the same: Correct Incorrect Totals Control Group 7 9 16 Treatment Group
14 2 16
Totals 21 11 32 p = (a+b)!*(c+d)!*(a+c)!*(b+d)! = 16!*16!*21!*11! p = .0106 a!*b!*c!*d!*N! 7!*9!*14!*2!*32! Continue to subtract 1 from cell d and recalculate the other values in the table to keep the row and column totals the same: Correct Incorrect Totals Control Group 6 10 16 Treatment Group
15 1 16
Totals 21 11 32 p = (a+b)!*(c+d)!*(a+c)!*(b+d)! = 16!*16!*21!*11! p = .00099 a!*b!*c!*d!*N! 6!*10!*15!*1!*32! Continue to subtract 1 from cell d and recalculate the other values in the table to keep the row and column totals the same: Correct Incorrect Totals Control Group 5 11 16 Treatment Group
16 0 16
133
Totals 21 11 32 p = (a+b)!*(c+d)!*(a+c)!*(b+d)! = 16!*16!*21!*11! p = .000034 a!*b!*c!*d!*N! 5!*11!*16!*0!*32! Add all the p-values together. p = .055859 + .016 + .00099 + .000034 = 0.072883 > 0.05, Accept Ho. Example 3: Difference between two groups, Post-Test Case 3. Correct Incorrect Totals Control Group 10 6 16 Treatment Group
16 0 16
Totals 26 6 32 p = (a+b)!*(c+d)!*(a+c)!*(b+d)! = 16!*16!*26!*6! p = .0088 < 0.05, Reject Ho. a!*b!*c!*d!*N! 10!*6!*16!*0!*32! Example 4: Difference between two groups, Post-Test Case 4. Correct Incorrect Totals Control Group 6 10 16 Treatment Group
16 0 16
Totals 22 10 32 p = (a+b)!*(c+d)!*(a+c)!*(b+d)! = 16!*16!*22!*10! p = .000397 < 0.05, Reject Ho. a!*b!*c!*d!*N! 6!*10!*16!*0!*32! Log-linear analysis Assumptions. Nominal Data, Takes into account Pre-Test Case Performance. Method. Compare performance for each Post-Test Case to the Pre-Test Case for the two groups. Use a statistical package to calculate the p-value. Case 1 - Data Treatment Pre-Test
Case 1 Total
134
Correct Incorrect Control Group Correct 7 0 7 Incorrect 2 4 6 Total 9 4 13 Treatment Group
Correct 11 0 11
Incorrect 4 0 4 Total 15 0 15 Case 1 - Results Source df Component Χ2 p Pre-Test 1 8.81 0.003 Treatment* 1 5.14 0.0234 Interaction 1 0.29 0.5925 Total 3 14.24 *after controlling for PreTest Case 2 - Data Treatment Pre-Test
Case 1 Total
Correct Incorrect Control Group Correct 5 2 7 Incorrect 2 4 6 Total 7 6 13 Treatment Group
Correct 10 1 11
Incorrect 3 1 4 Total 13 2 15 Case 2 - Results Source df Component Χ2 p PreTest 1 3.38 0.0658 Treatment* 1 2.78 0.0957 Interaction 1 0.04 0.8403 Total 3 6.2 *after controlling for PreTest Case 3 - Data Treatment Pre-Test
Case 3 Total
Correct Incorrect
135
Control Group Correct 6 1 7 Incorrect 2 4 6 Total 8 5 13 Treatment Group
Correct 11 0 11
Incorrect 4 0 4 Total 15 0 15
136
Case 3- Results Source df Component Χ2 p PreTest 1 4.99 0.0255 Treatment* 1 7.09 0.0078 Interaction 1 0.05 0.8316 Total 3 12.12 *after controlling for PreTest Case 4 - Data Treatment Pre-Test
Case 4 Total
Correct Incorrect Control Group Correct 6 1 7 Incorrect 0 6 6 Total 6 7 13 Treatment Group
Correct 11 0 11
Incorrect 4 0 4 Total 15 0 15 Case 4 - Results Source df Component Χ2 p PreTest 1 10.11 0.0015 Treatment* 1 13.95 0.0002 Interaction 1 0.39 0.5315 Total 3 24.45 *after controlling for PreTest An overall significance level can be obtained across cases by combining the p-values using the following method:
Case p-value -ln(p) 1 0.0234 3.76 2 0.0957 2.35 3 0.0078 4.85 4 0.0002 8.52
Total = 19.47
χ2 = 2(19.47) = 38.94 df = 8 p = 0.000005
137
138
Repeated Measures Analysis of Covariance Assumptions. Interval data, repeated measures, can control for a Pre-Test Case as a covariate. Method. Use a statistical package to enter subjects' performance on all of the cases Results. Four Post-Test Cases using the Pre-Test Case as a Covariate Source df SS MS F p PreTest 1 2.80264 2.80264 14.53 0.0008 Group 1 2.80477 2.80477 14.54 0.0008 Error 25 4.82300 0.19292 Case 3 0.37033 0.12344 1.5 ♠ 0.225 Interaction 3 0.22747 0.07582 0.92 ♠ 0.420 Error 78 6.45138 0.08225 Thus, although no one test is ideal for analyzing the data gathered for this study, use of all of these tests shows similar results, namely that there is a statistically significant improvement in performance on Cases 1, 3, and 4 for the Treatment Group and a marginally significant improvement in performance on Case 2. There is also an extremely large effect over all the cases, favoring the Treatment Group. Finally, performance on the Pre-Test Case is a good indicator of performance on later cases.
139
Appendix C. Definition of Classes of Errors Logged by the Computer
Error # Definition 1. Rule-out errors of
commission
1a. Antigen not present Rule-out of an antibody on a test cell not containing the corresponding antigen.
1b. Reactive cell Rule-out of an antibody on a test cell that was reacting. 1c. Heterozygous Rule-out of an antibody on a test cell that was heterozygous for the
corresponding antigen (where that antibody should only be ruled out on homozygous cells).
1d. No cells to rule out Rule-out (at the top of the panel) of an antibody on a panel where no cells existed for rule-out of that antibody
1e. Special panel error (RT, Cold, Prewarm, Enzyme, Eluate
Rule-out of an antibody whose reactions are weakened in that test condition
1f. Patient lacks antigen Rule-out of an antibody based on antigen typing showing that patient lacks the corresponding antigen.
1g Typing may not be valid Rule-out of an antibody on the antigen typing without first checking that the antigen typing test is valid.
1h. Test not run yet Rule-out of an antibody based on an additional cell that hasn't yet been run.
2. Rule-out errors of omission
2a. Main panels (screen & panels at Polyspecific or IgG)
Failure to complete all possible rule-outs using main panels.
2b. Special panels (RT, Cold, Prewarm, Enzyme)
Failure to complete all possible rule-outs using special panels.
2c. Antigen Typing Failure to complete all possible rule-outs based on antigen typings. 3. Incomplete protocol 3a. Unlikely Abs not
marked Failure to mark unlikely antibodies as "Unlikely".
3b. Rule-outs not completed Failure to rule-out alternative antibodies (antibodies not marked as confirmed or unlikely).
3c. Underlying Abs present Failure to rule out an antibody that is covered by the answer on all test cells.
3d. Antigen typing not done Failure to antigen type for antigen corresponding to antibodies marked as confirmed.
3e. Auto Abs not marked Failure to indicate that there are no autoantibodies present. 4. Data Implausible Given Answer
4a. No confirmed Ab on reacting cells
No answer(s) on a cell that is reacting.
4b. Confirmed Abs on non- reacting cells
A homozygous donor cell is not reacting with the hypothesized antibody
140
4c. Answer has positive antigen typing
Antigen typing positive for antigen corresponding to a confirmed antibody.
5. Answer implausible given prior probabilities
5a. Low frequency Ab (f, V, Cw, Lua, Kpa, Jsa,) in answer set
A low frequency antibody was included in the answer. (This could, of course, have been an unusual, but still correct, answer. The same holds true for 4d - 4j.)
5b. Abs from multiple groups in answer set
Marking an answer from three genetic systems. This would be very unusual for most patients (unless they have had separate exposures).
5c. Hypothesis probability Marking an unlikely combinations of antibodies 5d. Antibody specific Marking an answer that violates the normal pattern of reactions for
that antibody 5e. General check Failure to either rule-out or confirm an antibody that the patient
could have formed (since the patient was negative for that antigen) that was (statistically) more likely to form than the answer marked confirmed.
141
Appendix D. Sample Behavioral Protocol Log
Subject #: E1 Intelligence: On Degree: MT(ASCP) Years Worked:
15
Avg Pos Screens/mo:
<1
00:00:00 Loading Case: XPJR Training Case 2 00:00:00 Select Test: ABO-Rh ABO/Rh 00:00:03 Mark ABO: O 00:00:05 Mark Rh: Pos 00:00:09 Select Test: Antibody
Screen ANTIBODY SCREEN
You could have marked at least one more antibody as Unlikely for this patient.
00:00:18 Select Test: Case History
CASE HISTORY
00:00:25 Select Test: Albumin-Poly AHG
POLYSPECIFIC AHG IS, 37° ALBUMIN, AHG
00:00:34 Hilite: cells 1,2,3,6,7
00:00:43 Mark Likely: K,E 00:00:53 Select Test: Additional
Albumin-Poly AHG
You did not mark auto control before leaving a main panel.
You could have ruled out at least one more antibody using cell #'s 4, 5, 8, 9 and 10 on this panel.
You could have marked at least one more antibody as Unlikely for this patient.
ADDITIONAL CELLS: POLYSPECIFIC AHG IS, 37° ALBUMIN, AHG
00:01:15 Run Test Cells: 1,2,3,7,10 You could have ruled out at least one more antibody on this panel.
142
You could have marked at least one more antibody as Unlikely for this patient.
00:01:45 Select Test: Antigen Typing
ANTIGEN TYPING
00:01:54 Run Test Cells: E,K You did not rule out the corresponding antibody of a positive antigen typing.
00:02:00 Select Test: Direct Antiglobulin Test
DAT
00:02:28 Set Auto Ctrl: Negative 00:02:29 Set Allo Anti: Positive 00:02:38 Select Test: Done with
Case You did not mark an answer using the "Confirmed" button.
Pressed Try Again/Continue Button
00:02:44 Select Test: Direct Antiglobulin Test
DAT
Student Answer: none 00:02:53 Select Test: Albumin-
Poly AHG POLYSPECIFIC AHG IS, 37° ALBUMIN, AHG
00:03:02 Mark Confirmed:
K,E You could have ruled out at least one more antibody using cell #'s 4, 5, 8, 9 and 10 on this panel.
You could have marked at least one more antibody as Unlikely for this patient.
00:03:08 Select Test: Done with Case
You did not rule out all of the antibodies in this case before indicating you were done with the case.
You could have marked at least one more antibody as Unlikely for this patient.
C, Fyb and Jkb are confounded homozygously by your confirmed antigens
The antigen typing for c hasn't been checked. Since the patient would be more likely to form anti-c before K, be very careful.
Student Answer: E and K Correct Answer: E and K
143
00:00:00 Loading Case: XSVR Evaluation Case 1 00:00:00 Select Test: ABO-Rh ABO/Rh 00:00:03 Mark ABO: O 00:00:05 Mark Rh: Pos 00:00:08 Select Test: Antibody
Screen ANTIBODY SCREEN
00:00:14 Mark Unlikely: f,V,Cw,Lua,Kpa,Jsa
00:00:23 Select Test: Case History
CASE HISTORY
00:00:27 Select Test: Albumin-Poly AHG
POLYSPECIFIC AHG IS, 37° ALBUMIN, AHG
00:00:35 Hilite: cells 1,2,3,4,5,7,8,9
00:00:47 Rule Out on cell 6:
Cw
00:00:50 Rule Out on cell 10:
Lua
00:01:07 Rule Out on cell 6:
D,C,e,N,s,P1,Leb,Lub,k,Jka,Xga
00:01:27 Select Test: Additional Albumin-Poly AHG
You did not mark auto control before leaving a main panel.
Pressed Try Again/Continue Button
00:01:36 Set Auto Ctrl: Negative 00:01:50 Select Test: Additional
Albumin-Poly AHG
ADDITIONAL CELLS: POLYSPECIFIC AHG IS, 37° ALBUMIN, AHG
00:02:02 Hilite Antigens:
K,E,c,M
00:02:11 Run Test Cells: 1,4,6,5,8 00:02:34 Mark Likely: c,K You could have ruled out
at least one more antibody using cell #'s 4, 6 and 8 on this panel.
Pressed Try Again/Continue Button
00:03:00 Select Test: Antigen Typing
00:03:08 Rule Out on cell 4:
Lea
00:03:14 Rule Out on cell 8:
S
00:03:26 Rule Out on cell 6:
Fya You could have ruled out at least one more antibody on this panel.
Pressed Try Again/Continue Button
144
00:03:41 Select Test: Antigen Typing
00:03:47 Rule Out on cell 8:
E You ruled out anti-E using a heterozygous cell.
Undid Marking/Ruleout
00:03:56 Rule Out on cell 8:
M
00:04:03 Select Test: Antigen Typing
ANTIGEN TYPING
00:04:07 Run Test Cells: c,K 00:04:13 Mark
Confirmed: K,c
00:04:20 Select Test: Done with Case
You did not rule out all of the antibodies in this case before indicating you were done with the case.
Pressed Try Again/Continue Button
Fyb and Jkb are confounded homozygously by your confirmed antigens
The antigen typing for E hasn't been checked. Since the patient would be more likely to form anti-E before K, be very careful.
Student Answer: c and K 00:04:36 Select Test: Albumin-
Poly AHG POLYSPECIFIC AHG IS, 37° ALBUMIN, AHG
00:04:49 Select Test: Additional Albumin-Poly AHG
ADDITIONAL CELLS: POLYSPECIFIC AHG IS, 37° ALBUMIN, AHG
00:05:02 Run Test Cells: 7 00:05:15 Rule Out on
cell 7: Jkb
00:05:20 Run Test Cells: 10 00:05:26 Rule Out on
cell 10: Fyb
00:05:35 Select Test: Done with Case
You did not rule out all of the antibodies in this case before indicating you were done with the case.
Pressed Leave Anyway Button
The antigen typing for E hasn't been checked. Since the patient would be more likely to form anti-E before K, be very careful.
Viewed Antigen Typing
00:06:04 Select Test: Antigen Typing
ANTIGEN TYPING
145
Student Answer: c and K 00:06:09 Run Test Cells: E 00:06:21 Select Test: Done with
Case You did not rule out all of the antibodies in this case before indicating you were done with the case.
Pressed Try Again/Continue Button
00:06:31 Rule Out: E 00:06:38 Select Test: Done with
Case
Student Answer: c and K Correct Answer: c and K 00:00:00 Loading Case: XPJS Evaluation Case 2 00:00:00 Select Test: ABO-Rh ABO/Rh 00:00:04 Mark ABO: O 00:00:07 Mark Rh: Neg 00:00:11 Select Test: Antibody
Screen ANTIBODY SCREEN
Be careful ruling out since there are weak reactions on this panel.
Pressed OK Button
00:00:21 Mark Unlikely: f,V,Cw,Lua,Kpa,Jsa
00:00:30 Select Test: Albumin-Poly AHG
POLYSPECIFIC AHG IS, 37° ALBUMIN, AHG
00:00:39 Hilite: cells 7,10 You did not mark auto control before leaving a main panel.
Pressed Try Again/Continue Button
00:00:52 Select Test: Antibody Screen
00:00:54 Set Auto Ctrl: Negative 00:01:01 Select Test: Antibody
Screen ANTIBODY SCREEN
00:01:08 Select Test: Cold 4°C COLD (4°C) 00:01:21 Select Test: Enzyme You could have ruled out
at least one more antibody using cell #'s 1, 2, 3 and 4 on this panel.
Pressed Try Again/Continue Button
00:02:14 Rule Out on
cell 2: N
00:02:16 Rule Out on
cell 3: s You used 4 degrees C test
condition to rule out anti-s.
Undid Marking/Ruleout
146
00:02:23 Rule Out on cell 3:
P1
00:02:28 Rule Out on cell 4:
M
00:02:34 Select Test: Enzyme You could have ruled out at least one more antibody using cell #'s 1, 3 and 4 on this panel.
Pressed Try Again/Continue Button
00:02:43 Rule Out on cell 1:
Lub You used 4 degrees C test condition to rule out anti-Lub.
Undid Marking/Ruleout
00:03:02 Rule Out on cell 1:
Leb
00:03:19 Select Test: Enzyme You could have ruled out at least one more antibody using cell #'s 3 and 4 on this panel.
Pressed Try Again/Continue Button
00:03:47 Rule Out on cell 4:
Lea
00:03:53 Select Test: Enzyme You could have ruled out at least one more antibody on this panel.
Pressed Try Again/Continue Button
00:04:01 Rule Out on cell 3:
Lua
00:04:06 Select Test: Enzyme ENZYME: FICIN TREATED CELLS 37°C, IGG AHG
00:04:18 Hilite: cells 10,9,8,7,6
00:04:24 Hilite Antigens:
S,s,Fya,Fyb,Xga
00:04:37 Rule Out on cell 1:
c,e,f,Lub
00:04:46 Rule Out on cell 2:
k,Jka
00:04:50 Rule Out on cell 3:
Jkb
00:04:53 Rule Out on cell 1:
K
00:05:02 Select Test: Additional Albumin-Poly AHG
ADDITIONAL CELLS: POLYSPECIFIC AHG IS, 37° ALBUMIN, AHG
00:05:12 Hilite Antigens:
D,E
00:05:18 Hilite: cells 2,10 00:05:25 Hilite
Antigens: C
147
00:05:29 Hilite Antigens:
C
00:05:31 Hilite: cells 1 00:05:37 Hilite ** Cells
on cell 4: C
00:05:41 Hilite: cells 4,8 00:06:00 Run Test Cells: 8,10 00:06:15 Rule Out on
cell 8: C
00:06:16 Rule Out on cell 10:
E
00:06:20 Rule Out on cell 8:
S
00:06:28 Run Test Cells: 5 00:06:40 Rule Out on
cell 5: s
00:06:49 Rule Out on cell 8:
Fyb
00:06:50 Rule Out on cell 10:
Fya
00:06:57 Mark Likely: D You could have ruled out at least one more antibody using cell #'s 5 and 8 on this panel.
Pressed Try Again/Continue Button
00:07:19 Select Test: Antigen Typing
00:07:24 Hilite: cells 5,4,2,1,10
00:07:44 Rule Out on cell 5:
Jsa
00:07:46 Rule Out on cell 8:
Kpa You could have ruled out at least one more antibody on this panel.
Pressed Try Again/Continue Button
00:08:03 Select Test: Antigen Typing
00:08:07 Rule Out on cell 5:
Xga
00:08:36 Select Test: Antigen Typing
ANTIGEN TYPING
00:08:46 Select Test: Done with Case
You did not mark an answer using the "Confirmed" button.
Pressed Try Again/Continue Button
00:08:51 Select Test: Antigen Typing
ANTIGEN TYPING
Student Answer: none 00:08:55 Mark
Confirmed: D
148
00:08:58 Select Test: Done with Case
Student Answer: D Correct Answer: D 00:00:00 Loading Case: XPJW Evaluation Case 3 00:00:00 Select Test: ABO-Rh ABO/Rh 00:00:04 Mark ABO: B 00:00:06 Mark Rh: Pos 00:00:09 Select Test: Antibody
Screen ANTIBODY SCREEN
00:00:15 Mark Unlikely: f,V,Cw,Lua,Kpa,Jsa
00:00:23 Hilite: cells 2 00:00:27 Rule Out on
cell 1: C,D,e,M,P1,Lea,Lub,K,k,Fyb
00:00:47 Select Test: Case History
CASE HISTORY
00:00:53 Select Test: Albumin-Poly AHG
POLYSPECIFIC AHG IS, 37° ALBUMIN, AHG
00:01:01 Hilite: cells 1,3,5,6,8
00:01:09 Rule Out on cell 4:
c,f,V,S,Leb
00:01:21 Rule Out on cell 7:
Jkb,N
00:01:27 Rule Out on cell 2:
Xga
00:01:31 Set Auto Ctrl: Negative 00:01:37 Select Test: Additional
Albumin-Poly AHG
You could have ruled out at least one more antibody on this panel.
Pressed Try Again/Continue Button
00:01:44 Rule Out on cell 2:
Cw
00:01:49 Select Test: Additional Albumin-Poly AHG
ADDITIONAL CELLS: POLYSPECIFIC AHG IS, 37° ALBUMIN, AHG
00:02:01 Hilite Antigens:
E,s,Fya,Jka
00:02:07 Run Test Cells: 1,4 00:02:15 Rule Out on
cell 1: Jka
00:02:19 Rule Out on cell 4:
s
00:02:27 Run Test Cells: 2,9,10
149
00:02:44 Select Test: Antigen Typing
ANTIGEN TYPING
00:02:49 Run Test Cells: E,Fya 00:02:54 Select Test: Enzyme ENZYME: FICIN
TREATED CELLS 37°C, IGG AHG
00:03:06 Hilite: cells 1,3,6 00:03:13 Mark Likely: E 00:03:28 Set Allo Anti: Positive 00:03:39 Select Test: Case
History CASE HISTORY
00:03:55 Select Test: Antibody Screen
ANTIBODY SCREEN
00:04:01 Mark Confirmed:
E
00:04:03 Mark Likely: Fya 00:04:10 Select Test: Done with
Case You did not rule out all of the antibodies in this case before indicating you were done with the case.
Pressed Leave Anyway Button
The antigen you confirmed was not present on reacting cell #'s 5 and 8 of the Main Albumin-Poly AHG panel.
Viewed Main Panel
The antigen you confirmed was not present on reacting cell #2 of the Additional Cells Albumin-Poly AHG panel.
The antigen typing for c hasn't been checked. Since the patient would be more likely to form anti-c before E, be very careful.
Student Answer: E 00:04:30 Select Test: Albumin-
Poly AHG POLYSPECIFIC AHG IS, 37° ALBUMIN, AHG
00:04:46 Mark Confirmed:
Fya
00:04:52 Select Test: Done with Case
Student Answer: E and Fya
Correct Answer: E and Fya
150
00:00:00 Loading Case: XKL Evaluation Case 4 00:00:00 Select Test: ABO-Rh ABO/Rh 00:00:03 Mark ABO: O 00:00:06 Mark Rh: Pos 00:00:09 Select Test: Antibody
Screen ANTIBODY SCREEN
00:00:15 Mark Unlikely: f,V,Cw,Lua,Kpa,Jsa
00:00:24 Select Test: Case History
CASE HISTORY
00:00:28 Select Test: Albumin-Poly AHG
POLYSPECIFIC AHG IS, 37° ALBUMIN, AHG
00:00:37 Set Auto Ctrl: Negative 00:00:46 Select Test: Enzyme ENZYME: FICIN
TREATED CELLS 37°C, IGG AHG
00:00:56 Hilite: cells 3,6 00:00:58 Hilite
Antigens: M,N,S,s,Fya,Fyb,Xga
00:01:04 Rule Out on cell 1:
D You ruled out incorrectly anti-D using cell #1 (The D antigen is present on that cell).
Undid Marking/Ruleout
00:01:08 Rule Out on cell 1:
C You ruled out incorrectly anti-C using cell #1 (The C antigen is present on that cell).
Undid Marking/Ruleout
00:01:12 Rule Out on cell 1:
P1 You ruled out incorrectly anti-P1 using cell #1 (The P1 antigen is present on that cell).
Undid Marking/Ruleout
00:01:24 Select Test: Prewarm PREWARM TECHNIQUE IGG AHG
00:01:37 Select Test: Additional Eluate
ADDITIONAL CELLS: ELUATE: ORGANIC SOLVENT IGG AHG
00:01:48 Run Test Cells: 1,2,3,4,5,6,7,8,9,10,11,12
00:02:04 Select Test: Direct Antiglobulin Test
DAT
00:02:07 Set Allo Anti: Negative 00:02:11 Select Test: Done with
Case
151
You did not confirm an antigen but cell #'s 1 and 2 of the Antibody Screen panel were reacting.
Viewed Screen Cells
00:02:27 Select Test: Antibody Screen
ANTIBODY SCREEN
You did not confirm an antigen but cell #'s 1, 2, 3, 4, 5, 6, 7, 8, 9 and 10 of the Main Albumin-Poly AHG panel were reacting.
Student Answer: none 00:02:47 Select Test: Cold 4°C COLD (4°C) 00:03:01 Select Test: Direct
Antiglobulin Test
DAT
00:03:05 Select Test: Case History
CASE HISTORY
00:03:11 Select Test: Additional Albumin-Poly AHG
ADDITIONAL CELLS: POLYSPECIFIC AHG IS, 37° ALBUMIN, AHG
00:03:20 Run Test Cells: 1,2,3,4,5,6,7,8,9,11,10,12
00:03:38 Rule Out on cell 1:
D,C,e,Cw,S,P1,Lub,K,Fyb,Jka
00:04:00 Rule Out on cell 3:
N,Lea,k,Xga
00:04:10 Rule Out on cell 5:
s,Leb,Jsa
00:04:23 Select Test: Antibody Screen
ANTIBODY SCREEN
00:04:33 Select Test: Antigen Typing
ANTIGEN TYPING
00:04:38 Run Test Cells: E,c,M,Fya,Jkb
00:04:46 Rule Out: M,Fya 00:04:50 Mark Likely: E,c,Jkb 00:04:57 Select Test: Case
History CASE HISTORY
00:05:06 Select Test: Albumin-Poly AHG
POLYSPECIFIC AHG IS, 37° ALBUMIN, AHG
00:05:16 Mark Confirmed:
E
00:05:19 Hilite: cells 3,6 00:05:24 Select Test: Cold 4°C COLD (4°C)
152
00:05:37 Select Test: Enzyme ENZYME: FICIN TREATED CELLS 37°C, IGG AHG
00:05:49 Hilite Antigens:
Jkb
00:05:50 Mark Confirmed:
Jkb,c
00:06:00 Select Test: Done with Case
Student Answer: E, c and Jkb
Correct Answer: E, c and Jkb
153
Appendix E. Sample Error Log
Subject #: E1 Treatment Degree: MT(ASCP) Years Worked: 15 Avg Pos Screens/mo: <1 Case XPJR XSVR XPJS XPJW XKL ABO ERRORS Did Not Mark ABO Interp
- - - - -
ABO Incorrect - - - - - Rh Incorrect - - - - - Unlikely Not Marked 4 - - - - Auto Control Not Marked
1 1 1 - -
Auto Control Incorrect - - - - - PANEL RO ERRORS OF COMMISSION
Antigen Not Present Screen - - - - - Poly (IgG) - - - - - Add'l Poly (IgG) - - - - - Other Panel - - - - - Reactive Cell Error Screen - - - - - Poly (IgG) - - - - - Add'l Poly (IgG) - - - - - Other Panel - - - - D,C,P1 Heterozygous Cell Error
Screen - - - - - Poly (IgG) - - - - - Add'l Poly (IgG) - E - - - Other Panel - - - - - Test Not Run Error AdditonalCells - - - - - No Non-Reactive Cells Error
Screen - - - - - Poly (IgG) - - - - -
154
Add'l Poly (IgG) - - - - - Other Panel - - - - - No Cells To Rule Out With Error
Screen - - - - - Poly (IgG) - - - - - Add'l Poly (IgG) - - - - - Other Panel - - - - - Panel Type Error Enzyme - - - - - Cold - - s,Lub - - Room Temp - - - - - Prewarm - - - - - Eluate - - - - - TYPING RO ERRORS OF COMMISSION
Antigen Not Present On Patient
- - - - -
Typing May Not Be Valid
- - - - -
Test Not Run - - - - - R/O ERRORS OF OMISSION
Screen - - - - - Poly (IgG) 2 - - 1 - Add'l Poly (IgG) 1 2 2 - - Enzyme - - - - - Cold - - 4 - - Room Temp - - - - - Incr. Serum/Cell - - - - - Prewarm - - - - - Other Panel - - - - - Antigen Typing 1 1 - - - Leave Anyway - 1 - 1 - R/O Right Answer - - - - - Heeded Weak Reaction Warning
- - - - -
155
Correct Answer [OPos/E,K] [OPos/c,K
] [ONeg/D] [BPos/E,Fya] [OPos/E,c,Jkb
] Student Answer(s) [OPos/-]
[OPos/E,K] [OPos/c,K] [OPos/c,K] [OPos/c,K]
[ONeg/-] [ONeg/D]
[BPos/E] [BPos/E,Fya]
[OPos/none] [OPos/E,c,Jkb]
Case Time 00:03:18 00:06:47 00:09:08 00:05:00 00:06:08 No Answer Marked 1 - 1 - - ABO Not Marked - - - - - Did Not Complete R/Os
1 3 - 1 -
Unlikelies Not Marked 1 - - - - Auto Control Not Marked
- - - - -
Typing Not Done - - - - - Answer Has Positive Typing
- - - - -
No Confirmed On Reacting
- - - 2 2
Confirmed On Non-Reacting
- - - - -
Confounding 1 1 - - - Confirmed Unlikely - - - - - Multiple Group - - - - - Hypothesis Probability - - - - - Antibody Specific - - - - - General Check 1 2 - 1 -
156
Appendix F. Number of Mistakes and Slips Made on Each Case By Each Subject in the Control Group (n = 16)
(Mistakes are shown first, slips are shown in parentheses) Error Case 1. Rule-out errors of
commission Pretest 1 2 3 4 Subjec
t Totals
1a. Antigen not present 1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16
33 (0) 22 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0)
31 (0) 16 (2)
0 (0) 0 (2)
18 (0) 13 (0)
5 (0) 0 (6)
0 (0) 0 (0) 0 (0) 9 (0) 0 (0) 0 (0)
20 (1) 21 (1)
0 (0) 0 (0) 0 (0)
10 (0) 16 (0)
0 (0) 0 (1) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (1)
15 (0) 0 (0) 0 (0) 0 (0) 0 (0)
8 (0) 19 (0) 11 (0)
2 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
30 (0) 10 (0)
0 (0) 0 (1) 0 (0)
16 (6) 23 (0)
6 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
41 (0) 25 (0)
0 (0) 0 (0) 0 (0)
85 (6) 93 (0) 22 (0)
2 (7) 0 (0) 0 (0) 0 (0) 0 (0) 9 (0) 0 (0) 0 (1)
137 (1) 72 (3)
0 (0) 0 (3) 0 (0)
Total: 102 (4) 77 (8) 41 (2) 80 (1) 111 (6) 411 (21) 1b. Reactive cell
1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16
11 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0)
18 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 4 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (1) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 2 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (1) 0 (0) 0 (0) 0 (0) 0 (0)
11 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (1) 0 (0) 0 (0) 4 (0) 0 (0) 0 (0)
20 (1) 0 (0) 0 (0) 0 (0) 0 (0)
Total: 29 (0) 4 (0) 0 (0) 3 (1) 0 (1) 36 (2)
157
1c. Heterozygous 1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16
8 (0) 0 (0) 0 (0) 4 (0) 0 (1) 4 (0) 0 (0) 8 (0)
0 (0) 1 (0) 0 (0) 4 (0) 6 (0) 2 (2)
2 (0) 0 (0) 0 (0) 5 (0)
2 (0) 0 (0) 6 (0) 0 (0) 0 (0) 0 (0) 1 (0) 3 (0) 6 (0) 1 (1) 1 (0)
9 (0) 0 (0) 0 (0) 4 (2) 0 (0) 7 (0) 0 (0) 6 (0) 2 (0) 0 (0) 0 (1) 5 (1) 6 (0) 6 (0) 2 (0) 0 (0)
4 (0) 0 (0) 0 (0) 2 (0) 0 (0) 2 (0) 0 (0) 8 (0) 0 (0) 0 (0) 3 (0) 8 (0) 4 (0)
4 (0) 2 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 2 (0) 0 (0) 2 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 2 (0) 1 (0) 0 (0)
23 (0) 0 (0) 0 (0)
15 (2) 0 (1)
17 (0) 0 (0)
30 (0) 2 (0) 0 (0) 4 (1)
14 (1) 17 (0) 24 (0)
8 (3) 1 (0)
Total: 37 (3) 27 (1) 47 (4) 37 (0) 7 (0) 155 (8) 1d. No cells with which to 1
rule out 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16
1 (0) 0 (0) 4 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
21 (0)
1 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 4 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 4 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 8 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
1 (0) 0 (0) 4 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0) 0 (0)
37 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
Total: 27 (0) 4 (0) 4 (0) 8 (0) 1 (0) 44 (0)
158
1e. Special panel error 1
(RT, Cold, Prewarm, 2 Enzyme, Eluate)
3 4 5 6 7 8 9
10 11 12 13 14 15 16
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
3 (0) 0 (0) 0 (0) 1 (0)
0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 2 (0) 0 (0) 0 (0) 0 (0) 0 (0)
12 (0) 0 (0) 0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 7 (0) 0 (0)
4 (0) 0 (0) 0 (0) 3 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0)
6 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 5 (0) 0 (0) 0 (0) 0 (0) 0 (0) 6 (0)
25 (0) 0 (0) 0 (0) 5 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 5 (0) 2 (0) 0 (0) 0 (0) 7 (0) 6 (0)
Total: 0 (0) 6 (0) 20 (0) 7 (0) 17 (0) 50 (0) 1f. Patient lacks antigen 1
2 3 4 5 6 7 8 9
10 11 12 13 14 15 16
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0)
0 (13) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (1) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (14) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
Total: 0 (0) 0 (13) 0 (0) 0 (0) 0 (1) 0 (14)
159
1g. Typing may not be valid 1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0)
0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
Total: 0 (0) 0 (0) 1 (0) 0 (0) 0 (0) 1 (0) 1h. Test not run yet
1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (3)
0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (1) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (1)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (4) 0 (1)
Total: 0 (3) 0 (1) 0 (0) 0 (1) 0 (0) 0 (5) 2. Rule-out errors of
omission
160
2a. Main panels(screen & 1 panels at Polyspecific 2
or IgG) 3 4 5 6 7 8 9
10 11 12 13 14 15 16
0 (0) 2 (0) 8 (0) 8 (0) 0 (1) 0 (0) 2 (0) 0 (0)
4 (0) 0 (0) 1 (0) 0 (0) 0 (1) 0 (0)
0 (0) 5 (0) 2 (0) 1 (1)
0 (1) 3 (0) 0 (0) 3 (0) 4 (0) 0 (7) 0 (0) 1 (0) 0 (0) 0 (0) 0 (2)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 3 (0) 5 (0) 2 (0) 3 (0) 0 (1) 5 (0) 0 (0) 5 (0) 1 (0) 1 (1) 0 (0) 0 (0) 0 (0) 1 (0) 0 (1)
0 (0) 2 (0) 1 (0) 0 (0) 0 (0) 0 (0) 1 (0) 1 (0) 0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 12 (0) 16 (0) 11 (1)
3 (1) 0 (2)
11 (0) 1 (0) 8 (0)
10 (0) 1 (8) 1 (0) 1 (0) 0 (1) 1 (0) 0 (3)
Total: 25 (2) 19 (11) 0 (0) 26 (3) 6 (0) 76 (16) 2b. Special panels (RT, Cold, 1
Prewarm, Enzyme) 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16
0 (0) 0 (0) 1 (0) 4 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 1 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (4) 2 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0)
0 (0) 0 (0) 2 (2) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 1 (0) 1 (0) 0 (0) 0 (0) 2 (0) 0 (0) 0 (0) 1 (1) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0)
0 (0) 1 (0) 1 (0) 5 (0) 1 (0) 0 (0) 1 (0) 2 (0) 0 (0) 0 (0) 3 (7) 2 (0) 0 (0) 0 (0) 1 (0) 0 (0)
Total: 5 (0) 3 (4) 2 (2) 1 (0) 6 (1) 17 (7) 2c. Antigen Typing
1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16
0 (0) 0 (0) 3 (0) 6 (0) 0 (0) 0 (0) 1 (0) 0 (0)
3 (0) 0 (0)
0 (0) 0 (0) 0 (0) 1 (0)
0 (0) 2 (0) 1 (0) 0 (0)
0 (0) 1 (0) 0 (0) 2 (0) 3 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0) 1 (0)
0 (0) 0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 6 (0) 0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 2 (0) 0 (0) 1 (0) 0 (0) 2 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 2 (0) 1 (0) 1 (0) 5 (0) 0 (0) 2 (0) 2 (0) 1 (0) 0 (0) 0 (0) 1 (0) 0 (0) 0 (0) 1 (0) 0 (0)
0 (0) 4 (0) 8 (0) 7 (0) 6 (0) 0 (0)
12 (0) 2 (0) 4 (0) 6 (0) 0 (0) 1 (0) 1 (0) 1 (0) 2 (0) 1 (0)
Total: 14 (0) 11 (0) 9 (0) 5 (0) 16 (0) 58 (0)
161
3. Incomplete protocol
3a. Unlikely ab's not marked 1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16
0 (0) 0 (0) 1 (0) 0 (0) 1 (0) 1 (0) 1 (0) 1 (0)
0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 1 (0) 1 (0)
0 (0) 1 (0) 1 (0) 0 (0) 1 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 1 (0) 1 (0) 1 (0) 0 (0) 1 (0) 1 (0) 1 (0) 0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0)
0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0) 1 (0) 0 (0) 1 (0) 0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 1 (0) 1 (0) 0 (0) 1 (0) 1 (0) 1 (0) 0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 3 (0) 3 (0) 4 (0) 1 (0) 5 (0) 4 (0) 3 (0) 1 (0) 5 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0)
Total: 6 (0) 6 (0) 8 (0) 4 (0) 6 (0) 30 (0) 3b. Rule-out's not completed 1
2 3 4 5 6 7 8 9
10 11 12 13 14 15 16
0 (0) 0 (0) 1 (0) 1 (0) 1 (0) 0 (0) 1 (0) 0 (0)
1 (0) 0 (0) 1 (0) 0 (0) 1 (0) 1 (0)
0 (0) 1 (0) 1 (0) 1 (0)
0 (0) 1 (0) 0 (0) 1 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0)
1 (0) 0 (0) 1 (0) 1 (0) 1 (0) 1 (0) 1 (0) 0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 1 (0) 1 (0) 1 (0) 0 (0) 1 (0)
0 (0) 1 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 1 (0) 1 (0) 1 (0) 1 (0) 0 (0) 1 (0) 0 (0) 1 (0)
1 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
1 (0) 2 (0) 5 (0) 5 (0) 4 (0) 1 (0) 5 (0) 0 (0) 4 (0) 3 (0) 1 (0) 1 (0) 1 (0) 1 (0) 1 (0) 1 (0)
Total: 8 (0) 7 (0) 8 (0) 5 (0) 8 (0) 36 (0) 3c. Underlying ab's present 1
2 3 4 5 6 7 8 9
10 11 12 13 14 15 16
0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0) 1 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 1 (0) 1 (0) 0 (0)
0 (0) 1 (0) 0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0)
0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 1 (0) 1 (0) 0 (0) 4 (0) 0 (0) 2 (0) 0 (0) 3 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0)
162
Total: 2 (0) 5 (0) 1 (0) 2 (0) 2 (0) 12 (0) 3d. Antigen typing not done 1
2 3 4 5 6 7 8 9
10 11 12 13 14 15 16
1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 1 (0) 1 (0) 0 (0) 0 (0)
1 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 1 (0) 0 (0) 0 (0) 1 (0)
1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0) 0 (0) 1 (0) 1 (0) 0 (0) 1 (0) 0 (0)
1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0)
5 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0) 0 (0) 4 (0) 3 (0) 0 (0) 1 (0) 1 (0)
Total: 3 (0) 4 (0) 1 (0) 5 (0) 2 (0) 15 (0) 3e. Auto ab's not marked 1
2 3 4 5 6 7 8 9
10 11 12 13 14 15 16
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 1 (0) 1 (0) 0 (0)
0 (0) 0 (0) 1 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (1) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0) 0 (0) 0 (1) 0 (0) 0 (0) 1 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
1 (0) 0 (0) 1 (0) 1 (0) 0 (0) 0 (0) 2 (0) 0 (0) 0 (0) 0 (2) 0 (0) 0 (0) 2 (0) 1 (0) 0 (0) 0 (0)
Total: 2 (0) 1 (0) 1 (1) 3 (1) 1 (0) 8 (2) 4. Data Implausible Given Answer
163
4a. No confirmed Ab on 1
reacting cells 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16
1 (0) 1 (0) 0 (0) 1 (0) 0 (0) 0 (0) 2 (0) 0 (0)
0 (0) 0 (0) 1 (0) 2 (0) 0 (0) 0 (0)
1 (0) 0 (0) 0 (0) 0 (0)
0 (0) 2 (0) 0 (0) 1 (0) 0 (0) 0 (0) 1 (0) 1 (0) 0 (0) 0 (0) 0 (0)
2 (0) 3 (0) 0 (0) 0 (0) 0 (0) 0 (0) 3 (0) 0 (0) 1 (0) 0 (0) 0 (0) 2 (0) 3 (0) 2 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 2 (0) 0 (0) 2 (0) 0 (0) 0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0) 2 (0) 0 (0) 0 (0) 2 (0)
4 (0) 6 (0) 0 (0) 4 (0) 0 (0) 0 (0) 8 (0) 0 (0) 2 (0) 0 (0) 1 (0) 4 (0) 8 (0) 2 (0) 0 (0) 2 (0)
Total: 8 (0) 6 (0) 16 (0) 1 (0) 10 (0) 41 (0) 4b. Confirmed Abs on non- 1 reacting cells 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
0 (0) 0 (0) 1 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 1 (0) 1 (0) 0 (0) 0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
Total: 2 (0) 1 (0) 0 (0) 0 (0) 0 (0) 3 (0) 4b. Answer has positive
1 antigen typing 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0)
164
Total: 0 (0) 0 (0) 0 (0) 0 (0) 2 (0) 2 (0) 5. Answer implausible given prior probabilities
5a. Low frequency Ab (f, V, 1
Cw, Lua, Kpa, Jsa) in 2 answer set 3
4 5 6 7 8 9
10 11 12 13 14 15 16
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0)
0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0)
1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0) 2 (0) 0 (0)
Total: 1 (0) 0 (0) 2 (0) 0 (0) 1 (0) 4 (0) 5b. Abs from multiple
1 groups in answer set 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
0 (0) 0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0)
0 (0) 0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0)
Total: 1 (0) 0 (0) 0 (0) 0 (0) 1 (0) 2 (0)
165
5c. Hypothesis Probability 1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0) 0 (0) 1 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 2 (0) 0 (0) 0 (0) 1 (0)
Total: 0 (0) 0 (0) 1 (0) 0 (0) 2 (0) 3 (0) 5d. Antibody Specific
1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 1 (0) 0 (0) 0 (0) 0 (0)
1 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 1 (0) 0 (0) 0 (0) 1 (0) 0 (0) 0 (0) 1 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0) 0 (0) 1 (0) 1 (0) 0 (0) 0 (0) 1 (0)
2 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 2 (0) 1 (0) 0 (0) 2 (0) 2 (0) 0 (0) 0 (0) 2 (0)
Total: 1 (0) 1 (0) 4 (0) 0 (0) 5 (0) 11 (0) 5e. General Check
1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16
0 (0) 0 (0) 1 (0) 1 (0) 1 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 1 (0) 0 (0) 0 (0) 0 (0)
0 (0) 1 (0) 1 (0) 1 (0)
0 (0) 1 (0) 0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 1 (0) 0 (0) 1 (0) 0 (0)
1 (0) 0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 1 (0) 1 (0) 1 (0) 1 (0) 0 (0) 1 (0) 0 (0) 0 (0) 1 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 2 (0) 4 (0) 3 (0) 3 (0) 0 (0) 3 (0) 0 (0) 3 (0) 1 (0) 1 (0) 1 (0) 0 (0) 0 (0) 0 (0) 1 (0)
166
Total: 4 (0) 6 (0) 1 (0) 4 (0) 7 (0) 22 (0)
167
Appendix G. Number of Mistakes and Slips Made on Each Case By Each Subject in the Treatment Group (n = 16)
(Mistakes are shown first, slips are shown in parentheses) Error Case 1. Rule-out errors of
commission Pretest 1 2 3 4 Subjec
t Totals
1a. Antigen not present 1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16
0 (0) 3 (2) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 3 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (6) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (1) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (1) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (1)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (2) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (1) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 3 (3) 0 (1) 0 (6) 0 (0) 0 (2) 0 (0) 0 (1) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0) 3 (0)
0 (1)
Total: 6 (2) 0 (6) 0 (3) 1 (2) 0 (1) 7 (14) 1b. Reactive cell
1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16
0 (0) 0 (0)
0 (1) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (1) 0 (0) 0 (0) 0 (1)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 3 (0) 0 (1) 0 (0) 0 (1) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (1) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (1)
0 (0) 0 (0) 0 (1) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0) 0 (0)
0 (0)
0 (3) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (2) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (3) 0 (0) 0 (2) 0 (0) 0 (0) 0 (1) 0 (0) 0 (2) 0 (0) 3 (0) 0 (2) 1 (0) 0 (1) 0 (1)
0 (1)
Total: 0 (3) 3 (2) 0 (2) 1 (1) 0 (5) 4 (13)
168
1c. Heterozygous 1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16
0 (0) 5 (0) 6 (0) 3 (0) 6 (0) 8 (0) 0 (0) 0 (0)
0 (0) 0 (0) 4 (0) 0 (0) 5 (0)
8 (0)
1 (0) 0 (0) 0 (2) 0 (0) 0 (0) 0 (2) 0 (0) 0 (0) 0 (1) 0 (0) 0 (0) 0 (1) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (1) 0 (2) 0 (0) 0 (0) 0 (2) 0 (0) 0 (1) 0 (1) 0 (0) 0 (0)
0 (1)
0 (0) 0 (1) 0 (2) 0 (0) 0 (0) 0 (3) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0)
0 (0) 0 (0) 0 (1) 0 (1) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (1)
0 (0)
1 (0) 5 (1) 6 (5) 3 (1) 6 (1) 8 (7) 0 (0) 0 (0) 0 (3) 0 (0) 0 (1) 4 (2) 0 (0) 5 (1)
8 (1)
Total: 45 (0) 1 (6) 0 (8) 0 (6) 0 (3) 46 (23) 1d. No cells with which to 1
rule out 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16
0 (0) 0 (0) 0 (0) 0 (2) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (2) 0 (0) 1 (0) 0 (0) 0 (0) 1 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
Total: 0 (2) 1 (0) 1 (0) 0 (0) 0 (0) 2 (2)
169
1e. Special panel error 1
(RT, Cold, Prewarm, 2 Enzyme, Eluate)
3 4 5 6 7 8 9
10 11 12 13 14 15 16
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
1 (0) 0 (0) 0 (0) 3 (0) 2 (0) 1 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0) 1 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 1 (0) 3 (0) 0 (0) 1 (0) 2 (2) 1 (1) 2 (0) 0 (0) 0 (0) 2 (0) 0 (0) 3 (0)
0 (0)
1 (0) 0 (0) 1 (0) 6 (0) 2 (0) 2 (0) 2 (2) 2 (1) 3 (0) 1 (0) 0 (0) 3 (0) 0 (0) 3 (0)
0 (0)
Total: 0 (0) 1 (0) 9 (0) 1 (0) 15 (3) 26 (3) 1f. Patient lacks antigen 1
2 3 4 5 6 7 8 9
10 11 12 13 14 15 16
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
Total: 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
170
1g. Typing may not be valid 1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
Total: 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1h. Test not run yet
1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (1)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (1)
Total: 0 (0) 0 (0) 0 (0) 0 (1) 0 (0) 0 (1) 2. Rule-out errors of
omission
171
2a. Main panels(screen & 1 panels at Polyspecific 2
or IgG) 3 4 5 6 7 8 9
10 11 12 13 14 15 16
2 (1) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (4) 4 (0)
3 (0) 1 (0) 0 (0) 0 (0) 0 (0)
1 (0)
2 (0) 0 (0) 0 (1) 0 (1) 0 (1) 1 (2) 0 (2) 0 (1) 0 (1) 0 (0) 0 (2) 0 (1) 0 (0) 2 (1)
0 (0)
0 (1) 0 (0) 0 (0) 0 (0) 0 (1) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (1)
0 (0)
0 (1) 0 (0) 2 (0) 0 (5) 0 (1) 1 (1) 0 (0) 0 (0) 0 (0) 1 (1) 0 (1) 0 (1) 0 (0) 0 (1)
0 (0)
0 (0) 0 (0) 0 (1) 0 (1) 0 (2) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
4 (3) 1 (0) 2 (2) 0 (7) 0 (5) 2 (3) 0 (6) 4 (1) 0 (1) 4 (1) 1 (3) 0 (2) 1 (0) 2 (3)
0 (0)
Total: 11 (5) 5 (13) 1 (3) 4 (12) 0 (4) 21 (37) 2b. Special panels (RT, Cold, 1
Prewarm, Enzyme) 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0)
) 0 (0) 1 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
4 (0) 0 (0) 0 (0) 0 (2) 0 (0) 1 (0)
0 (0) 0 (0)
0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 1 (2) 0 (0) 1 (0) 1 (1) 0 (1) 0 (0) 0 (0) 1 (1) 0 (0) 1 (0)
0 (0)
4 (0) 0 (0) 0 (0) 0 (2) 1 (2) 1 (0) 1 (0) 2 (1) 1 (1) 1 (0) 1 (0) 1 (1) 0 (0) 1 (0)
0 (0)
Total: 2 (0) 1 (0) 6 (2) 0 (0) 5 (5) 14 (7) 2c. Antigen Typing
1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16
1 (0) 0 (0) 0 (0) 2 (0) 0 (0) 0 (0) 0 (0) 2 (0)
2 (0) 0 (0)
0 (0) 0 (0) 0 (0)
0 (0)
1 (0) 0 (0) 0 (1) 0 (0) 0 (1) 0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 4 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
) 0 (0)
2 (0) 0 (0) 0 (1) 2 (0) 0 (1) 4 (0) 1 (0) 2 (0) 0 (0) 2 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
Total: 7 (0) 2 (2) 0 (0) 4 (0) 0 (0) 13 (2)
172
3. Incomplete protocol 3a. Unlikely ab's not marked 1
2 3 4 5 6 7 8 9
10 11 12 13 14 15 16
1 (0) 1 (0) 1 (0) 1 (0) 0 (0) 1 (0) 0 (0) 1 (0)
1 (0) 1 (0) 0 (0) 1 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
1 (0) 1 (0) 1 (0) 1 (0) 0 (0) 1 (0) 0 (0) 1 (0) 0 (0) 1 (0) 1 (0) 0 (0) 1 (0) 0 (0)
0 (0)
Total: 9 (0) 0 (0) 0 (0) 0 (0) 0 (0) 9 (0) 3b. Rule-out's not completed 1
2 3 4 5 6 7 8 9
10 11 12 13 14 15 16
1 (0) 1 (0) 0 (0) 1 (0) 1 (0) 0 (0) 1 (0) 1 (0)
1 (0) 0 (0) 0 (0) 1 (0) 0 (0)
0 (0)
3 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 6 (0) 0 (0) 2 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 3 (0) 0 (0)
0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 1 (0) 0 (0)
0 (0)
0 (0) 0 (0) 1 (0) 3 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 1 (0) 1 (0) 0 (0)
0 (0)
5 (0) 1 (0) 1 (0) 4 (0) 2 (0) 9 (0) 1 (0) 3 (0) 0 (0) 1 (0) 0 (0) 1 (0) 3 (0) 0 (0)
0 (0)
Total: 8 (0) 3 (0) 9 (0) 5 (0) 6 (0) 31 (0) 3c. Underlying ab's present 1
2 3 4 5 6 7 8 9
10 11 12 13 14 15 16
1 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
1 (0) 0 (0) 0 (0) 1 (0) 0 (0)
0 (0)
1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 2 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
2 (0) 1 (0) 0 (0) 0 (0) 1 (0) 2 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0) 0 (0) 2 (0) 0 (0)
0 (0)
173
Total: 4 (0) 1 (0) 1 (0) 3 (0) 0 (0) 9 (0) 3d. Antigen typing not done 1
2 3 4 5 6 7 8 9
10 11 12 13 14 15 16
0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 1 (0) 1 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 1 (0)
0 (0)
Total: 3 (0) 0 (0) 0 (0) 0 (0) 0 (0) 3 (0) 3e. Auto ab's not marked 1
2 3 4 5 6 7 8 9
10 11 12 13 14 15 16
0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 1 (0) 0 (0) 1 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 1 (0) 0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
Total: 3 (0) 0 (0) 0 (0) 0 (0) 0 (0) 3 (0) 4. Data Implausible Given Answer
174
4a. No confirmed Ab on 1
reacting cells 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16
0 (0) 2 (0) 0 (0) 0 (0) 3 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 2 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 3 (0) 1 (0) 0 (0) 0 (0)
10 (0) 0 (0) 1 (0) 0 (0) 2 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
2 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
2 (0) 0 (0) 2 (0) 8 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 2 (0) 0 (0)
0 (0)
4 (0) 5 (0) 3 (0) 8 (0) 3 (0)
10 (0) 0 (0) 1 (0) 0 (0) 2 (0) 0 (0) 1 (0) 4 (0) 0 (0)
0 (0)
Total: 7 (0) 0 (0) 17 (0) 2 (0) 15 (0) 41 (0) 4b. Confirmed Abs on non- 1 reacting cells
2 3 4 5 6 7 8 9
10 11 12 13 14 15 16
0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
1 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
1 (0)
Total: 2 (0) 0 (0) 0 (0) 0 (0) 0 (0) 2 (0)
175
4b. Answer has positive 1
antigen typing 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 2 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 2 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
Total: 0 (0) 0 (0) 0 (0) 2 (0) 0 (0) 2 (0) 5. Answer implausible given prior probabilities
5a. Low frequency Ab (f, V, 1
Cw, Lua, Kpa, Jsa) in 2 answer set 3
4 5 6 7 8 9
10 11 12 13 14 15 16
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
Total: 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
176
5b. Abs from multiple 1
groups in answer set 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
1 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
1 (0)
Total: 1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 5c. Hypothesis Probability 1
2 3 4 5 6 7 8 9
10 11 12 13 14 15 16
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 2 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 2 (0) 0 (0) 0 (0) 1 (0) 0 (0)
0 (0)
Total: 0 (0) 0 (0) 2 (0) 0 (0) 1 (0) 3 (0) 5d. Antibody Specific
1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
1 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 2 (0) 0 (0) 0 (0) 4 (0) 1 (0) 1 (0) 2 (0) 0 (0) 0 (0) 1 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0)
0 (0)
0 (0) 0 (0) 2 (0) 0 (0) 0 (0) 4 (0) 1 (0) 1 (0) 2 (0) 0 (0) 0 (0) 1 (0) 1 (0) 0 (0)
1 (0)
177
Total: 1 (0) 0 (0) 11 (0) 0 (0) 1 (0) 13 (0) 5e. General Check
1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16
1 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0)
1 (0) 0 (0) 0 (0) 1 (0) 0 (0)
0 (0)
2 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0) 1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)
0 (0)
1 (0) 0 (0) 0 (0) 0 (0) 0 (0) 3 (0)
0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 0 (0)
0 (0)
0 (0) 0 (0) 1 (0) 2 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0) 1 (0) 1 (0) 0 (0)
0 (0)
4 (0) 1 (0) 1 (0) 2 (0) 0 (0) 4 (0) 0 (0) 2 (0) 0 (0) 1 (0) 0 (0) 1 (0) 3 (0) 0 (0)
0 (0)
Total: 5 (0) 2 (0) 2 (0) 5 (0) 7 (0) 19 (0)