8
Medical Education 1987, 21, 199-206 On making laboratory report work more meaningful through criterion-based evaluation N. NERAA Institute Of Physiology, University of hhus Summary. The purpose of this work was to encourage students to base their laboratory report work on guidelines reflecting a quality criterion set, previously derived from the func- tional role of the various sections in scientific papers. The materials were developed by a trial-and-error approach and comprise learning objectives, a parallel structure of manual and reports, general and specific report guidelines and a new common starting experiment. The principal contents are presented, fol- lowed by an account of the author’s experience with them. Most of the author’s students now follow the guidelines. Their conclusions are affected by difficulties in adjusting expected results with due regard to the specific condi- tions of the experimental subject or to their own deviations from the experimental or analytical procedures prescribed in the manual. Also, problems in interpreting data unbiased by explicit expectations are evident, although a clear distinction between expected and actual results has been helpful for them in seeing the relationship between experiments and textbook contents more clearly, and thus in understand- ing the hypothetico-deductive approach. Key words: *Teaching materials; *education, medical, undergraduate; physiology/*educ; laboratories; educational measurement; Den- mark Introduction Within a teaching system based on non- obligatory teaching activities, laboratory Correspondence: Dr N. Natraa, Institute of Phy- siology, 160 Ole Worms Alle, 8000 Arhus C, Den- mark. courses in medical physiology can no longer rely on tradition as a means of ensuring that most students exploit fully the learning poten- tialities of such courses, i.e. skills resulting from the writing of reports attaining a pre-set standard of acceptance. From an experimental period where report writing was substituted by small-group semi- nars in which the experimental results were discussed, it appeared that the writing of re- ports was the crucial activity. In order to obtain a better learning effect by improving the report quality, a criterion set was then established (Naeraa 1979). The present version of it is shown in Table I. However, the set proved inefficient as a learning objective for the majority of students. They were quite willing to do the experiments, but reluctant or unwilling to put sufficient time and effort into attaining the standard of the criterion-set. As students’ resistance seemed to be caused by unclear learning objectives and by their suspicion of the training being irrelevant to their clinical education and future professional role (‘We are not going to be scientists’), the author decided to begin a systematic effort to compensate for this by improving the materials used in our laboratory courses. The developmental process This could possibly best be explained as a series of trial-and-error sequences, based upon copies of the written specific comments made to the student reports assigned to the author for evaluation. A set of guidelines for report-

On making laboratory report work more meaningful through criterion-based evaluation

  • Upload
    n-naraa

  • View
    213

  • Download
    1

Embed Size (px)

Citation preview

Medical Education 1987, 21, 199-206

On making laboratory report work more meaningful through criterion-based evaluation

N . N E R A A

Institute Of Physiology, University of h h u s

Summary. The purpose of this work was to encourage students to base their laboratory report work on guidelines reflecting a quality criterion set, previously derived from the func- tional role of the various sections in scientific papers. The materials were developed by a trial-and-error approach and comprise learning objectives, a parallel structure of manual and reports, general and specific report guidelines and a new common starting experiment.

The principal contents are presented, fol- lowed by an account of the author’s experience with them. Most of the author’s students now follow the guidelines. Their conclusions are affected by difficulties in adjusting expected results with due regard to the specific condi- tions of the experimental subject or to their own deviations from the experimental or analytical procedures prescribed in the manual. Also, problems in interpreting data unbiased by explicit expectations are evident, although a clear distinction between expected and actual results has been helpful for them in seeing the relationship between experiments and textbook contents more clearly, and thus in understand- ing the hypothetico-deductive approach.

Key words: *Teaching materials; *education, medical, undergraduate; physiology/*educ; laboratories; educational measurement; Den- mark

Introduction Within a teaching system based on non- obligatory teaching activities, laboratory

Correspondence: Dr N. Natraa, Institute of Phy- siology, 160 Ole Worms Alle, 8000 Arhus C, Den- mark.

courses in medical physiology can no longer rely on tradition as a means of ensuring that most students exploit fully the learning poten- tialities of such courses, i.e. skills resulting from the writing of reports attaining a pre-set standard of acceptance.

From an experimental period where report writing was substituted by small-group semi- nars in which the experimental results were discussed, it appeared that the writing of re- ports was the crucial activity. In order to obtain a better learning effect by improving the report quality, a criterion set was then established (Naeraa 1979). The present version of it is shown in Table I .

However, the set proved inefficient as a learning objective for the majority of students. They were quite willing to do the experiments, but reluctant or unwilling to put sufficient time and effort into attaining the standard of the criterion-set.

As students’ resistance seemed to be caused by unclear learning objectives and by their suspicion of the training being irrelevant to their clinical education and future professional role (‘We are not going to be scientists’), the author decided to begin a systematic effort to compensate for this by improving the materials used in our laboratory courses.

The developmental process

This could possibly best be explained as a series of trial-and-error sequences, based upon copies of the written specific comments made to the student reports assigned to the author for evaluation. A set of guidelines for report-

200 N. N m a a

Table I . Present version of criterion set

Report section Criterion question

Introduction Material

Methods

Results

Discussion and Conclusion

Does this section contain: A statement of the learning objective/problem of this particular experiment? Sufficient specific information supplied about the experimental subject (so as) to evaluate and explain the results presented? Answers to conditional questions free of misconceptions and errors? Sufficient specific information supplied for evaluation/explanation of the results presented? Presented and described clearly? Descriptions ‘clean’, i.e. without interpretations? Consistent with Material and Methods section? Expected results stated?-logically consistent with learning materials? Comparisons of expected and actual results valid? Conclusions valid? Final conclusion related to learning objectiveiproblem? Consistency with reported results easily verifiable? Consistency actually present?

Documentation

writing was first derived from the criterion-set and the effect of these guidelines was then evaluated through the comments made to the reports from the following laboratory course. This evaluation in turn led to modification of the next version of the guidelines, sometimes also of the laboratory manual text, and a few times to the production of additional material. The new versions and materials were then tried out in the same way during the next course and so on, until the present state was reached.

Most of the seven to nine guideline revisions have been ‘private’ i.e. used only with students allocated to the author, while a few versions were ‘official’ i.e. included in the laboratory manual. This difference developed because the manual is printed every second or third year only. The combined system of the two ver- sions allowed faster exploitation of the feed- back provided by the reports, and from discus- sions evoked by both the guidelines and the comments. The author believes this system to have helped to avoid an atmosphere of unrest among the teachers.

Results

These consist of modifications of teaching materials and activities. Tables 2, 3 and 4 show typical examples of the various modifications, while Table ga and b presents the specific components of a single experiment from the common starting day. This by itself represents a modification.

Learning objective

Table 2 shows the general version of the learning objective or problem faced by students carrying out one of the I I experiments offered, three of them simple animal preparations, while the remaining experiments are performed on the students themselves.

This version of the objective is a recent development after several attempts to make the students phrase these questions themselves were unsuccessful, even when helped by various leading questions. All questions follow one of the two standard patterns shown.

The experimental problem (or its answer) is not the learning objective as such, but is only part of it. Not until this difference was fully spelt out in the manual did the contents of this section start to become meaningful for the average student. This was effected by surpri- singly small changes, the essence of which was a more consequent distinction between means and ends. Such changes could be brought about by students critically pointing out logical discrepancies-here average students seemed just as helpful as those above average.

Principal sections of manual and reports

These are shown in Table 3a, while the general guideline principles are shown in Table

These sections were derived as parallels to the sections used in scientific papers, because

3b.

Promoting criterion-based evaluation 20 I

Table 2. ‘Standard versions’ of learning objective for typical cookbook experiments in preclinical physiology

The purpose of this experiment is to make the participants able to put forward and document (through the report) a valid conclusion on whether or not the obtained results differ significantly from those results that should be expected according to the learning materials, with regard to the following questions (e.g.):

( I ) Which changes occur with respect to energy metabolism during and after steady state exercise, performed

(2) Which circulatory and respiratory changes occur during and after such exercise?

with regard to the functions examined in this experiment?

with large muscle groups?

or alternatively (in other experiments):

Table 3a. Corresponding sections of laboratory manual and report-guidelines (general guidelines in Table 3b)

Corresponding sections in laboratory manual and reports

Manual sections Report sections

BACKGROUND Clinical relation Purpose (IfTable 2) PREPARATORY WORK Conditional questions (4 Table 4) Setting up the experiment (Experimental subjects or preparation, equipment)

THE EXPERIMENT/ EXAMINATION (mostly divided into units, one variable being changed in each)

INTRODUCTION; State learning problem/objective of the experiment METHODS State specific information not given in manual, i.e. (a) your answers for the conditional questions (b) deviations from procedure described (c) any equipment choices made MATERIAL State specific information about experimental subject/ preparation necessary to interpret or explain your results (hut not available from teaching materials). (For each ‘unit’): RESULTS presented in figures or tables, each with a legend pointing out relevant findings without interpretation. DISCUSSION First explain what results should be expected according to textbooks, then systematically compare them with the actual results just presented. Draw CONCLUSION whether any significant deviations are present, when measuring accuracy, and biological variation has been taken into account. FINAL CONCLUSION(S) Sum up in one or more conclusions explicitly related to the learning objective(s)/problem(s) stated in the introduction. Some of the conclusions of this type may cover more units and should be kept separate as the last element(s) of this section. DOCUMENTATION (enclosure of curves, etc.)

Table 3b. General guidelines for laboratory reports

General principles stated in the report guidelines: ( I ) Information explicitly stated in the laboratory manual need not be repeated in reports, but give precise

(2) Observations, interpretations, expectations and explanations must be kept clearly apart (otherwise it may references.

not be possible to decide whether the report authors are able to discriminate between them).

they have such clear functions determined by their practical use, i.e. to present data in such a way as to allow qualified readers t o establish their o w n judgement of them as if they were data of their own. As this is exactly what the evaluating teacher is supposed to do, sections derived from a scientific source should function well in reports.

Some of the section titles of the manual may need explanation.

Under purpose w e now explicitly state the question(s), the expected answer to which is t o be tested through that particular experiment (stated as part of the learning objective [Table

T h e conditional questions, which number 3 ~ 1 ) .

202 N. Nmaa

Table 4. The three principal components of Report section: Methods ( I ) Answers to the conditional questions asked in the manual (4 Table 2) . Examples from typical frogheart

perfusion experiment: (a) At what levels should the inflow pressure of the liquid from the Mariotte bottle be measured? (b) How do you ascertain that the bottle system is functioning properly?

alveolar sampling device. (2) The actual choices made, where a choice is demanded in the manual, e.g. the size of the bag used in the

(3) Actual deviations from the instruction, whether intended or not.

between two and six in each experiment, deal with methodological aspects only. Specific ex- amples are given in Table 4.

As indicated by the term conditional, teachers may prevent their students from starting an experiment if they are unable to give acceptable answers to the conditional questions. Thus, these questions were originally introduced pri- marily to protect experimental subjects and equipment against incompetent handling by ignorant students. By including the answers to these questions as a component of the methods section, we retain their ‘protective effect’ and avoid, in the reports, lengthy citations from the manual describing what the students ought to have done but did not always do. The potential learning effect from copying the manual is also retained, and probably raised to a higher intel- lectual level, through the working out of the answers to the conditional questions, which generally demand understanding of principles rather than pure recall. The total contents of this methods section demonstrate for the stu- dents how redundancy may be avoided without skipping necessary information.

Report section: results, discussion and conclusions

Under ‘Results’ the students are guided to present their results in the most appropriate form (as tables or figures) and asked to describe their findings systematically, in order that the reader may decide whether the participants are able to differentiate between observation and interpretation.

After the Results section is the corresponding Discussion and Conclusion section for each ‘unit’, within which one experimental variable is manipulated. Here we ask the students first to explain what results they had expected to find and why, and then to compare the ex- pected findings with the actual results just

described, taking into account measuring accuracy and other predictable influences. In that way the section may logically end with a conclusion regarding whether a significant dif- ference between expected and actual findings was or was not present in that particular unit. Considerable effort has been made to point out this hypothetico-deductive element-previ- ously expected and actual results were explic- itly kept apart in the discussion, to n o avail.

Final conclusions may concern experimental problems reaching ‘across units’, e.g. where two or more experimental variables must be combined in order to reach a conclusion. These are predominantly found in experiments or examinations performed on the participants themselves.

Report section: documentatiori

This section is used to point out in the guidelines the difference between presentation of results as a communicative function on the one side, and the documentation of them through the raw data on the other, thereby allowing the reader an independent evaluation of both data and of the experimental carefulness and skills of the author(s). However, students are permitted to use recorded curves also for presentation, but without an explicit descrip- tion of such curves, they are only considered valid as documentation.

Supplementary changes of the laboratory manuai

Mainly due to students pointing out smaller and smaller inconsistencies between the manual and the guidelines, some additional changes were introduced in the manual.

In the general introduction to the manual, where the relation between the objectives and the activities of the laboratory course is dealt

Promoting criterion-based evaluation 203

with, the latter are now described as a way of getting intimately acquainted with the essential process behind the contents of most medical textbooks rather than getting acquainted with their contents as previously stated. Conse- quently, it is now considered more important that the report conclusions are valid than that the results are as expected-the more so as reliable results may demand skills that stem from repeating the same experiments a number of times. This is very seldom done in our courses, and never demanded, as we are train- ing doctors, not laboratory technologists.

Another change in the manual has been to include, in some experiments, in a very short additional section named ‘Report’, helpful sug- gestions on specific issues that have caused trouble in reports on that particular experi- ment. For instance, in ophthalmoscopy, it has been necessary to make clear that the result here is a precise description of the observed retinal picture. The description of specific (nor- mal) details of the vessels or optical papilla, likely to be unique for that particular person, must be included mainly in order to allow the consistency between Material, Results and Conclusion to be ascertained.

Supplementary activities and materials

Common starting day. This was planned by the staff in order to combine our efforts for making students familiar with ( I ) the practical labora- tory conditions including self-instructional au- diovisual tape-slide programmes, (2) the use of elementary statistics in data evaluation and (3) how reports are systematically evaluated. The specific components of the experiment of this day are listed in Table ga and b.

From the top of Table ga it appears that this experiment falls in two parts: the ‘practical’ part I, where individual results are obtained by the students performing two simple lung- function tests on themselves, and the ‘theore- tical’ part 11, which is based on group results (from a former class) presented in a prototype report printed in the manual, and written by teachers strictly according to our report guide- lines. After all participants have first discussed their answers to the conditional questions and what the expected results will most likely be,

half the class perform their measurements, while the others familiarize themselves with the laboratory environment. They then exchange activities. When all students have obtained their results (after about I hour), they report them in a one-page fill-in report (where they are strongly guided to reach a valid conclusion regarding their personal results, comparing them with normal-person-values calculated from regression equations). All reports are then peer-evaluated in small groups together with the prototype report. This small-group work (%I hour) provides a specific basis for answer- ing the plenary questions (Table gb), and thus enables students to take an active part in the final plenary session (YEI hour), which finishes the common starting day.

Ad hoc notes. For students who cannot understand why their report conclusions must state whether they found any deviations from the expected findings instead of just concluding that they supported or proved what the text- book already says, a short note explaining the difference more fully has been worked out. This note also explains how the expected findings represent a combination of the null- hypothesis and specific test-hypotheses taken from the textbook.

The note also describes how different conclu- sions are possible for a single researcher or research-group on the one hand and a review- author on the other. The latter is in a position to sum up a larger number of independent results and is therefore, in contrast to any single researcher, able to take into account the possi- bility of obtaining results as expected simply by chance.

Present experiences

Learning objective. The futile efforts of students trying to deduce the learning objective or the experimental problem from the manual text were surprising, until the author tried it for himself. To do so required an intimate know- ledge of the experiments, of their methods, of the expected results, and also of what results students typically obtain.

The present solution hardly makes the stu- dents aware of how difficult the task is, but its relevance for appropriate conclusions is readily

204 N. Nmaa

Table ga. Specific elements integrated in common starting experiment (for explanation see text)

__ Element Part I (fill-in report)

Learning objective

Experimental problem

Conditional questions

Expected results

Actual results

Reach valid conclusion regarding personal VC- and PEF-results

calculate NPV obtain reliable VC- and PEF-results examine for significant deviation from NPV H o w do you find VC from a spirogram recorded with a Krogh spirometer? H o w do you transfer a gas volume from ATPS to BTPS conditions? (by calcula- tion)? What does your personal NPV amount to, by calculation from the regression equations? H o w d o you decide whether your mea- sured result deviates significantly from NPV?

From students stating no evidence of pul- monary symptoms: no deviation from NPV.

Subnormal values only from students with recognized pulmonary disease. Variable proportion of healthy students sig- nificantly above their NPV, particularly with PE results.

Discussion and conclusions

See Table gb below.

Part I1 (printed prototype report)

Reach valid conclusion regarding possibility of reference population not being valid for student population examined. To make valid comparisons of population means

( I ) H o w do you find out whether two means differ statistically significant at a certain P-level?

N o significant differences between mean of measured results and mean of corresponding NPVs.

O n some experimental days mean result of VC is significantly higher than mean of NPVs, mostly for men, at least once for women also. Same findings with PEF re- sults.

Table sb. Discussion and conclusion elements of common starting exoeriment

Plenary ( I ) What level of probability is obtained questions with limit set by 1.96XSD of fill-in (to guide report? discusion) ( 2 ) (a) Are any subnormal values found in

your group? (b) Is the number of deviations in the group as expected?

(3) What is the advantage of using individual-normal-person-values rather than the usual 95% range of results from normal persons?

(4) (a) Does the conclusion of the pro- totype report (PTR) pertain to the ‘objective question’ stated in the intro- duction? (b) Is the conclusion valid?

(5 ) Does the description of results in the PTR contain any interpretations?

(8) Anybody who wants to propose a like- ly explanation of deviations above nor- mal?

(10) What may be coricluded from a signifi- cant deviation in an apparently healthy person?

Conclusions Approximately 10% state non-valid devia- tions from NPV due to incorrect limit

Significant difference between means of re- sults and corresponding NPVs have been present for both VC and PEF in men several times, for women once out of eight occa- sions.

(Controlled investigations of these differences have been instigated from this)

Promoting criterion-based evaluation 205

acknowledged by most students. The most likely learning effect in the students from simply re-stating the learning objective from the manual in their own reports may be the habit of beginning a working process by stating the problem as a clear question which then governs the whole process.

From the student reports the following im- pression has emerged: if students either do not state the learning objective or rephrase it in a non-questioning format, the final conclusion will either be lacking or non-valid.

Structural elements. In the section on Material it seems difficult to prevent students firstly from stating their own judgement like ‘well- trained’ without giving the basis for it, and secondly from omitting relevant negative in- formation and suggesting this to be taken as equivalent to ‘no complaints’. Thus they do not take into account the possibility of the issue simply having been forgotten, in spite of their own frequent use of this ‘excuse’ in their sections on Method or Results. In experiments performed by the students on themselves, it might also be just a matter of students forget- ting that other readers have access only to the information offered explicitly by the students concerning their own state of health, which is so evident to themselves.

The most direct effect of the Methods section has been that students now spontaneously ack- nowledge that answering the ‘conditional ques- tions’ is also a useful preparatory activity for them. Some had indicated this by including the answers in their reports even before this was requested in the guidelines.

Precise and yet sufficient descriptions of actual results are still not too common. This may bc partly due to the fact that this clinically very useful skill is not tested at every physiolo- gy examination. Another aspect may be that many students have great difficulties in making and describing observations unbiased by their expectations. Mostly they tend to use either diffuse or interpretative statements, and some even explain deviations they have not de- scribed. T o get rid of unconscious biases caus- ing ‘selective perceptions’ seems much more difficult than the average student expects. This appears to be a major problem in working with the combination of actual and expected results

so characteristic of the hypothetico-deductive approach.

While it seems rather uncomplicated for most students to deduce acceptable expected results from textbook theory in general, the same sort of trouble with unconscious bias as mentioned above may be reflected in an un- awareness in students of how expected results deduced from a general textbook hypothesis should be modified by the particular conditions of an experimental subject or by specific devia- tions from the procedures decribed in the manual.

This could be a parallel to the overworking of inappropriate hypotheses reported for clinic- al diagnostic work by Elstein et al . (1978). Taken together these experiences may indicate a lack of a process element in the hypothetico- deductive method, concerned with something like ‘Validation of the general hypothesis against the particular conditions of the ex- perimental subject or patient’.

The Discussion and Conclusion sections have led to a positive change, in that only few students now mix up expected and actual results in more than the first one or two reports. This was not the case until the two elements were made very explicit in the guide- lines. Now students may even discuss whether their expectations may be ‘ailing’, rather than their results. As a whole this section seems to have transformed the general term ‘hypothetico-deductive process or method’, known to them from their course in the philosophy of sciences, into a specific and useful term.

But another problem still remains when students are confronted with the lack of logical consistency between their initial stated problem and their final conclusion, most often concern- ing the issues mentioned above, as to how unconscious bias may be reflected. The prob- lem is that some students do not find it worth while to produce corrective changes in their report once they have been able to localize the source of their inconsistency, which as a rule is brought about by encouragement from the teacher into systematic use of the criterion questions in Table I .

Apparently, students are so satisfied with their ‘Aha!’ experience that they do not find it

206 N . Nmaa

necessary to take the trouble of phrasing a new and fully correct conclusion. As yet the author has not found a way of convincing students that the ability to acknowledge a logical incon- sistency is not evidence for being able to avoid or correct such an inconsistency but is only a necessary prerequisite. And convincing stu- dents is the only solution, as we have no sanctions.

The common starting d a y experiment has de- finitely directed the attention of both teachers and students towards the need for correct decisions based on elementary statistics. In addition to instigating controlled investigations the deviations found hitherto have led to pro- ductive discussion, particularly of plenary ques- tion 8 in Table sb, bringing up problems of how accurate and how representative the re- sults are, sometimes at a level which puts severe demands on the teacher.

Invariably a few students do reach non-valid conclusions in their starting experiment, main- ly due to faulty use of the statistics taught during the first preclinical year (but apparently never applied in other subsequent preclinical courses). If such non-valid conclusions are not appropriately discussed with the participants at this first occasion, the error tends to repeat itself and becomes more and more difficult to correct. This may be explained by later addi- tional problems masking the statistical aspect of the conclusions.

Other student reactions to the material. Early in a laboratory course, our guidelines were consi- dered superfluous and slightly ridiculous by some students, because ‘It can’t be necessary to train us in simple things like keeping observa- tions and interpretations apart, if one has a critical attitude- we always do that’. In some cases it has been possible to challenge such students into proving their point by following the guidelines strictly in at least one report. Afterwards some of them said they had had trouble in doing so, and were surprised to discover that critical interpretation of results was not possible without a good knowledge also of the specific methods used.

However, the overall reaction from students has been to deliver reports that follow the guidelines well enough to make the reports easier to mark and discuss with their authors,

mainly because it is now faster to point out sources of non-valid conclusions. In addition, more students have been concerned with obtaining formally accepted reports than be- fore, although this has no consequences what- soever. Whether this is a passing phenonemon, time will show.

Reactions from teaching colleagues. Most col- leagues have simply granted their acceptance by not protesting when the modifications have been subjected to the formal procedure speci- fied by the institute board for all teaching materials to be used by all 18 teachers.

T w o have expressed the belief that stressing a learning objective dealing with reports does not appeal to preclinical students as much as experimental problems do. If this is true, a change is easily made by turning the objective (in Table 2) into a criterion. This may be done by first stating the experimental problems, and then ending by stating: ‘The experiment has been successful when the report contains a valid conclusion as to whether or not the results obtained differ significantly from those that should be expected-according to the learning materials-with regard to the experimental problems.’

Useful for teachers elsewhere?

The general effect of such a set of materials is difficult to predict or test, because they act through teachers whose interest, support and competence may vary considerably from place to place and from time to time. However, the author can testify that it has been an exciting experience to pour old wine into a new set of bottles evidently more recognizable by the students.

References

Elstein A.S., Shulrnan L.S. & Sprafka S.A. (1978) Medical Problem Solving: A n Analysis of Clinical Reasoning. Harvard University Press, Cambridge, Massachusetts.

Naeraa N . (1979) Criteria for acceptance of reports in experimental laboratory courses. Medical Education 13, 2 I F 2 3 .

Received 3 July 1985; editorial comments to author 29 January 1986; I I June 1986; 29 November 1986; acceptedfor publication 8 January 1986