Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
The relationship between the applicationof scoring rubrics and writing performance
Item Type text; Dissertation-Reproduction (electronic)
Authors MacElvee, Cameron
Publisher The University of Arizona.
Rights Copyright © is held by the author. Digital access to this materialis made possible by the University Libraries, University of Arizona.Further transmission, reproduction or presentation (such aspublic display or performance) of protected items is prohibitedexcept with permission of the author.
Download date 10/12/2020 20:12:51
Link to Item http://hdl.handle.net/10150/289775
INFORMATION TO USERS
This manuscript has been reproduced from the microfilm master. UMI films
the text directly from the original or copy submitted. Thus, some thesis and
dissertation copies are in typewriter face, while others may be from any type of
computer printer.
The quality of this reproduction is dependent upon the quality of the
copy submitted. Broken or indistinct print, colored or poor quality illustrations
and photographs, print bleedthrough, substandard margins, and improper
alignment can adversely affect reproduction.
In the unlikely event that the author did not send UMI a complete manuscript
and there are missing pages, these will be noted. Also, if unauthorized
copyright material had to be removed, a note will indicate the deletion.
Oversize materials (e.g., maps, drawings, charts) are reproduced by
sectioning the original, beginning at the upper left-hand comer and continuing
from left to right in equal sections with small overiaps.
Photographs included in the original manuscript have t)een reproduced
xerographically in this copy. Higher quality 6" x 9" black and white
photographic prints are available for any photographs or illustrations appearing
in this copy for an additional charge. Contact UMI directly to order.
ProQuest Information and Learning 300 North Zeeb Road. Ann Arbor. Ml 48106-1346 USA
800-521-0600
1
THE RELATIONSHIP BETWEEN THE APPLICATION OF SCORING RUBRICS
AND WRITING PERFORMANCE
by
Cameron Ross MacElvee
Copyright ® Cameron Ross MacElvee 2002
A Dissertation Submitted to the Faculty of the
DEPARTMENT OF EDUCATIONAL PSYCHOLOGY
In Partial Fulfillment of the Requirements For the Degree of
DOCTOR OF PHILOSOPHY
In the Graduate College
THE UNIVERSITY OF ARIZONA
UMI Number 3050306
Copynght2002 by MacElvee, Cameron Ross
All rights reserved.
UMI' UMI Microform 3050306
Copyright 2002 by ProQuest Information and Leaming Company. All rights reserved. This microform edition is protected against
unauthorized copying under Title 17, United States Code.
ProQuest Information and Leaming Company 300 North Zeeb Road
P.O. Box 1346 Ann Arbor. Ml 48106-1346
2
THE UNIVERSITY OF ARIZONA ® GRADUATE COLLEGE
As members of the Final Examination Committee, we certify that we have Cameron Ross MacElvee
read the dissertation prepared by
The R.elat ionship between the Application of
ScorinF Rubrics and Writin-r Performance
and recommend that it be accepted as fulfilling the dissertation Doctor of Philosophy
requirement for the I^egree of
Phil Callahan
Chris Johnson
Date
Date
c'2-Date
Date
Date
Final approval and acceptance of this dissertation is contingent upon the candidate's submission of the final copy of the dissertation to the Gradxiate College.
I hereby certify that I direction and recommend requirement.
have read this dissertation prepared under my that it be accepted as fulfilling the dissertation
irector nith
Date 'K.enne
3
STATEMENT BY AUTHOR
This dissertation has been submitted in partial fulfillment of requirements for an advanced degree at The University of Arizona and is deposited in the University Library to be made available to borrowers under rules of the Library.
Brief quotations from this dissertation are allowable without special permission, provided that accurate acknowledgment of source is made. Requests for permission for extended quotation from or reproduction of this manuscript in whole or in part may be granted by the copyright holder.
4
ACKNOWLEDGEMENTS
I would like to acknowledge the important contribution of the following individuals: my advisor and director. Dr. Kenneth Smith for his steadfast endurance with my slow progress and his consistent doses of encouragement; Dr. Barbara Smith for her wise counsel and friendship; Dr. Philip Callahan for his faithfulness and confidence in my work not only as a student, but as an educator; Dr. Christopher Johnson for overlooking my procrastination; Dr. Joni Adamson for enthusiastically taking up "my cause" in the eleventh hour; Dr. Theresa Enos for accepting me into one of the finest rhetoric departments in the country and supporting my unusual pursuits; Sheila Savadkohi and Karoleen Wilsey for always helping me make those important deadlines; the 1998 MMEP participants for their passion of learning; Dr. Linda Don for her support and enthusiasm surrounding my participation in MMEP; my friend Dr. Jerry Rintala for his encouragement; my dear friends Steve and Matt Jones, who love me with or without this degree -thanks guys for everything you have done for me; my friends and colleagues at Scottsdale Community College for supporting me in my educational and professional endeavors; my children, T.C., Nick and Jaxi for reminding me what is really important in life; my mother, Berthalene Lairson for sticking it out with me until the very end; my "bestest" friend in the whole wide world, Lynette Crockett - thanks for nagging me to get this done; my amiga bonita, Taime Bengochea - thanks for believing in me, even when I did not; and God for His amazing grace in my life.
5
DEDICATION
This work is dedicated to my mother, Berthalene
Lairson. Without her support and uncompromising faith in
my abilities, this work would not have been completed.
Thanks, Mom!
I would also like to dedicate the fulfillment of this
writing to the memory of Rhuby MacElvee. You will forever
be missed in my life.
6
TABLE OF CONTENTS
LIST OF FIGURES 7 LIST OF TABLES 8 ABSTRACT 9 CHAPTER I: INTRODUCTION 10
Statement of the Problem 15 Limitations of the Study 15
Sample 15 Sample size 16 Time 17 Interreader/interrater reliability 17
CHAPTER II: REVIEW OF RELATED LITERATURE 19 CHAPTER III: DESIGN OF THE STUDY 25
Statement of the Hypothesis 26 Method 26 Subjects 27 Instrument 29
Medical College Admission Test 29 Writing Sample 29 Writing skills accessed by the Writing Sample....31 Writing Sample scoring rubric 31 Holistic scoring procedure 32
Procedure 33 CHAPTER IV: ANALYSIS OF DATA 37
Analysis of the Relationship 38 Strength of the Relationship 40 Nature of Relationship 41
CHAPTER V: CONCLUSION, IMPLICATIONS, AND RECOMMENDATIONS FOR FURTHER RESEARCH 4 3
Conclusion 43 Implications 43 Recommendations 44
APPENDIX A, SCORE POINT DESCRIPTIONS 45 APPENDIX B, INTERPRETATION OF LETTER SCORES 46 ElEFERENCES 47
LIST OF FIGURES
FIGURE 1, Experimental Design
8
LIST OF TABLES
TABLE 1, Rank Sums and U for Experimental and Control Groups 39
TABLE 2, Positive and Negative Rank Sioms Differences and T for Experimental and Control Groups 41
9
ABSTRACT
The purpose of this study was to determine if a
relationship exists between the knowledge and application
of a writing scoring rubric to writing performance.
Participants in a Minority Medical Education Program were
given intense instruction in the use of the Medical
Colleges Admissions Test Writing Sample scoring rubric.
Scores from the participants' pretest and posttest were
compared.
10
CHAPTER I:
INTRODUCTION
In 187 6, the Association of American Medical Colleges
(AAMC), a non-profit association, was founded to address
reform in medical education. In approximately the last 20
years, one of its goals has been to increase the number of
underrepresented minority students (URMS) in medical
education. Consequently, the Minority Medical Education
Program (MMEP), held each summer at one of the eleven
medical schools sites around the country, has been offered
to qualified participants for more than a decade
(Association of American Medical Colleges, 1995-2002a,
1995-2002e).
As an academic enrichment program, one of the many
goals of MMEP is to provide the type of experiences that
will help URMS not only to appreciate the relationship
between diverse cultures and the medical profession, but
also to help students gain admission to medical schools.
One of the focal points of MMEP then is instruction and
preparation for the Medical College Admission Test (MCAT),
which along with undergraduate grades, is used to predict
performance in medical school courses and is used by
medical college admissions committees when evaluating
11
applicants for admissions (Association of American Medical
Colleges, 1995, 1995-2002f; Jones, 1983; Koenig & Mitchell,
1993; Koenig & Wiley, 1996; Lynch, 1990).
During MMEP, URMS take two full-length MCAT practice
examinations. Results of the first examination (pretest)
are returned to the URMS, and are then used by the MMEP
instructors to focus instruction in physics, chemistry,
biology, biochemistry, verbal reasoning and writing. URMS
also work on study skills, test-taking skills, time
management, and stress management. The preparation for the
MCAT concludes with the taking and analysis of the second
MCAT practice examination (posttest) (Association of
American Medical Colleges, 1995-2002e).
It has been documented that participants in MMEP
demonstrate score gains on the MCAT, but whether or not
those gains are due to the instruction alone or to a
combination of other factors has not been thoroughly
investigated (Jackson et al., 1985). Various preparation
programs for standardized tests have been shown to increase
examinees' performances (Greenlee, 1996; Seaton, 1992).
Although it has been observed that high-stakes admissions
tests such as the MCAT are less open to test preparation
than general intelligence measures, when these preparation
12
programs combine a pretest along with instruction, then
examinees' scores show the greatest gains (Witt, 1993). In
particular, commercial coaching courses (i.e. Princeton
Review, Columbia Review, Kaplan) have been found to have
the effect of increasing examinees' scores on the physical
and biological science sections of the MCAT (Jones, 1986).
Nevertheless, the AAMC reports that approximately 63% of
MMEP participants who successfully complete the summer
program are accepted to medical school (Association of
American Medical Colleges, 1995-2002d; Jolly, 1992).
The current MCAT contains 4 subtests: physical
sciences, biological sciences, verbal reasoning and writing
sample. However, the writing sample portion of the
examination did not become an integral part of the MCAT
until 1991 (Future Doctor Net, 2001).
As early as 1973, the Medical College Admission
Assessment Program recommended that a test of written
communication skills be considered along with physical and
biological science knowledge. This was in part due to the
emphasis in the 1970s on the need for better communication
between doctors and their patients. In fact, it has been
documented that writing sample scores are closely
associated with clinical competences (Hojat et al., 2000).
13
Early in the 1980s, medical school deans began
discussing the importance of writing and analytical skills
for applicants. In fact, many medical school deans reported
inadequate writing and communication skills among their
medical students. As a result, the medical school
application process was revised so to require an assessment
of writing and communication skills from their applicants
(Future Doctor Net, 2001; Koenig & Mitchell, 1993; Swanson
& Mitchell, 1989).
Several medical schools in the 1980s were already
collecting writing samples at the time of application;
however, there was no standardized instrument for assessing
the writing and analytical skills of medical school
applicants (Future Doctor, 2001). Consequently, the AAMC
piloted the writing sample in the spring and fall of 1985
and 1986 (Holden, 1989; Koenig & Mitchell, 1993).
The pilot was extended through 1990 in order for
developers to insure acceptable levels of reliability and
to produce a number of comparable test forms or prompts
(see page 29 for a description of the prompts) (Koenig,
1995). Researchers experimented with various versions of
the writing sample, from two 30-minute prompts to a
combination of one 45-minute plus one 15-minute prompt.
14
Although the 30-minute prompt often produced responses with
unsophisticated arguments, this format was adopted
(Mitchell, 1987). The two 30-minute prompts required
examinees to develop and present ideas in a cohesive manner
in order to provide medical school admission committees
evidence of applicants' writing and analytical skills
(Future Doctor, 2001; Koenig & Mitchell, 1993).
Since the MCAT Writing Sample elicits responses from
examinees that are not connected directly to physical or
biological science content, the only focus for MMEP
instructors is on developing the URMS' test-taking
strategies (Ornstein, 1993). It has been documented that
minority students and/or students from culturally different
backgrounds benefit substantially from instruction in test-
taking strategies (Johns & VanLeirsburg, 1992; Scruggs &
Mastropieri, 1995; Thurmond, 1986).
For the writing sample, the strategies for test-taking
skills are independent of content knowledge and require
URMS to become "test-wise" (Wheeler & Haertel, 1993). The
term "test-wise" can be defined as the ability to use
knowledge about a specific test format, including the
scoring criteria, to make the most of one's performance
(Scruggs & Mastropieri, 1995). In other words, MMEP
15
instructors must address those instructional activities
that require URMS to evaluate their own and each others'
writing in the context of the standardized criteria set
forth by the MCAT Writing Sample.
Statement of the Problem
The University of Arizona is one of two program sites
for the Western Consortium of Medical Colleges (the
University of Washington is the other) where UI^S may
select for their MMEP summer experience.
The University, located in Tucson, Arizona, began
conducting the MMEP in 1994. Since the University is the
only publicly funded medical school in Arizona, the Office
of Minority Affairs strives to serve the needs of the
State's diverse minority population, which ranges from
Native Americans to recent citizens from South American
countries (Arizona Health Sciences Center, n.d.).
The purpose of this study was to determine whether the
knowledge and application of a writing scoring rubric was
related to writing performance.
Limitations of the Study
Sample. The 1998 University MMEP participants were
pre-selected into the program based on particular criteria
16
set down by the AAMC. These criteria required the
participant to:
• be a U.S. citizen or hold a permanent resident visa
• show evidence of completing at least one year of
college
• have an overall grade point average (4.0 scale) of
3.00, with at least a 2.75 in the sciences
• have a SAT score of at least 950 or an ACT score of at
least 20
• be a member of an underrepresented minority group
(African-American, Mexican-American - Mainland Puerto
Rican, or American Indian/Alaska Native). (Association
of American Medical Colleges, 1995-2002f)
Since the researcher had no part in the decision as to
the inclusion or exclusion of the University MMEP
participants, the 46 participants were not confirmed as a
truly random representative sample of the URMS population.
Sample size. Due to the intense nature of MMEP and
limited facilities, participants in the program are limited
to no more than 50 - 60 per summer. During the summer of
1998, when this investigation was conducted, 46
participants were selected to participate. Because the
17
research called for a pretest-posttest control group
design, the number of participants in the experimental
group was relatively small (n = < 30).
Time. The amount of time MMEP participants were
involved in the program was limited to six weeks.
Furthermore, these six weeks were not devoted entirely to
writing instruction, but included instruction in the
physical sciences, biological sciences, and verbal
reasoning. Participants also took part in other enrichment
activities, such as clinical rotations, guest speakers, and
a number of other professional and career exploration
activities. Therefore, the amount of time on task (seven
hours total) devoted to writing instruction during the
program was extremely limited. Due to the nature of the
design, participants were exposed only for a short period
of time to the writing instruction in question - the
knowledge and application of the MCAT Writing Sample rubric
- before being reassessed.
Interreader/interrater reliability. It has been argued
that a standardized test of writing has many inherent
problems, such as the validity and reliability of the
rubric (Mabry, 1999). During the 1985 spring and fall pilot
test of the MCAT Writing Sample, interreader/interrater
18
reliability was estimated at .84. Of the 20 readers used
in this pilot, 16 of them had been trained using White's
(1994) holistic scoring method (Mitchell, 1986a, 1986b).
However, the norming sessions for the University MMEP
staff/readers, who were not experienced writing teachers,
were conducted in half of the time and only on 5% of the
pre-MCAT Writing Sample responses. The degree of
interreader reliability was not calculated, which, by some
reviewers, may limit the interpretation of the results.
19
CHAPTER II:
REVIEW OF RELATED LITERATURE
Providing students with background information and the
structure of the scoring of an assessment has been shown to
increase student scores in mathematics according to Fuchs
et al. (2000). The same procedure has been shown to affect
writing.
Recent composition theory and research, as reported by
Soles (2001), suggests that instructors should share
scoring rubrics with students, as there appear to be many
benefits to both teachers and students when students write
to the context of the rubrics. According to Soles (2001),
the knowledge and application of rubrics to writing results
in more active participation among students; furthermore,
when students become familiar with the scoring rubrics,
their writing anxiety often decreases and their awareness
of audience is improved (see also Wyngaard & Gehrke, 1996) .
In a study of 303 8^^ graders, Andrade (1999) found
that the students' knowledge and application of the scoring
rxibrics to their own writing actually helped them write
better. However, in a related study where students used
the rubrics as a self-assessment tool only, Andrade (1999)
did not report improvements in student writing. This would
20
suggest that the rubric must be used in an instructional
manner and not simply provided to students as a self-
assessment tool.
Bergdahl (1999) reports having students
collaboratively create scoring rubrics and guides for
papers in composition courses. Bergdahl (1999) argues that
students' attention to the rubrics caused them to be more
focused on the goals of the papers. In addition, it
allowed students to have a clearer notion on what areas
they needed improvement.
Skillings and Ferrell (2000) suggest that in some
cases when teachers and students work together to generate
specific criteria in the form of a rubric, it can increase
critical thinking (see also Use Rubrics and Reach All
Learners, 1998). For example, Duke & Sanchez (1994) report
how teachers and 10^^ grade students in Pennsylvania
developed a six-point holistic rubric to be used in grades
6 to 9. Students used the rxibric to evaluate their own and
others' papers. The project recorded an improvement in
student writing
Another study by Schirmer & Bailey (2000) announced
improved writing skills in two classes of deaf students who
participated in the creation of writing assessment rubrics
21
with their teachers. Students then used the rubrics to
guide them in their writing and revising.
Harrington, HoliJc and Hurt (1998) report an
improvement in student writing as documented in a research
project targeting 5*^^ graders. The project first identified
poor student writing through a collection of student
writing samples. The researchers obtained the advice of
writing experts as to what instruction might be used in
order to address the students' lack of skills. One
category of intervention that the experts suggested was the
use of rubrics in the planning and revising of student
writing. The project documented an improvement in student
writing, as well as a marked increase in students'
enjoyment of writing.
It has been noted by Wilson & Onwuegbuzie (1999) that,
without an integration of a rubric into writing
instruction, writing responses have the potential of
failing to meet the expectations of instructors.
Furthermore, students may become frustrated with a lack of
clear criteria. Rubrics are beneficial, not only because
they break the writing into manageable tasks but also, as
Shaugnessy (1994) points out in reference to standardized
22
tests, because they provide clear criteria of writing
skills to be judged.
The use of rubrics not only specifies the criteria but
the performance levels as well. Montgomery (2000) claims
this allows the teaching of concepts, rather than content
which is particularly suited to instruction for the MCAT
Writing Sample.
Rubrics, as Andrade (2000) notes, show a graduation of
quality. The MCAT Writing Sample scoring rubric makes it
ideal for use in instruction, since the six-levels clearly
state the expectations of the Medical College Admissions
Assessment Program and the AAMC (see Appendix A). This
rubric makes it an equally effective type of feedback for
the URMS writing samples.
Based on the available literature about the
instructional uses of writing scoring rubrics to improve
writing performance, it is apparent that a typical
population of MMEP participants (or other populations of 2"^
year+ college students) has not been the subject of similar
research. One reason for this may lie in the perception of
the role writing instruction has in an undergraduate
college degree program.
Two semesters of freshman composition and one
23
upper-division writing course are the typical requirements
for an undergraduate degree from a four-year college or
university. In addition to these courses, many
institutions require students to pass some sort of writing
proficiency exam before their junior year in a
baccalaureate program. For students who do not succeed at
freshman composition or pass proficiency exams, there are
courses meant to provide "remedial" instruction in writing.
As a result, the concern for improving writing performance
among college juniors or seniors is delegated to those
remedial courses. And, as White (1994) points out, the
label "remedial" is seldom used with these courses since
colleges and universities do not want the stigma of having
accepted and enrolled "struggling" writers to their
institution of higher education.
The participants in MMEP have not only completed at
least one year of college (including freshman composition),
but also tend to be academic achievers (see the
requirements for MMEP admissions page 16). As a result,
literature addressing the use of writing scoring rubrics to
improve writing performance is limited to populations of
students who are in need of improved performance in the
24
classroom rather than on standardized examinations such as
the MCAT.
25
CHAPTER III:
DESIGN OF THE STUDY
The design used in this study was a pretest-posttest
control group design (Figure 1). An individual's
performance on the MCAT Writing Sample has the possibility
of being related to a number of variables. A "shotgun"
approach to investigating all possible and conceivable
variables, which may or may not be related to writing
performance, would be expensive, inefficient and time
consuming. Based on deductive reasoning from the
literature (Andrade, 1999, 2000; Bergdahl, 1999; Fuchs et
al., 2000; Harrington, Holik & Hurt, 1998; Schirmer &
Bailey, 2000; Soles, 2001), the knowledge and application
of the MCAT Writing Sample six-point scoring rubric within
the method described by White (1994) was considered a good
candidate for investigation as one of the many possible
variables related to writing performance.
Therefore, the design was intended to indicate the
relationship between the knowledge and application of the
MCAT Writing Sample six-point scoring rubric and MCAT
Writing Sample scores. (Figure 1)
26
Group Assignment n Treatment Midtest
1 random 21 Writing MCAT
Experimental Instruction Writing
Sample
2 random 23^ Verbal MCAT
Control Reasoning Writing
Instruction Sample
Figure 1. Experimental Design ^Note. Two subjects in Group 2 did not participate in the midtest
Statement of the Hypothesis
It was hypothesized that there is a relationship
between the knowledge and application of a writing scoring
rubric to writing performance.
Method
In order to determine whether or not the knowledge and
application of the MCAT Writing Sample scoring rubric to
their own and others' writing was related to their
performance on the writing sample portion of the test,
participants in the University's 1998 MMEP were introduced
27
to the rubric and holistic scoring procedures as described
by White (1994).
Subjects
The description of the population of the University's
1998 MMEP participants was inferred from the "Sunutiary Data
on the April 1997 MCAT by Gender, Racial/Ethnic Group, Age,
Language Status, Undergraduate Major, Testing History,
State of Legal Residence" document prepared by the
Association of American Medical Colleges (n.d.). That
population was comprised of 4 6% female, 53% male, and 11%
underrepresented minority (African-American/Black, Mexican-
American/Chicano/Puerto Rico, and Native American/Native
Alaskan/Native Hawaiian) with 24% of the test takers
representing other minorities (Commonwealth Puerto Rico,
Asian, and other Hispanics). As a total, minority test
takers represented 34%. The median Writing Sample score
for the underrepresented minority group was a scaled score
of N. The median Writing Sample score for all 1997 test
takers was a scaled score of O (see Appendix B for a
description of the scaled scores).
The 46 participants in the University's 1998 MMEP were
preselected using the criteria established by the
Association of American Medical Colleges (Association of
28
American Medical Colleges, 1995-2002f). The group consisted
of 65% female, 35% male with 67% identified as
underrepresented minorities (African-American/Black,
Mexican-American/Chicano/Puerto Rico, and Native
American/Native Alaskan/Native Hawaiian), and 15%
representing other minorities (Commonwealth Puerto Rico,
Asian, and other Hispanics). There were a number of
participants (15%) who did not indicate a specific ethnic
group, but rather chose more than one as representative of
their ethnic identity.
These 46 participants were randomly assigned (using a
table of random numbers) to two groups of 23 each. Because
of the scheduling constraints of the six-week MMEP program,
the experimental group was further divided (randomly) into
two subgroups (subgroups 1 and 2). The experimental group
was then instructed in the use of the writing scoring
rubric on different days; nevertheless, the two subgroups
of the experimental group received the same instruction.
Before instruction began, an adjustment in group assignment
was made to accommodate two students and resulted in Group
1 with n = 21 and Group 2 with n = 25.
29
Instrument
The University's Office of Minority Affairs received
copies of the Association of American Medical Colleges'
publications MCAT Practice Test I, MCAT Practice Test II,
and MCAT Practice Test III with Solutions Booklet. Each of
the tests contained an approved MCAT Writing Sample
portion, and these practice tests were used as the
measuring instruments.
Medical College Admission Test. The three quarters of
the MCAT is a standardized, multiple-choice examination
designed to assist admission committees in predicting which
of their applicants will perform adequately in the medical
school curriculum. The test assesses problem solving,
critical thinking, and writing skills in addition to the
examinee's knowledge of science concepts and principles
prerequisite to the study of medicine. The MCAT is scored
in each of the following areas: Verbal Reasoning, Physical
Sciences, Writing Sample, and Biological Sciences
(Association of American Medical Colleges, 1995-2002b).
Writing Sample. Each Writing Sample prompt requires an
expository response to a specific topic selected from areas
of general interest, such as business, politics, history,
art, education, or ethics. Topics do not pertain to the
30
content of the physical or biological sciences, to the
medical school application process, or to reasons for the
choice of medicine as a career. Specific prior knowledge
about the topic is not necessary to complete the Writing
Sample (Association of American Medical Colleges, 1995).
The prompts consist of a "topic statement" (e.g.
"Honesty is the best policy.") followed by instructions for
three writing tasks. The first task is to explain or
interpret the topic statement. The first portion of the
instructions is the same for all prompts: "Write a unified
essay in which you perform the following tasks. Explain
what you think the above statement means." The
instructions for the second and third writing tasks vary
across prompts and are printed immediately beneath each
topic statement. In general, the second task requires
consideration of a circumstance in which the statement
might be contradicted or judged not applicable. Examinees
must present a specific example that illustrates a
viewpoint opposite to the one presented in the topic
statement. The third task requires discussion of ways in
which the conflict between the initial statement and its
opposition (expressed in the second writing task) might be
resolved. Here, examinees must reconcile the two
31
viewpoints. (Association of American Medical Colleges,
1995, 1995-2002c; Future Doctor Net, 2001).
Writing skills accessed by the Writing Sample. The
MCAT Writing Sample consists of two 30-minute essays
designed to assess skills in the following areas:
• Developing a central idea
• Synthesizing concepts and ideas
• Presenting ideas cohesively and logically
• Writing clearly
• Following accepted practices of grammar, syntax,
and punctuation consistent with timed, first-
draft composition. (Association of American
Medical Colleges, 1995, p. 65)
Writing Sample scoring rubric. The scoring rubric for
the MCAT Writing Sample, like any rubric, is an assessment
tool that outlines the specific expectations and criteria
examinees' samples are meant to show (Taggert, Phifer,
Nixon, & Wood 1998). Since rubrics are used in the judgment
of quality, rather than content (Moskal, 2000), writing
samples are scored holistically. Whereas some methods for
scoring essays assign several scores to a single piece of
writing (e.g. separate scores for organization.
32
%
development, grammar and mechanics) , holistic scoring
regards an essay as a whole without separable aspects. This
type of scoring is based on the assxamption that the various
factors involved in writing are so closely interrelated
that an essay should be assigned a single score based on
the quality of writing as a whole. See Appendix A for the
MCAT Writing Sample scoring rubric, which has been
reproduced from the MCAT Student Manual publication.
Holistic scoring procedure. White (1994) established
this method of holistic scoring for the MCAT during the
1980's pilot (Holden, 1989; Koenig, 1995; Koenig &
Mitchell, 1993; Mitchell, 1986a, 1986b). The key to
White's method is the establishment of what he terms the
"discourse community." This community is comprised of the
readers/scores/graders of writing. In order to attain
acceptable levels of interreader/interrater reliability, it
is important that all members of this community use common
criteria in the form of a holistic scoring rubric.
The group of readers is lead by a "chief" reader or
leader. A random sample of writing products is drawn from
all available products (usually 20% of the total
available). The group of readers is then lead by the chief
reader in a norming process. It is this process that
33
creates the discourse community, where all readers are in
agreement on the interpretation of the rubric's score point
descriptions. During this process, if a reader
consistently scores out of the norm, as represented by the
group, then the group scrutinizes his or her scoring
decisions, and the individual is persuaded to change in
order to conform to the group's interpretation of the
rubric.
Once the norming process is complete and the leader is
assured of the consistency in scoring among readers, the
remainder of the writing products is randomly distributed
to the group. Each writing product is then read twice by
two different readers. Scores from the first reading are
kept concealed from those who complete the second reading.
If a sample receives two scores with more than a one-point
span (i.e. 3 and 5), the sample is read and scored by a
third reader in order to reconcile the point discrepancy.
Procedure
During the first session of the University's 1998
MMEP, the 46 preselected participants were randomly
assigned to two groups of 23 each. The experimental group
was further divided (randomly) into two subgroups and were
assigned to meet for writing sample instruction on Monday
34
and Wednesday afternoons, while the other group was
assigned to meet for verbal reasoning instruction on
Tuesday and Thursday afternoons.
During the second session, all 46 participants
completed a full MCAT practice test {MCAT Practice Test I).
Each participant completed two Writing Scimple prompts,
resulting in a total of 92 writing artifacts to be scored
by the trained MMEP readers. The first scoring session on
the pretest resulted in 4 scores for each prompt for each
participant (two scores for the first sample prompt and 2
scores for the second sample prompt). These scores were
then added together to give one overall writing sample
score for each participant ranging from 4-24.
The study was designed to last the full six weeks of
the MMEP. While Group 2 completed 1 hour and 45 minutes of
verbal reasoning instruction. Group 1,divided into
subgroups 1 and 2 (9 and 12 participants each),
participated in 1 hour and 20 minutes of writing sample
instruction.
The procedure of the writing sample instruction was as
follows:
1. During the first writing sample session, students were
provided with an example of the MCAT Writing Sample
35
and six-point scoring rubric available in the MCAT
Student Manual publication through the Association of
American Medical Colleges. The rxibric and holistic
scoring procedures were discussed. Five randomly
chosen responses written by participants in the
alternate subgroup were photocopied for each subgroup
member. Participants were introduced to White's
(1994) holistic scoring method. They then read,
scored and normed one of the five responses. A
discussion followed. During the second session, the
remaining four responses were scored and normed. A
lengthy discussion followed on the interpretation of
the prompts, as well as the application of the rubric.
2. During the third session, participants were given two
writing sample prompts and 30 minutes each to complete
them. Responses were photocopied, and five randomly
chosen responses from each subgroup were given to the
counter subgroup for scoring and grading.
3. During the fourth session, participants repeated the
scoring and norming process they had learned in
sessions one and two. Again, a lengthy discussion
followed.
36
4. During the fifth session, participants received their
scored responses from the alternate group. This
session consisted of participants' working in groups
of three, discussing and analyzing their own and
others' responses in lieu of the scoring rubric.
In the middle of the third week of the MMEP, and after
the fifth session of writing sample instruction for Group
1, all 46 participants were given a partial MCAT consisting
of a Verbal Reasoning section and a Writing Sample section
(from MCAT Practice Test II); this represented the
"posttest," but is labeled as the "midtest" since it was
administered midway through the program, and a full length
MCAT would be given at the conclusion of MMEP as a
posttest. The written responses were scored using the same
group of MMEP trained readers. Data on writing sample
scores for the pretest and midtest were then compared.
37
CHAPTER IV:
ANALYSIS OF DATA
The first question to be considered before testing the
hypothesis was whether or not Group 1 (experimental) showed
equivalent performance to Group 2 (control) on the writing
pretest. Given that the MMEP participants were randomly
assigned to each group, there should have been no
difference in writing sample scores on the pretest between
the two groups. Then, if the instruction in the
application of the scoring rubric were related to the
performance on the MCAT Writing Sample, and Group 1 showed
significant differences in scores on the midtest, the null
hypothesis (instruction had no relationship to writing
performance) could be rejected in favor of the alternative
hypothesis (the instruction in the application of a scoring
rubric is related to writing performance).
The dependent variable, writing sample score, was
achieved by sxamming the four ordinal rank scores each
participant received from the holistic scoring sessions
conducted by the MMEP readers - two scores from the first
response on the pretest and two scores from the second
response on the pretest. Each of the four ordinal rank
scores ranged from 1-6, representing the six-point scoring
38
rubric. When all four ordinal ranks scores were summed,
the scores ranged from 4-24. The independent variable,
instruction and application of the rubric, was nominal and
measured on two levels - 0 and 1.
As indicated, prior to the instruction in the
application of the MCAT Writing Sample scoring rubric to
their own and others' writing, all 4 6 participants
completed a pretest - a full practice MCAT, which included
two 30-minute Writing Sample prompts. A Mann-Whitney U
test was used to verify equivalency between Group 1
(experimental) and Group 2 (control). Examination of the
rank sums and results of the Mann-Whitney U test (alpha =
.05) indicated essentially no difference between the two
groups in their performance on the pretest (see Table 1).
Therefore, before any of the participants received
instruction in the application of the scoring rubrics to
their own and others' writing, the groups were established
as equivalent in their performance on the pretest.
Analysis of the Relationship
After Group 1 (experimental) received the writing
instruction, the participants in both groups completed
another MCAT Writing Sample examination. Like the pretest,
the responses were scored by the same MMEP readers using
39
Table 1
Rank Sums and U for Experimental and Control Groups
Group
Test Experimental Control U
Pretest
Rank Sums 496 585 127*
Note. Maximum score for both the pretest and midtest = 24. ^n = 21. "11 = 25.
the same method as discussed earlier. A Wilcoxon Match-
Pairs Signed Ranks test was used as an alternative to a
correlated groups t test since the dependant variable,
writing sample score for Group 1 (experimental) could form
rank differences between the pretest and midtest; the
independent variable, instruction, was within-subjects and
had only two levels (0, 1).
As Table 2 indicates, the results of the Wilcoxon
Match-Pairs Signed Ranks test indicate that the scores for
the pretest and midtest for Group 1 (experimental) were
significantly different (alpha = .05). The scores for the
pretest and midtest for the control group were not
significantly different, as would be expected, since that
40
group did not receive instruction in the application of the
MCAT Writing Sample scoring rubric. This would indicate
that the original hypothesis that the "instruction in the
application of a scoring rubric was related to writing
performance" was supported.
In isolation, these results would suggest that the
instruction the participants in the experimental group
received is certainly related to their writing sample
score. In fact, numerous participants in the experimental
group commented that in the end, they felt like they knew
what was expected of them and how to approach the MCAT
Writing Sample portion of the examination. It is possible
that a test-retest factor may contribute to the gains in
Group I's midtest scores. It has been dociomented that even
within a year's time, nearly all of those who repeat the
MCAT show some improvement in all subtests (Land, 1976).
Strength of the Relationship
In order to determine the strength of the
relationship, a matched-pairs rank biserial correlation
coefficient was calculated. The correlation coefficient
(rc = 0.65), indicates a relationship between the
instruction in the application of the MCAT Writing Sample
scoring rubric and writing sample score. Therefore, the
41
original hypothesis that the "instruction in the
application of a scoring rubric was related to writing
performance" was supported.
Table 2
Positive and Negative Rank Sums Differences and T for
Experimental and Control Groups
Group
Test Experimental Control
Positive
Rank Sum 191 185.5
Negative
Rank Sum 40 92.5
T 40.00* 92.5*
Note. Maximum score for both the pretest and midtest = 24. ^n = 21. "n = 25 (pretest). n = 23 (midtest)
Nature of Relationship
Since the rank sum for the negative differences of the
Wilcoxon Match-Pairs Signed Ranks test for Group 1
(experimental) are less than the rank sum for the positive
differences (Rn = 40 < ̂ = 191) it can be concluded that the
42
gains observed in the experimental group would be related
to the instruction the participants received.
Theoretically, this study suggests, as does the other
research, that the knowledge and application of any scoring
rubric for writing may have a positive relationship to
student writing performance. Certainly, it can be argued
that the implementation of the rubrics creates a stronger
link between learning, teaching and assessment (Soles
2001).
43
CHAPTER V:
CONCLUSION, IMPLICATIONS, AND
RECOMMENDATIONS FOR FURTHER RESEARCH
The following conclusion and implications are made
within the context of this particular study. Any certainty
associated with the following statements must be understood
within the limitations of this study as detailed in the
introduction.
Conclusion
This study indicates that the instruction in the
application of a writing scoring rubric to one's own and
others' writing is related to writing performance.
Implications
1. In practice, those who prepare students for standardized
writing examinations might this knowledge to better
prepare examinees.
2. Teachers of writing should not keep their scoring
criteria from students, and they should develop those
criteria into a rubric for students to apply to their
peers' and their own writing performances.
3. Additionally, students preparing for standardized writing
examinations should familiarize themselves with the
examination's scoring rubric and scoring practices.
44
4. If student writers become the "scorer" of their own and
their peers writing, it may lead to a more developed
critical stance about their own writing skills as well as
their ability to fulfill the requirements of other
writing situations.
Recommendations for Further Research
Nevertheless, due to the limitations of this
particular study, there is a need for a controlled
experimental study, which investigates the effect of rubric
knowledge and application on writing performance. In
particular, attention needs to be focused on the holistic
scoring procedures in order to assure consistency in the
data. Furthermore, the time on task needs to be increased
to assure sufficient instruction time. Finally, other
variables (such as primary language, previous writing
instruction, and test-retest effect) must be considered and
controlled. Not only would such a study establish the
effect or non-effect of this knowledge, but it would also
validate that the knowledge and application of scoring
rubrics do result in better writing performance.
45
APPENDIX A
SCORE POINT DESCRIPTIONS
6 = These papers show clarity, depth, and complexity of thought. The treatment of the writing assignment is focused and coherent. Major ideas are substantially developed. A facility with language is evident.
5 = These essays show clarity of thought, with some depth or complexity. The treatment of the rhetorical assignment is generally focused and coherent. Major ideas are well developed. A strong control of language is evident.
4 = These essays show clarity of thought and may show evidence of depth or complexity. The treatment of the writing assignment is coherent, with some focus. Major ideas are adequately developed. An adequate control of language is evident.
3 = These essays may show some problems with clarity or complexity of thought. The treatment of the writing assignment may show problems with integration or coherence. Major ideas may be underdeveloped. There may be ntimerous errors in mechanics, usage, or sentence structure.
2 = These essays may show some problems with clarity or complexity of thought. The treatment of the writing assignment may show problems with integration or coherence. Major ideas may be underdeveloped. There may be niomerous errors in mechanics, usage, or sentence structure.
1 = These essays may demonstrate a lack of understanding of the writing assignment. There may be serious problems with organization. Ideas may not be developed. Or, there may be so many errors in mechanics, usage, or sentence structure that the writer's ideas are difficult to follow.
From: Association of American Medical Colleges. (1995). MCAT Student Manual. Washington, DC, pp.67-68.
46
APPENDIX B
INTERPRETATION OF LETTER SCORES
Below are the scaled scores (J to T) examinees receive from an official MCAT Writing Sample. Because examinees may attain a given letter score in various ways (i.e., by performing equivalently on both essays or by performing better on one essay than on the other), making inferences about the writing skills associated with a given letter score is not straightforward.
R S T Above Average
These essays respond co the topic and the three writing tasks in a superior manner. The writing demonstrates a strong control of language. The response is presented in a clear, organized, and coherent fashion. Ideas are well developed, and the topic is dealt with at a complex level.
M N O P Q Average
These essays demonstrate a degree of proficiency in discussing the topic and/or responding to the three writing tasks. Few problems in the mechanics of writing are evident, and there is demonstration of control of language. The writing tasks are addressed in a clear, organized, and coherent manner. Ideas are developed to some extent and may show evidence of depth and complexity of thought.
J K L Below Average
These essays demonstrate a degree of difficulty in discussing the or/or responding to the three writing tasks. They may show problems with the mechanics of writing or in addressing the topic at a complex level. The response my not be organized, and ideas may not be integrated. Although the three writing tasks may be addressed, the ideas may be underdeveloped.
From: Future Doctor Net, Writing Sample Help. (2001). How the Writing Sample is Scored. Retrieved January 5, 2002, from http://www.futuredoctor.net/
47
REFERENCES
Andrade, H. G. (1999, April). The Role of Instructional Rubrics and Self-Assessment in Learning to Write: A Smorgasbord of Findings. Paper presented at the Annual Meeting of the American Educational Research Association, Montreal, Quebec, Canada.
Andrade, H. G. (2000) . Using Rubrics to Promote Thinking and Learning. Educational Leadership, 57, 13-18.
Arizona Health Sciences Center, Office of Minority Affairs, (n.d.). Minority Medical Education Program (MMEP)-- Who is Eligible. Retrieved September 8, 2001, from http://www.ahsc.arizona.edu/
Association of American Medical Colleges, About the AAMC. (© 1995-2002a). Retrieved September 8, 2001, from http://www.aamc.org/
Association of American Medical Colleges, Applying to Medical School (® 1995-2002b). About the MCAT. Retrieved September 8, 2001, from http://www.aamc.org/
Association of American Medical Colleges, Applying to Medical School. (© 1995-2002c). MCAT Writing Sample Prompts. Retrieved September 8, 2001, from http://www.aamc.org/
Association of American Medical Colleges. (1995). MCAT Student Manual. Washington, DC.
Association of American Medical Colleges, Minorities in Medicine. (© 1995-2002d). Minority Medical Education Program. Retrieved September 8, 2001, from http://www.aamc.org/
Association of American Medical Colleges, Minorities in Medicine. (© 1995-2002e). MMEP Program Sites: Western Consortium University of Washington / University of Arizona. Retrieved September 8, 2001, from http://www.aamc.org/
48
Association of American Medical Colleges, Minorities in Medicine. (® 1995-2002f) . Who Is Eligible? Retrieved September 8, 2001, from http://www.aamc.org/
Bergdahl, D. (1999). Scoring Guides in the Process-Oriented Composition Class: Having Students Write Their Own Scoring Guides. Exercise Exchange, 45, 21-24.
Duke, C.R., and Sanchez, R. (1994). Giving Students Control over Writing Assessment. English Journal, 83, 47-54.
Fuchs, L.S., Fuchs, D., Karns, K., Hamlett, C.L., Dutka, S., & Katzaroff, M. (2000). The Importance of Providing Background Information on the Structure and Scoring of Performance Assessments. Applied Measurement in Education, 13, 1-34.
Future Doctor Net, Writing Sample Help. (2001). How the Writing Sample is Scored. Retrieved January 5, 2002, from http://www.futuredoctor.net/
Greenlee, C.T. (1996). Passing the Test: Game Plan Developed to Help Athletes Beat the SAT. Black Issues in Higher Education, 13, 13-14.
Harrington, M., Holik, M., & Hurt, P. (1998). Improving Writing through the Use of Varied Strategies. Action Research Project, Saint Xavier University and IRI/Skylight. (ERIC Document Reproduction Service No. ED 420 874).
Holden, C. (1989). MCAT to Stress Thinking, Writing. Science, 243, 1431.
Hojat, M., Erdmann, J.B., Veloski, J.J., Nasca, T.J., Callahan, C.A., Julian, E., & Peck, J. (2000). A Validity Study of the Writing Sample Section of the Medical College Admission Test. Academic Medicine, 75, 25S-27S.
49
Jackson, E.W. (1985, March). A Comparison of Repeater Perfoxrmance on the MCAT: Review Course Participants vs. Non-Participants. Paper presented at the Annual Meeting of the American Educational Research Association, Chicago, IL.
Johns, J., & VanLeirsburg, P. (1992). Teaching Test-Wiseness: Can Test Scores of Special Populations Be Improved? Reading Psychology, 13, 99-103.
Jolly, P. (1992). Academic Achievement and Acceptance Rates of Underrepresented-Minority Applicants to Medical School. Academic Medicine, 67, 765-7 69.
Jones, R.F. (1986). The Effect of Commercial Coaching Courses on Performance on the MCAT. Journal of Medical Education, 61, 27 3-84.
Jones, R.F. (1983). The Relationship between MCAT Science Scores and Undergraduate Science CPA. Journal of Medical Education, 58, 908-11.
Kachigan, S.K. (1986). Statistical Analysis: An Interdisciplinary Introduction to Univariate and Multivariate Methods. New York: Radius Press.
Koenig, J.A. (1995, April). Examination of the Comparability of MCAT Writing Sample Test Forms. Paper presented at the Annual Meeting of the American Educational Research Association, San Francisco, CA.
Koenig, J.A., & Mitchell, K.J. (1993). Use of Medical College Admission Test Writing Sample Data in Medical School Admissions Decisions. The Advisor, 13, 13-15.
Koenig, J.A., & Wiley, A. (1996). The Validity of the Medical College Admission Test for Predicting Performance in the First Two Years of Medical School. Academic Medicine, 71, S83-S85.
Land, M.S. (1976). Cross-Year Comparison of Test-Retest MCAT Performance. Journal of Medical Education, 51, 532-86.
50
Lynch, K.B. (1990). The Relationship of Minority Students' MCAT Scores and Grade Point Averages to Their Acceptance into Medical School. Academic Medicine, 65, 480-82.
Mabry, L. (1999). Writing to the Rubric: Lingering Effects of Traditional Standardized Testing on Direct Writing Assessment. Phi Delta Kappan, 80, 673-80.
Mitchell, K.J. (1987, April). Estimation of Interrater and Parallel Forms Reliability for the MCAT Essay. Paper presented at the Annual Meeting of the American Educational Research Association, Washington, DC.
Mitchell, K.J. (198 6a). Factor Structure of the MCAT and Pilot Essay. Educational and Psychological Measurement, 46, 1019-27.
Mitchell, K.J. (1986b, August). Reliability of Holistic Scoring for the 1985 MCAT Essays. Educational and Psychological Measurement, 46, 771-75.
Moskal, B.M. (2000). Scoring Rubrics Part I: What and When. (ERIC Document Reproduction Service No. ED 446 110).
Montgomery, K. (2000). Classroom Rubrics: Systematizing What Teachers Do Naturally. Clearing House,, 76, 324-28.
Ornstein, A.C. (1993). Coaching, Testing, and College Admission Scores. NASSP Bulletin, 11, 12-19.
Schirmer, B.R., & Bailey, J. (2000). Writing Assessment Rubric: An Instructional Approach with Struggling Writers. Teaching Exceptional Children, 33, 52-58.
Scruggs, T.E., & Mastropieri, M.A. (1995). Teaching Test-Taking Skills: Helping Students Show What They Know. Cambridge, MA: BrooJcline Books.
Seaton, T. (1992). The Effectiveness of Test Preparation Seminars on Performance on Standardized Achievement Tests. (ERIC Document Reproduction Service No. ED 356 233).
51
Shaugnessy, A. (1994). Teaching to the Test: Sometimes a Good Practice. English Journal, 83, 54-56.
Skillings, M.J., & Ferrell, R. (2000). Student-Generated Rubrics: Bringing Students into the Assessment Process. Reading Teacher, 53, 452-55.
Soles, D. (2001, March). Sharing Scoring Guides. Paper presented at the Annual Meeting of the Conference on College Composition and Communication, Denver, CO.
Swanson, A., & Mitchell, J. (1989). MCAT responds to Changes in Medical Education and Physician Practice. (Medical College Admission Test). The Journal of the American Medical Association, 262, 261-64.
Taggart, G.L., Phifer, S.J., Nixon, J.A., & Wood, M. (Eds.). (1998). Rubrics: A Handbook for Construction and Use. Lancaster, PA: Technomic Publishing Company, Inc.
Thurmond, V.B. (1986). Correlations Between SAT Scores and MCAT Scores of Black Students in a Summer Program. Journal of Medical Education, 61, 640-43.
Use Rubrics and Reach All Learners. (1998). Active Learner: A Foxfire Journal for Teachers, 3, 32-33, 39.
Wheeler, P.H., & Haertel, G.D. (1993). Preparing Students for Testing: Should We Promote Test ffiseness? EREAPA Associates, Livermore, CA. (ERIC Document Reproduction Service No. ED 37 4 163).
White, E.M. (1994). Teaching and Assessing Writing. 2"'^ ed. San Francisco: Jossey-Bass.
Wilson, V.A., & Onwuegbuzie, A.J. (1999, November). Improving Achievement and Student Satisfaction through Criteria-Based Evaluation: Checklists and Rubrics in Educational Research Courses. Paper presented at the Annual Meeting of the Mid-South Educational Research Association, Point Clear, AL.
52
Witt, E.A. (1993, April). Meta-Analysis and the Effects of Coaching for Aptitude Tests. Paper presented at the American Educational Research Association, Atlanta, GA.
Wyngaard, S., & Gehrlce, R. (1996) . Responding to Audience: Using Rubrics to Teach and Assess Writing. English Journal, 85, 67-70.