The relationship between the application of scoring ...€¦ · read the dissertation prepared by The R.elat ionship between the Application of ScorinF Rubrics and Writin-r Performance

The relationship between the applicationof scoring rubrics and writing performance

Item Type text; Dissertation-Reproduction (electronic)

Authors MacElvee, Cameron

Publisher The University of Arizona.

Rights Copyright © is held by the author. Digital access to this materialis made possible by the University Libraries, University of Arizona.Further transmission, reproduction or presentation (such aspublic display or performance) of protected items is prohibitedexcept with permission of the author.

Download date 10/12/2020 20:12:51

Link to Item http://hdl.handle.net/10150/289775

http://hdl.handle.net/10150/289775

INFORMATION TO USERS

This manuscript has been reproduced from the microfilm master. UMI films

the text directly from the original or copy submitted. Thus, some thesis and

dissertation copies are in typewriter face, while others may be from any type of

computer printer.

The quality of this reproduction is dependent upon the quality of the

copy submitted. Broken or indistinct print, colored or poor quality illustrations

and photographs, print bleedthrough, substandard margins, and improper

alignment can adversely affect reproduction.

In the unlikely event that the author did not send UMI a complete manuscript

and there are missing pages, these will be noted. Also, if unauthorized

copyright material had to be removed, a note will indicate the deletion.

Oversize materials (e.g., maps, drawings, charts) are reproduced by

sectioning the original, beginning at the upper left-hand comer and continuing

from left to right in equal sections with small overiaps.

Photographs included in the original manuscript have t)een reproduced

xerographically in this copy. Higher quality 6" x 9" black and white

photographic prints are available for any photographs or illustrations appearing

in this copy for an additional charge. Contact UMI directly to order.

ProQuest Information and Learning 300 North Zeeb Road. Ann Arbor. Ml 48106-1346 USA

800-521-0600

1

THE RELATIONSHIP BETWEEN THE APPLICATION OF SCORING RUBRICS

AND WRITING PERFORMANCE

by

Cameron Ross MacElvee

Copyright ® Cameron Ross MacElvee 2002

A Dissertation Submitted to the Faculty of the

DEPARTMENT OF EDUCATIONAL PSYCHOLOGY

In Partial Fulfillment of the Requirements For the Degree of

DOCTOR OF PHILOSOPHY

In the Graduate College

THE UNIVERSITY OF ARIZONA

UMI Number 3050306

Copynght2002 by MacElvee, Cameron Ross

All rights reserved.

UMI' UMI Microform 3050306

Copyright 2002 by ProQuest Information and Leaming Company. All rights reserved. This microform edition is protected against

unauthorized copying under Title 17, United States Code.

ProQuest Information and Leaming Company 300 North Zeeb Road

P.O. Box 1346 Ann Arbor. Ml 48106-1346

2

THE UNIVERSITY OF ARIZONA ® GRADUATE COLLEGE

As members of the Final Examination Committee, we certify that we have Cameron Ross MacElvee

read the dissertation prepared by

The R.elat ionship between the Application of

ScorinF Rubrics and Writin-r Performance

and recommend that it be accepted as fulfilling the dissertation Doctor of Philosophy

requirement for the I^egree of

Phil Callahan

Chris Johnson

Date

Date

c'2-Date

Date

Date

Final approval and acceptance of this dissertation is contingent upon the candidate's submission of the final copy of the dissertation to the Gradxiate College.

I hereby certify that I direction and recommend requirement.

have read this dissertation prepared under my that it be accepted as fulfilling the dissertation

irector nith

Date 'K.enne

3

STATEMENT BY AUTHOR

This dissertation has been submitted in partial fulfillment of requirements for an advanced degree at The University of Arizona and is deposited in the University Library to be made available to borrowers under rules of the Library.

Brief quotations from this dissertation are allowable without special permission, provided that accurate acknowledgment of source is made. Requests for permission for extended quotation from or reproduction of this manuscript in whole or in part may be granted by the copyright holder.

4

ACKNOWLEDGEMENTS

I would like to acknowledge the important contribution of the following individuals: my advisor and director. Dr. Kenneth Smith for his steadfast endurance with my slow progress and his consistent doses of encouragement; Dr. Barbara Smith for her wise counsel and friendship; Dr. Philip Callahan for his faithfulness and confidence in my work not only as a student, but as an educator; Dr. Christopher Johnson for overlooking my procrastination; Dr. Joni Adamson for enthusiastically taking up "my cause" in the eleventh hour; Dr. Theresa Enos for accepting me into one of the finest rhetoric departments in the country and supporting my unusual pursuits; Sheila Savadkohi and Karoleen Wilsey for always helping me make those important deadlines; the 1998 MMEP participants for their passion of learning; Dr. Linda Don for her support and enthusiasm surrounding my participation in MMEP; my friend Dr. Jerry Rintala for his encouragement; my dear friends Steve and Matt Jones, who love me with or without this degree -thanks guys for everything you have done for me; my friends and colleagues at Scottsdale Community College for supporting me in my educational and professional endeavors; my children, T.C., Nick and Jaxi for reminding me what is really important in life; my mother, Berthalene Lairson for sticking it out with me until the very end; my "bestest" friend in the whole wide world, Lynette Crockett - thanks for nagging me to get this done; my amiga bonita, Taime Bengochea - thanks for believing in me, even when I did not; and God for His amazing grace in my life.

5

DEDICATION

This work is dedicated to my mother, Berthalene

Lairson. Without her support and uncompromising faith in

my abilities, this work would not have been completed.

Thanks, Mom!

I would also like to dedicate the fulfillment of this

writing to the memory of Rhuby MacElvee. You will forever

be missed in my life.

6

TABLE OF CONTENTS

LIST OF FIGURES 7 LIST OF TABLES 8 ABSTRACT 9 CHAPTER I: INTRODUCTION 10

Statement of the Problem 15 Limitations of the Study 15

Sample 15 Sample size 16 Time 17 Interreader/interrater reliability 17

CHAPTER II: REVIEW OF RELATED LITERATURE 19 CHAPTER III: DESIGN OF THE STUDY 25

Statement of the Hypothesis 26 Method 26 Subjects 27 Instrument 29

Medical College Admission Test 29 Writing Sample 29 Writing skills accessed by the Writing Sample....31 Writing Sample scoring rubric 31 Holistic scoring procedure 32

Procedure 33 CHAPTER IV: ANALYSIS OF DATA 37

Analysis of the Relationship 38 Strength of the Relationship 40 Nature of Relationship 41

CHAPTER V: CONCLUSION, IMPLICATIONS, AND RECOMMENDATIONS FOR FURTHER RESEARCH 4 3

Conclusion 43 Implications 43 Recommendations 44

APPENDIX A, SCORE POINT DESCRIPTIONS 45 APPENDIX B, INTERPRETATION OF LETTER SCORES 46 ElEFERENCES 47

LIST OF FIGURES

FIGURE 1, Experimental Design

8

LIST OF TABLES

TABLE 1, Rank Sums and U for Experimental and Control Groups 39

TABLE 2, Positive and Negative Rank Sioms Differences and T for Experimental and Control Groups 41

9

ABSTRACT

The purpose of this study was to determine if a

relationship exists between the knowledge and application

of a writing scoring rubric to writing performance.

Participants in a Minority Medical Education Program were

given intense instruction in the use of the Medical

Colleges Admissions Test Writing Sample scoring rubric.

Scores from the participants' pretest and posttest were

compared.

10

CHAPTER I:

INTRODUCTION

In 187 6, the Association of American Medical Colleges

(AAMC), a non-profit association, was founded to address

reform in medical education. In approximately the last 20

years, one of its goals has been to increase the number of

underrepresented minority students (URMS) in medical

education. Consequently, the Minority Medical Education

Program (MMEP), held each summer at one of the eleven

medical schools sites around the country, has been offered

to qualified participants for more than a decade

(Association of American Medical Colleges, 1995-2002a,

1995-2002e).

As an academic enrichment program, one of the many

goals of MMEP is to provide the type of experiences that

will help URMS not only to appreciate the relationship

between diverse cultures and the medical profession, but

also to help students gain admission to medical schools.

One of the focal points of MMEP then is instruction and

preparation for the Medical College Admission Test (MCAT),

which along with undergraduate grades, is used to predict

performance in medical school courses and is used by

medical college admissions committees when evaluating

11

applicants for admissions (Association of American Medical

Colleges, 1995, 1995-2002f; Jones, 1983; Koenig & Mitchell,

1993; Koenig & Wiley, 1996; Lynch, 1990).

During MMEP, URMS take two full-length MCAT practice

examinations. Results of the first examination (pretest)

are returned to the URMS, and are then used by the MMEP

instructors to focus instruction in physics, chemistry,

biology, biochemistry, verbal reasoning and writing. URMS

also work on study skills, test-taking skills, time

management, and stress management. The preparation for the

MCAT concludes with the taking and analysis of the second

MCAT practice examination (posttest) (Association of

American Medical Colleges, 1995-2002e).

It has been documented that participants in MMEP

demonstrate score gains on the MCAT, but whether or not

those gains are due to the instruction alone or to a

combination of other factors has not been thoroughly

investigated (Jackson et al., 1985). Various preparation

programs for standardized tests have been shown to increase

examinees' performances (Greenlee, 1996; Seaton, 1992).

Although it has been observed that high-stakes admissions

tests such as the MCAT are less open to test preparation

than general intelligence measures, when these preparation

12

programs combine a pretest along with instruction, then

examinees' scores show the greatest gains (Witt, 1993). In

particular, commercial coaching courses (i.e. Princeton

Review, Columbia Review, Kaplan) have been found to have

the effect of increasing examinees' scores on the physical

and biological science sections of the MCAT (Jones, 1986).

Nevertheless, the AAMC reports that approximately 63% of

MMEP participants who successfully complete the summer

program are accepted to medical school (Association of

American Medical Colleges, 1995-2002d; Jolly, 1992).

The current MCAT contains 4 subtests: physical

sciences, biological sciences, verbal reasoning and writing

sample. However, the writing sample portion of the

examination did not become an integral part of the MCAT

until 1991 (Future Doctor Net, 2001).

As early as 1973, the Medical College Admission

Assessment Program recommended that a test of written

communication skills be considered along with physical and

biological science knowledge. This was in part due to the

emphasis in the 1970s on the need for better communication

between doctors and their patients. In fact, it has been

documented that writing sample scores are closely

associated with clinical competences (Hojat et al., 2000).

13

Early in the 1980s, medical school deans began

discussing the importance of writing and analytical skills

for applicants. In fact, many medical school deans reported

inadequate writing and communication skills among their

medical students. As a result, the medical school

application process was revised so to require an assessment

of writing and communication skills from their applicants

(Future Doctor Net, 2001; Koenig & Mitchell, 1993; Swanson

& Mitchell, 1989).

Several medical schools in the 1980s were already

collecting writing samples at the time of application;

however, there was no standardized instrument for assessing

the writing and analytical skills of medical school

applicants (Future Doctor, 2001). Consequently, the AAMC

piloted the writing sample in the spring and fall of 1985

and 1986 (Holden, 1989; Koenig & Mitchell, 1993).

The pilot was extended through 1990 in order for

developers to insure acceptable levels of reliability and

to produce a number of comparable test forms or prompts

(see page 29 for a description of the prompts) (Koenig,

1995). Researchers experimented with various versions of

the writing sample, from two 30-minute prompts to a

combination of one 45-minute plus one 15-minute prompt.

14

Although the 30-minute prompt often produced responses with

unsophisticated arguments, this format was adopted

(Mitchell, 1987). The two 30-minute prompts required

examinees to develop and present ideas in a cohesive manner

in order to provide medical school admission committees

evidence of applicants' writing and analytical skills

(Future Doctor, 2001; Koenig & Mitchell, 1993).

Since the MCAT Writing Sample elicits responses from

examinees that are not connected directly to physical or

biological science content, the only focus for MMEP

instructors is on developing the URMS' test-taking

strategies (Ornstein, 1993). It has been documented that

minority students and/or students from culturally different

backgrounds benefit substantially from instruction in test-

taking strategies (Johns & VanLeirsburg, 1992; Scruggs &

Mastropieri, 1995; Thurmond, 1986).

For the writing sample, the strategies for test-taking

skills are independent of content knowledge and require

URMS to become "test-wise" (Wheeler & Haertel, 1993). The

term "test-wise" can be defined as the ability to use

knowledge about a specific test format, including the

scoring criteria, to make the most of one's performance

(Scruggs & Mastropieri, 1995). In other words, MMEP

15

instructors must address those instructional activities

that require URMS to evaluate their own and each others'

writing in the context of the standardized criteria set

forth by the MCAT Writing Sample.

Statement of the Problem

The University of Arizona is one of two program sites

for the Western Consortium of Medical Colleges (the

University of Washington is the other) where UI^S may

select for their MMEP summer experience.

The University, located in Tucson, Arizona, began

conducting the MMEP in 1994. Since the University is the

only publicly funded medical school in Arizona, the Office

of Minority Affairs strives to serve the needs of the

State's diverse minority population, which ranges from

Native Americans to recent citizens from South American

countries (Arizona Health Sciences Center, n.d.).

The purpose of this study was to determine whether the

knowledge and application of a writing scoring rubric was

related to writing performance.

Limitations of the Study

Sample. The 1998 University MMEP participants were

pre-selected into the program based on particular criteria

16

set down by the AAMC. These criteria required the

participant to:

• be a U.S. citizen or hold a permanent resident visa

• show evidence of completing at least one year of

college

• have an overall grade point average (4.0 scale) of

3.00, with at least a 2.75 in the sciences

• have a SAT score of at least 950 or an ACT score of at

least 20

• be a member of an underrepresented minority group

(African-American, Mexican-American - Mainland Puerto

Rican, or American Indian/Alaska Native). (Association

of American Medical Colleges, 1995-2002f)

Since the researcher had no part in the decision as to

the inclusion or exclusion of the University MMEP

participants, the 46 participants were not confirmed as a

truly random representative sample of the URMS population.

Sample size. Due to the intense nature of MMEP and

limited facilities, participants in the program are limited

to no more than 50 - 60 per summer. During the summer of

1998, when this investigation was conducted, 46

participants were selected to participate. Because the

17

research called for a pretest-posttest control group

design, the number of participants in the experimental

group was relatively small (n = < 30).

Time. The amount of time MMEP participants were

involved in the program was limited to six weeks.

Furthermore, these six weeks were not devoted entirely to

writing instruction, but included instruction in the

physical sciences, biological sciences, and verbal

reasoning. Participants also took part in other enrichment

activities, such as clinical rotations, guest speakers, and

a number of other professional and career exploration

activities. Therefore, the amount of time on task (seven

hours total) devoted to writing instruction during the

program was extremely limited. Due to the nature of the

design, participants were exposed only for a short period

of time to the writing instruction in question - the

knowledge and application of the MCAT Writing Sample rubric

- before being reassessed.

Interreader/interrater reliability. It has been argued

that a standardized test of writing has many inherent

problems, such as the validity and reliability of the

rubric (Mabry, 1999). During the 1985 spring and fall pilot

test of the MCAT Writing Sample, interreader/interrater

18

reliability was estimated at .84. Of the 20 readers used

in this pilot, 16 of them had been trained using White's

(1994) holistic scoring method (Mitchell, 1986a, 1986b).

However, the norming sessions for the University MMEP

staff/readers, who were not experienced writing teachers,

were conducted in half of the time and only on 5% of the

pre-MCAT Writing Sample responses. The degree of

interreader reliability was not calculated, which, by some

reviewers, may limit the interpretation of the results.

19

CHAPTER II:

REVIEW OF RELATED LITERATURE

Providing students with background information and the

structure of the scoring of an assessment has been shown to

increase student scores in mathematics according to Fuchs

et al. (2000). The same procedure has been shown to affect

writing.

Recent composition theory and research, as reported by

Soles (2001), suggests that instructors should share

scoring rubrics with students, as there appear to be many

benefits to both teachers and students when students write

to the context of the rubrics. According to Soles (2001),

the knowledge and application of rubrics to writing results

in more active participation among students; furthermore,

when students become familiar with the scoring rubrics,

their writing anxiety often decreases and their awareness

of audience is improved (see also Wyngaard & Gehrke, 1996) .

In a study of 303 8^^ graders, Andrade (1999) found

that the students' knowledge and application of the scoring

rxibrics to their own writing actually helped them write

better. However, in a related study where students used

the rubrics as a self-assessment tool only, Andrade (1999)

did not report improvements in student writing. This would

20

suggest that the rubric must be used in an instructional

manner and not simply provided to students as a self-

assessment tool.

Bergdahl (1999) reports having students

collaboratively create scoring rubrics and guides for

papers in composition courses. Bergdahl (1999) argues that

students' attention to the rubrics caused them to be more

focused on the goals of the papers. In addition, it

allowed students to have a clearer notion on what areas

they needed improvement.

Skillings and Ferrell (2000) suggest that in some

cases when teachers and students work together to generate

specific criteria in the form of a rubric, it can increase

critical thinking (see also Use Rubrics and Reach All

Learners, 1998). For example, Duke & Sanchez (1994) report

how teachers and 10^^ grade students in Pennsylvania

developed a six-point holistic rubric to be used in grades

6 to 9. Students used the rxibric to evaluate their own and

others' papers. The project recorded an improvement in

student writing

Another study by Schirmer & Bailey (2000) announced

improved writing skills in two classes of deaf students who

participated in the creation of writing assessment rubrics

21

with their teachers. Students then used the rubrics to

guide them in their writing and revising.

Harrington, HoliJc and Hurt (1998) report an

improvement in student writing as documented in a research

project targeting 5*^^ graders. The project first identified

poor student writing through a collection of student

writing samples. The researchers obtained the advice of

writing experts as to what instruction might be used in

order to address the students' lack of skills. One

category of intervention that the experts suggested was the

use of rubrics in the planning and revising of student

writing. The project documented an improvement in student

writing, as well as a marked increase in students'

enjoyment of writing.

It has been noted by Wilson & Onwuegbuzie (1999) that,

without an integration of a rubric into writing

instruction, writing responses have the potential of

failing to meet the expectations of instructors.

Furthermore, students may become frustrated with a lack of

clear criteria. Rubrics are beneficial, not only because

they break the writing into manageable tasks but also, as

Shaugnessy (1994) points out in reference to standardized

22

tests, because they provide clear criteria of writing

skills to be judged.

The use of rubrics not only specifies the criteria but

the performance levels as well. Montgomery (2000) claims

this allows the teaching of concepts, rather than content

which is particularly suited to instruction for the MCAT

Writing Sample.

Rubrics, as Andrade (2000) notes, show a graduation of

quality. The MCAT Writing Sample scoring rubric makes it

ideal for use in instruction, since the six-levels clearly

state the expectations of the Medical College Admissions

Assessment Program and the AAMC (see Appendix A). This

rubric makes it an equally effective type of feedback for

the URMS writing samples.

Based on the available literature about the

instructional uses of writing scoring rubrics to improve

writing performance, it is apparent that a typical

population of MMEP participants (or other populations of 2"^

year+ college students) has not been the subject of similar

research. One reason for this may lie in the perception of

the role writing instruction has in an undergraduate

college degree program.

Two semesters of freshman composition and one

23

upper-division writing course are the typical requirements

for an undergraduate degree from a four-year college or

university. In addition to these courses, many

institutions require students to pass some sort of writing

proficiency exam before their junior year in a

baccalaureate program. For students who do not succeed at

freshman composition or pass proficiency exams, there are

courses meant to provide "remedial" instruction in writing.

As a result, the concern for improving writing performance

among college juniors or seniors is delegated to those

remedial courses. And, as White (1994) points out, the

label "remedial" is seldom used with these courses since

colleges and universities do not want the stigma of having

accepted and enrolled "struggling" writers to their

institution of higher education.

The participants in MMEP have not only completed at

least one year of college (including freshman composition),

but also tend to be academic achievers (see the

requirements for MMEP admissions page 16). As a result,

literature addressing the use of writing scoring rubrics to

improve writing performance is limited to populations of

students who are in need of improved performance in the

24

classroom rather than on standardized examinations such as

the MCAT.

25

CHAPTER III:

DESIGN OF THE STUDY

The design used in this study was a pretest-posttest

control group design (Figure 1). An individual's

performance on the MCAT Writing Sample has the possibility

of being related to a number of variables. A "shotgun"

approach to investigating all possible and conceivable

variables, which may or may not be related to writing

performance, would be expensive, inefficient and time

consuming. Based on deductive reasoning from the

literature (Andrade, 1999, 2000; Bergdahl, 1999; Fuchs et

al., 2000; Harrington, Holik & Hurt, 1998; Schirmer &

Bailey, 2000; Soles, 2001), the knowledge and application

of the MCAT Writing Sample six-point scoring rubric within

the method described by White (1994) was considered a good

candidate for investigation as one of the many possible

variables related to writing performance.

Therefore, the design was intended to indicate the

relationship between the knowledge and application of the

MCAT Writing Sample six-point scoring rubric and MCAT

Writing Sample scores. (Figure 1)

26

Group Assignment n Treatment Midtest

1 random 21 Writing MCAT

Experimental Instruction Writing

Sample

2 random 23^ Verbal MCAT

Control Reasoning Writing

Instruction Sample

Figure 1. Experimental Design ^Note. Two subjects in Group 2 did not participate in the midtest

Statement of the Hypothesis

It was hypothesized that there is a relationship

between the knowledge and application of a writing scoring

rubric to writing performance.

Method

In order to determine whether or not the knowledge and

application of the MCAT Writing Sample scoring rubric to

their own and others' writing was related to their

performance on the writing sample portion of the test,

participants in the University's 1998 MMEP were introduced

27

to the rubric and holistic scoring procedures as described

by White (1994).

Subjects

The description of the population of the University's

1998 MMEP participants was inferred from the "Sunutiary Data

on the April 1997 MCAT by Gender, Racial/Ethnic Group, Age,

Language Status, Undergraduate Major, Testing History,

State of Legal Residence" document prepared by the

Association of American Medical Colleges (n.d.). That

population was comprised of 4 6% female, 53% male, and 11%

underrepresented minority (African-American/Black, Mexican-

American/Chicano/Puerto Rico, and Native American/Native

Alaskan/Native Hawaiian) with 24% of the test takers

representing other minorities (Commonwealth Puerto Rico,

Asian, and other Hispanics). As a total, minority test

takers represented 34%. The median Writing Sample score

for the underrepresented minority group was a scaled score

of N. The median Writing Sample score for all 1997 test

takers was a scaled score of O (see Appendix B for a

description of the scaled scores).

The 46 participants in the University's 1998 MMEP were

preselected using the criteria established by the

Association of American Medical Colleges (Association of

28

American Medical Colleges, 1995-2002f). The group consisted

of 65% female, 35% male with 67% identified as

underrepresented minorities (African-American/Black,

Mexican-American/Chicano/Puerto Rico, and Native

American/Native Alaskan/Native Hawaiian), and 15%

representing other minorities (Commonwealth Puerto Rico,

Asian, and other Hispanics). There were a number of

participants (15%) who did not indicate a specific ethnic

group, but rather chose more than one as representative of

their ethnic identity.

These 46 participants were randomly assigned (using a

table of random numbers) to two groups of 23 each. Because

of the scheduling constraints of the six-week MMEP program,

the experimental group was further divided (randomly) into

two subgroups (subgroups 1 and 2). The experimental group

was then instructed in the use of the writing scoring

rubric on different days; nevertheless, the two subgroups

of the experimental group received the same instruction.

Before instruction began, an adjustment in group assignment

was made to accommodate two students and resulted in Group

1 with n = 21 and Group 2 with n = 25.

29

Instrument

The University's Office of Minority Affairs received

copies of the Association of American Medical Colleges'

publications MCAT Practice Test I, MCAT Practice Test II,

and MCAT Practice Test III with Solutions Booklet. Each of

the tests contained an approved MCAT Writing Sample

portion, and these practice tests were used as the

measuring instruments.

Medical College Admission Test. The three quarters of

the MCAT is a standardized, multiple-choice examination

designed to assist admission committees in predicting which

of their applicants will perform adequately in the medical

school curriculum. The test assesses problem solving,

critical thinking, and writing skills in addition to the

examinee's knowledge of science concepts and principles

prerequisite to the study of medicine. The MCAT is scored

in each of the following areas: Verbal Reasoning, Physical

Sciences, Writing Sample, and Biological Sciences

(Association of American Medical Colleges, 1995-2002b).

Writing Sample. Each Writing Sample prompt requires an

expository response to a specific topic selected from areas

of general interest, such as business, politics, history,

art, education, or ethics. Topics do not pertain to the

30

content of the physical or biological sciences, to the

medical school application process, or to reasons for the

choice of medicine as a career. Specific prior knowledge

about the topic is not necessary to complete the Writing

Sample (Association of American Medical Colleges, 1995).

The prompts consist of a "topic statement" (e.g.

"Honesty is the best policy.") followed by instructions for

three writing tasks. The first task is to explain or

interpret the topic statement. The first portion of the

instructions is the same for all prompts: "Write a unified

essay in which you perform the following tasks. Explain

what you think the above statement means." The

instructions for the second and third writing tasks vary

across prompts and are printed immediately beneath each

topic statement. In general, the second task requires

consideration of a circumstance in which the statement

might be contradicted or judged not applicable. Examinees

must present a specific example that illustrates a

viewpoint opposite to the one presented in the topic

statement. The third task requires discussion of ways in

which the conflict between the initial statement and its

opposition (expressed in the second writing task) might be

resolved. Here, examinees must reconcile the two

31

viewpoints. (Association of American Medical Colleges,

1995, 1995-2002c; Future Doctor Net, 2001).

Writing skills accessed by the Writing Sample. The

MCAT Writing Sample consists of two 30-minute essays

designed to assess skills in the following areas:

• Developing a central idea

• Synthesizing concepts and ideas

• Presenting ideas cohesively and logically

• Writing clearly

• Following accepted practices of grammar, syntax,

and punctuation consistent with timed, first-

draft composition. (Association of American

Medical Colleges, 1995, p. 65)

Writing Sample scoring rubric. The scoring rubric for

the MCAT Writing Sample, like any rubric, is an assessment

tool that outlines the specific expectations and criteria

examinees' samples are meant to show (Taggert, Phifer,

Nixon, & Wood 1998). Since rubrics are used in the judgment

of quality, rather than content (Moskal, 2000), writing

samples are scored holistically. Whereas some methods for

scoring essays assign several scores to a single piece of

writing (e.g. separate scores for organization.

32

%

development, grammar and mechanics) , holistic scoring

regards an essay as a whole without separable aspects. This

type of scoring is based on the assxamption that the various

factors involved in writing are so closely interrelated

that an essay should be assigned a single score based on

the quality of writing as a whole. See Appendix A for the

MCAT Writing Sample scoring rubric, which has been

reproduced from the MCAT Student Manual publication.

Holistic scoring procedure. White (1994) established

this method of holistic scoring for the MCAT during the

1980's pilot (Holden, 1989; Koenig, 1995; Koenig &

Mitchell, 1993; Mitchell, 1986a, 1986b). The key to

White's method is the establishment of what he terms the

"discourse community." This community is comprised of the

readers/scores/graders of writing. In order to attain

acceptable levels of interreader/interrater reliability, it

is important that all members of this community use common

criteria in the form of a holistic scoring rubric.

The group of readers is lead by a "chief" reader or

leader. A random sample of writing products is drawn from

all available products (usually 20% of the total

available). The group of readers is then lead by the chief

reader in a norming process. It is this process that

33

creates the discourse community, where all readers are in

agreement on the interpretation of the rubric's score point

descriptions. During this process, if a reader

consistently scores out of the norm, as represented by the

group, then the group scrutinizes his or her scoring

decisions, and the individual is persuaded to change in

order to conform to the group's interpretation of the

rubric.

Once the norming process is complete and the leader is

assured of the consistency in scoring among readers, the

remainder of the writing products is randomly distributed

to the group. Each writing product is then read twice by

two different readers. Scores from the first reading are

kept concealed from those who complete the second reading.

If a sample receives two scores with more than a one-point

span (i.e. 3 and 5), the sample is read and scored by a

third reader in order to reconcile the point discrepancy.

Procedure

During the first session of the University's 1998

MMEP, the 46 preselected participants were randomly

assigned to two groups of 23 each. The experimental group

was further divided (randomly) into two subgroups and were

assigned to meet for writing sample instruction on Monday

34

and Wednesday afternoons, while the other group was

assigned to meet for verbal reasoning instruction on

Tuesday and Thursday afternoons.

During the second session, all 46 participants

completed a full MCAT practice test {MCAT Practice Test I).

Each participant completed two Writing Scimple prompts,

resulting in a total of 92 writing artifacts to be scored

by the trained MMEP readers. The first scoring session on

the pretest resulted in 4 scores for each prompt for each

participant (two scores for the first sample prompt and 2

scores for the second sample prompt). These scores were

then added together to give one overall writing sample

score for each participant ranging from 4-24.

The study was designed to last the full six weeks of

the MMEP. While Group 2 completed 1 hour and 45 minutes of

verbal reasoning instruction. Group 1,divided into

subgroups 1 and 2 (9 and 12 participants each),

participated in 1 hour and 20 minutes of writing sample

instruction.

The procedure of the writing sample instruction was as

follows:

1. During the first writing sample session, students were

provided with an example of the MCAT Writing Sample

35

and six-point scoring rubric available in the MCAT

Student Manual publication through the Association of

American Medical Colleges. The rxibric and holistic

scoring procedures were discussed. Five randomly

chosen responses written by participants in the

alternate subgroup were photocopied for each subgroup

member. Participants were introduced to White's

(1994) holistic scoring method. They then read,

scored and normed one of the five responses. A

discussion followed. During the second session, the

remaining four responses were scored and normed. A

lengthy discussion followed on the interpretation of

the prompts, as well as the application of the rubric.

2. During the third session, participants were given two

writing sample prompts and 30 minutes each to complete

them. Responses were photocopied, and five randomly

chosen responses from each subgroup were given to the

counter subgroup for scoring and grading.

3. During the fourth session, participants repeated the

scoring and norming process they had learned in

sessions one and two. Again, a lengthy discussion

followed.

36

4. During the fifth session, participants received their

scored responses from the alternate group. This

session consisted of participants' working in groups

of three, discussing and analyzing their own and

others' responses in lieu of the scoring rubric.

In the middle of the third week of the MMEP, and after

the fifth session of writing sample instruction for Group

1, all 46 participants were given a partial MCAT consisting

of a Verbal Reasoning section and a Writing Sample section

(from MCAT Practice Test II); this represented the

"posttest," but is labeled as the "midtest" since it was

administered midway through the program, and a full length

MCAT would be given at the conclusion of MMEP as a

posttest. The written responses were scored using the same

group of MMEP trained readers. Data on writing sample

scores for the pretest and midtest were then compared.

37

CHAPTER IV:

ANALYSIS OF DATA

The first question to be considered before testing the

hypothesis was whether or not Group 1 (experimental) showed

equivalent performance to Group 2 (control) on the writing

pretest. Given that the MMEP participants were randomly

assigned to each group, there should have been no

difference in writing sample scores on the pretest between

the two groups. Then, if the instruction in the

application of the scoring rubric were related to the

performance on the MCAT Writing Sample, and Group 1 showed

significant differences in scores on the midtest, the null

hypothesis (instruction had no relationship to writing

performance) could be rejected in favor of the alternative

hypothesis (the instruction in the application of a scoring

rubric is related to writing performance).

The dependent variable, writing sample score, was

achieved by sxamming the four ordinal rank scores each

participant received from the holistic scoring sessions

conducted by the MMEP readers - two scores from the first

response on the pretest and two scores from the second

response on the pretest. Each of the four ordinal rank

scores ranged from 1-6, representing the six-point scoring

38

rubric. When all four ordinal ranks scores were summed,

the scores ranged from 4-24. The independent variable,

instruction and application of the rubric, was nominal and

measured on two levels - 0 and 1.

As indicated, prior to the instruction in the

application of the MCAT Writing Sample scoring rubric to

their own and others' writing, all 4 6 participants

completed a pretest - a full practice MCAT, which included

two 30-minute Writing Sample prompts. A Mann-Whitney U

test was used to verify equivalency between Group 1

(experimental) and Group 2 (control). Examination of the

rank sums and results of the Mann-Whitney U test (alpha =

.05) indicated essentially no difference between the two

groups in their performance on the pretest (see Table 1).

Therefore, before any of the participants received

instruction in the application of the scoring rubrics to

their own and others' writing, the groups were established

as equivalent in their performance on the pretest.

Analysis of the Relationship

After Group 1 (experimental) received the writing

instruction, the participants in both groups completed

another MCAT Writing Sample examination. Like the pretest,

the responses were scored by the same MMEP readers using

39

Table 1

Rank Sums and U for Experimental and Control Groups

Group

Test Experimental Control U

Pretest

Rank Sums 496 585 127*

Note. Maximum score for both the pretest and midtest = 24. ^n = 21. "11 = 25.

the same method as discussed earlier. A Wilcoxon Match-

Pairs Signed Ranks test was used as an alternative to a

correlated groups t test since the dependant variable,

writing sample score for Group 1 (experimental) could form

rank differences between the pretest and midtest; the

independent variable, instruction, was within-subjects and

had only two levels (0, 1).

As Table 2 indicates, the results of the Wilcoxon

Match-Pairs Signed Ranks test indicate that the scores for

the pretest and midtest for Group 1 (experimental) were

significantly different (alpha = .05). The scores for the

pretest and midtest for the control group were not

significantly different, as would be expected, since that

40

group did not receive instruction in the application of the

MCAT Writing Sample scoring rubric. This would indicate

that the original hypothesis that the "instruction in the

application of a scoring rubric was related to writing

performance" was supported.

In isolation, these results would suggest that the

instruction the participants in the experimental group

received is certainly related to their writing sample

score. In fact, numerous participants in the experimental

group commented that in the end, they felt like they knew

what was expected of them and how to approach the MCAT

Writing Sample portion of the examination. It is possible

that a test-retest factor may contribute to the gains in

Group I's midtest scores. It has been dociomented that even

within a year's time, nearly all of those who repeat the

MCAT show some improvement in all subtests (Land, 1976).

Strength of the Relationship

In order to determine the strength of the

relationship, a matched-pairs rank biserial correlation

coefficient was calculated. The correlation coefficient

(rc = 0.65), indicates a relationship between the

instruction in the application of the MCAT Writing Sample

scoring rubric and writing sample score. Therefore, the

41

original hypothesis that the "instruction in the

application of a scoring rubric was related to writing

performance" was supported.

Table 2

Positive and Negative Rank Sums Differences and T for

Experimental and Control Groups

Group

Test Experimental Control

Positive

Rank Sum 191 185.5

Negative

Rank Sum 40 92.5

T 40.00* 92.5*

Note. Maximum score for both the pretest and midtest = 24. ^n = 21. "n = 25 (pretest). n = 23 (midtest)

Nature of Relationship

Since the rank sum for the negative differences of the

Wilcoxon Match-Pairs Signed Ranks test for Group 1

(experimental) are less than the rank sum for the positive

differences (Rn = 40 < ̂ = 191) it can be concluded that the

42

gains observed in the experimental group would be related

to the instruction the participants received.

Theoretically, this study suggests, as does the other

research, that the knowledge and application of any scoring

rubric for writing may have a positive relationship to

student writing performance. Certainly, it can be argued

that the implementation of the rubrics creates a stronger

link between learning, teaching and assessment (Soles

2001).

43

CHAPTER V:

CONCLUSION, IMPLICATIONS, AND

RECOMMENDATIONS FOR FURTHER RESEARCH

The following conclusion and implications are made

within the context of this particular study. Any certainty

associated with the following statements must be understood

within the limitations of this study as detailed in the

introduction.

Conclusion

This study indicates that the instruction in the

application of a writing scoring rubric to one's own and

others' writing is related to writing performance.

Implications

1. In practice, those who prepare students for standardized

writing examinations might this knowledge to better

prepare examinees.

2. Teachers of writing should not keep their scoring

criteria from students, and they should develop those

criteria into a rubric for students to apply to their

peers' and their own writing performances.

3. Additionally, students preparing for standardized writing

examinations should familiarize themselves with the

examination's scoring rubric and scoring practices.

44

4. If student writers become the "scorer" of their own and

their peers writing, it may lead to a more developed

critical stance about their own writing skills as well as

their ability to fulfill the requirements of other

writing situations.

Recommendations for Further Research

Nevertheless, due to the limitations of this

particular study, there is a need for a controlled

experimental study, which investigates the effect of rubric

knowledge and application on writing performance. In

particular, attention needs to be focused on the holistic

scoring procedures in order to assure consistency in the

data. Furthermore, the time on task needs to be increased

to assure sufficient instruction time. Finally, other

variables (such as primary language, previous writing

instruction, and test-retest effect) must be considered and

controlled. Not only would such a study establish the

effect or non-effect of this knowledge, but it would also

validate that the knowledge and application of scoring

rubrics do result in better writing performance.

45

APPENDIX A

SCORE POINT DESCRIPTIONS

6 = These papers show clarity, depth, and complexity of thought. The treatment of the writing assignment is focused and coherent. Major ideas are substantially developed. A facility with language is evident.

5 = These essays show clarity of thought, with some depth or complexity. The treatment of the rhetorical assignment is generally focused and coherent. Major ideas are well developed. A strong control of language is evident.

4 = These essays show clarity of thought and may show evidence of depth or complexity. The treatment of the writing assignment is coherent, with some focus. Major ideas are adequately developed. An adequate control of language is evident.

3 = These essays may show some problems with clarity or complexity of thought. The treatment of the writing assignment may show problems with integration or coherence. Major ideas may be underdeveloped. There may be ntimerous errors in mechanics, usage, or sentence structure.

2 = These essays may show some problems with clarity or complexity of thought. The treatment of the writing assignment may show problems with integration or coherence. Major ideas may be underdeveloped. There may be niomerous errors in mechanics, usage, or sentence structure.

1 = These essays may demonstrate a lack of understanding of the writing assignment. There may be serious problems with organization. Ideas may not be developed. Or, there may be so many errors in mechanics, usage, or sentence structure that the writer's ideas are difficult to follow.

From: Association of American Medical Colleges. (1995). MCAT Student Manual. Washington, DC, pp.67-68.

46

APPENDIX B

INTERPRETATION OF LETTER SCORES

Below are the scaled scores (J to T) examinees receive from an official MCAT Writing Sample. Because examinees may attain a given letter score in various ways (i.e., by performing equivalently on both essays or by performing better on one essay than on the other), making inferences about the writing skills associated with a given letter score is not straightforward.

R S T Above Average

These essays respond co the topic and the three writing tasks in a superior manner. The writing demonstrates a strong control of language. The response is presented in a clear, organized, and coherent fashion. Ideas are well developed, and the topic is dealt with at a complex level.

M N O P Q Average

These essays demonstrate a degree of proficiency in discussing the topic and/or responding to the three writing tasks. Few problems in the mechanics of writing are evident, and there is demonstration of control of language. The writing tasks are addressed in a clear, organized, and coherent manner. Ideas are developed to some extent and may show evidence of depth and complexity of thought.

J K L Below Average

These essays demonstrate a degree of difficulty in discussing the or/or responding to the three writing tasks. They may show problems with the mechanics of writing or in addressing the topic at a complex level. The response my not be organized, and ideas may not be integrated. Although the three writing tasks may be addressed, the ideas may be underdeveloped.

From: Future Doctor Net, Writing Sample Help. (2001). How the Writing Sample is Scored. Retrieved January 5, 2002, from http://www.futuredoctor.net/

47

REFERENCES

Andrade, H. G. (1999, April). The Role of Instructional Rubrics and Self-Assessment in Learning to Write: A Smorgasbord of Findings. Paper presented at the Annual Meeting of the American Educational Research Association, Montreal, Quebec, Canada.

Andrade, H. G. (2000) . Using Rubrics to Promote Thinking and Learning. Educational Leadership, 57, 13-18.

Arizona Health Sciences Center, Office of Minority Affairs, (n.d.). Minority Medical Education Program (MMEP)-- Who is Eligible. Retrieved September 8, 2001, from http://www.ahsc.arizona.edu/

Association of American Medical Colleges, About the AAMC. (© 1995-2002a). Retrieved September 8, 2001, from http://www.aamc.org/

Association of American Medical Colleges, Applying to Medical School (® 1995-2002b). About the MCAT. Retrieved September 8, 2001, from http://www.aamc.org/

Association of American Medical Colleges, Applying to Medical School. (© 1995-2002c). MCAT Writing Sample Prompts. Retrieved September 8, 2001, from http://www.aamc.org/

Association of American Medical Colleges. (1995). MCAT Student Manual. Washington, DC.

Association of American Medical Colleges, Minorities in Medicine. (© 1995-2002d). Minority Medical Education Program. Retrieved September 8, 2001, from http://www.aamc.org/

Association of American Medical Colleges, Minorities in Medicine. (© 1995-2002e). MMEP Program Sites: Western Consortium University of Washington / University of Arizona. Retrieved September 8, 2001, from http://www.aamc.org/

48

Association of American Medical Colleges, Minorities in Medicine. (® 1995-2002f) . Who Is Eligible? Retrieved September 8, 2001, from http://www.aamc.org/

Bergdahl, D. (1999). Scoring Guides in the Process-Oriented Composition Class: Having Students Write Their Own Scoring Guides. Exercise Exchange, 45, 21-24.

Duke, C.R., and Sanchez, R. (1994). Giving Students Control over Writing Assessment. English Journal, 83, 47-54.

Fuchs, L.S., Fuchs, D., Karns, K., Hamlett, C.L., Dutka, S., & Katzaroff, M. (2000). The Importance of Providing Background Information on the Structure and Scoring of Performance Assessments. Applied Measurement in Education, 13, 1-34.

Future Doctor Net, Writing Sample Help. (2001). How the Writing Sample is Scored. Retrieved January 5, 2002, from http://www.futuredoctor.net/

Greenlee, C.T. (1996). Passing the Test: Game Plan Developed to Help Athletes Beat the SAT. Black Issues in Higher Education, 13, 13-14.

Harrington, M., Holik, M., & Hurt, P. (1998). Improving Writing through the Use of Varied Strategies. Action Research Project, Saint Xavier University and IRI/Skylight. (ERIC Document Reproduction Service No. ED 420 874).

Holden, C. (1989). MCAT to Stress Thinking, Writing. Science, 243, 1431.

Hojat, M., Erdmann, J.B., Veloski, J.J., Nasca, T.J., Callahan, C.A., Julian, E., & Peck, J. (2000). A Validity Study of the Writing Sample Section of the Medical College Admission Test. Academic Medicine, 75, 25S-27S.

49

Jackson, E.W. (1985, March). A Comparison of Repeater Perfoxrmance on the MCAT: Review Course Participants vs. Non-Participants. Paper presented at the Annual Meeting of the American Educational Research Association, Chicago, IL.

Johns, J., & VanLeirsburg, P. (1992). Teaching Test-Wiseness: Can Test Scores of Special Populations Be Improved? Reading Psychology, 13, 99-103.

Jolly, P. (1992). Academic Achievement and Acceptance Rates of Underrepresented-Minority Applicants to Medical School. Academic Medicine, 67, 765-7 69.

Jones, R.F. (1986). The Effect of Commercial Coaching Courses on Performance on the MCAT. Journal of Medical Education, 61, 27 3-84.

Jones, R.F. (1983). The Relationship between MCAT Science Scores and Undergraduate Science CPA. Journal of Medical Education, 58, 908-11.

Kachigan, S.K. (1986). Statistical Analysis: An Interdisciplinary Introduction to Univariate and Multivariate Methods. New York: Radius Press.

Koenig, J.A. (1995, April). Examination of the Comparability of MCAT Writing Sample Test Forms. Paper presented at the Annual Meeting of the American Educational Research Association, San Francisco, CA.

Koenig, J.A., & Mitchell, K.J. (1993). Use of Medical College Admission Test Writing Sample Data in Medical School Admissions Decisions. The Advisor, 13, 13-15.

Koenig, J.A., & Wiley, A. (1996). The Validity of the Medical College Admission Test for Predicting Performance in the First Two Years of Medical School. Academic Medicine, 71, S83-S85.

Land, M.S. (1976). Cross-Year Comparison of Test-Retest MCAT Performance. Journal of Medical Education, 51, 532-86.

50

Lynch, K.B. (1990). The Relationship of Minority Students' MCAT Scores and Grade Point Averages to Their Acceptance into Medical School. Academic Medicine, 65, 480-82.

Mabry, L. (1999). Writing to the Rubric: Lingering Effects of Traditional Standardized Testing on Direct Writing Assessment. Phi Delta Kappan, 80, 673-80.

Mitchell, K.J. (1987, April). Estimation of Interrater and Parallel Forms Reliability for the MCAT Essay. Paper presented at the Annual Meeting of the American Educational Research Association, Washington, DC.

Mitchell, K.J. (198 6a). Factor Structure of the MCAT and Pilot Essay. Educational and Psychological Measurement, 46, 1019-27.

Mitchell, K.J. (1986b, August). Reliability of Holistic Scoring for the 1985 MCAT Essays. Educational and Psychological Measurement, 46, 771-75.

Moskal, B.M. (2000). Scoring Rubrics Part I: What and When. (ERIC Document Reproduction Service No. ED 446 110).

Montgomery, K. (2000). Classroom Rubrics: Systematizing What Teachers Do Naturally. Clearing House,, 76, 324-28.

Ornstein, A.C. (1993). Coaching, Testing, and College Admission Scores. NASSP Bulletin, 11, 12-19.

Schirmer, B.R., & Bailey, J. (2000). Writing Assessment Rubric: An Instructional Approach with Struggling Writers. Teaching Exceptional Children, 33, 52-58.

Scruggs, T.E., & Mastropieri, M.A. (1995). Teaching Test-Taking Skills: Helping Students Show What They Know. Cambridge, MA: BrooJcline Books.

Seaton, T. (1992). The Effectiveness of Test Preparation Seminars on Performance on Standardized Achievement Tests. (ERIC Document Reproduction Service No. ED 356 233).

51

Shaugnessy, A. (1994). Teaching to the Test: Sometimes a Good Practice. English Journal, 83, 54-56.

Skillings, M.J., & Ferrell, R. (2000). Student-Generated Rubrics: Bringing Students into the Assessment Process. Reading Teacher, 53, 452-55.

Soles, D. (2001, March). Sharing Scoring Guides. Paper presented at the Annual Meeting of the Conference on College Composition and Communication, Denver, CO.

Swanson, A., & Mitchell, J. (1989). MCAT responds to Changes in Medical Education and Physician Practice. (Medical College Admission Test). The Journal of the American Medical Association, 262, 261-64.

Taggart, G.L., Phifer, S.J., Nixon, J.A., & Wood, M. (Eds.). (1998). Rubrics: A Handbook for Construction and Use. Lancaster, PA: Technomic Publishing Company, Inc.

Thurmond, V.B. (1986). Correlations Between SAT Scores and MCAT Scores of Black Students in a Summer Program. Journal of Medical Education, 61, 640-43.

Use Rubrics and Reach All Learners. (1998). Active Learner: A Foxfire Journal for Teachers, 3, 32-33, 39.

Wheeler, P.H., & Haertel, G.D. (1993). Preparing Students for Testing: Should We Promote Test ffiseness? EREAPA Associates, Livermore, CA. (ERIC Document Reproduction Service No. ED 37 4 163).

White, E.M. (1994). Teaching and Assessing Writing. 2"'^ ed. San Francisco: Jossey-Bass.

Wilson, V.A., & Onwuegbuzie, A.J. (1999, November). Improving Achievement and Student Satisfaction through Criteria-Based Evaluation: Checklists and Rubrics in Educational Research Courses. Paper presented at the Annual Meeting of the Mid-South Educational Research Association, Point Clear, AL.

52

Witt, E.A. (1993, April). Meta-Analysis and the Effects of Coaching for Aptitude Tests. Paper presented at the American Educational Research Association, Atlanta, GA.

Wyngaard, S., & Gehrlce, R. (1996) . Responding to Audience: Using Rubrics to Teach and Assess Writing. English Journal, 85, 67-70.

Documents

The relationship between the application of scoring ...€¦ · read the dissertation prepared by The R.elat ionship between the Application of ScorinF Rubrics and Writin-r Performance