Upload
e
View
217
Download
0
Embed Size (px)
Citation preview
7/27/2019 Evidence on the Effectiveness
1/41
Language Learning ISSN 0023-8333
Evidence on the Effectivenessof Comprehensive Error Correction
in Second Language Writing
Catherine G. Van Beuningen
University of Amsterdam
Nivja H. De Jong
Utrecht University
Folkert Kuiken
University of Amsterdam
This study investigated the effect of direct and indirect comprehensive corrective feed-
back (CF) on second language (L2) learners written accuracy (N= 268). The studyset out to explore the value of CF as a revising tool as well as its capacity to supportlong-term accuracy development. In addition, we tested Truscotts (e.g., 2001, 2007)
claims that (a) correction may have value for nongrammatical errors but not for errors
in grammar; (b) students are inclined to avoid more complex constructions due to error
correction; and (c) the time spent on CF may be more wisely spent on additional writing
practice. Results showed that both direct and indirect comprehensive CF led to improved
accuracy, over what is gained from self-editing without CF (control group 1) and from
sheer writing practice without CF (control group 2), and this was true not only during
revision but also in new pieces of writing (i.e., texts written during posttest and delayed
posttest sessions, 1 and 4 weeks after the delivery of CF). Furthermore, a separateanalysis of grammatical and nongrammatical error types revealed that only direct CF
resulted in grammatical accuracy gains in new writing and that pupils nongrammatical
An earlier version of this article was presented at the Annual Conference of the American Associ-
ation of Applied Linguistics in Atlanta in March 2010. The research was conducted as part of the
doctoral dissertation by the first author. We would like to thank the dissertation examiners John
Bitchener, Kees De Glopper, Jan Hulstijn, Lourdes Ortega, Gert Rijlaarsdam, and Rob Schoo-
nen for useful suggestions. We also thank the anonymous reviewers and the editor ofLanguage
Learningfor their invaluable comments on the previous versions of this article.Correspondence concerning this article should be addressed to Catherine G. Van Beuningen,
Amsterdam Center for Language and Communication, University of Amsterdam, Spuistraat 134,
1012 VB Amsterdam, the Netherlands. Internet: [email protected]
Language Learning 62:1, March 2012, pp. 141 1C2011 Language Learning Research Club, University of MichiganDOI: 10.1111/j.1467-9922.2011.00674.x
7/27/2019 Evidence on the Effectiveness
2/41
Van Beuningen, De Jong, and Kuiken Effectiveness of Comprehensive CF
accuracy benefited most from indirect CF. Moreover, CF did not result in simplified
writing when structural complexity and lexical diversity in students new writing were
measured. Our findings suggest that comprehensive CF is a useful educational tool that
teachers can use to help L2 learners improve their written accuracy over time.
Keywords written corrective feedback; error correction; direct and indirect feedback;
second language writing; accuracy development; written complexity
Introduction
Error correction or corrective feedback (CF) is probably the most widely used
feedback form in present-day second language (L2) classrooms. Its usefulness,
however, has been fiercely debated ever since Truscotts (1996) article, in which
he claimed error correction is necessarily ineffective and potentially harmful.
In the decade that followed, he repeatedly presented objections with respect
to the use of CF in L2 writing classes (Truscott, 1996, 1999, 2004, 2007, and
elsewhere).
Truscotts (1996) statement that CF is ineffective relies on both practical
and theoretical arguments. His practical doubts pertain to teachers capacities
in providing adequate and consistent feedback and to learners ability and will-
ingness to use the feedback effectively. His theoretical argument rests on theclaim that CF overlooks important insights from second language acquisition
(SLA) theories, including (a) the gradual and complex nature of interlanguage
development, which stands in stark contrast to error correction as a simple
transfer of information; (b) the impossibility for any single form of CF to be ef-
fective across the very differently acquired domains of morphology, syntax, and
lexis, particularly with respect to grammatical features that are integral parts
of a complex system (Truscott, 2007, p. 258) and that would be impervious to
change (even if CF might be proven beneficial for spelling problems and othersuch simple, discrete errors); (c) the likelihood that any proven benefits might
be at best related to the development of explicit declarative knowledge (e.g.,
DeKeyser, 2003; Ellis, 2004), but never implicit procedural knowledge, which
is all-important for acquisition; thus, CF would promote pseudo-learning or
at best self-editing and revision skills, without fostering true accuracy develop-
ment; and (d) the impracticality of tailoring CF to each learners current level
of L2 development, in Pienemanns (1998) sense of learner readiness, given
that findings in this area to date are incomplete and insufficient to be useful
for teaching practice. In addition to arguing that error correction is ineffective,
Truscott (1996, 2004, 2007) stated that correcting students writing might be
also counterproductive. One of his arguments was that teachers run the risk
Language Learning 62:1, March 2012, pp. 141 2
7/27/2019 Evidence on the Effectiveness
3/41
Van Beuningen, De Jong, and Kuiken Effectiveness of Comprehensive CF
of making their students avoid more complex structures when they emphasize
learners errors by providing CF. Truscott reasoned that it is the immediate
goal of error correction to make learners aware of the errors they committedand that this awareness creates a motivation for students to avoid the corrected
constructions in future writing (Truscott, 2007). Second, Truscott (1996, 2004)
claimed that CF is a waste of time and suggested that the energy spent on
dealing with correctionsboth by teachers and studentscould be allocated
more efficiently to alternative activities, such as additional writing practice.
Based on all of the above reasons, Truscott (1996) summoned the aban-
donment of CF from L2 writing classes, until its usefulness had been proven
by empirical research (Truscott, 1999, 2004, 2007). Although Ferris (1999,
2002, 2004) made a stand for the use of written CF and argued that Truscotts
conclusions were premature, she agreed that evidence from well-designed
studies was necessary before any firm conclusions could be drawn about the
(in)effectiveness of error correction. This call has resulted in an ever-expanding
body of studies exploring the effects of CF on L2 learners writing.
Framed within a cognitive perspective of L2 acquisition, the present study
reports on a tightly controlled classroom-based quasi-experiment, which in-
cluded pretest, treatment/control, posttest, and delayed posttest sessions, and
compared the effects of direct and indirect comprehensive CF on the writtenaccuracy exhibited by 268 secondary school L2 learners during revision as well
as in new writing, 1 and 4 weeks after the treatment. Two control groups that
did not receive any CF but engaged instead in self-editing and in additional
writing practice, respectively, were also included. The study also examined the
differential value of CF for grammatical and nongrammatical error types as
well as any putative complexity avoidance effects of the CF treatments.
Research into the Effectiveness of Written CF
Early empirical work on the effects of CF on L2 learners writing can be
categorized into two strands: a set of studies that focused on the role of CF
during the revision process and a second group of investigations that set out to
answer the question of whether correction yields a learning effect when new
pieces of writing are inspected. The revision studies demonstrated that language
students were able to improve the accuracy of a particular piece of writing,
based on the feedback provided (e.g., Ashwell, 2000; Fathman & Whalley,
1990; Ferris, 1997; Ferris & Roberts, 2001). These findings thus showed that
CF is a useful editing tool. However, as Truscott and Hsu (2008) have rightly
argued, these results do not constitute evidence of learning, as only two versions
3 Language Learning 62:1, March 2012, pp. 141
7/27/2019 Evidence on the Effectiveness
4/41
Van Beuningen, De Jong, and Kuiken Effectiveness of Comprehensive CF
of the same draft were compared. It thus remained unclear if students success
in using the feedback during revision would subsequently lead to acquisition of
the corrected forms. One might also speculate that practice with autonomousself-editing of texts might be a better use of instructional time, if CF is really
only beneficial as an editing tool. If one believes that editing is the only source
of benefit of CF, teachers might simply spend more instructional time in class
encouraging the honing of self-editing skills.
The more interesting body of research, therefore, consists of studies that
have investigated the effects of CF on new pieces of writing. The first studies that
did opt to provide insights into the learning potential of CF, however, produced
mixed results (e.g., Chandler, 2003; Kepner, 1991; Polio, Fleck, & Leder, 1998;
Semke, 1984; Sheppard, 1992). Whereas Chandler (2003) concluded that CF
is an effective means of improving the accuracy of students writing over time,
Kepner (1991), Semke (1984), Polio et al. (1998), and Sheppard (1992) failed
to find an effect for error correction. All of these studies, however, suffered
from serious shortcomings in terms of their design, execution, and/or analyses
(see Guenette, 2007, or Van Beuningen, De Jong, & Kuiken, 2008, for a review
of these methodological issues). These investigations thus did not allow for any
firm conclusions on the role of CF in L2 accuracy development and led both
opponents (e.g., Truscott, 1996) and advocates (e.g., Ferris, 1999) of writtenerror correction to call for more, well-designed CF studies.
The recognition that a focus on grammar learning benefits rather than
on revision skills is needed has led the way into a growing body of tightly
controlled investigations that has begun to explore the long-term effects of CF
on L2 writing by comparing learners accuracy performance on pretests and
(delayed) posttests. Following the methodology of oral feedback research (e.g.,
Ellis, Loewen, & Erlam, 2006; Lyster, 2004), the majority of these researchers
(e.g., Bitchener, 2008; Bitchener & Knoch, 2008, 2009, 2010a, 2010b; Ellis,Sheen, Murakami, & Takashima, 2008; Sheen, 2007, 2010) have chosen to
investigate the effects of focused correction, whereby CF only targets one
persistently problematic error type at a time (e.g., errors in the use of English
articles). The rationale behind this focused approach is that learners might
be more likely to notice and understand corrections when just one feature is
targeted (Ellis et al., 2008), particularly in light of a limited processing capacity
model of L2 acquisition (Bitchener, 2008; Sheen, 2007). The extant findings
thus far consistently point at robust and durable positive effects of focused
CF on learners accuracy development (see Xu, 2009, however, for a critical
discussion of the findings by Bitchener, 2008; and see Bitchener, 2009, for a
response).
Language Learning 62:1, March 2012, pp. 141 4
7/27/2019 Evidence on the Effectiveness
5/41
Van Beuningen, De Jong, and Kuiken Effectiveness of Comprehensive CF
By comparison, evidence on the language learning potential of unfocused
CF, which involves comprehensive correction of every error in students writ-
ing, is scarce. To the best of our knowledge, only four studies have investigatedwhether comprehensive CF yields a learning effect. Two of them (Truscott &
Hsu, 2008; Van Beuningen et al., 2008) compared comprehensive CF to no CF
groups, whereas the other two (Ellis et al., 2008; Sheen, Wright, & Moldawa,
2009) included a focused and a comprehensive CF group in addition to a control
group in their design.
Truscott and Hsu (2008) examined the writing performance of 47 English-
as-a-second-language (ESL) learners, half of whom received comprehensive
corrections and the other half functioned as a control group. Truscott and Hsu
found that although comprehensive CF enabled their learners to improve the
accuracy of a particular text during revision, it did not lead to accuracy gains
when writing a new text. However, Bruton (2009) has noted that a ceiling effect
may explain the findings, as the texts learners wrote during the pretest contained
very few errors to begin with. The second study that compared comprehensive
CF to no CF was conducted as a pilot for the present study (Van Beuningen et
al., 2008). It explored the effects of two types of comprehensive CF and two
control conditions on the writing of 66 L2 learners of Dutch. It was found that
comprehensive error correction not only led to improved accuracy in the revisedversion of a particular piece of writing but also yielded a clear learning effect;
learners who received comprehensive CF made significantly fewer errors in
newly produced texts than pupils whose errors had not been corrected.
Ellis et al. (2008) found accuracy gains for both their focused and com-
prehensive CF groups and thus concluded both are equally effective. However,
some methodological weaknesses of the study have been discussed by Xu
(2009) and, as the authors themselves acknowledged, students in the focused
CF group received more feedback on the target feature (i.e., English articles)than students in the comprehensive CF group. Sheen et al. (2009), on the other
hand, found the focused approach to be more beneficial than provision of com-
prehensive feedback when both were compared to the control group. However,
the authors pointed out that the correction received by the comprehensive CF
group was rather unsystematic in nature; although some errors were corrected,
others were ignored. It is conceivable that this unsystematic way of correcting
negatively influenced the effect of comprehensive CF in this study. More evi-
dence on the potentially differential value of focused versus comprehensive CF
would be most welcome, not only because of the discrepant results reviewed
here but also because the issue of relative effectiveness is particularly relevant
5 Language Learning 62:1, March 2012, pp. 141
7/27/2019 Evidence on the Effectiveness
6/41
Van Beuningen, De Jong, and Kuiken Effectiveness of Comprehensive CF
for teaching practice, given that both might be valuable feedback methodologies
in different ways.
Ultimately, however, we argue that it is important to produce more evi-dence about the singular contribution of unfocused or comprehensive CF on
the accuracy of new pieces of writing. For one, and notwithstanding the sig-
nificant contribution of the focused CF work to the error correction debate,
the implications that can be drawn from focused CF studies so far are rather
limited, because the targeted linguistic features (i.e., English articles) are typi-
cally selected for maximal simplicity (Ferris, 2010; Truscott, 2010). Xu (2009)
also noted that such a clear focus on just one or two grammatical structures
may lead students to consciously monitor the use of that target feature when
performing on the posttest(s). We are sympathetic to Brutons (2009) concern
that focused CF studies might have framed CF as written grammar exercises
rather than authentic writing tasks, and we would like to heed Hartshorn et
al.s (2010) call for research on more authentic CF methodologies, which focus
on the accurate production of all aspects of writing, simultaneously (p. 89).
Most important to us is the fact that the comprehensive approach most closely
resembles the correction method used in actual teaching practice; a teachers
purpose in correcting his/her pupils written work is (among other things) to
improve accuracy in general, not just the use of one grammatical feature (Ferris,2010; Storch, 2010).
Thus, in the present study, we decided to investigate crucial theoretical
issues surrounding the CF debate by closely examining the pedagogical value
of comprehensive error correction (both during revision as well as for longer
term L2 learning as demonstrated in new writing) while adopting the tightly
controlled methodology of recent focused CF studies. Before presenting the
study, we review the extant empirical evidence bearing on the main additional
variables investigated: the relative effectiveness of direct versus indirect CF, thedifferential effects of CF on grammatical versus nongrammatical errors, and
two potentially harmful side effects of CF.
The Relative Effectiveness of Direct and Indirect CF
Corrective feedback researchers have not only shown interest in the question
of whether correction should be comprehensive or selective in nature. Many
studies have also addressed how questions, exploring the relative effective-
ness of different ways to deliver CF, or CF types. Most of these studies have
categorized their CF methodologies as either direct or indirect. Whereas direct
CF consists of an indication of the error and provision of the correspond-
ing correct L2 form, indirect CF only indicates that an error has been made.
Language Learning 62:1, March 2012, pp. 141 6
7/27/2019 Evidence on the Effectiveness
7/41
Van Beuningen, De Jong, and Kuiken Effectiveness of Comprehensive CF
Indirect correction methods can take different forms that vary in their explicit-
ness (e.g., underlining of errors, coding of errors). What they share in common
is that instead of the teacher providing the target form, it is left to the learnerto correct his/her own errors.
Various alternative hypotheses concerning the relative effectiveness of di-
rect and indirect CF have been put forward. In support of indirect CF, it has been
suggested that learners will benefit more from it because it engages students in
a more profound form of language processing while they self-edit their writing
(e.g., Ferris, 1995; Lalande, 1982). In this view, the indirect approach requires
pupils to engage in guided learning and problem solving and, as a result, pro-
motes the type of reflection that is more likely to foster long-term acquisition
(Bitchener & Knoch, 2008, p. 415). In support of direct CF, on the other hand,
it has been claimed that the indirect approach might fail because it provides
learners with insufficient information to resolve complex linguistic errors (e.g.,
syntactic errors). Chandler (2003) furthermore argued that whereas direct CF
enables learners to instantly internalize the correct form, learners whose errors
are corrected indirectly do not know if their own hypothesized corrections are
indeed accurate. This delay in access to the target form might level out the
potential advantage of the additional cognitive effort associated with indirect
CF. Moreover, it may be that learners need a certain level of (meta)linguisticcompetence to be able to self-correct their errors using indirect CF (e.g., Ferris,
2004; Hyland & Hyland, 2006; Sheen, 2007).
Clear empirical evidence on the differential effects of direct and indirect CF
on accuracy development is lacking, as research on the issue has produced con-
flicting results. Some researchers have found no differences between the two CF
types (Frantzen, 1995; Robb, Ross, & Shortreed, 1986), others have reported
an advantage for indirect CF (Ferris, 2006; Lalande, 1982), and yet others have
found direct correction to be most effective in their comparisons (Bitchener& Knoch, 2010b; Chandler, 2003; Van Beuningen et al., 2008). Problems of
interpretation cloud these contradictory results, however. For example, some
of the studies were not designed to compare the two CF methodologies, and in
other studies, the operationalization of direct and indirect CF as distinct treat-
ments was not rigorous enough. Additionally, caution must be exercised when
results are proclaimed on descriptive findings versus statistical significance
tests and on comparisons against the other treatment versus the control group.
Thus, in the study in which we piloted the present research methodology (Van
Beuningen et al., 2008), the difference when the direct and indirect CF treat-
ments were compared against each other did not reach significance, at ap-value
of .06, but when each treatment was compared to the two control (no CF)
7 Language Learning 62:1, March 2012, pp. 141
7/27/2019 Evidence on the Effectiveness
8/41
Van Beuningen, De Jong, and Kuiken Effectiveness of Comprehensive CF
conditions, only the learners receiving direct CF significantly outperformed
pupils in the control groups when writing a new text. Bitchener and Knoch
(2010b), who did report a statistically significant difference between direct andindirect CF, found the advantage to be in favor of the direct approach. However,
the findings between the two studies are not directly comparable, as the former
investigated (direct and indirect) comprehensive CF and the latter investigated
(direct and indirect) focused CF.
The Value of CF for Different Error Types
In his 2007 article, Truscott explained that his case against CF (Truscott, 1996,
1999) was actually a case against grammar correction. He claimed that syntactic
errors in particular might not be amenable to correction, because they are
integral parts of a complex system thatin Truscotts viewis impermeable to
CF. He furthermore suggested that morphological features are evenly unlikely to
benefit from CF because their acquisition not only depends on the understanding
of form but also of meaning and use in relation to other words and portions of
the language system. Truscott (2001, 2007) concluded that if CF has any value
for L2 development, this could only be true for errors that involve simple
problems in relatively discrete items (Truscott, 2001, p. 94)such as spelling
errorsand not for errors in grammar.A number of studies explored the effects of CF on separate error types, and
all reported differing levels of improvement for different types of errors (e.g.,
Bitchener, Young, & Cameron, 2005; Ferris, 2006; Ferris & Roberts, 2001;
Frantzen, 1995; Lalande, 1982; Sheppard, 1992). Ferris (2006), for example,
differentiated among five major error categories: verb errors, noun errors, article
errors, lexical errors, and sentence errors. She found that students receiving CF
only experienced a significant reduction from pretest to posttest in verb errors.
Lalande (1982) distinguished 12 error types and observed that correction onlyled to a significant decrease in orthographical errors. Bitchener et al. (2005)
investigated how focused CF influenced learners accuracy development on
three target structures and found that CF had a greater effect on the accuracy of
past simple tense and articles than on the correct usage of prepositions. None
of these studies, however, could test Truscotts claim that grammar correction is
ineffective, because they were not set out by design to investigate whether CF is
more beneficial for nongrammatical error types than for errors in grammar.
The Potential Harmful Side Effects of CF
Truscott has argued not only that CF is ineffective but also that it has at least
two harmful side effects. One is simplified writing, based on the assumption
Language Learning 62:1, March 2012, pp. 141 8
7/27/2019 Evidence on the Effectiveness
9/41
Van Beuningen, De Jong, and Kuiken Effectiveness of Comprehensive CF
that CF encourages learners to avoid situations in which they make errors.
In fact, Truscott (2004, 2007) has argued that accuracy gains found in earlier
correction studies might well be attributable to such avoidance and simplifiedwriting instead of CF. His suggestions are in line with limited capacity models
of attention that also predict a trade-off between accuracy and complexity (e.g.,
Skehan, 1998). Within these models, L2 performance is expected to become
more complex when learners are willing to experiment with the target language.
A focus on accuracy, on the other hand, is seen to reflect a greater degree of
conservatism in which learners will try to achieve greater control over more
stable [interlanguage] elements while avoiding extending their L2 repertoire
(Skehan & Foster, 2001, p. 191).
Few studies have investigated the influence of written CF on linguistic
complexity, and, in our opinion, studies that did (Chandler, 2003; Robb et al.,
1986; Sheppard, 1992) could not come to any warranted conclusions. Shep-
pard (1992), for example, reported a negative effect of CF on the structural
complexity of learners writing, but in fact his finding was nonsignificant.
Robb et al. (1986) found that CF had a significant positive effect on written
complexity, but they did not include a no CF control group. The same holds for
Chandler (2003), who did not find any effect of CF on the complexity of stu-
dents writing. An additional problem with the latter study is that Chandlerbased her conclusion on holistic ratings of text quality. In our view, however,
the fact that holistic ratings did not change does not necessarily prove that the
linguistic complexity of learners writing did not change either.
The second harmful side effect of CF identified by Truscott (1996, 2004)
was the diversion of time and energy away from more productive aspects of writ-
ing instruction, such as sheer writing practice. The only study that has directly
tested this claim by comparing the effects of CF to those of writing practice is
an investigation by Sheen et al. (2009). Their results opposed Truscotts claimby showing that learners did not benefit more from writing practice than from
CF.
Research Questions
The present studyof which the setup, tasks, and procedure were piloted in Van
Beuningen et al. (2008)intended to add to the existing body of CF research
by trying to tackle some of the unsettled issues in the debates reviewed above
and by answering eight research questions.
Given the limited empirical evidence about the value of comprehensive CF,
the first two research questions are:
9 Language Learning 62:1, March 2012, pp. 141
7/27/2019 Evidence on the Effectiveness
10/41
Van Beuningen, De Jong, and Kuiken Effectiveness of Comprehensive CF
RQ1 Is comprehensive written CF useful as an editing toolthat is, does it
enable learners to improve the accuracy of an initial text during revision?
RQ2 Does comprehensive written CF yield alearning effectthat is, does itlead to improved accuracy in new texts written 1 week and 4 weeks after
CF has been provided?
Because it seems plausible that learners benefit from taking a critical look
at their own text and revising it, even without teacher intervention, the present
study furthermore opted to test if comprehensive CF has an added value above
self-correction:
RQ3 Is comprehensive CF more beneficial to learners accuracy developmentthan having the opportunity to correct their own writing (without feed-
back)?
Given that the extant CF research to date has been unable to come to any
clear conclusions regarding the relative efficacy of direct and indirect CF, we
also asked the following:
RQ4 How effective are direct and indirect (comprehensive) CF relative to each
other and to no CF?
Moreover, as no earlier research has been designed to directly address
Truscotts (2001, 2007) claim that CF might have value for nongrammatical
errors but not for errors in grammar, we strived to test this hypothesis by
distinguishing between grammatical and nongrammatical error types in our
analyses:
RQ5 Are grammatical errors less amenable to correction than other types of
errors (i.e., nongrammatical errors)?
In addition, very few studies have explored the potential harmful effects of
CF, and we therefore asked the following:
RQ6 Is comprehensive CF more beneficial to learners accuracy development
than writing practice?
RQ7 Does error correction lead to avoidance of lexically and structurally
(more) complex utterances?
Finally, we explored the potential influence in our sample of learners
educational level on the degree to which they are able to benefit from direct
and indirect CF:
Language Learning 62:1, March 2012, pp. 141 10
7/27/2019 Evidence on the Effectiveness
11/41
Van Beuningen, De Jong, and Kuiken Effectiveness of Comprehensive CF
RQ8 What (if any) is the influence of pupils educational level on CF efficacy?
Research question 8 holds educational and theoretical implications. In the
first place, it would be valuable for teachers to know if learners from different
educational levels are equally receptive to (direct and indirect) comprehensive
CF. Moreover, we presumed learners educational level to be indicative of
their level of metalinguistic awareness, and the suggestion has been made that
learners with lower levels of (meta)linguistic competence might be less able
to correct their own errors based on indirect CF (e.g., Ferris, 2004; Hyland &
Hyland, 2006; Sheen, 2007). This led us to expect that pupils with a higher
educational level would perhaps profit more from indirect CF than pupils from
a lower level of education.
Method
Setting and Participants
Four Dutch secondary schools with multilingual student populations partici-
pated in the study. Over 80% of those schools pupils came from non-Dutch
language backgrounds;1 although most pupils were born in the Netherlands,
many of them only started learning Dutch in school (i.e., at age 4). Our samplewas very heterogeneous with respect to language background (28 first lan-
guages [L1s]), with Moroccan Arabic (31%), Turkish (16%), and Surinamese
languages (16%: 8% Sranan Tongo, 8% Sarnami Hindustani) being the most
common L1s.
All schools that took part in our study aimed at integrating content and
(second) language instruction by adopting a language-sensitive instructional
approach (e.g., Van Eerde & Hajer, 2005). Following this approach, language
did not only play a central role in language classes but also in classes inwhich the overriding focus was on content (e.g., biology, mathematics, ge-
ography). The main aim is to cater for the special needs of L2 learners and
learners with limited language proficiency, who might experience problems
understanding and acquiring the content due to the linguistic demands of the
input. The present investigation was conducted during biology classes, and our
tasks treated biology-related topics.
The studys population consisted of six intact classes of pupils in their
second year of higher general secondary education (orhavo in Dutch) (N=
134)
and seven intact classes of pupils in their second year of secondary prevocational
education (orvmbo-tin Dutch) (N= 134). In the remainder of this article wewill use the contrast higher level and lower level, respectively, to refer
11 Language Learning 62:1, March 2012, pp. 141
7/27/2019 Evidence on the Effectiveness
12/41
Van Beuningen, De Jong, and Kuiken Effectiveness of Comprehensive CF
to these two groups of pupils representing distinct educational levels. Pupils
mean age was 14 (minimum 14, maximum 15). Within classes, participants
were randomly assigned to the four different condition groups incorporated inthe study, to prevent a confounding interaction between condition and class.
Furthermore, the male/female and Dutch L1/L2 proportion was kept constant
across the four groups (see Note 1).
Treatments and Controls
The present study featured two experimental treatments and two control con-
ditions. They are described below.
Experimental Group I: Direct CF (DIR)Pupils in the first experimental group received comprehensive direct CF. The
researcher identified all existing linguistic errors and provided the pupils with
the corresponding target forms, as illustrated in Example 1:
Example1 :direct error correction
De lievenheersbeestje is rood. [The ladybug is red.]Het lieveheersbeestje
Experimental Group II: Indirect CF (IND)
Texts of pupils in the second experimental group were corrected indirectly.
The researcher provided an indication of each error and its category, but it was
left to the student to derive the corresponding target forms, as illustrated in
Example 2:
Example2 :indirect error correction
De lievenheersbeestje S is rood. [The ladybug is red.]( = wrong word, S = spelling error)
The error coding system comprised codes for nine different linguistic error
types, classified under three superordinate categories: (a) lexical errors: word
choice; (b) grammatical errors: word form (e.g., verb tense, singular/plural),
word order, incomplete sentences, and addition or omission of a word; and (c)
orthographical errors: spelling, punctuation, and capitalization.
Control Group I: Self-Correction (SLF)
Pupils in the first control group did not receive any feedback on their initial
text but were invited to revise their writing without any available CFthat is,
Language Learning 62:1, March 2012, pp. 141 12
7/27/2019 Evidence on the Effectiveness
13/41
Van Beuningen, De Jong, and Kuiken Effectiveness of Comprehensive CF
to self-correct their output. Including this control condition enabled us to set
apart effects of error correction from effects of the revision process as such.
Control Group II: Additional Writing Practice (PRC)
Just as the first control group, participants in the second control group did
not receive CF. Unlike the first control group, they were not involved in any
revision activities. Instead, they performed a completely new writing task,
which provided them with the opportunity to practice their writing skills once
more. Inclusion of this second control condition assured an equal distribution of
time-on-task across experimental and control groups, because completion of a
completely new task asked (at least) as much time as revising an already written
text. Moreover, integration of the practice condition enabled us to determine ifCF is more effective than just providing pupils with more practice opportunities
(Truscott, 1996, 2004).
Procedure
The present investigation included four sessions: a pretest session (S1), a treat-
ment/control session (S2), a posttest session (S3), and a delayed-posttest session
(S4). Figure 1 presents an overview of the procedure, which is described in de-
tail below. Our tasks were designed for research purposes only; they were not
part of the standard biology curriculum. However, all tasks were administered
during class periods. The tasks and topics were introduced and explained by
the researcher, who also provided the CF treatments, and the class teacher was
present to maintain order. All tasks and tests were pen-and-paper assignments.
Session 1: Pretest (Week 1)
During the first session (S1), all pupils, irrespective of their group assignment,
were presented with a receptive vocabulary test, a questionnaire concerning
their language background, and the first writing task. Scores on the vocabularytest were taken to provide an indication of pupils overall language proficiency.
A vocabulary test was chosen because earlier research demonstrated that vo-
cabulary knowledge is a good predictor of overall language proficiency (e.g.,
Beglar & Hunt, 1999; Zareva, Schwanenflugel, & Nikolova, 2005). Moreover,
a vocabulary test seemed to be the most suitable instrument out of practical
considerations and time restrictions. We included the vocabulary test scores in
our analyses as a covariate to be able to control for individual L2 proficiency
differences. The output on the first assignment served as a baseline measure
of pupils written accuracy and structural and lexical complexity. Pupils were
given 20 minutes to complete the writing task and were instructed to write a
minimum of 15 lines.
13 Language Learning 62:1, March 2012, pp. 141
7/27/2019 Evidence on the Effectiveness
14/41
Van Beuningen, De Jong, and Kuiken Effectiveness of Comprehensive CF
Group Condition Pretest
S1: week 1
Treatment/Control
Session
S2: week 2
Posttest
S3: week 3
Delayed
Posttest
S4: week 6
Direct CF
(DIR)
N= 32 + 35
Revision based on
comprehensive
direct CF
(Same text)
Indirect CF
(IND)
N = 29 + 33
Revision based on
comprehensive
indirect CF
(Same text)
Self-Correction
(SLF)
N = 37 + 34
Revision based on
self-correction
(no CF)
(Same text)
Practice
(PRC)
N = 36 + 32
Vocabulary
test
&
Language
background
questionnaire
Initial text
(butterflies)
Writingpractice
(no revision nor CF)
(New text: honeybees)
New text
(ladybugs)
New text
(wasps)
Figure 1 Experimental procedure. DIR= Direct CF, IND = Indirect CF, SLF = Self-correction with no CF, PRC = Additional writing practice with no CF.
Before administering the first assignment, the researcher introduced the
tasks topic by means of a 10-minute mini-lesson, to ensure a comparable
minimal amount of background knowledge on the topic among participants
and across the four conditions. Additionally, pupils were told that their main
Language Learning 62:1, March 2012, pp. 141 14
7/27/2019 Evidence on the Effectiveness
15/41
Van Beuningen, De Jong, and Kuiken Effectiveness of Comprehensive CF
focus needed to be on communicating content but that their texts would also
be evaluated on linguistic adequacy. To give them an idea of the form-related
features they could attend to while writing, pupils were given a handout listingcommon types of errors and an example for each error category. Participants
were not aware of the fact that they could be receiving CF at a later stage. We
chose to draw learners attention to form across all four groups at this stage for
several reasons. First, we aimed at establishing the unique contribution of CF
to accuracy development. If only the attention of the experimental groups had
been directed toward linguistic form (by means of comprehensive CF provision
during session 2), it would have been unclear if it was the form-oriented focus
in general or the CF as such that led to any observed differences with the
control groups. Second, the handout used in the introduction was reusedin a
slightly adapted version including the error coding systemwhen instructing
pupils in the indirect CF group on the interpretation of the error codes [see the
Session 2: Treatment/Control (Week 2) subsection]. This setup ensured that
the indirect group did not have an advantage as compared to the other groups
in terms of language input. Finally, by directing every learners attention to
linguistic form at the start of the experiment, we strove to foster stable (i.e.,
across sessions) and equal (i.e., across the four groups) task representations.
Had the task instructions in session 1 emphasized only content issues, the wayin which pupils in the experimental (but not the control) groups interpreted
the task would have been likely to change over time because students are
known to attune their task representations toward the task demands set by the
teacher (Broekkamp, Van Hout-Wolters, Van den Bergh, & Rijlaarsdam, 2004).
Whereas pupils initial perceptions of the study tasks goal would have been to
understand and communicate content (in any scenario where there was no initial
instructional focus on language form), receiving CF could have resulted in a
shift of learners task representations toward focusing on linguistic accuracy.
Session 2: Treatment/Control (Week 2)
The treatment/control session (S2) took place 1 week after S1. Pupils in the two
CF groups received the directly or indirectly corrected versions of their initial
texts (cf. Examples 1 and 2 earlier). They were asked to copy the initial texts,
revising all errors corrected by the researcher. Before starting their revisions,
pupils in the indirect CF group were given a handout, accompanied by oral
instructions, on how to interpret and use the error codes in their texts. The two
control groups each underwent a different condition involving no CF. Pupils in
the self-correction group were handed the texts they wrote during the pretest
session without any alterations. They were invited to self-correct their pretest
15 Language Learning 62:1, March 2012, pp. 141
7/27/2019 Evidence on the Effectiveness
16/41
Van Beuningen, De Jong, and Kuiken Effectiveness of Comprehensive CF
writing by thoroughly reading over their texts and searching for any elements
that needed revising. Even if no such utterances would be found, pupils were
still obliged to copy their initial texts. Pupils in the practice group, on theother hand, were not given the opportunity to revise their pretest texts but were
presented with a new writing task instead. The researcher shortly explained
the new tasks topic before pupils started writing. All pupils were allocated 20
minutes to finish the task they were presented with, irrespective of the condition
to which they were assigned.
Sessions 3 and 4: Posttest (Week 3) and Delayed Posttest (Week 6)
The first posttest (S3) was administered 1 week after the treatment/control
session (S2), and the delayed posttest took place 4 weeks after S2. During both
posttest sessions, all pupils were given 20 minutes to produce a text (at least
15 lines in length) on a new topic, which was again shortly introduced by the
researcher.
Writing Tasks
The tasks used throughout the study were four writing assignments on a biology-
related topicnamely, the metamorphosis of different insects: butterflies (S1),
honeybees (S2), ladybugs (S3), and wasps (S4).2 Whereas three of the taskswere performed by all pupils, the task on a wasps metamorphosis was only
performed by participants in the second control group, who practiced their
writing during the treatment/control session (S2).
All tasks within the series had a comparable presentation and design; they
invited pupils to write an email to a classmate who was absent during the
researchers explanation of the tasks topic. Pupils were asked to explain
the metamorphosis of the relevant insect, based on a series of images depicting
the metamorphosis process. To avoid individual differences in familiarity withthe relevant vocabulary, (potentially) unfamiliar words (e.g., larva, cocoon)
were glossed.
Data Processing and Reliability
All handwritten output was transcribed and coded using the CLAN program
of CHILDES (MacWhinney, 2000). Two research assistants transcribed pupils
texts, and the first author was responsible for coding them. Texts were coded
for linguistic errors and clause types (i.e., main clause and subordinate clause).
The coding procedure was entirely blind; during coding, the researcher was
unaware of the study group to which the text at hand belonged. Ten percent of
the data was also coded by one of the assistants to be able to establish interrater
Language Learning 62:1, March 2012, pp. 141 16
7/27/2019 Evidence on the Effectiveness
17/41
Van Beuningen, De Jong, and Kuiken Effectiveness of Comprehensive CF
Table 1 Average levels of intrarater and interrater agreement for linguistic measures
Overall Grammatical Nongrammatical Structural
accuracy accuracy accuracy complexity
ICC Intrarater .998 .977 .996 .974
ICC Interrater .981 .921 .975 .946
reliability and recoded, 6 months later, by the first author to measure intrarater
reliability.
The reliability results are presented in Table 1. We calculated intraclass
correlation coefficients (ICC) to establish the average levels (over sessions) ofintrarater and interrater agreement for overall accuracy, grammatical accuracy,
nongrammatical accuracy, and structural complexity (cf. next section). The
intrarater ICCs were calculated from an ANOVA two-way mixed-effects model,
and they provide an indication of the variability due to variation within the same
rater. A two-way random-effects model was used to estimate the interrater ICCs,
reporting the proportion of the variability due to variation among raters. As
shown in Table 1, high levels of agreement within the same rater as well as
between raters were found for all four measures.
Linguistic Measures for the Dependent Variables
Every text was analyzed for linguistic accuracy and, in order to address research
question 7, also for structural complexity and lexical diversity.
As did earlier studies exploring the effectiveness of written CF (e.g.,
Chandler, 2003; Truscott & Hsu, 2008), we used an error ratio to measure
overall accuracy: [number of linguistic errors/total number of words] 10.A 10-word ratio rather than the more common 100-word ratio was used be-cause pupils texts were relatively short (i.e., around 120 words). To be able to
test Truscotts (2001, 2007) claim that nongrammatical errors might be more
amenable to correction than errors in grammar (our research question 5), we
broke down our overall accuracy measure into a measure of grammatical ac-
curacy and a measure of nongrammatical accuracy. The first measure was a
ratio calculated on the basis of the sum of the number of article errors, inflec-
tional errors, word order errors, omissions of necessary elements, additions
of nonnecessary elements, pronominal errors, and other grammatical errors
(i.e., [number of grammatical errors/total number of words] 10). Lexicalerrors, orthographical errors, appropriateness/pragmatic errors, and other non-
grammatical errors,3 on the other hand, were included in the nongrammatical
17 Language Learning 62:1, March 2012, pp. 141
7/27/2019 Evidence on the Effectiveness
18/41
Van Beuningen, De Jong, and Kuiken Effectiveness of Comprehensive CF
accuracy ratio (i.e., [number of nongrammatical errors/total number of words]
10).4
To measure structural complexity, we used a subordination index: the num-ber of subordinate clauses as a percentage of the total number of clauses (i.e.,
[number of subclauses/total number of clauses] 100) (Norris & Ortega, 2009;Wolfe Quintero, Inagaki, & Kim, 1998). Lexical diversity was calculated using
Guirauds Index, a type-token ratio that corrects for text length (types/
tokens)
(Guiraud, 1954).
Statistical Analyses
As our participants came from different classes within different schools, our
data were structured hierarchically. In other words, the performances of pupilswithin one class or school were expected to be more similar to each other than
to performances of pupils in other classes and schools; a teacher, for example,
is likely to affect the performances of his/her pupils. This consideration was
important for the choice of appropriate statistical procedures. ANOVA, specif-
ically, relies on the independence of cases and may have been inappropriate
for our data if indeed we had found that observations within classes or schools
were dependent. We therefore applied linear multilevel analyses (e.g., Snijders
& Bosker, 1999) to explore the relationship between observations within classesand schools. The multilevel procedure enabled us to explicitly model possible
dependencies in the data by including class and school as random factors.5
These multilevel analyses showed, however, that class and school never made a
significant contribution to the model. After we had ascertained in this way that
the assumption of independence was met, we proceeded to analyze our data
using AN(C)OVAs because this procedure has proven to have more power in
absence of data dependencies (e.g., Snijders & Bosker, 1999).
We used ANCOVAs to test for between-groups differences on the dif-
ferent dependent variables (i.e., overall accuracy, grammatical accuracy, non-
grammatical accuracy, structural complexity, and lexical diversity) in the treat-
ment/control session (S2), first posttest session (S3), and delayed-posttest ses-
sion (S4). The initial ANCOVA models contained group condition and ed-
ucational level as between-subjects variables and language proficiency as a
covariate. In addition, we incorporated learners pretest (S1) performance on
the relevant dependent variable as a covariate to account for effects of initial in-
dividual differences. We started out each ANCOVA with a full model, including
all relevant factors, and all possible interactions between those factors. Factors
and interactions between factors that did not explain a significant proportion of
the variance were excluded from the final ANCOVA models.
Language Learning 62:1, March 2012, pp. 141 18
7/27/2019 Evidence on the Effectiveness
19/41
Van Beuningen, De Jong, and Kuiken Effectiveness of Comprehensive CF
Table 2 Descriptive statistics for overall language proficiency by educational level and
group condition
Higher educational level (havo) Lower educational level (vmbo-t)Group Group
condition N M SD condition N M SD
DIR 32 76.09 10.52 DIR 34 72.24 11.08
IND 26 74.85 11.80 IND 31 67.97 10.71
SLF 33 80.45 8.93 SLF 31 70.61 11.93
PRC 33 74.09 8.46 PRC 33 71.97 8.54
Total 124 76.46 10.10 Total 129 70.75 10.63
Note. The maximum possible score on the receptive vocabulary test was 108. The sample
size for the vocabulary test is smaller than the total sample size for participants in the
study because some participants failed to complete the vocabulary test. DIR= directCF, IND = indirect CF, SLF = self-correction with no CF, PRC = additional writing
practice with no CF.
Results
We will first present some relevant descriptive and inferential results on thelanguage proficiency and writing in S1 that help ascertain the comparability
of the four study groups in terms of learners proficiency profiles and baseline
performance at the onset of the study, prior to delivery of the two treatments
and two control conditions. We will then successively present the findings
regarding the value of comprehensive CF as an editing tool (RQ1), its language
learning potential (RQ2), the effects of direct and indirect CF on grammatical
and nongrammatical accuracy (RQ4 and RQ5), and the influence of CF on the
complexity of learners writing (RQ7). The comparison between the effects of
CF and those of self-correction (RQ3) and additional writing practice (RQ 7),
and the role of pupils educational level (RQ8), will be discussed throughout
the Results section.
Comparability of Groups at the Onset of the Study
The descriptive statistics for overall language proficiency (i.e., pupils scores
on the receptive vocabulary test), itemized by educational level and group as-
signment, are included in Table 2. In Tables 3, 4, and 5 readers will find the
descriptive statistics for overall accuracy, grammatical accuracy, and nongram-
matical accuracy scores on the first writing task about butterflies, done in S1
19 Language Learning 62:1, March 2012, pp. 141
7/27/2019 Evidence on the Effectiveness
20/41
Van Beuningen, De Jong, and Kuiken Effectiveness of Comprehensive CF
Table 3 Descriptive statistics for overall accuracya by educational level, condition, and
session
Pretest
(S1)
Treatment/control
session
(S2)
Posttest
(S3)
Delayed
posttest
(S4)Educational Group
level condition N M SD M SD M SD M SD
High (havo) DIR 32 1.52 0.75 0.37 0.29 1.08 0.66 1.32 0.89
IND 29 1.25 0.67 0.42 0.32 1.05 0.46 0.89 0.40
SLF 37 1.39 0.60 1.28 0.66 1.61 0.82 1.50 0.70
PRC 36 1.29 0.63 1.30 0.72 1.49 0.73 1.41 0.77
Total 134 1.37 0.66 0.88 0.71 1.33 0.73 1.31 0.75
Low (vmbo-t) DIR 35 1.69 0.70 0.32 0.24 1.64 0.76 1.42 0.78
IND 33 1.95 0.88 0.74 0.63 1.73 0.97 1.54 0.90
SLF 34 1.85 0.75 1.50 0.68 1.97 0.93 1.75 0.84
PRC 32 1.79 1.00 2.19 1.00 2.26 1.00 2.08 1.10
Total 134 1.82 0.83 1.17 0.99 1.90 0.94 1.70 0.93
Note.DIR=
direct CF, IND=
indirect CF, SLF=
self-correction with no CF, PRC=additional writing practice with no CF.
aNumber of linguistic errors per 10 words.
as a baseline measure of pupils written accuracy and structural and lexical
complexity, also itemized by educational level. The descriptive results for the
structural complexity and lexical diversity on S1 are shown in Tables 6 and 7,
respectively.A series of ANOVAs on pupils baseline texts (S1) with group condition
as a between-subjects variable showed that there were no initial differences
among the four groups in overall accuracy,F(3, 264)
7/27/2019 Evidence on the Effectiveness
21/41
Van Beuningen, De Jong, and Kuiken Effectiveness of Comprehensive CF
Table 4 Descriptive statistics for grammatical accuracyaby educational level, condition,
and session
Pretest
(S1)
Treatment/control
session
(S2)
Posttest
(S3)
Delayed
posttest
(S4)Educational Group
level condition N M SD M SD M SD M SD
High (havo) DIR 32 0.43 0.27 0.09 0.10 0.32 0.22 0.39 0.34
IND 29 0.30 0.24 0.10 0.11 0.36 0.27 0.37 0.22
SLF 37 0.37 0.25 0.37 0.23 0.42 0.30 0.46 0.33
PRC 36 0.37 0.27 0.36 0.27 0.44 0.23 0.45 0.23Total 134 0.37 0.26 0.24 0.24 0.39 0.26 0.42 0.29
Low (vmbo-t) DIR 35 0.55 0.31 0.10 0.11 0.49 0.33 0.39 0.26
IND 33 0.47 0.33 0.21 0.23 0.51 0.27 0.49 0.32
SLF 34 0.53 0.32 0.46 0.29 0.64 0.41 0.60 0.39
PRC 32 0.52 0.36 0.59 0.41 0.69 0.44 0.72 0.46
Total 134 0.52 0.33 0.34 034 0.58 0.38 0.55 0.38
Note.DIR
=direct CF, IND
=indirect CF, SLF
=self-correction with no CF, PRC
=additional writing practice with no CF.aNumber of grammatical errors per 10 words.
language proficiency, F(3, 253) = 1.96, p = .120, p 2 = .02. These findingsensured that our four groups were comparable at the onset of the study.
On the other hand, however, a series of ANCOVAs with educational level
as a between-subjects variable and language proficiency as a covariate revealed
that these two factors did influence learners pretest writing. We found thatboth educational level,F(1, 250) = 14.20,p
7/27/2019 Evidence on the Effectiveness
22/41
Van Beuningen, De Jong, and Kuiken Effectiveness of Comprehensive CF
Table 5 Descriptive statistics for nongrammatical accuracya by educational level, con-
dition, and session
Pretest
(S1)
Treatment/control
session
(S2)
Posttest
(S3)
Delayed
posttest
(S4)Educational Group
level condition N M SD M SD M SD M SD
High (havo) DIR 32 1.09 0.63 0.28 0.27 0.77 0.56 0.93 0.82
IND 29 0.95 0.52 0.32 0.26 0.69 0.39 0.52 0.29
SLF 37 1.01 0.55 0.92 0.56 1.20 0.69 1.04 0.59
PRC 36 0.93 0.54 0.94 0.63 1.05 0.66 0.97 0.69
Total 134 1.00 0.56 0.64 0.56 0.95 0.63 0.89 0.66
Low (vmbo-t) DIR 35 1.14 0.56 0.23 0.17 1.15 0.65 1.03 0.75
IND 33 1.48 0.72 0.52 0.47 1.22 0.82 1.05 0.72
SLF 34 1.32 0.63 1.05 0.64 1.33 0.69 1.15 0.70
PRC 32 1.28 0.82 1.60 0.85 1.56 0.89 1.36 0.95
Total 134 1.30 0.69 0.83 0.78 1.32 0.78 1.15 0.79
Note.DIR
=direct CF, IND
=indirect CF, SLF
=self-correction with no CF, PRC
=additional writing practice with no CF.aNumber of nongrammatical errors per 10 words.
were well guided in incorporating the factors educational level and language
proficiency in later analyses when answering our research questions.
Effects of Comprehensive CF on Overall Written Accuracy
Table 3 includes the descriptive statistics for overall accuracy scores for allfour group conditions, itemized per educational level and session (i.e., treat-
ment/control session, posttest, and delayed posttest). Figure 2 is a graphic illus-
tration of the descriptive results in Table 3, collapsed over educational level. It
shows the accuracy development of the four study groups over time. The graph
provides a descriptive preview of this studys main findings, which will be
presented in this section. As can be seen from the graph, all groups performed
similarly at the pretest (S1). In the treatment/control session (S2), however, the
error rate of the CF groups and the self-correction group decreased, whereas
the number of errors committed by the practice group increased. On the two
posttests (S3 and S4), we still see an error rate difference between the control
and experimental groups, in favor of the latter.
Language Learning 62:1, March 2012, pp. 141 22
7/27/2019 Evidence on the Effectiveness
23/41
Van Beuningen, De Jong, and Kuiken Effectiveness of Comprehensive CF
Table 6 Descriptive statistics for structural complexityaby educational level, condition,
and session
Pretest
(S1)
Treatment/contro
lsession
(S2)
Posttest
(S3)
Delayed
posttest
(S4)Educational Group
level condition N M SD M SD M SD M SD
High DIR 32 16.31 10.46 17.30 10.87 14.67 9.14 10.27 8.13
(havo) IND 29 16.30 8.98 18.07 10.20 14.36 10.38 10.98 8.25
SLF 37 14.74 8.94 14.39 9.30 15.65 10.92 10.94 10.62
PRC 36 19.86 10.02 16.13 8.73 15.75 9.32 15.80 8.02Total 134 16.83 9.71 16.35 9.74 15.16 9.88 12.06 9.10
Low DIR 35 20.78 8.17 21.66 8.27 18.65 10.77 15.17 9.64
(vmbo-t) IND 33 19.41 11.66 21.04 10.57 16.70 10.98 10.84 8.95
SLF 34 15.03 8.95 16.19 8.62 14.57 11.00 11.25 7.67
PRC 32 18.86 8.00 15.19 8.23 13.66 8.65 13.17 8.30
Total 134 18.52 9.45 18.57 9.32 15.89 10.45 12.72 8.73
Note.DIR
=direct CF, IND
=indirect CF, SLF
=self-correction with no CF, PRC
=additional writing practice with no CF.aSubordination Index (i.e., [number of subclauses/total number of clauses] 100).
Analyses of covariance were used to test for between-groups accuracy dif-
ferences in the treatment/control session (S2), posttest (S3), and delayed posttest
(S4). The initial ANCOVA models contained group condition and educational
level as between-subjects variables and language proficiency and overall ac-
curacy S1 as covariates. Nonsignificant factors and nonsignificant interactions
between them were excluded from the final models. Table 8 offers a previewof the significant differences observed between groups in post hoc pairwise
comparisons (to be reported in detail below) in overall, grammatical, and non-
grammatical accuracy across the three sessions, together with the associated
Cohensdeffect sizes.
Revision Effects (S2)
During revision, pupils language proficiency, F(1, 240)= 2.63, p= .106,
p
2
=.01, and the interaction between condition and language proficiency,
F(3, 240) = 2.40,p = .069, p 2 = .03, did not prove to play a significant role.These factors were therefore discarded from the final ANCOVA model. The
23 Language Learning 62:1, March 2012, pp. 141
7/27/2019 Evidence on the Effectiveness
24/41
Van Beuningen, De Jong, and Kuiken Effectiveness of Comprehensive CF
Table 7 Descriptive statistics for lexical diversitya by educational level, condition, and
session
Pretest
(S1)
Treatment/control
session
(S2)
Posttest
(S3)
Delayed
posttest
(S4)Educational Group
level condition N M SD M SD M SD M SD
High (havo) DIR 32 7.13 0.85 7.28 0.91 6.87 0.90 6.51 0.80
IND 29 7.10 0.77 7.10 0.73 6.69 0.68 6.61 0.63
SLF 37 7.09 0.81 7.06 0.78 6.70 0.90 6.31 0.98
PRC 36 7.31 0.50 6.72 0.72 6.88 0.71 6.63 0.83Total 134 7.16 0.74 7.03 0.81 6.79 0.80 6.51 0.84
Low (vmbo-t) DIR 35 7.00 0.81 7.26 0.70 6.51 0.73 6.58 0.69
IND 33 6.77 1.06 7.03 1.00 6.64 1.00 6.47 0.89
SLF 34 7.08 0.75 7.12 0.82 6.46 0.70 6.37 0.75
PRC 32 6.96 0.87 6.53 0.77 6.59 0.87 6.57 0.70
Total 134 6.95 0.88 7.00 0.87 6.55 0.74 6.50 0.75
Note.DIR
=direct CF, IND
=indirect CF, SLF
=self-correction with no CF, PRC
=additional writing practice with no CF.aGuirauds Index (types/
tokens).
definitive analysis revealed significant effects of group condition, F(3, 259)
= 111.24,p
7/27/2019 Evidence on the Effectiveness
25/41
Van Beuningen, De Jong, and Kuiken Effectiveness of Comprehensive CF
Table8S
ummaryofsignificantcontrastsamongfourgroupswithass
ociatedCohensdeffectsizes
Overallaccuracy
d
Gra
mmaticalaccuracy
d
Nongrammaticalaccuracy
d
Treatment/controlsession(S2)
D
IR>
SLF
1.9
9
DIR>
SLF
1.5
2
DIR>
SLF
1.6
1
D
IR>
PRC
2.7
4
DIR>
PRC
1.8
5
DIR>
PRC
2.5
3
I
ND>
SLF
1.5
5
IND>
SLF
1.0
5
IND>
SLF
1.3
3
I
ND>
PRC
2.2
9
IND>
PRC
1.3
7
IND>
PRC
2.0
6
Posttest(S
3)
D
IR>
SLF
0.6
7
DIR>
PRC
0.5
5
DIR>
SLF
0.5
3
D
IR>
PRC
0.8
1
DIR>
PRC
0.6
5
I
ND>
SLF
0.7
0
IND>
SLF
0.6
8
I
ND>
PRC
0.8
4
IND>
PRC
0.8
0
Delayedposttest(S4)
D
IR>
PRC
0.6
3
DIR>
PRC
0.6
5
IND>
SLF
0.6
6
IND>
SLF
0.6
6
IND>
PRC
0.8
2
I
ND>
PRC
0.9
4
Note.
DIR
=
directCF,
IND=
indirectC
F,
SLF=
self-correctionwith
noCF,
PRC=
additionalwritingpracticewithnoCF.
p