10
This article was downloaded by: [North Carolina State University] On: 30 April 2013, At: 08:13 Publisher: Routledge Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK Research Quarterly for Exercise and Sport Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/urqe20 The Effects of Repeated Retention Tests Can Benefit as Well as Degrade Timing Performance Jeffrey T. Fairbrother a & Joao Augusto de Camargo Barros a a Department of Exercise, Sport, and Leisure Studies, University of Tennessee Published online: 23 Jan 2013. To cite this article: Jeffrey T. Fairbrother & Joao Augusto de Camargo Barros (2010): The Effects of Repeated Retention Tests Can Benefit as Well as Degrade Timing Performance, Research Quarterly for Exercise and Sport, 81:2, 171-179 To link to this article: http://dx.doi.org/10.1080/02701367.2010.10599664 PLEASE SCROLL DOWN FOR ARTICLE Full terms and conditions of use: http://www.tandfonline.com/page/terms-and-conditions This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. The publisher does not give any warranty express or implied or make any representation that the contents will be complete or accurate or up to date. The accuracy of any instructions, formulae, and drug doses should be independently verified with primary sources. The publisher shall not be liable for any loss, actions, claims, proceedings, demand, or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with or arising out of the use of this material.

The Effects of Repeated Retention Tests Can Benefit as Well as Degrade Timing Performance

Embed Size (px)

Citation preview

Page 1: The Effects of Repeated Retention Tests Can Benefit as Well as Degrade Timing Performance

This article was downloaded by: [North Carolina State University]On: 30 April 2013, At: 08:13Publisher: RoutledgeInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41Mortimer Street, London W1T 3JH, UK

Research Quarterly for Exercise and SportPublication details, including instructions for authors and subscription information:http://www.tandfonline.com/loi/urqe20

The Effects of Repeated Retention Tests Can Benefit as Wellas Degrade Timing PerformanceJeffrey T. Fairbrother a & Joao Augusto de Camargo Barros aa Department of Exercise, Sport, and Leisure Studies, University of TennesseePublished online: 23 Jan 2013.

To cite this article: Jeffrey T. Fairbrother & Joao Augusto de Camargo Barros (2010): The Effects of Repeated Retention Tests CanBenefit as Well as Degrade Timing Performance, Research Quarterly for Exercise and Sport, 81:2, 171-179

To link to this article: http://dx.doi.org/10.1080/02701367.2010.10599664

PLEASE SCROLL DOWN FOR ARTICLE

Full terms and conditions of use: http://www.tandfonline.com/page/terms-and-conditions

This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction,redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expresslyforbidden.

The publisher does not give any warranty express or implied or make any representation that the contents will becomplete or accurate or up to date. The accuracy of any instructions, formulae, and drug doses should be independentlyverified with primary sources. The publisher shall not be liable for any loss, actions, claims, proceedings, demand, orcosts or damages whatsoever or howsoever caused arising directly or indirectly in connection with or arising out of theuse of this material.

Page 2: The Effects of Repeated Retention Tests Can Benefit as Well as Degrade Timing Performance

RQES: June 2010 171

Fairbrother and Barros

Key words: forgetting, memory updating, motor learning, testing effect

In the motor domain, some theorists have argued for us-ing multiple retention tests to assess learning (Christina

& Shea, 1988, 1993) while others have suggested that ad-ministering two tests might lead to performance changes on the second test compared to a single test (Schmidt & Lee, 1999). Christina and Shea (1988, 1993) argued that immediate retention tests can be influenced by temporary performance factors, while delayed tests necessarily reflect both the dissipation of these effects and forgetting across the retention delay. Consequently, they advocated that using multiple tests would provide a clearer picture of short- and long-term motor skill retention. In contrast, Schmidt and Lee (1999) noted that when two tests are ad-ministered to the same individuals, the experience gained from the first test might logically influence performance during the second. To prevent such potential testing ef-fects, they recommended using split-group designs that assign half the participants in each condition to an im-mediate test and half to a delayed test.

Despite the interest in repeated testing, little attention has been devoted to directly investigating the effects of repeated retention tests on motor learning. Magnuson, Shea, and Fairbrother (2004) administered retention tests to five experimental groups that took either one or two tests at different delays following practice. The top portion of Table 1 displays the retention schedules for the groups important to the present study. The D1-D1 group1 received tests at 10 min and 20 min after acquisition. The D1-D2 group received tests at 10 min and 24 hr after ac-quisition. The D2-D2 group received tests at 23 hr 50 min and 24 hr after acquisition. A control group (D2-Cont) received only one test administered at 24 hr. The Magnu-son et al. (2004) results demonstrated that administering two retention tests degraded timing accuracy during the 24-hr retention test but only for the group that received the first test at 23 hr 50 min after acquisition. This decre-ment was seen in two ways. First, there was a significant increase in constant error (CE) and absolute constant error (ACE) from the first to second tests for the D2-D2 group. In contrast, the groups that took the first test on the same day as acquisition (D1-D1 and D1-D2) showed no increases in CE or ACE. Second, performance during the 24-hr retention test was less accurate for the D2-D2 group compared to the D1-D2 and D2-Cont groups.

Magnuson et al. (2004) argued that the effects of the repeated retention tests were isolated in the D2-D2 group because of the relatively long delay until taking their first retention test at 23 hr 50 min after acquisition. This delay

The Effects of Repeated Retention Tests Can Benefit as Well as Degrade Timing Performance

Jeffrey T. Fairbrother and Joao Augusto de Camargo Barros

Submitted: April 24, 2008 Accepted: March 22, 2009 Jeffrey T. Fairbrother and Joao Augusto de Camargo Barros are with the Department of Exercise, Sport, and Leisure Studies at the University of Tennessee.

In this study, we examined the effects of interference and repeated retention tests by comparing groups that performed (a) one or two tests, or (b) two tests separated by interpolated tasks. The task involved pressing five keys in 925 ms. Constant error increased after Block 1 of the second test for the group completing the interpolated tasks. Variable error decreased across retention tests and was smaller for the two-test groups compared to the one-test control. Results differed from previous reports of degraded timing accuracy (Magnuson, Shea, & Fairbrother, 2004), suggesting the present results may have been related to highly accurate performance dur-ing the first retention test that reflected successful initial encoding of task information.

Research Quarterly for Exercise and Sport©2010 by the American Alliance for Health,Physical Education, Recreation and DanceVol. 81, No. 2, pp. 171–179

Motor Behavior

Fairbrother.indd 171 5/6/2010 11:50:01 AM

Dow

nloa

ded

by [

Nor

th C

arol

ina

Stat

e U

nive

rsity

] at

08:

13 3

0 A

pril

2013

Page 3: The Effects of Repeated Retention Tests Can Benefit as Well as Degrade Timing Performance

172 RQES: June 2010

Fairbrother and Barros

diminished the availability of accurate task information which led, to the re-encoding of errors that occurred dur-ing the test. Magnuson et al. (2004) also suggested that the loss of context information from acquisition undermined re-encoding such that the re-encoded task information was less accessible for recall during the subsequent 24-hr retention test. In contrast, accurate information related to both the task and practice context was available for the D1-D1 and D1-D2 groups, because they received their first test after only a 10-min delay. The authors also concluded that re-encoding occurred after the first test for all the two-test groups, because there were no block effects in either CE or ACE during the tests and no dif-ferences between groups during their respective first tests (i.e., the 10-min test for D1-D1 and D1-D2, and the 23-hr 50-min test for D2-D2). This argument was consistent with views of memory as reconstructive (Bartlett, 1932; Bergman & Roediger, 1999) as well as the proposal that memory representations do not change until reactivated by recall, at which point they are subsequently updated and re-encoded (Estes, 1997). Consequently, Magnuson et al. (2004) proposed that the effects of repeated reten-tion tests on CE and ACE resulted from a combination of decay (i.e., forgetting that occurred during the first 10 min following acquisition) and interference. We refer to this argument as the faulty re-encoding hypothesis.

Magnuson et al. (2004) speculated that forgetting caused by intertask interference might accentuate2 repeat-ed retention effects in multiple task learning situations (i.e., further degrade performance). They did not specify the precise time at which intertask interference might in-teract with the effects of repeated retention tests; however, they indicated at least two distinct possibilities. First, they referred to multiple task learning settings that typically involve intertask interference during acquisition and retention. Fairbrother, Shea, and Marzilli (2007) examined this possibility and found no evidence indicating any interac-tions between the effects of the interference and testing manipulations. Second, they also referred to forgetting due to intertask interference, which presumably occurs when the interference is experienced during memory re-encoding (i.e., sometime during the retention phase).

The purpose of the present study was to examine the possibility that forgetting due to intertask interference during the retention phase would magnify the negative effects of repeated retention tests. Therefore, the study was designed to (a) replicate the conditions under which Magnuson et al. (2004) found effects of repeated reten-tion tests and (b) directly test the proposition that inter-task interference during retention would magnify these effects. Interference was introduced by a requirement to complete several trials of two interpolated tasks between the two retention tests, a manipulation in keeping with interference theory (Underwood, 1957). We assumed that using interpolated tasks similar to the practiced task

would result in so-calleds “intertask” processing related to between-task comparisons (Magnuson et al., 2004; Shea & Zimny, 1983, 1988).

The bottom portion of Table 1 shows the testing schedule used in the present experiment. Two groups received two retention tests administered 23 hr 50 min and 24 hr after acquisition to match the testing schedule of the D2-D2 group that showed accuracy decrements in the Magnuson et al. (2004) study. According to the faulty re-encoding hypothesis, the initial delay of 23 hr 50 min should increase the likelihood of re-encoding errors for the task and reduce accessibility of the representation. Furthermore, the requirement to perform interpolated tasks between the first and second retention tests should cause forgetting due to intertask interference during re-encoding processes. Such interference should degrade the memory representation for the practiced task in a number of ways (see Chandler & Fisher, 1996). For ex-ample, contextual information, such as the “get ready” prompt, might be remapped to the interpolated tasks, or the new timing goals for the same movement sequence might lead to a recombination of task features in memory. Therefore, we expected that: (a) the groups taking two retention tests would show increased CE and ACE (i.e., degraded timing accuracy) compared to the one-test control group during the 24-hr retention test; (b) CE and ACE would increase across successive tests for the groups taking two tests; and (c) increased CE and ACE across retention tests would be more pronounced for the group completing interpolated tasks between tests than for the group that did not.

Table 1. Schedule of retention tests for groups in Magnuson et al. (2004) and the current experiment

Group 10 min 20 min 23 hr INT trials 50 min 24 hr

Magnuson, Shea, & Fairbrother (2004) D1-D1 T1 T2 D1-D2 T1 T2 D2-D2 T1 T2 D2-Cont T1Current experiment No-INT Ret 1 Ret 2 INT Ret 1 YES Ret 2 CTRL Ret 2

Note. INT = interpolated; T1 = first test; = T2 = second test administered to groups that took two tests in Magnuson et al. (2004); Ret 1 = test given at 23 hr 50 min; Ret 2 = test given at 24 hr in current experiment; control groups (D2-Cont and CTRL) received only one retention test administered 24 hr after acquisition; the INT group completed interpolated trials between tests while the No-INT group did not.

Fairbrother.indd 172 5/6/2010 11:50:01 AM

Dow

nloa

ded

by [

Nor

th C

arol

ina

Stat

e U

nive

rsity

] at

08:

13 3

0 A

pril

2013

Page 4: The Effects of Repeated Retention Tests Can Benefit as Well as Degrade Timing Performance

RQES: June 2010 173

Fairbrother and Barros

Method

Participants

Fifty-one students (M age = 21.4 years, SD = 4.5) vol-unteered to participate in the experiment. All participants read and signed an informed consent form notifying them of their right to participate or withdraw from the experi-ment at any time. Participants had no prior experience with the task and were unaware of the study’s purpose.

Apparatus and Task

The apparatus consisted of a personal computer, color monitor, and keyboard positioned on a standard table. Participants sat in front of the computer so they could comfortably access the keys using the index finger of their right hand. The 925-ms task used during acquisition and retention was identical to the one used by Magnuson et al. (2004). The task required participants to complete a four-segment movement pattern by sequentially depress-ing five keys on the keypad in a prescribed order (i.e., 0-2-6-4-9). Figure 1 shows the diagram displayed on the monitor during acquisition and retention phases. The overall timing goal for the acquisition and retention tasks was 925 ms, while the overall timing goals for the two inter-polated tasks were 725 ms and 1,125 ms, respectively. Both tasks required the same movement pattern as the one used during acquisition and retention. The keys involving the movement pattern were covered by yellow tape, and the remaining keys were covered by black tape. A custom soft-ware program written in E-Prime 1.2 (Psychology Software Tools, Inc., Pittsburgh, PA) presented task cues, recorded data, and administered feedback.

Procedure

The procedures matched those used by Magnuson et al. (2004) with two exceptions. First, the present experi-ment replicated only the group that showed negative con-sequences of repeated testing (i.e., the D2-D2 group) and the matching control group (i.e., the D2-Cont group). Second, the present experiment included a third group that experienced intertask interference by completing interpolated tasks between retention tests. Table 1 shows the experimental groups and testing schedules used in the present experiment and Magnuson et al. (2004).

Participants were randomly assigned to one of the three experimental groups. The No Interference group (No-INT) took one retention test at 23 hr 50 min (Ret 1) and one at 24 hr (Ret 2) after acquisition. The In-terference group (INT) also took Ret 1 and Ret 2 and performed two interpolated tasks between the tests. The control group (CTRL) took only Ret 2. Each participant received written instructions, which the experimenter

also read aloud. A 2-s display of the words “Get Ready” preceded each trial. Following this warning, the monitor displayed the 925-ms goal as a cue to respond. After a trial, knowledge of results (KR) in terms of constant er-ror was displayed for 2 s. KR was withheld following the last practice trial to prevent its use in performing the first retention test trial (Salmoni, Schmidt, & Walter, 1984). All three groups completed 40 trials of the 925-ms task during acquisition and 15 trials of the 925-ms task during retention. Following Ret 1, the INT group completed 15 trials each of the 725-ms and 1,125-ms interpolated tasks presented in an alternating fashion. Retention test proce-dures were the same as those during acquisition, with the exception that KR was not administered. For the 725-ms and 1,125-ms interpolated tasks, the monitor displayed the respective overall goal movement time as the cue to respond. No KR was given for the interpolated tasks.

Data Treatment and Analysis

Movement time (MT) was recorded for each trial during acquisition and retention testing. MT was the time elapsed from the depression of the first key until depres-sion of the final key in the sequence. CE was calculated for each trial and then averaged across blocks of five trials. ACE and variable error (VE) were calculated for each block of trials. CE was the difference between actual MT and goal movement times. ACE was the absolute value of CE for each participant. VE was the population standard deviation3 for the trials in a block. CE, ACE, and

Figure 1. Diagram of the key-pressing task displayed to the participants. The arrows indicate the sequence in which the keys were pressed.

Fairbrother.indd 173 5/6/2010 11:50:01 AM

Dow

nloa

ded

by [

Nor

th C

arol

ina

Stat

e U

nive

rsity

] at

08:

13 3

0 A

pril

2013

Page 5: The Effects of Repeated Retention Tests Can Benefit as Well as Degrade Timing Performance

174 RQES: June 2010

Fairbrother and Barros

VE were considered to represent timing error, timing er-ror without regard to direction, and timing consistency, respectively (Schmidt & Lee, 1999, pp. 21–23). Spatial errors were recorded when a participant did not depress the correct key sequence.

For acquisition, CE, ACE, and VE data were averaged into eight blocks of five trials and analyzed using separate 3 (group: No-INT, INT, CTRL) x 8 (block) analyses of variance (ANOVAs), with the last factor as a repeated measure. For retention tests, data were averaged into three blocks of five trials. Performance during Ret 2 was analyzed using separate 3 (group: No-INT, INT, CTRL) x 3 (block) ANOVA with repeated measures on the last factor. Performance across the two retention tests was analyzed using separate 2 (group: No-INT, INT) x 2 (test: Ret 1, Ret 2) x 3 (block) ANOVAs with repeated measures on the last two factors. When appropriate, F ratios involving repeated measures factors were reported with the Green-house-Geisser df adjustment. Partial eta-squared values (h2) were reported to indicate effect sizes for significant results. Follow-up testing was conducted using Sidak post hoc procedures. Spatial errors during acquisition and the second retention test were analyzed using chi-square pro-cedures to compare the total number of errors for each group. Spatial errors across the repeated retention tests were analyzed using a 2 (group: No-INT, INT) x 2 (Ret 1,

Ret 2) chi-square to compare the total number of errors each group committed during each test. For all analyses, alpha was set at .05. For the interpolated tasks performed by the INT group, descriptive statistics were calculated for CE, ACE, VE, and spatial errors.

Results

Acquisition

As seen in the left panel of Figure 2, all three groups performed similarly during acquisition in terms of mean CE scores. In addition, all groups showed a decrease in CE from Block 1 to subsequent blocks. These observations accounted for the significant main effect for block, F(7, 336) = 9.88, p < .001, h² = .17. Post hoc comparisons indi-cated that Block 1 differed significantly from Blocks 4–8 (all p < .008), which did not differ from one another (all p > .737). No other post hoc comparisons were significant (all p > .05). Neither the main effect for group, F(2, 48) = 0.76, p = .474, nor the Group x Block interaction, F(14, 336) = 0.73, p = .659, was significant.

For ACE during acquisition, the three groups ap-peared to perform with slight differences in accuracy during early blocks. For example, the CTRL group pro-

Figure 2. Mean CE during acquisition and retention for the No-INT, INT, and CTRL groups.

-150

-100

-50

0

50

100

150

200

250

300

350

1 2 3 4 5 6 7 8 1 2 3 1 2 3

A c quis it ion Ret 1 Ret 2

CE

(ms)

CTRLNo-INTINT

Fairbrother.indd 174 5/6/2010 11:50:02 AM

Dow

nloa

ded

by [

Nor

th C

arol

ina

Stat

e U

nive

rsity

] at

08:

13 3

0 A

pril

2013

Page 6: The Effects of Repeated Retention Tests Can Benefit as Well as Degrade Timing Performance

RQES: June 2010 175

Fairbrother and Barros

duced a lower mean ACE score than the No-INT and INT groups during Block 1. ACE scores rapidly converged, however, so that all three groups performed similarly by Block 4. All groups also showed a decrease in ACE from early to later blocks. These observations accounted for the significant effect for block, F(7, 336) = 13.97, p < .001, h² = .225. Post hoc comparisons indicated that Block 1 differed significantly from all subsequent blocks (all p < .003), which did not differ from one another (all p > .586). The ANOVA also revealed a main effect for group, F(2, 48) = 3.22, p = .049, h² = .12. However, post hoc comparisons revealed no significant differences between groups (all p > .078). The Group x Block interaction, F(14, 336) = 0.60, p = .759, was not significant. A post hoc analysis of variance conducted on Blocks 2–8 revealed no significant effects for block, group, or the interaction between the two (all p > .085).

As seen in the left panel of Figure 3, mean VE scores during acquisition were similar for all three groups. In addition, all groups showed decreased VE across blocks. These observations accounted for the significant main ef-fect for block, F(7, 336) = 14.39, p < .001, h² = .23. Post hoc comparisons indicated that Block 1 differed significantly from Blocks 2–8 (all p < .036) and that Block 2 differed significantly from Block 5 (p = .025). No other post hoc comparisons were significant (all p > .056). Neither the

main effect for group, F(2, 48) = 0.25, p = .780, nor the Group x Block interaction, F(14, 336) = 1.30, p = .240, was significant. The total number of errors committed by each group was similar during acquisition (M = 20; SD = 3.61). The analysis of errors did not reveal a significant effect for group, χ².05 (2) = 1.30, p > .05.

Retention

The middle and right panels of Figure 2 show mean CE scores during Ret 1 and Ret 2, respectively. During Ret 2, the INT group (filled circles) showed degraded accuracy across blocks, while the No-INT (open circles) and CTRL (open squares) showed relatively stable per-formance. The Group x Test x Block analysis conducted on both retention tests revealed significant Test x Block, F(2, 64) = 8.59, p < .001, h² = .21, and Group x Test x Block, F(2, 64) = 5.14, p = .008, h² = .14, interactions. Post hoc comparisons following the three-way interaction revealed a significant increase in mean CE from Block 1 to Blocks 2–3 during Ret 2 for the INT group (both p < .025). Thus, the CE profile across blocks differed on the two tests due to the interference manipulation. That is, degraded accuracy across blocks during Ret 2 occurred only when participants performed interpolated tasks between tests. The main effects and all other interactions

0

20

40

60

80

100

120

140

1 2 3 4 5 6 7 8 1 2 3 1 2 3

A c quis it ion Ret 1 Ret 2

VE (m

s)

CTRLNo-INTINT

Figure 3. Mean VE during acquisition and retention for the No-INT, INT, and CTRL groups.

Fairbrother.indd 175 5/6/2010 11:50:02 AM

Dow

nloa

ded

by [

Nor

th C

arol

ina

Stat

e U

nive

rsity

] at

08:

13 3

0 A

pril

2013

Page 7: The Effects of Repeated Retention Tests Can Benefit as Well as Degrade Timing Performance

176 RQES: June 2010

Fairbrother and Barros

were not significant (all p > .154). The Group x Block analysis conducted on Ret 2 revealed a significant main effect for block, F(2, 96) = 5.74, p = .008, h² = .11. Post hoc comparisons revealed that CE scores during Block 1 were significantly lower than during Block 3 (p = .023). Block 2 did not differ significantly from Block 1 (p = .102) or Block 3 (p = .272). The main effect for group, F(2, 48) = 1.42, p = 252, and the Group x Block interaction, F(4, 96) = 1.10, p = .356, were not significant. For ACE, the Group x Test x Block analysis on both retention tests revealed no significant main effects or interactions (all p > .153). The Group x Block analysis conducted on Ret 2 revealed no significant main effects or an interaction (all p > .512).

The middle and right panels of Figure 3 show mean VE scores during Ret 1 and Ret 2, respectively. The Group x Test x Block analysis on both retention tests revealed a significant main effect for test, F(1, 32) = 9.79, p = .004, h² = .23. Performance during Ret 1 was more variable than during Ret 2. Other main effects and interactions were not significant (all p > .188). The Group x Block analysis of Ret 2 revealed a significant main effect for group, F(2, 48) = 5.24, p = .009, h² = .18. Post hoc comparisons revealed that the CTRL group (open squares) produced significantly higher VE scores than the No-INT (open circles; p = .046) and INT (filled circles; p = .012) groups, which did not differ from one another (p = .941).

The total number of errors committed by each group was similar during retention testing (M = 2.50; SD = 2.65). The Group x Test chi-square analysis conducted on both retention tests revealed no significant effects, χ².05 (1) = 1.67, p > .05. The analysis of errors on Ret 2 also revealed no significant effect, χ².05 (2) = 2.67, p > .05. Because the num-ber of errors committed in the present study was higher than that reported by Magnuson et al. (2004), a post hoc correlation analysis was conducted to examine the relationships between errors and performance measures during retention. Results revealed a significant correlation between the number of acquisition errors and VE during Ret 2 (r = .42; p = .002). There was a moderate relation-ship such that VE during Ret 2 tended to be greater for participants who committed more errors during acquisi-tion. In addition, the number of errors during Ret 2 was positively correlated to CE (r = .30; p = .032), ACE (r = .40; p = .004), and VE (r = .35; p = .013) during Ret 2. Each correlation indicated moderate relationships such that higher CE, ACE, and VE scores were associated with a greater number of total errors during Ret 2.

Interpolated Tasks

Mean performance scores indicated that participants performed both interpolated tasks with less accuracy than the 925-ms task. CE scores indicated that participants un-derestimated the time required to complete the 725-ms task (M = -106.67 ms; SD = 85.29 ms) and overestimated

the time required to complete the 1,125-ms task (M = +279.45 ms; SD = 222.47 ms). ACE scores showed a rela-tively high degree of inaccuracy for both the 725-ms (M = 120.98 ms; SD = 62.87 ms) and the 1,125-ms tasks (M = 286.35 ms; SD = 213.33 ms). VE scores indicated that participants also performed the 725-ms and 1125-ms interpolated tasks with less consistency than the 925-ms task (725 ms: M = 60.81 ms, SD = 97.58 ms; 1,125 ms: M = 94.92 ms, SD = 70.00 ms). There were only four errors on the interpolated tasks, and no participant committed more than one error.

Discussion

The aim of the present study was to (a) replicate the procedures used by Magnuson et al. (2004) in demonstrat-ing the effects of repeated retention tests on timing accu-racy and (b) investigate the potential interaction between intertask interference and repeated testing. The most important contribution of this study was that, although there were reliable effects involving the test factor, the results did not match those reported by Magnuson et al. (2004) and they did not fit the expectations from their faulty re-encoding hypothesis and speculation regarding the influence of intertask interference. Indeed, the differ-ent effects of the repeated testing manipulation across two procedurally similar studies suggests there is no “repeated retention testing effect” per se that uniformly changes per-formance in one direction. At the same time, the evidence supports the idea that administering a retention test is not a neutral event but instead can influence performance on a second test. Thus, the present results give additional weight to Schmidt and Lee’s (1999) recommendation that researchers adopt a split-group design when assessing learning at more than one retention delay.

We made three predictions based on the faulty re-encoding hypothesis and the proposition that intertask interference would influence the effects of repeated retention tests. First, we predicted that (a) timing ac-curacy (CE and ACE) would be degraded during Ret 2 for the No-INT and INT groups compared to the CTRL group; (b) timing accuracy would also be degraded across successive tests for the No-INT and INT groups; and (c) the decrements in timing accuracy would be more pronounced for the INT group than the No-INT group. The results for CE partially supported the second and third predictions. A Group x Test x Block interaction indicated that the interference manipulation influenced performance across successive tests. Specifically, there was a significant decrease in timing accuracy (increase in CE) after the first block of trials during Ret 2 for the INT group but not for the No-INT group. This result showed that timing accuracy in terms of CE was degraded across successive tests for the INT group (Prediction 2) and the

Fairbrother.indd 176 5/6/2010 11:50:02 AM

Dow

nloa

ded

by [

Nor

th C

arol

ina

Stat

e U

nive

rsity

] at

08:

13 3

0 A

pril

2013

Page 8: The Effects of Repeated Retention Tests Can Benefit as Well as Degrade Timing Performance

RQES: June 2010 177

Fairbrother and Barros

INT group showed greater changes in timing accuracy across successive tests than the No-INT group (Predic-tion 3). The first prediction was not supported. Analyses of Ret 2 performance showed no significant differences between the No-INT, INT, and CTRL groups. Evidence from the current study suggests any faulty re-encoding that might have occurred was due to the combined effects of repeated testing and the interference manipulation; however, repeated testing alone did not lead to faulty re-encoding. Thus, the results for timing accuracy provided little support for the faulty re-encoding hypothesis.

In contrast to the results for timing accuracy, the significant effects of repeated retention tests seen in VE scores clearly demonstrated that repeated testing can influence timing consistency. Interestingly, these effects showed that repeated testing actually facilitated timing consistency as opposed to degrading it. Both the No-INT and INT groups had significantly lower VE scores than the CTRL group during Ret 2, and both groups showed a significant decrease in VE from Ret 1 to Ret 2. The interference manipulation had no effect on VE. These findings were in direct opposition to the predictions for timing accuracy that were based on the faulty re-encoding hypothesis. Magnuson et al. (2004) did not discuss timing consistency in their discussion (presumably because they did not find significant test effects in VE), but the mem-ory updating and re-encoding processes they forwarded should logically apply to this aspect of performance. If re-encoding operates similarly for both timing accuracy and consistency, the overall results of the current study are inconsistent with the faulty re-encoding hypothesis. Despite the fact that the current findings conflict with the notion of faulty re-encoding, the effects in VE were consis-tent with a more general mechanism of re-encoding that offers a way to explain the effects of repeated retention tests in both this study and in Magnuson et al. (2004).

It is important to note that although they directed much of their discussion toward explaining the degraded timing accuracy, Magnuson et al. (2004) acknowledged their results may have been a specific instance of a more general memory updating phenomenon that would also include cases in which performance improved as a result of repeated testing. Such improvement is consistent with the VE findings in the present study and previous research showing enhanced performance result from repeated testing in the verbal domain (for a review, see Roediger & Karpicke, 2006). It is possible that updating a memory representation is a general process depending on the quality of the re-encoded information. If so, it would be expected that re-encoding information contain-ing errors would undermine subsequent performance as seen in Magnuson et al. (2004) and other research (e.g., Bergman & Roediger, 1999; Estes, 1997). On the other hand, if re-encoding incorporates accurate task information, then subsequent performance might be

stabilized or even enhanced, as in the current experi-ment. From this perspective, it might be important that performance during Ret1 in the current study was accu-rate when compared to that of the D2-D2 group during the corresponding test in Magnuson et al. (2004). If the strong performance during Ret1 in the current study resulted in re-encoding of accurate task information, then a strengthened memory representation for the task might have stabilized timing accuracy and enhanced tim-ing consistency during Ret 2. In contrast, the relatively inaccurate performance by the D2-D2 group during the first retention test in Magnuson et al. (2004) may have resulted in re-encoding that incorporated inaccurate task information and degraded subsequent performance.

The possibility that the quality of memory updating depends on accuracy during an initial retention test is intriguing, as it suggests the effects of certain instructional approaches (e.g., repeated testing) may interact with individual differences that influence initial learning. For example, Chiviacowsky and Wulf (2002, 2005) argued that allowing participants to control certain aspects of the instructional setting enhanced learning because it created a practice setting that better met the learner’s needs and preferences. Perhaps long-term retention can be facilitated by procedures that ensure training to a point at which the first test performance will be accu-rate enough to support successful re-encoding. Thus, it may be that participants in Magnuson et al. (2004) did not learn the task well enough to facilitate successful re-encoding, whereas participants in the present study did. This issue might be addressed by letting participants decide how much practice they want to complete before taking a first retention test or by adopting a trials-to-crite-rion procedure to ensure all participants reach a certain level of task proficiency during acquisition.

Processes related to memory consolidation and en- hancement may also have influenced the results of the current study (for a review, see Walker, 2005, and corre-sponding commentaries). Memory consolidation studies have shown that a 24-hr delay with sleep following initial practice of a procedural task can lead to memory consoli-dation or enhancement for the task as indicated by faster and more accurate performance (e.g., Walker, Brakefield, Hobson, & Stickgold, 2003; Walker, Brakefield, Morgan, Hobson, & Stickgold, 2002). VE results in the present experiment were consistent with this perspective. It is possible that biochemical processes supporting memory consolidation and enhancement occurred during the overnight retention delay. Memory consolidation and enhancement supporting timing consistency would have yielded the observed pattern of results in VE for the No-INT and INT groups. Initial performance during Ret 1 after an overnight delay (23 hr 50 min) showed VE scores similar to those at the end of acquisition. Moreover, timing consistency continued to improve throughout retention

Fairbrother.indd 177 5/6/2010 11:50:02 AM

Dow

nloa

ded

by [

Nor

th C

arol

ina

Stat

e U

nive

rsity

] at

08:

13 3

0 A

pril

2013

Page 9: The Effects of Repeated Retention Tests Can Benefit as Well as Degrade Timing Performance

178 RQES: June 2010

Fairbrother and Barros

(Ret 1 and 2). In addition, memory consolidation research also showed that tasks can be relatively immune to inter-ference once they are consolidated. This would explain why the INT group was not different from the No-INT group in consistency during Ret 2. Memory consolidation may have also played a role in the high degree of timing accuracy (CE) seen in both the No-INT and INT groups during Ret 1. Such consolidation may have partially offset timing accuracy decrements produced by the interference and repeated retention manipulations (i.e., resulting in the Group x Test x Block interaction in CE that showed changes during Ret 2 for the INT group but not the No-INT group).

Results from the current study combined with the two previous studies (Magnuson et al., 2004; Fairbrother et al., 2007) illustrate that repeated testing is a complex phenomenon. In one case, repeated testing degraded tim-ing accuracy (Magnuson et al., 2004). In another, it slowed total time for speeded-response tasks only with repeated error trials and had no impact on accuracy for timing tasks (Fairbrother et al., 2007). In the current study, repeated testing improved timing consistency. Two conclusions can be drawn from these studies: (a) under certain conditions an initial test can significantly influence performance on a later test, and (b) further research is needed to fully understand the divergent findings reported thus far. In-terestingly, this divergence in findings across motor studies parallels the literature on repeated testing in the verbal domain—with some results showing memory distortions (e.g., Estes, 1997) and some showing enhancements (e.g., Carpenter & DeLosh, 2006).

Memory enhancements in the verbal domain have been attributed to the idea that the initial test provides opportunities for retrieval practice (Butler & Roediger, 2007). In the current study, retrieval practice during Ret 1 might have benefited timing consistency if participants in the No-INT and INT groups learned the task well enough to subjectively evaluate that their retrieval effort was suc-cessful in meeting the task goal. Schmidt and Lee (2005) argued that practice develops error-detection capabilities, and previous research has shown that error estimation can benefit learning (e.g., Hogan & Yanowitz, 1978). It is possible that successful error estimation combined with retrieval practice during Ret 1 acted to reinforce the schema for the response. According to schema theory (Schmidt, 1975), a movement response is based on an abstract memory representation (i.e., a schema) that in-cludes information about the relationships between initial conditions, response specifications, sensory consequences, and response outcomes. In other words, if participants in the present study developed a schema strong enough to result in accurate performance during Ret 1, the ad-ditional retrieval practice would further strengthen the schema and enhance subsequent performance. In con-trast, if the participants in Magnuson et al. (2004) had a

poorly developed schema and were unable to accurately evaluate sensory consequences, retrieval during the first test would lead to poor performance during that test and might also weaken the schema, resulting in the degraded performance during the second test.

The results of this study present at least three im-portant implications. First, these findings reinforce the evidence that retention testing is not a neutral event, and due consideration should be given to using repeated test-ing in experimental designs. Unfortunately, the present state of knowledge offers only the vague advice to “handle with care” without providing a full understanding of the other factors to consider when deciding whether repeated testing is truly problematic. Second, researchers should be aware of the potential impact repeated retention proce-dures might logically have on interpreting experimental results. Although Fairbrother et al. (2007) found no in-teractions between repeated testing and practice schedule effects, the fairly widespread practice of administering two or more retention tests to the same individuals warrants further research. Third, retention testing might offer a viable technique to enhance motor learning in practical settings. When considering testing as a learning strategy, practitioners should consider doing so only when condi-tions indicate it will likely reinforce correct responses to avoid the possible negative consequences associated with poor initial test performance (as seen in Magnuson et al., 2004). For example, during basketball practice of a new defensive set a coach might want to allow players to “walk-through” the drill more than once before executing it at full speed. The additional “walk-throughs” should increase the likelihood of correct execution when players are subsequently “tested” in a full-speed run-through and, thereby, reinforce the correct response.

References

Bartlett, F. C. (1932). Remembering: A study in experimental and social psychology. London: Cambridge University Press.

Bergman, E. T., & Roediger, H. L., III. (1999). Can Bartlett’s repeated reproduction experiments be replicated? Memory & Cognition, 27, 937–947.

Butler, A. C., & Roediger, H. L., III (2007). Testing improves long-term retention in a simulated classroom setting. Eu-ropean Journal of Cognitive Psychology, 19, 514–527.

Carpenter, S. K., & DeLosh, E. L. (2006). Impoverished cue support enhances subsequent retention: Support for the elaborative retrieval explanation of the testing effect. Memory & Cognition, 34, 2, 268–276.

Chandler, C. C., & Fisher, R. P. (1996). Retrieval processes in witness memory. In E. L. Bjork & R. S. Bjork (Eds.), Memory (pp. 493–524). San Diego, CA: Academic Press.

Chiviacowsky, S., & Wulf, G. (2002). Self-controlled feedback: Does it enhance learning because performers get feedback when they need it? Research Quarterly for Exercise and Sport, 73, 408–415.

Fairbrother.indd 178 5/6/2010 11:50:02 AM

Dow

nloa

ded

by [

Nor

th C

arol

ina

Stat

e U

nive

rsity

] at

08:

13 3

0 A

pril

2013

Page 10: The Effects of Repeated Retention Tests Can Benefit as Well as Degrade Timing Performance

RQES: June 2010 179

Fairbrother and Barros

Chiviacowsky, S., & Wulf, G. (2005). Self-controlled feedback is effective if it is based on the learner’s performance. Research Quarterly for Exercise and Sport, 76, 42–48.

Christina, R. W., & Shea, J. B. (1988). The limitations of gener-alization based on restricted information. Research Quarterly for Exercise and Sport, 59, 291–297.

Christina, R. W., & Shea, J. B. (1993). More on assessing the re-tention of motor learning based on restricted information. Research Quarterly for Exercise and Sport, 64, 217–222.

Estes, W. K. (1997). Processes of memory loss, recovery, and distortion. Psychological Review, 104, 148–169.

Fairbrother, J. T., Shea, J. B., & Marzilli, T. S. (2007). Repeated retention testing effects do not generalize to a contextual interference protocol. Research Quarterly for Exercise and Sport, 78, 465–475.

Hogan, J. C., & Yanowitz, B. A. (1978). The role of verbal estimates of movement error in ballistic skill acquisition. Journal of Motor Behavior, 10, 133–138.

Magnuson, C. E., Shea, J. B., & Fairbrother, J. T. (2004). Effects of repeated retention tests on learning a single timing task. Research Quarterly for Exercise and Sport, 75, 39–46.

Roediger, H. L., & Karpicke, J. D. (2006). Test-enhanced learn-ing: Taking memory tests improves long-term retention. Psychological Science, 17, 249–255.

Salmoni, A. W., Schmidt, R. A., & Walter, C. B. (1984). Knowl-edge of results and motor learning: A review and critical reappraisal. Psychological Bulletin, 95, 355–386.

Schmidt, R. A. (1975). A schema theory of discrete motor skill learning. Psychological Review, 82, 225–260.

Schmidt, R. A., & Lee, T. D. (1999). Motor control and learning: A behavioral emphasis (3rd ed.). Champaign, IL: Human Kinetics.

Schmidt, R. A., & Lee, T. D. (2005). Motor control and learning: A behavioral emphasis (4th ed.). Champaign, IL: Human Kinetics.

Shea, J. B., & Zimny, S. T. (1983). Context effects in memory and learning movement information. In R. A. Magill (Ed.), Memory and control of action (pp. 345–366). Amsterdam: North-Holland Company.

Shea, J. B., & Zimny, S. T. (1988). Knowledge incorporation in motor representation. In O. G. Meijer & K. Roth (Eds.), Complex movement behavior: The motor-action controversy (pp. 289–314). Amsterdam: Elsevier.

Underwood, B. J. (1957). Interference and forgetting. Psychologi-cal Review, 64, 49–60.

Walker, M. P. (2005). A refined model of sleep and the time course of memory formation. Behavioral and Brain Sciences, 28, 51–104.

Walker, M. P., Brakefield, T., Hobson, J. A., & Stickgold, R. (2003). Dissociable stages of human memory consolidation and reconsolidation. Nature, 425, 616–620.

Walker, M. P., Brakefield, T., Morgan, A., Hobson, J. A., & Stick-gold, R. (2002). Practice with sleep makes perfect: Sleep-dependent motor skill learning. Neuron, 35, 1–20.

Notes

1. According to the conventions used by Magnuson et al. (2004), the letter “D” refers to the day on which a test was administered. Thus, the D1-D1 group took both tests on Day 1 (the same day as acquisition), the D1-D2 group took one test on Day 1 and Day 2, and the D2-D2 group took both tests on Day 2.2. Because Magnuson et al. (2004) found a decrement in accuracy (i.e., increases in CE and ACE scores) due to repeated testing, we assumed their use of the term ac-centuate referred to a magnification of this decrement.3. The equation for the population standard deviation is similar to the one for the sample, except the denominator is n instead of n-1. The equation used to calculate VE for a block of trials is the same as the equation for population standard deviation for the trials in that block.

Authors’ Notes

The authors thank Craig Wrisberg for his helpful com-ments during the preparation of this manuscript.Please address correspondence concerning this article to Jeffrey T. Fairbrother, Department of Exercise, Sport, and Lei-sure Studies, University of Tennessee, 1914 Andy Holt Av-enue, 322 HPER Building, Knoxville, TN 37996-2700.

E-mail: [email protected]

Fairbrother.indd 179 5/6/2010 11:50:03 AM

Dow

nloa

ded

by [

Nor

th C

arol

ina

Stat

e U

nive

rsity

] at

08:

13 3

0 A

pril

2013