Transcript
Page 1: An Online Spaced-Education Game to Teach and Assess Residents: A Multi-Institutional Prospective Trial

EDUCATION

An Online Spaced-Education Game to Teach andAssess Residents: A Multi-InstitutionalProspective TrialB Price Kerfoot, MD, EdM, Harley Baker, EdD

BACKGROUND: While games are frequently used in resident education, there is little evidence supporting theirefficacy. We investigated whether a spaced-education (SE) game can be both a reliable and validmethod of assessing residents’ knowledge and an effective means of teaching core content.

STUDY DESIGN: The SE game consisted of 100 validated multiple-choice questions and explanations on core urologycontent. Residents were sent 2 questions each day via email. Adaptive game mechanics re-sent the ques-tions in 2 or 6 weeks if answered incorrectly and correctly, respectively. Questions expired if not answeredon time (appointment dynamic). Residents retired questions by answering each correctly twice in a row(progressiondynamic).Competitionwas fosteredbyposting relativeperformanceamongresidents.Mainoutcomes measures were baseline scores (percentage of questions answered correctly on initial presenta-tion) and completion scores (percentage of questions retired).

RESULTS: Nine hundred thirty-one US and Canadian residents enrolled in the 45-week trial. Cronbach alphareliability for the SE baseline scores was 0.87. Baseline scores (median 62%, interquartile range[IQR] 17%) correlated with scores on the 2008 American Urological Association in-service exam-ination (ISE08), 2009 American Board of Urology qualifying examination (QE09), and ISE09 (r �0.76, 0.46, and 0.64, respectively; all p � 0.001). Baseline scores varied by sex, country, medicaldegree, and year of training (all p � 0.001). Completion scores (median 100%, IQR 2%) correlatedwith ISE08 and ISE09 scores (r � 0.35, p � 0.001 for both). Seventy-two percent of enrollees (667of 931) requested to participate in future SE games.

CONCLUSIONS: An SE game is a reliable and valid means to assess residents’ knowledge and is a well-acceptedmethod by which residents can master core content. ( J Am Coll Surg 2012;214:367–373.

© 2012 by the American College of Surgeons)

camtg

gkdgrpae

In spite of a paucity of evidence supporting their efficacy,games are frequently used for graduate medical education.In a recent survey, 80% of family medicine and internalmedicine residency program directors in the United States

Disclosure Information: Dr Kerfoot owns equity in and is a board memberof Qstream Inc. and has authored courses on Qstream’s websites, but re-ceived no compensation from the courses or the company. Mr Baker hasnothing to disclose.Supported in part by the American Urological Association (Linthicum, MD), theAmericanUrologicalAssociationFoundation(Linthicum,MD),AstellasPharmaUS,Inc, and the United States Agency for Healthcare Research and Quality.Theviewsexpressed in this articleare thoseof theauthorsanddonotnecessarily reflectthe position and policy of the United States Federal Government or the Departmentof Veterans Affairs. No official endorsement should be inferred.

Received September 20, 2011; Revised November 19, 2011; Accepted No-vember 21, 2011.From the Surgical Service (Urology Section), Veterans Affairs Boston Health-care System (Kerfoot) and Harvard Medical School (Kerfoot), Boston, MA;and California State University, Channel Islands, Camarillo, CA (Baker).Correspondence address: B Price Kerfoot, MD, EdM, VA Boston Healthcare

System, 150 South Huntington Ave, 151DIA, Jamaica Plain, MA 02130;email: [email protected]

367© 2012 by the American College of SurgeonsPublished by Elsevier Inc.

reported using games in their residency programs.1 Gamesan be defined as outcomes-oriented activities that proceedccording to a set of rules, often involve focused decision-aking, and can range from elaborate war games involving

housands of military personnel to simple Jeopardy-likeames often used in medicine residencies.1,2 The 2011 Ho-

rizon Report identifies “game-based learning” as 1 of 6emerging technologies likely to have a large impact on ed-ucation over the coming 5 years.3 Although educationalames have been used effectively to improve patients’nowledge and behaviors,4-7 little research has been con-ucted to demonstrate the effectiveness of educationalames in graduate or postgraduate medical education. Aecent Cochrane review on the use of games for healthrofessional education was unable to conduct any meta-nalyses because the authors could identify only 1 studyligible for analysis.8

We created a novel online educational game by incorpo-rating adaptive game mechanics into an evidence-based

form of online education, termed “spaced education” (SE).

ISSN 1072-7515/12/$36.00doi:10.1016/j.jamcollsurg.2011.11.009

Page 2: An Online Spaced-Education Game to Teach and Assess Residents: A Multi-Institutional Prospective Trial

368 Kerfoot and Baker Online Spaced-Education Game for Residents J Am Coll Surg

Based on 2 psychology research findings (the spacing andtesting effects), SE has been shown in randomized trials toimprove knowledge acquisition, boost learning retentionfor up to 2 years, and durably improve clinical behavior.9-12

SE is currently delivered using periodic emails that containclinical case scenarios and multiple-choice questions. Onsubmitting an answer, the clinician is immediately pre-sented with the correct answer and an explanation of thetopic. The material is then re-presented in a cycled patternover 8 to 42 days to reinforce the content.

We introduced adaptive game mechanics to SE to indi-vidualize the pattern of SE reinforcement and content foreach clinician based on his or her performance on the SEquestions. For example, a question is repeated in 2 weeksif answered incorrectly, repeated in 6 weeks if answeredcorrectly, and retired (no longer repeated) once an-swered correctly twice in a row. Additional game me-chanics include an appointment dynamic (questions ex-pire if not answered on time) and a progression dynamic(players work to goal by retiring questions; new ques-tions are introduced as older questions are retired).13

The SE game also fosters competition between residentsby displaying how others have answered each questionand how many other residents have already retired thatquestion. A recent randomized trial showed that theseadaptive game mechanics can boost learning efficiencyby more than 35%.14

It is not known whether this adaptive SE game is effectiveand well accepted as a method of graduate medical education.Using graduate urology education as an experimental system,we conducted a prospective multi-institutional trial to deter-mine if an SE game can be an effective method to teach andassess residents.

METHODSStudy participantsAll urology residents in the US and Canada were eligible toparticipate. Participants were recruited via email. Therewere no exclusion criteria. Institutional review board ap-

Abbreviations and Acronyms

ABU � American Board of UrologyAUA � American Urological AssociationIQR � interquartile rangeISE � In-Service ExaminationQE � Qualifying Examination (Part 1 of the ABU

Certification Process)SASP � Self-Assessment Study ProgramSE � spaced education

proval was obtained to perform this study.

Development of contentThe SE game intervention consisted of 100 multiple-choice questions and explanations. Ninety items from the2005 to 2007 Self-Assessment Study Programs (SASP)were selected across all urology topic areas for content va-lidity and for their educational value for residents. TheseSASP items are developed annually through an iterativeprocess of content validation by a panel of 10 to 12 urolo-gists on the American Urological Association (AUA)/American Board of Urology (ABU) Examination Commit-tee. A draft question is written by a panel member, thequestion is critiqued by the urologist panel, and the ques-tion is altered based on these comments. Most questionsare field tested on the ABU Qualifying Examination (QE).Based on the psychometric performance of the item, thepanel then decides whether to include that question on theQE, the AUA In-Service Examination (ISE), or the AUASASP. Some SASP questions are also retired questions fromthe QE or ISE. Each SASP question contains several ele-ments, including a multiple choice question on a core urol-ogy topic, the correct answer to the question, a detailedexplanation of the correct and incorrect answers, and areference to pertinent data in the medical literature. Al-though developed through a similar process by the AUA/ABU Examination Committee, ISE and QE questions arenot identical to SASP questions. Unlike the ISE and QE,the SASP includes new questions that have not yet beenpsychometrically tested, contains items that that have in-sufficient difficulty for the ISE or QE, and includes ques-tions whose primary focus is to introduce new concepts tothe urologic community. Ten SE questions focused on gen-itourinary histopathology (a topic in the urology boardexaminations) and had been constructed and validated fora previous trial.15

Structure of the gameThe game used an automated, interactive email system de-veloped at Harvard Medical School. Upon clicking a hy-perlink in an email, a Web page opened that allowed en-rollees to submit an answer to a multiple-choice question.The order of possible answer choices was randomized ateach presentation. The answer was downloaded to a centralserver, and residents were immediately presented with aWeb page displaying the correct answer and a detailed ex-planation of the content. The adaptive game mechanicswould repeat questions in 2 weeks if answered incorrectlyand 6 weeks if answered correctly. Spacing intervals be-tween repetitions were established based on psychology re-search findings to optimize long-term retention of learn-ing.16,17 If a question was not answered within 2 weeks of itsarrival, it expired, was marked as answered incorrectly, and

was cycled back to the resident again (appointment dy-
Page 3: An Online Spaced-Education Game to Teach and Assess Residents: A Multi-Institutional Prospective Trial

TrasshOpa

1pfs

uwtictabm

369Vol. 214, No. 3, March 2012 Kerfoot and Baker Online Spaced-Education Game for Residents

namic).13 If a question was answered correctly twice con-secutively, it was retired and not repeated again (progres-sion dynamic).13 The goal was to retire all 100 questions.

he length of the SE game therefore varied based on eachesident’s baseline knowledge and his or her ability to learnnd retain knowledge from the SE questions. To foster aense of competition and community, residents werehown how other enrollees answered a given question andow many other residents had already retired that question.n retiring �80% of the items and completing an end-of-

rogram survey, residents received a $25 gift certificate ton online bookstore.

Study designThis multi-institutional prospective trial was conductedover 45 weeks, from August 2008 to June 2009. Residentswere sent 2 SE questions-explanations every day via theautomated email delivery system. Server errors occurredduring 10 days of the trial (days 73 to 77, 99, 136, 138 to139, 156); these errors caused duplicate questions to besent to residents who had fallen behind and allowed thesequestions to expire. On the end-of-program online survey,residents were asked if they would want to participate inadditional SE games.

Scoring and outcomes measuresPerformance on the SE game was calculated by 2 methods:baseline scores and completion scores. Baseline scores mea-sured residents’ pregame knowledge of guidelines contentand were calculated as the percentage of questions an-swered correctly on initial presentation. Completion scoreswere calculated as the percentage of SE questions retired byresidents. Completion scores reflect residents’ ability tomaster the content by answering the questions correctlytwice in a row, separated by a 6-week interval. Validity wasassessed relative to residents’ ISE08, ISE09, and QE09scores.

Data sourcesThe AUA In-Service Examination (ISE) is an annual proc-tored examination administered contemporaneously toresidents across the US and Canada via booklet examina-tion on a single Saturday in mid-November. The ISE scoresand within-group percentile rankings from 2008 and 2009(ISE08 and ISE09, respectively) were obtained from theAUA. The scored ISE08 and ISE09 examinations consistedof 163 and 164 multiple-choice questions, respectively.The ABU Qualifying Examination (QE) is the first part ofthe urology board certification process and is given annu-ally in a proctored setting contemporaneously via com-

puter in the late summer. Applicants must pass the QE to

be eligible to sit for the oral Certifying Examination. QEscores and percentile rankings from 2009 (QE09) wereobtained from the ABU. The scored QE09 examinationconsisted of 191 multiple-choice questions. The process ofquestion development of the ISE and QE is similar to thatfor the SASP questions outlined above.

Statistical analysisPower was estimated to be �0.9 for all planned analyses if450 residents completed the trial, assuming a 0.2 effect sizeand an alpha of 0.05. Reliability (internal consistency) ofthe ISE, QE, and the 100 SE questions on initial presenta-tion (baseline) was estimated with Cronbach’s alpha.18,19

Cohen’s d provided the intervention effect sizes, with 0.2generally considered a small effect, 0.5 a moderate effect,and 0.8 a large effect.20,21 Residents who submitted at least

answer to all 100 questions and did not receive any du-licate questions were included in score analysis. Evidenceor construct validity was obtained by assessing baselinecore performance by year of training.

Due to the non-normal nature of the completion scores,nivariate analyses of both baseline and completion scoresere performed with Mann Whitney U and Kruskal-Wallis

ests. For consistency, all scores are reported as median andnterquartile range (IQR). Chi-square was used to assess forohort-level differences in demographic characteristics. Af-er determining that the baseline score data fit the modelnd statistical assumptions, we performed an analysis ofaseline scores using the main effects ANOVA model. Thisodel, which is insensitive to violations of normality,21

adjusts for the simultaneous influence of all of the indepen-dent variables, providing a more reliable method of deter-mining the relationship between baseline scores and resi-dent characteristics.21 In the analysis of completion scores,a strong ceiling effect was found: three-quarters (73.1%) ofthe participants retired either 99% or 100% of the ques-tions, and only 3.3% retired less than 90% of the questions.Under these conditions, multivariate statistical analyses ofdifferences are not possible or useful. Statistical analyseswere performed with SPSS 19.0 (Chicago, IL).

RESULTSOf approximately 1,100 eligible residents ranging in train-ing level from internship (pre-urology) to chief residentyear (uro-4), 931 (85%) enrolled in the trial (Table 1).Seventy-nine percent of enrollees (739 of 931) submittedat least 1 answer to all 100 questions. Among this group,278 residents received duplicate questions due to servererror and were excluded from analysis. Score analyses weretherefore performed on data from the remaining 461 resi-

dents (50% of all 931 enrollees). Among these residents,
Page 4: An Online Spaced-Education Game to Teach and Assess Residents: A Multi-Institutional Prospective Trial

vC[dormbu0(

370 Kerfoot and Baker Online Spaced-Education Game for Residents J Am Coll Surg

matched scores on the ISE08, ISE09, and QE09 were avail-able for 438 (95%), 326 (71%), and 98 (21%) of residents,respectively. Attrition did not vary significantly by country,degree, or sex, but did vary significantly by year of training(69% for pre-urology and 61%, 47%, 45%, and 48% urol-ogy years 1 to 4, respectively; p � 0.001, Table 1).

Cronbach alpha reliabilities for ISE08, ISE09, QE09,and SE baseline scores were 0.91, 0.89, 0.88 and 0.87,respectively. ISE08, ISE09, baseline, and completion scoresall rose consistently with year of urology training (p �0.001, Table 2). Distinct differences in SE game progresscould be detected between years of urology training as soonas 40 to 80 days after the start of the game (Fig. 1).

Overall, median baseline score was 62% (IQR 17%). Inunivariate analysis, baseline scores varied significantly byyear of training: 39% (IQR 19%), 48% (IQR 10%), 58%(IQR 12%), 65% (IQR 12%), and 69% (IQR 14%) for

Table 2. Median Score Distribution with Interquartile Rang

Year of training(in August 2008)

Baseline scores, % CompletioMedian IQR Median

Pre-urology 39 9 99Uro-1 48 10 99Uro-2 58 12 100Uro-3 65 12 100Uro-4 69 14 100

ISE08, ISE09, and baseline scores are reported as the median percentage o

Table 1. Baseline Demographic Characteristics of ResidentsCharacteristic Enrolled Included in analysis

Total, n 931 461Country, n (%)

United States 873 (94) 432 (94)Canada 58 (6) 29 (6)

Year of training, n (%)Pre-urology 52 (6) 16 (4)Uro-1 158 (17) 62 (13)Uro-2 223 (24) 118 (26)Uro-3 251 (27) 138 (30)Uro-4 247 (27) 127 (28)

Degree, n (%)MD 848 (91) 414 (90)MD, PhD 28 (3) 15 (3)DO 34 (4) 20 (4)Other 21 (2) 12 (3)

Sex, n (%)Male 732 (79) 358 (78)Female 199 (21) 103 (22)

Age, y, mean (SD) 30.4 (3.0) 30.4 (2.7)

Percentages may not add to 100% due to rounding.

questions retired (answered correctly twice in a row 6 weeks apart).IQR, interquartile range; ISE, In-Service Examination.

residents in pre-urology and years 1 to 4 of training, respec-tively (p � 0.001; Tables 2, 3; Fig. 2). Baseline scores alsoaried significantly by country (US 63% [IQR 17%] vsanada 53% [IQR 15%]; p � 0.001), by sex (male 63%

IQR 16%] vs female 59% [IQR 17%]; p � 0.001), and byegree (ranging from a median 50% [IQR 27%] amongsteopathy residents to 64% [IQR 15%] among MD-PhDesidents; p � 0.005, Table 3). In multivariate analysis withain effects ANOVA, baseline scores varied significantly

y year of training (p � 0.001, dmax� 3.39 between pre-rology residents and year 4 residents), by country (p �.001, d � 1.00), by sex (p � 0.001, d � 0.38), and by degreep � 0.001, dmax� 1.11 between osteopathic and MD-PhD

residents). Post-hoc trend analysis was performed on trainingyear to determine more clearly how the scores differed by year.Both the linear (p � 0.001, d � 1.71) and quadratic (p �0.001, d � 0.30) trends were significant, indicating that al-though scores increased significantly each year (linear trend),the actual magnitude of the increase diminished from year toyear (quadratic trend). Baseline scores correlated significantlywith ISE08, QE09, and ISE09 scores (r � 0.76, 0.46, and0.64, respectively; p � 0.001 for all).

Median completion score was 100% (IQR 2%). In uni-variate analysis, completion scores varied significantly byyear of training, ranging from a median 99% (IQR 4%)among pre-urology residents to 100% (IQR 1%) amongyear 4 residents (p � 0.001; Tables 2, 3; Fig. 1). Comple-tion scores did not vary significantly by country, medicaldegree, or sex (Table 3). Multivariate analyses of comple-tion scores with main effects ANOVA were invalidated bythe extremely strong ceiling effect (median 100%) and arenot reported. Completion scores correlated with ISE08,QE09, and ISE09 scores (r � 0.35 [p � 0.001], 0.12 [p �0.24], and 0.35 [p � 0.001], respectively).

Six hundred seventy (72.0%) enrolled residents com-pleted the end-of-program survey. Eighty-two respondents(12.2%) reported ever looking up the answers to the SEquestions before submitting their answer; they did so for amean 12.6% of the questions (SD 14.8%). When asked ifthey would want to participate in further AUA programs

Year of Trainingres, % ISE08 scores, % ISE09 scores, %

IQR Median IQR Median IQR

4 47 3 49 114 56 12 61 82 65 11 68 111 71 8 69 101 74 11 — —

stions answered correctly. Completion scores are the median percentage of

e byn sco

f que

Page 5: An Online Spaced-Education Game to Teach and Assess Residents: A Multi-Institutional Prospective Trial

m

trmeaao

D

S

371Vol. 214, No. 3, March 2012 Kerfoot and Baker Online Spaced-Education Game for Residents

using the adaptive SE game format, 667 (99.6%) surveyrespondents (71.6% of all enrollees) said “yes.”

DISCUSSIONOur study demonstrates that an SE game is an effectivemeans of teaching core content to residents and is a reliable

Figure 1. Residents’ progress on the Spaced Education (SE) game overhe course of the trial. Plots represent the median percentage of questionsetired over the duration of the game. The SE game consisted of 100ultiple-choice questions-explanations on urology clinical and basic sci-nce topics. Residents were sent 2 questions-explanations every day vian automated email system. Questions were repeated in 2 or 6 weeks ifnswered incorrectly and correctly, respectively. A question was retirednce answered correctly twice in a row.

Table 3. Baseline and Completion Scores

CharacteristicBaseline scores, %

Median IQR

CountryUnited States 63 17Canada 53 15

Year of trainingPre-urology 39 9Uro-1 48 10Uro-2 58 12Uro-3 65 12Uro-4 69 14egreeMD 62 17MD, PhD 64 15DO 50 27Other 55 19

exMale 63 16Female 59 17

The p values listed reflect univariate analyses.

*Significant results from multivariate analyses.IQR, interquartile range.

and valid method of assessing residents’ core knowledge.The SE game’s strong reliability (alpha 0.87) is similar tothat of the ISE and QE, and it enables moderate-to-highstakes decisions to be made from baseline score results.Evidence for validity of the SE game is demonstrated by theprogressive increase in scores from pre-urology training tochief resident year (uro-4) and by the correlation betweenresidents’ SE game scores and their ISE and QE perfor-mances. Importantly, the SE game was also shown to bewell accepted by residents, with 72% of all enrollees re-questing to participate in future SE games. Ideally, an SEgame would be used as one part of an overall resident eval-uation program to prospectively identify lower-performingresidents who could benefit from additional educationalsupport. Though performance characteristics of the SEgame may change when used as a summative rather thanformative evaluation and as a compulsory rather than vol-untary program, our results indicate that the 100-questiongame could be used at the very least for moderate-stakesdecisions for individual residents.

Our SE game results with residents are consistentwith those from other SE game trials involving physi-cians and medical students. Among 1,470 physicians in63 countries, a 40-question SE game on clinical practiceguidelines was able to overcome regional differences inphysicians’ baseline knowledge to substantially improveguidelines knowledge.22 Among 731 students at 3 US

edical schools, a 100-question SE game was found to

alueCompletion scores, %

p ValueMedian IQR

.001* 0.083100 299 3

.001* �0.00199 499 4

100 2100 1100 1

.005* 0.099100 2100 199 5

100 2.001* 0.195

100 299 2

p V

�0

�0

0

0

Page 6: An Online Spaced-Education Game to Teach and Assess Residents: A Multi-Institutional Prospective Trial

sts

siuympcreyhsmesrmacAsqmotttg

tstw

(ClQtabgbscataaoltaec(i

edu

372 Kerfoot and Baker Online Spaced-Education Game for Residents J Am Coll Surg

be a reliable (alpha 0.83) and valid method to assessstudent knowledge and a well accepted means of teach-ing core content.23 Taken together, the results of these 3tudies argue strongly that SE games can be valuableools to improve education and assessment across thepectrum of medical training.

There are several other interesting findings from ourtudy. First, attrition varied significantly by year of train-ng, from a high of 69% for pre-urology and low of 45% inrology year 3. A similar decreasing pattern of attrition byear of training was found in our SE game study amongedical students.23 Such a pattern of attrition is not unex-

ected given that the material in both trials was targeted onontent that would be more challenging for students andesidents early in their training. Future SE games may ben-fit by tailoring at least some of the content to participants’ear of training. Second, male residents had significantlyigher baseline scores than females (small-moderate effectize d � 0.38), a finding that was similarly found amongedical students.23 The reason for this sex-related differ-

nce is not clear and merits further study. Finally, our re-ults demonstrate that, with deliberate practice and spacedeinforcement, residents of all levels of training are able toaster core urology content. We define mastery of content

s the retirement of questions that have been answeredorrectly twice in a row, separated by a 6-week interval.lthough such a definition can be debated, multiple

ources of evidence indicate that the iterative answering ofuestions generates deep learning of content, not justemorization of the correct answers.9,12,15,24 This mastery

f content across all years of training caused a ceiling effecthat stymied statistical analysis of completion scores, buthis result should be welcomed because it demonstrateshat residents across the spectrum of training were able toain substantial educational value from the SE game.

There are several limitations to our study, includinghe SE game’s focus on medical knowledge, the inclu-ion of residents from a single surgical subspecialty, andhe server error that occurred during 10 days of the trial,

Figure 2. Diagram of study timeline. Timelintion; QE, Qualifying Examination; SE, spaced

hich required us to discard data from 278 residents

30% of enrollees) who received duplicate questions.aution must be taken in directly comparing the corre-

ations between SE game scores and ISE08, ISE09, andE09 because the number of residents in the study who

ook each examination varied substantially (95%, 71%,nd 21%, respectively). This variation is not unexpectedecause ISE09 could not be taken by chief residentsraduating in June 2009 and QE09 could only be takeny these graduating chief residents. In addition, ourtudy was not designed to assess whether the SE gamean improve ISE and QE scores. The best study design tonswer this question would be a randomized controlledrial. Although completion score measures a resident’sbility to master the game content, completion scorelso reflects several other factors, including the relevancef the content to residents’ clinical practice, their base-ine knowledge of the content, and the acceptability ofhe game mechanics and question-answer format. Therere also many strengths to the study, including the nov-lty of the educational intervention, the use of validatedontent for the game, and the increased generalizabilityexternal validity) of our findings due to the multi-nstitutional study design.

CONCLUSIONSIn summary, our study demonstrates that an SE game is areliable and valid method to assess residents’ knowledgeand is an effective and well-accepted means of teaching corecontent. An SE game can be a valuable tool to identifylower-performing residents who might benefit from addi-tional educational support. Additional research is neededto determine if educational games for residents can im-prove their practice patterns and patient outcomes, not justtheir knowledge.

Author ContributionsStudy conception and design: KerfootAcquisition of data: Kerfoot

ot drawn to scale. ISE, In-Service Examina-cation.

e is n

Analysis and interpretation of data: Kerfoot, Baker

Page 7: An Online Spaced-Education Game to Teach and Assess Residents: A Multi-Institutional Prospective Trial

373Vol. 214, No. 3, March 2012 Kerfoot and Baker Online Spaced-Education Game for Residents

Drafting of manuscript: Kerfoot, BakerCritical revision: Kerfoot, BakerFinal approval of document: Kerfoot, Baker

Acknowledgment: We recognize the invaluable work of Ron-ald Rouse, Jason Alvarez, and David Bozzi of the HarvardMedical School Center for Educational Technology for theirdevelopment of the online delivery platform used in this trial.

REFERENCES

1. Akl EA, Gunukula S, Mustafa R, et al. Support for and aspects ofuse of educational games in family medicine and internal med-icine residency programs in the US: a survey. BMC Med Educ2010;10:26.

2. Salen K, Zimmerman E. Rules of Play: Game Design Funda-mentals. Cambridge, MA: The MIT Press; 2004.

3. Johnson LA, Smith R, Willis H, et al. The 2011 Horizon Re-port. Austin, TX: The New Media Consortium; 2011.

4. Volpp KG, Troxel AB, Pauly MV, et al. A randomized, con-trolled trial of financial incentives for smoking cessation. N EnglJ Med 2009;360:699–709.

5. Volpp KG, Loewenstein G, Troxel AB, et al. A test of financialincentives to improve warfarin adherence. BMC Health ServRes 2008;8:272.

6. Volpp KG, John LK, Troxel AB, et al. Financial incentive-basedapproaches for weight loss: a randomized trial. JAMA 2008;300:2631–2637.

7. Kato PM, Cole SW, Bradlyn AS, Pollock BH. A video gameimproves behavioral outcomes in adolescents and young adultswith cancer: a randomized trial. Pediatrics 2008;122:e305–317.

8. Akl EA, Sackett K, Pretorius R, et al. Educational games forhealth professionals. Cochrane Database Syst Rev 2008:CD006411.

9. Kerfoot BP. Learning benefits of on-line spaced education per-sist for 2 years. J Urol 2009;181:2671–2673.

10. Kerfoot BP, Armstrong EG, O’Sullivan PN. Interactive spaced-education to teach the physical examination: a randomized con-trolled trial. J Gen Intern Med 2008;23:973–978.

11. Kerfoot BP, Kearney MC, Connelly D, Ritchey ML. Interactive

spaced education to assess and improve knowledge of clinicalpractice guidelines: a randomized controlled trial. Ann Surg2009;249:744–749.

12. Kerfoot BP, Lawler EV, Sokolovskaya G, et al. Durable improve-ments in prostate cancer screening from online spaced educationa randomized controlled trial. Am J Prev Med 2010;39:472–478.

13. SCVNGR’s Secret Game Mechanics Playdeck (posted Aug 25,2010 by Erick Schonfeld). TechCrunch. Available at: http://techcrunch.com/2010/08/25/scvngr-game-mechanics/. AccessedFebruary 27, 2011.

14. Kerfoot BP. Adaptive spaced education improves learning effi-ciency: a randomized controlled trial. J Urol 2010;183:678–681.

15. Kerfoot BP, Fu Y, Baker H, et al. Online spaced education gen-erates transfer and improves long-term retention of diagnosticskills: a randomized controlled trial. J Am Coll Surg 2010;211:331–337 e1.

16. Pashler H, Rohrer D, Cepeda NJ, Carpenter SK. Enhancinglearning and retarding forgetting: choices and consequences.Psychonomic bulletin & review 2007;14:187–193.

17. Cepeda NJ, Vul E, Rohrer D, et al. Spacing effects in learning: atemporal ridgeline of optimal retention. Psychol Sci 2008;19:1095–1102.

18. Cronbach LJ. Coefficient alpha and the internal structure oftests. Psychometrika 1951;16:297–334.

19. Pedhazur EJ, Schmelkin LP. Measurement, Design, and Analy-sis: An Integrated Approach. Hillsdale, NJ: Lawrence Erlbaum;1991.

20. Cohen J. Statistical Power Analysis for the Behavioral Sciences(2nd edition). Hillsdale, NJ: Erlbaum; 1988.

21. Maxwell SE, Delaney HD. Designing Experiments and Analyz-ing Data: A Model Comparison Approach. Belmont, CA: Wads-worth; 1990.

22. Kerfoot BP, Baker H. An online spaced-education game forglobal continuing medical education: a randomized trial (un-published data).

23. Kerfoot BP, Baker H, Pangaro L, et al. An online spaced-education game to teach and assess medical students: a multi-institutional study (unpublished data).

24. Matzie KA, Kerfoot BP, Hafler JP, Breen EM. Spaced educationimproves the feedback that surgical residents give to medical

students: a randomized trial. Am J Surg 2009;197:252–257.