Adapting Designs
Professor David TorgersonUniversity of York
Professor Carole TorgersonDurham University
Trial design
• Numerous trial designs are available to answer different questions. Sometimes the same question could be answered using different designs.
• Trade-off between:» Statistical efficiency (including contamination);» Post-randomisation bias;» Generalisability;» Cost.
Numerous trial designs
• Individual randomisation;• Cluster randomisation.
Individual allocation
• “Standard” RCT (Summer schools)• Waiting list RCT
» Within school year waiting list (ECC);» Outside school year waiting list;
• Factorial;• Combined with regression discontinuity
(SHINE);• Incomplete block design.
Cluster randomisation
• School cluster (Calderdale);• Class cluster (Grammar for writing);• Year cluster (Third space);• Waiting list (Third space outside school);• Stepped wedge;• Partial split plot (Grammar for writing);• Full split plot.
ECC & Online Maths
• In this session we will discuss two RCTs and their designs:» Every Child Counts (ECC) evaluation;» Third space (online maths) evaluation (EEF
funded study).
Independent evaluation of Every Child Counts intervention ‘Numbers Count’
• Effectiveness research question: Is the ECC numeracy intervention ‘Numbers Count’ better at improving mathematics achievement than normal classroom teaching in numeracy?
• Year 2 pupils at risk in numeracy• Intervention: one to one teaching, focus on number,
every day for 12 weeks• Control: usual classroom teaching in number and other
mathematical concepts a Torgerson, C.J., b Wiggins, A., c Torgerson, D.J., c Ainsworth, H., c Hewitt, C., Testing policy effectiveness using
a randomized controlled trial, designed, conducted and reported to CONSORT standards, Journal of Research in Mathematics Education, March, 2013
Funded by Dept. for Education, £305K, 2009-11
Design of experiment
• 12 children in each of 44 schools selected as eligible for ‘Numbers Count’ intervention
• Maths test (Sandwell test) (pre-test) at beginning of autumn term (administered by teachers)
• Random allocation of 12 children to term of delivery: autumn, spring or summer: ‘waiting list’ design
• Intervention group: autumn children• Control group: spring and summer children• Maths test (Progress in Maths test) after 12 weeks (administered by
independent testers) (post-test)• Simple analysis: compare the mean maths post-test score of
intervention children with mean maths score of control children and conclude whether ‘Numbers Count’ is more effective than usual teaching
• Rigorous design: excludes some alternative explanations for results
Design features that increased internal validity and acceptability
• Randomisation: intervention and control groups are equivalent at start so design controls for history, maturation, regression to the mean, selection bias
• Large sample size: excludes chance finding• Intervention and control conditions are both numeracy
interventions and both last for 30 mins. per day for 12 weeks: the comparison is a ‘fair’ one
• Independent ‘blinded’ testing: eliminates possibility of tester bias
• ‘Waiting list’ design so all eligible pupils received intervention
• Small number of ‘wild cards’ allowed
Results
Intervention Group
Control Group
Effect Size95% ConfidenceInterval
PIM 6 (0-30) 15.8 (4.9)N = 144
14.0 (4.5)N = 440
0.33 (0.12 to 0.53)
Design limitations: Generalisability
• ECC schools were identified: by policy-makers/funders of programme - education policy ‘roll out’ in England, i.e., schools in disadvantaged areas
• Ideally, a random sample of all secondary schools in England should have been approached and asked to take part
Design limitations: Intervention
• One to one teaching with intervention children being withdrawn from classroom
• Problem of attribution: was effect due to NC intervention? one to one teaching?
• Design could have included additional one to one arm
Design limitations: Intervention
• One to one teaching with intervention children being withdrawn from classroom
• Problem of attribution: was effect due to NC intervention? one to one teaching?
• Design could have included additional one to one arm
Design limitations: ‘Contamination’/’spill over’ effects
• Children withdrawn from usual classroom teaching – may have benefited remaining children; teachers using programme may have applied it to some control children.
• Instead of randomising individual children design could have randomised by school (cluster randomisation, where school is the cluster) to avoid these problems.
Design limitations: Long term effects
• Wait list design prevented long term follow-up; effects may have ‘washed out’ soon after intervention was finished.
• Could have used cluster randomisation;
• Could have recruited 3 additional children above threshold and randomised these to intervention or control for long term follow-up;
• All options (above) rejected by funder.
Conclusions
• Design and conduct warranted conclusion NC (as delivered) more effective than usual classroom teaching BUT because of design limitations couldn’t answer some really important questions
• These questions could have been answered if a different experimental design had been used: cluster randomisation (randomisation of schools), long-term follow-up (control group that didn’t receive intervention); one to one control group (literacy or other numeracy)
Online maths evaluation
• EEF have funded Third Space to deliver to 600 children 1 school year of face to face online maths tuition delivered from tutors based in India;
• York Trials Unit with Durham University have designed a trial to evaluate this intervention;
• Several design options are possible.
Individual randomisation
• 600 children randomised to tuition and 600 allocated to nothing would give 80% power to show 0.11 ES difference (pre-post correlation 0.70);
• Unequal allocation 600 to tuition 1200 would increase efficiency to show 0.10 difference;
• Problems:» Resentful demoralisation from control children;» Difficulty in getting schools to take part.
Waiting list
• We could instead randomise 600 children such that all could receive the intervention;
• 300 in term one and 300 in term two (similar to ECC evaluation);
• Power: 80% to show 0.16 ES;• Problems:
» Lack of long term follow-up; don’t know if intervention’s effects will be sustained.
Cluster trial
• We could randomise schools which would avoid resentful demoralisation at the child level;
• 600 children (assuming 10 per school; ICC 0.19; pre/post 0.70), would give us 80% power to show 0.19 ES difference;
• Problem:» Schools in the control group may be more
likely to drop-out introducing attrition bias.
Cluster/wait list design
• We could randomise schools to offer intervention to children in year 6 and the waitlist schools to get the intervention for their next year’s year 6 pupil;
• Prevent school level drop-out;• Allow long term follow-up;• Problem:
» Lower efficiency than previous design (0.26 ES detectable), but lower risk of bias.
What has actually happened?
• Aimed to recruit 60 schools with an average of 10 pupils per school;
• However, over-recruited 72 schools so we are recruiting 8 pupils per school;
• This improves our efficiency so that we now can detect an effect size of 0.25 rather than 0.26.
Activity
• In small groups discuss your EEF trials where the trial design has been adapted to increase: acceptability or implementation of the intervention; internal validity; or external validity;
• Select the most interesting/significant example for feedback to whole group.