Statistical Randomization Tests: Issues and Applications Randomization Tests versus Permutation Tests Randomization Tests versus Permutation Tests Test

Statistical Randomization Tests:Issues and Applications

Randomization Tests versus Permutation Tests

Test Statistic Choice

Complete versus Sampled Randomization Distribution

Randomization Test

1. Random assignment of 3 blocks to each condition

2. Compute observed test statistic

3. Create randomization distribution by computing test statistic

for each of the 20 possible randomizations

4. Compare the obtained test statistic to the randomization

distribution

Permutation Test

1. No random assignment

2. Compute observed test statistic

3. Create permutation distribution by computing test statistic

for data permutations - 20 permutations of blocks of 5 observations?- 155,117,520 permutations of individual observations?

4. Compare the obtained test statistic to the permutation

distribution

Type I error control of randomization tests

Type I error is controlled as long as:

1. Randomization is used in some form 2. Randomization distribution mirrors the

randomization used 3. The test statistic choice is not influenced by

knowledge of treatment assignments

Type I error control of permutation tests

Type I error may or may not be controlled

Type I error can be controlled if one can assume exchangability, but this is generally difficult to assume with time series data.

Consequently, it is statistically preferable to use some form of random assignment and a randomization test than to use a permutation test.

What if you can’t randomize?

Some permutation tests work better than others to control the Type I error rate.

Design: ABABABABABABABABABABABABABABAB

Type of Permutation: individual observations Sample Permutations:

ABABABABABABABABABABABABABABAB

BBAAABBAABABABABABABABABABBAAB

AAABABABABABABABBBABABABABABAB

This test will become conservative with positive autocorrelation


Design: AAAAABBBBBAAAAABBBBBAAAAABBBBB

Type of Permutation: blocks of 5 observations

Sample Permutations: AAAAABBBBBAAAAABBBBBAAAAABBBBB

AAAAAAAAAABBBBBBBBBBAAAAABBBBB

BBBBBAAAAAAAAAABBBBBBBBBBAAAAA

This tests will become conservative with positive autocorrelation


Design: AAAAAAABBBBBBBAAAAAAABBBBBBB

Type of Permutation: start points (say minimum phase length = 5)

Sample Permutations: AAAAAAABBBBBBBAAAAAAABBBBBBB

AAAAAAAABBBBBAAAAAAAAABBBBBB AAAAABBBBBBBBAAAAAAABBBBBBBB

This test will become conservative with positive autocorrelation

Choice of Test Statistic

AB MMT

Anticipated increase in level:

AB MedianMedianT

0

0.2

0.4

0.6

0.8

1


AB MMT *

Delayed increase in level:

*BM

0

0.2

0.4

0.6

0.8

1

is mean of last n observations in B


AB bbT

Change in slope:

0

0.2

0.4

0.6

0.8

1


B

A

s

sT

Change in variation:

0

0.2

0.4

0.6

0.8

1

Complete versus Sampled Randomization Distribution

Complete – Randomization distribution is constructed by systematically enumerating all possible random assignments and computing the test statistic for each.

Sampling – Randomization distribution is constructed by randomly sampling with replacement some large number (e.g., 1000) of the possible assignments, and then computing the test statistic for each sampled assignment.

Suppose you have 20 possible randomizations and the observed test statistic is the largest of the 20.

Complete: the p-value = .05

Sampling: the p-value will be approximately .05. Therefore, you will only get a statistically significant (p ≤ .05) result about half the times you run the test.

Research Applications

The Impact of a Computer Network on Pediatric Pain and Anxiety: A

Randomized Control Clinical Trial

Authors: Holden, G., Bearison, D. J., Rode, D. C., Kapiloff, M. F., Rosenberg, G., & Rosenzweig, J.

Publication Date: 2002

Journal: Social Work and Health Care, 36, 21-33

Treatment: Starbright World – a private computer network that allows hospitalized children to interact with other hospitalized children

Outcomes: Pain intensity, pain aversiveness, and anxiety (self report)

Design: Alternating treatment design replicated across participants with random assignment of conditions (B=SBW, A=Control) to observation periods with the restriction that there couldn’t be more than 6 consecutive periods of the same condition.

Sample: 39 Hospitalized Children 7-18 years old that had at least 8 observations and went through at least one reversal (e.g., ABA)

Randomization Test: separate test on each child for each outcome using the mean difference as the test statistic and Single Case Randomization Test (SCRT) software

Promoting Expressive Writing Among Students with Emotional and

Behavioral Disturbance Via Dialogue Journals

Authors: Regan, K. S., Mastropieri, M. A., & Scruggs, T. E.


Journal: Behavioral Disorders, 31, 33-50

Treatment: individualized dialogue journals in which students and teachers communicated daily about observed social and behavioral issues

Outcomes: Time on task, number of words written, and writing quality

Design: Multiple-baseline design across participants with randomly ordered intervention onset. Baseline lengths of 4, 8, 12, 16, or 20 observations in a series of 26 observations

Sample: 5 students in 6th grade identified as having EBD

Randomization Test: “These differences were also statistically significant from baseline to intervention phases according to a randomization test (p < .001).” How many randomizations were possible? Minimum p-value?

Extending Classwide Social Skills Interventions to At-Risk Minority

Students: A Preliminary Application of Randomization Tests Combined with

Single-Subject Methodology

Authors: Bardon, L. A., Dona, D. P., & Symons, F. J.


Journal: Behavioral Disorders, 33, 141-152

Treatment: PATHS a violence prevention program

Outcome: % of intervals cooperative play during classroom play time

Design: Replicated AB design across participants with randomly selected intervention start times. For each participant, onset of the intervention was selected randomly from the 6th to 15th observation in a planned 20 observation sequence. Baseline lengths were 6, 8, and 8. Is this random assignment ideal?

Sample: 3 elementary students with at least 3 office behavioral referrals

Randomization Test: “…the proportion of 1,000 randomly sampled data divisions producing a mean difference in the predicted direction at least as large as the experimentally obtained difference was 0.036”

How many randomizations were possible?

Documents

Statistical Randomization Tests: Issues and Applications Randomization Tests versus Permutation Tests Randomization Tests versus Permutation Tests Test