8
Global test statistics for treatment effect of stroke and traumatic brain injury in rats with administration of bone marrow stromal cells Mei Lu a, *, Jieli Chen b , Dunyue Lu c , Li Yi b , Asim Mahmood c , Michael Chopp d a Department of Biostatistics and Research Epidemiology, Henry Ford Health Sciences Center, One Ford Place, 3E, Detroit, MI 48202, USA b Department of Neurology, Henry Ford Health System, Detroit, MI 48202, USA c Department of Neurosurgery, Henry Ford Health System, Detroit, MI 48202, USA d Oakland University, Department of Physics, Rochester, MI 48309, USA Received 20 February 2003; received in revised form 12 June 2003; accepted 12 June 2003 Abstract Because no single test measures disability in rats with middle cerebral artery occlusion/traumatic brain injury (MCAo/TBI), multiple tests are needed to assess the effect of bone marrow stromal cell (MSC) on functional recovery. Testing the treatment effect on each outcome at the 0.05 level without adjusting for multiple outcomes can increase type I error. Therefore, we applied the global test to evaluate a common MSC dose effect on multiple outcomes in two applications: (i) MCAo rats with the MSC dose of zero (BPS), 1 /10 6 and 3 /10 6 , and (ii) TBI rats with the MSC dose of zero, 1 /10 6 ,2 /10 6 and 4 /10 6 , administered intravenously at 1 day after injury. For the MCAo rats, 3 /10 6 MSCs improved the 14 day functional recovery (P B/0.05) compared to the controls. TBI rats with the MSC dose 4 /10 6 were improved significantly at 1 month compared to controls, rats with 1 /10 6 or 2 /10 6 MSCs (P B/0.05). The global test on multiple outcomes is more efficient than a single outcome when treatment effects are consistent. The less correlation among the outcomes, the more power and, therefore, the higher efficiency of the global test. We demonstrated that the global test for continuous outcomes could be implemented under careful statistical modeling and proper data transformation. # 2003 Elsevier B.V. All rights reserved. Keywords: Stroke; Traumatic brain injury; Rat; Data analysis; Global test 1. Introduction Stroke is the number three cause of death and the leading cause of serious long term disability, and treatment of stroke is restricted to thrombolysis (rt- PA) within a 3-h window after symptom onset. Trau- matic brain injury (TBI) is an important cause of human morbidity, and as many as 50 000 Americans are killed and an equal number disabled by head trauma each year. Currently, treatment of TBI consists of evacuating mass lesion and providing an optimal milieu for brain recovery, which cannot repair the bio-structure neuronal damage. Marrow stromal cells (MSCs) administered at 24 h after brain injury have shown therapeutic benefit on improvement of functional recovery after brain injury of stroke and TBI in animal (Chen et al., 2001; Lu et al., 2001; Mahmood et al., 2001a). The functional response was assessed using modified neurological severity score (mNSS), Adhesive-Removal patch test, Rotarod motor test and Corner test scores. No single test (e.g. Barthel index, Modified Rankin scale, Glasgow outcome scale, or NIH stroke scale for stroke patients) describes all dimensions of brain deficit and recovery. In animal research, the researcher may focus on a single outcome, even though none is considered as the ‘gold standard.’ At the other extreme, we may report an array of tests without adjusting for the number of outcomes. Consequently, no integrated conclusion can be drawn. For example, a treatment comparison of each outcome at the critical value of 0.05 without adjusting for multiple outcomes can increase type I error (the probability of erroneously rejecting the null hypothesis) from 5 to 15%, for three outcome * Corresponding author. Tel.: /1-313-874-6413; fax: /1-313-874- 6730. E-mail address: [email protected] (M. Lu). Journal of Neuroscience Methods 128 (2003) 183 /190 www.elsevier.com/locate/jneumeth 0165-0270/03/$ - see front matter # 2003 Elsevier B.V. All rights reserved. doi:10.1016/S0165-0270(03)00188-2

Global test statistics for treatment effect of stroke and traumatic brain injury in rats with administration of bone marrow stromal cells

  • Upload
    mei-lu

  • View
    212

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Global test statistics for treatment effect of stroke and traumatic brain injury in rats with administration of bone marrow stromal cells

Global test statistics for treatment effect of stroke and traumaticbrain injury in rats with administration of bone marrow stromal cells

Mei Lu a,*, Jieli Chen b, Dunyue Lu c, Li Yi b, Asim Mahmood c, Michael Chopp d

a Department of Biostatistics and Research Epidemiology, Henry Ford Health Sciences Center, One Ford Place, 3E, Detroit, MI 48202, USAb Department of Neurology, Henry Ford Health System, Detroit, MI 48202, USA

c Department of Neurosurgery, Henry Ford Health System, Detroit, MI 48202, USAd Oakland University, Department of Physics, Rochester, MI 48309, USA

Received 20 February 2003; received in revised form 12 June 2003; accepted 12 June 2003

Journal of Neuroscience Methods 128 (2003) 183�/190

www.elsevier.com/locate/jneumeth

Abstract

Because no single test measures disability in rats with middle cerebral artery occlusion/traumatic brain injury (MCAo/TBI),

multiple tests are needed to assess the effect of bone marrow stromal cell (MSC) on functional recovery. Testing the treatment effect

on each outcome at the 0.05 level without adjusting for multiple outcomes can increase type I error. Therefore, we applied the global

test to evaluate a common MSC dose effect on multiple outcomes in two applications: (i) MCAo rats with the MSC dose of zero

(BPS), 1�/106 and 3�/106, and (ii) TBI rats with the MSC dose of zero, 1�/106, 2�/106 and 4�/106, administered intravenously at

1 day after injury. For the MCAo rats, 3�/106 MSCs improved the 14 day functional recovery (P B/0.05) compared to the controls.

TBI rats with the MSC dose 4�/106 were improved significantly at 1 month compared to controls, rats with 1�/106 or 2�/106

MSCs (P B/0.05). The global test on multiple outcomes is more efficient than a single outcome when treatment effects are consistent.

The less correlation among the outcomes, the more power and, therefore, the higher efficiency of the global test. We demonstrated

that the global test for continuous outcomes could be implemented under careful statistical modeling and proper data

transformation.

# 2003 Elsevier B.V. All rights reserved.

Keywords: Stroke; Traumatic brain injury; Rat; Data analysis; Global test

1. Introduction

Stroke is the number three cause of death and the

leading cause of serious long term disability, and

treatment of stroke is restricted to thrombolysis (rt-

PA) within a 3-h window after symptom onset. Trau-

matic brain injury (TBI) is an important cause of human

morbidity, and as many as 50 000 Americans are killed

and an equal number disabled by head trauma each

year. Currently, treatment of TBI consists of evacuating

mass lesion and providing an optimal milieu for brain

recovery, which cannot repair the bio-structure neuronal

damage. Marrow stromal cells (MSCs) administered at

24 h after brain injury have shown therapeutic benefit

on improvement of functional recovery after brain

injury of stroke and TBI in animal (Chen et al., 2001;

Lu et al., 2001; Mahmood et al., 2001a). The functional

response was assessed using modified neurological

severity score (mNSS), Adhesive-Removal patch test,

Rotarod motor test and Corner test scores.No single test (e.g. Barthel index, Modified Rankin

scale, Glasgow outcome scale, or NIH stroke scale for

stroke patients) describes all dimensions of brain deficit

and recovery. In animal research, the researcher may

focus on a single outcome, even though none is

considered as the ‘gold standard.’ At the other extreme,

we may report an array of tests without adjusting for the

number of outcomes. Consequently, no integrated

conclusion can be drawn. For example, a treatment

comparison of each outcome at the critical value of 0.05

without adjusting for multiple outcomes can increase

type I error (the probability of erroneously rejecting the

null hypothesis) from 5 to 15%, for three outcome

* Corresponding author. Tel.: �/1-313-874-6413; fax: �/1-313-874-

6730.

E-mail address: [email protected] (M. Lu).

0165-0270/03/$ - see front matter # 2003 Elsevier B.V. All rights reserved.

doi:10.1016/S0165-0270(03)00188-2

Page 2: Global test statistics for treatment effect of stroke and traumatic brain injury in rats with administration of bone marrow stromal cells

measurements. The Bonferroni approach, which divides

type I error as 0.05/3�/0.016 for each outcome of three

and detects a treatment benefit if there is significant

treatment benefit on every single outcome at the critical

value of 0.016, is too conservative (O’Brien, 1984;

Pocock et al., 1987). Several global test statistics have

been developed to test treatment efficacy for multiple

binary outcomes (Lipsitz et al., 1991; Lu and Tilley,

2001) with important applications for clinical trial

studies (National Institute of Neurological Disorders

and Stroke rt-PA Stroke Study Group, 1995; Lehmann

et al., 1986). In 1995, the NINDS t-PA Stroke Study

Group reported the effectiveness of t-PA on 3-month

recovery (National Institute of Neurological Disorders

and Stroke rt-PA Stroke Study Group, 1995). The

primary outcome (a favorable outcome) was defined

from four neurological scores at 3 months, each

dichotomized as success or failure. Success was defined

on the Barthel index (an ordinal scale in increments of 5)

as a score of 95 or 100, on the modified Rankin scale as

0 or 1, on the Glasgow scale as 1, and on the NIH stroke

scale as 0 or 1. A global test was used to test for a

common treatment effect on the four binary outcomes,

followed by testing the treatment benefit on each

individual score at the 0.05 level, if the global test was

significant at the 0.05 level. Among these four binary

outcomes, the proportion agreements are in a range of

0.77�/0.94 and f coefficients (Cohen and Cohen, 1983)

are in a range of 0.55�/0.78 (Tilley et al., 1996). There

was significant t-PA benefit on 3-month stroke

recovery, based on the global test as well as

significant t-PA benefit on each outcome. For clinical

studies, many publications that include the global

test to handle multiple endpoints in stroke research are

limited to the binary outcomes (e.g. the evidence

of presence or absence). However, in pre-clinical in-

vestigations, the global test has not been employed and

tests/outcomes tend to be measured as continuous

variables.

It seems easy to generalize the global test statistics for

multiple continuous outcomes using the analysis ap-

proach for longitudinal data (Liang and Zeger, 1986;

Zeger and Liang, 1986). However, in analyzing data

collected to assess the MSC dose effect on neuro-

functional recovery in rats with brain injury, we

encountered outcomes measured in various scales that

would lead to an invalid test statistic, if no adjustments

were made. In this paper, we extend analysis of

covariance (ANCOVA) to calculate the global test,

which measures a common dose effect among multiple

test scores after re-scaling outcomes. We then test the

MSC dose effect on neuro-functional recovery in rats

with MACo (stroke) and with TBI using proper global

test statistics.

2. Materials and methods

2.1. Animal 2 h MCAO model and therapeutic MSCs

Young male Wistar rats were anesthetized withhalothane. The right femoral artery and vein were

cannulated for measuring blood gases (pH, pO2,

pCO2) and blood pressure as basic physiological vari-

ables. MCAo was induced by advancing a surgical nylon

suture (4�/0 for rats) with an expanded (heated) tip from

the external carotid artery into the lumen of the internal

carotid artery to block the origin of the MCA, and

reperfusion was performed by withdrawal of the suture(Chen et al., 1992). Donor MSCs were obtained from

age-matched adult male Wistar rats. Physiological

monitoring (e.g. blood gasses, blood pressure, and

body temperature) will be performed on all animals.

No immunosuppression was used. We administered

MSCs IV with a dosage of zero (PBS), 1�/106/ml, or

3�/106/ml to rats subjected to 2 h of MCAo (Chen et

al., 2001) at 1 day after MCAo.

2.2. Animal TBI model in rats

Young male Wistar rats (weighing 200�/300 g) were

anesthetized with chloral hydrate (35 mg/100 g) intra-

peritoneally, and then subjected to controlled cortical

impact (Dixon et al., 1991; Mahmood et al., 2001b).

MSCs were obtained from age-matched male Wistardonor rats and injected via the tail vein. No immuno-

suppression was used. The treatment MSCs were

administered at 24 h after TBI with various doses:

zero (BPS), 1�/106/ml, 2�/106/ml or 4�/106/ml of

MSCs.

2.3. Outcomes of interest

Several tests, such as the mNSS, Adhesive-Removal

test, Motor Rotarod test and the Corner test scores, are

used to evaluate neurological and behavior status of rats

with brain injury. For different injury models, different

tests might be performed. The assessments of neurolo-

gical and behavior recovery used in the two previous

injury models are described in the following:

(1) Motor Rotarod test: an accelerating Rotarod wasused to measure the motor function (Lennmyr et al.,

1998). Rats were trained 3 days before injury. A rat was

placed on the Rotarod cylinder and the time remaining

on the Rotarod was recorded. The speed would be

slowly increased from 4 to 40 rpm within 5 min. An

experiment was ended if the rat fell off the rungs or

gripped the device and spun around for two consecutive

revolutions without attempting to walk on the rungs.The mean duration (in seconds) on the device was

recorded in three replicates 1 day before the brain injury

(pre-baseline). Motor test score is calculated as a

M. Lu et al. / Journal of Neuroscience Methods 128 (2003) 183�/190184

Page 3: Global test statistics for treatment effect of stroke and traumatic brain injury in rats with administration of bone marrow stromal cells

percentage of the mean duration in three replications

compared to the pre-baseline mean duration.

(2) Adhesive-Removal Test (Schallert et al., 2000)

measures somatosensory deficit both before and aftersurgery. All rats were familiarized with the testing

environment. In the initial test, two small pieces of

adhesive-backed paper dots (of equal size, 113.1 mm2)

were used as bilateral tactile stimuli occupying the

distal�/radial region on the wrist of each forelimb. The

rat was then returned to its cage. The time to remove

each stimulus from forelimbs was recorded on five trials

per day. Individual trials were separated by at least 5min. The animals were trained for 3 days prior to

surgery. Once the rats were able to remove the dots

within 10 s, they were subjected to the surgery (MCAo

or TBI).

(3) The mNSS test (Chen et al., 2001) is graded on a

scale of 0�/18 (normal score 0; maximal deficit score 18).

The mNSS is a composite of motor, sensory, reflex and

balance tests. In the severity scores of injury, one scorepoint is awarded for the inability to perform the test or

for the lack of a tested reflex, described in Table 1.

(4) The Corner test measures sensory and motor

functions and is more sensitive for long-term sensory

and motor deficits (Zhang et al., 2002). A rat was placed

between two boards, each with dimensions of 30�/20�/

1 cm3 in the home cage. The edges of the two boards

were attached at a 308 angle with a small opening alongthe joint between the two boards to encourage entry into

the corner. The rat was placed between the two angled

boards facing the corner and half-way to the corner.

When entering deep into the corner, both sides of the

vibrissae were stimulated together. The rat then reared

forward and upward, and then turned back to face the

open end. A non-injured rat either turned left or right,

but the injured rats preferentially turned toward thenon-impaired, ipsilateral (right for MCAo and left for

TBI) side. The turns in one versus the other direction

were recorded from ten trials for each test, and the

fraction of the turns was used as the Corner test score.

For the MCAo model, we used the mNSS, Adhesive-

Removal and Rotarod tests, and for the TBI model, we

used the mNSS and Corner tests for neurological and

behavior status. MCAo/TBI induced neurological defi-cits include somatosensory, motor, balance and other

deficits, which cannot be evaluated by one test. There-

fore, we selected the Adhesive-Removal test to deter-

mine the somatosensory deficit, the Rotarod test for

motor deficit detection, the Corner test for sensory and

motor deficit of TBI model and the mNSS that measures

sensory and motor function, balance and reflexes, which

is similar to clinical. Functional deficit/recovery, definedas the set of functional tests for each brain injury model

(MCAo or TBI), is measured at baseline prior to the

treatment and at 14 days after the brain injury. We are

interested in testing the null hypothesis: there is a MSC

treatment effect on functional recovery in rats with

brain injury (Stroke, TBI).

2.4. Statistical background of the global test statistic

We consider observations (Yi , Xi ), for i�/1, 2,. . ., N

(the number of subjects), where, vector Y ? i �/(yi ,1,

yi ,2,. . ., yi,k ) represents K outcomes for some integer K

(e.g. mNSS, Adhesive-Removal test and Rotarod test

scores are the outcomes for the MCAo Model, and K�/

3) and X ? i �/(xi ,1, xi ,2,. . ., xi,q ) represents q covariates

for some integer q and xi,q is the treatment indicator, forconvenience, with the value of 0 as the MSC zero dose, 1

as MSC dose 1, and 2 as MSC dose 2. Suppose Yi has a

distribution of f( �/, Ui , f), in which EYi �/Ui is K �/1

Table 1

Modified neurological severity score

Motor tests

Raising the rat by the tail 3

1 Flexion of forelimb

1 Flexion of hind limb

1 Head moved more than 108 to the vertical axis

within 30 s

Walking on the floor (normal�/0; maximum�/3) 3

0 Normal walk

1 Inability to walk straight

2 Circling toward the paretic side

3 Fall down to the paretic side

Sensory tests 2

1 Placing test (visual and tactile test)

2 Proprioceptive test (deep sensation, pushing the

paw against the table edge to stimulate limb

muscles)

Beam balance tests (normal�/0; maximum�/6) 6

0 Balances with steady posture

1 Grasps side of beam

2 Hugs the beam and one limb fall down from the

beam

3 Hugs the beam and two limbs fall down from the

beam, or spins on the beam (�/60 s)

4 Attempts to balance on the beam but falls off (�/

40 s)

5 Attempts to balance on the beam but falls off (�/

20 s)

6 Falls off: no attempt to balance or hang on to the

beam (B/20 s)

Reflexes absence and abnormal movements 4

1 Pinna reflex (a head shake when touching the

auditory meatus)

1 Corneal reflex (an eye blink when lightly touching

the cornea with cotton)

1 Startle reflex (a motor response to a brief noise

from snapping a clipboard paper)

1 Seizures, myoclonus, myodystony

Maximum

points

18

One point is awarded for the inability to perform the tasks or for the

lack of a tested reflex, 13�/18 severe injury; 7�/12 moderate injury; 1�/6

mild injury.

M. Lu et al. / Journal of Neuroscience Methods 128 (2003) 183�/190 185

Page 4: Global test statistics for treatment effect of stroke and traumatic brain injury in rats with administration of bone marrow stromal cells

vector and f is an unknown scale parameter. The

statistical model can be expressed as

g(ui;1)

g(ui;2)

ng(ui;K )

2664

3775

K�1

1 0 0 � � � 0 xi;1 � � � xi;q

0 1 0 � � � 0 xi;1 � � � xi;q

n n n ::: n n ::: n0 0 0 � � � 1 xi;1 � � � xi;q

2664

3775

K�(K�q)

a1

a1

naK

b1

nbq

2666666664

3777777775

(K�q)�1

(1)

where g( �/) is a link function, ak is the intercept for thek th outcome, b1, b2,. . ., bq�1, is the set of nuisance

parameters (e.g. pre-treatment functional test scores),

and bq is the parameter of interest (the coefficient for

the common treatment effect). Assuming the K �/K

covariance matrix, var(Yi)�/fV (Ui), for some known

K �/K matrix V (the working covariance matrix), the

quasi-likelihood estimator vector b , b ?�/ (a1, a2,. . ., aK ,

b1,. . ., bq )?, is the solution of the score-like equationsystem

XN

i�1

�dUi

db

?

V�1(Ui)(Yi�Ui)�0 (2)

Wedderburn (1974) first proposed the quasi-likeli-

hood theory. Liang and Zeger (1986) developed general-

ized estimating equations (GEE) based on the score-like

equations in Eq. (2) for multiple or clustered continuous

outcomes and for discrete longitudinal data (Zeger and

Liang, 1986). In Eq. (2), b is estimated using a quasi-

likelihood function, not a proper likelihood function.Therefore, we could assume the link function, g ( �/), and

variance structure, matrix V , without attempting to

specify the entire distribution of Yi , for i�/1, 2,. . ., N .

When matrix V is the true variance matrix of Yi Eq. (2)

becomes the score equation system yielding a maximum

likelihood estimate of b , which can be calculated using

the ANCOVA approach (Park, 1993).

We can conduct a global test for the treatment effecton correlated outcomes assuming a common treatment

effect on all outcomes (a homogeneous effect). Testing

the null hypothesis, described in the above section, is

equivalent to testing H0:bq �/0 in Eqs. (1) and (2), where

bq is the global test statistic for the MSC effect. The

treatment of MSC is effective, if bq "/0 at the critical

level of 0.05. In addition, Eq. (1) can be extended to test

the common treatment effect and the variable interac-tion (e.g. MSC and the time of MSC administration

interaction) by adding the interaction term along with

the two individual main effect terms.

Furthermore, in order to calculate a valid global test,

based on Eqs. (1) and (2), we need to have a closer look

at how the outcome data are collected. Two necessary

steps need to be checked before conducting the globaltest for the MSC effect.

2.4.1. Data transformation or re-scaling the outcome

variables

2.4.1.1. Outcome measurement consistency. Stroke re-

covery can be measured in different directions numeri-

cally. For example, the mNSS test score, in a range of

0�/18, is a composite of motor, sensory, reflex andbalance tests, where the higher the scale, the more

severity of brain injury and less recovery. In contrast,

the Rotarod test measures the time the animals

remained on the Rotarod cylinder; the higher the score,

the less severity of brain injury and more recovery.

Including the inconsistent outcome measures in Eq. (1)

will provide an invalid test that diminishes the treatment

effect. To solve this problem, all of the outcomes ofinterest should be evaluated for numerical direction of

injury. Data transformation, such as subtracting from

the maximum or calculating the reciprocal of the scale, if

the scale "/0, will be considered.

2.4.1.2. Outcome re-scaling. From Eqs. (1) and (2), as a

continuous outcome, the scale of outcome plays a role in

testing the treatment effect. The outcome with the larger

scale will dominate the estimation of the treatment

effect, compared to the outcome with a smaller scale,

which is a validation of the common treatment assump-

tion in Eq. (1). To overcome this, data will be expressed

in common units via a transformation described in thefollowing expression:

y+i;k�

yi;k � yk

sk

(3)

where yk is the mean and sk is standard deviation of yi ,k ,i�/1, 2,. . ., N .

Note that, although data transformation must be

considered when outcomes are measured in different

directions or in different scales numerically, to be

realistic, we would prefer to present descriptive statistics

such as the mean or the standard deviation of each test

score based on the raw data for data illustration.

2.4.2. Statistical software used for the global test

The global test can be implemented using SAS, the

commercial software for statistical analyses (SAS In-

stitution, 1999). PROC GLM or PROC MIXED in SAS can be

used, if data are normal, and PROC GENMOD in SAS orthe GEE SAS macro 2.03 (which may be obtained from U.

Gromping, Fachbereich Statistik, Universitat) written

using SAS MODULAR IML, if data are otherwise. A

M. Lu et al. / Journal of Neuroscience Methods 128 (2003) 183�/190186

Page 5: Global test statistics for treatment effect of stroke and traumatic brain injury in rats with administration of bone marrow stromal cells

significant difference between the treated group and

controls is detected if the P -value for a global test is less

than 0.05. (SAS code for testing the global statistic can

be obtained from the first author.)

2.4.3. Subgroup analysis of MSC effect on single

outcome in controlling type I error

In addition to the global test for the common

treatment effect for functional recovery measured from

multiple test scores, researchers are also interested in the

MSC effect on single outcome or common MSC effectamong the multiple outcomes between pair-wise group

comparisons, if more than two doses are involved. When

the treatment difference is detected, the critical value of

0.05 is based on the global test. The sub-group analysis,

such as the pair-wise dose comparison, can be further

tested using the CONTRAST statement in SAS con-

trolled by the same critical value (0.05). Any significant

dose difference on functional recovery in the pair-wisecomparison will then be tested for dose effect on the

individual outcome at the same critical value (0.05). If

the global test is not significant at the 0.05 level, the

formative analysis should be stopped without further

testing for pair-wise dose group comparisons or dose

responses on the individual outcome with a conclusion

of no treatment effect. Otherwise, analysis of the

individual outcome will be considered as exploratoryanalysis, which has to be confirmed in further study.

2.5. Sample size/power calculation for global test

To design an experiment for testing a treatment effect

on multiple test scores, sample size (e.g. the number ofsubjects needed per group) will be considered to insure

sufficient power (e.g. 80% of power) to detect the

treatment effect. Two approaches are discussed. One

approach is to use the single outcome to estimate the

sample size, assuming that test scores are correlated in

the same direction, given that the global test is more

efficient than a single outcome.

We will have

n�2(Za=2 � Zb)2

d2(4)

where Zp is the value cutting off the proportion p in the

upper tail of the standard normal distribution, and a is

type I error (e.g. 0.05), b is type II error (e.g. 1�/b�/0.8

for the power) and d is effect size, defined as the

difference in treatment means expressed in units of

standard deviation d�/jm1�/m0j/s . Eq. (4)provides a

formula to calculate power for a fixed sample size. We

can also see from Eq. (4) that with fixed type I and typeII errors, the sample size determination is based on the

effect size, d . The larger the required effect size, the

more subjects are needed per group.

The second approach is to consider the correlation (r )

among outcomes, proposed by Diggle et al. (1994) with

the expression

n�2(Za=2 � Zb)2(1 � (K � 1)r)

Kd2(5)

where K is number of outcomes. If the outcomes are

perfectly correlated outcomes (e.g. r�/1), Eq. (5)becomes Eq. (4). With a fixed sample, the power for

multiple outcomes would be equal to the power for a

single outcome. However, when the correlation de-

creases, the sample size required decreases compared

to the sample size calculation used in Eq. (4). If the

correlation varies, to be conservative, one should use the

highest correlation in Eq. (5), or use Eq. (4) for the

sample size calculation.In addition, if more than two treatment groups are

involved, the effect size, d , can be defined by two

treatment groups is given Cohen (1988) as

d�mmax � mmin

s(6)

where M is number of treatment groups for M �/2, mmax

is the largest of the M means, mmin is the smallest of the

M means and s is the common standard deviation

within the population. Assuming intermediate variabil-

ity: m means equally spaced over d , with estimation of d

in Eq. (6), we could determine the sample size, based on

Eq. (4) or Eq. (5).

3. Experiment: data collection and analysis results

3.1. The MSC dose responses on functional recovery for

rats with MCAo

The MCAo rat model was discussed in 2.1. The MSC

dose response on functional recovery is measured from

three test scores ; Adhesive-Removal patch test, Rotarod

test and mNSS. Nineteen rats were involved in this

study with three dose groups; PBS (control, n�/6), 1�/

106 MSCs (n�/6) and 3�/106 MSCs (n�/7) at an

administration time of 1 day after MCAo. The func-

tional status, Adhesive-Removal patch test, Rotarodtest and mNSS were collected at baseline (immediately

before treatment) and at 14 days after stroke, where the

Rotarod test had an inverse order (higher score and

better recovery) compared to the mNSS or Adhesive-

Removal test. The data was transformed to have

measurement consistency and scaled outcomes. MCAo

severity on functional scores prior to treatment was

balanced among the dose groups (P -values �/0.23). Thecorrelation coefficients among the three outcomes were

in a range of 0.14 between the Rotarod and Adhesive

Removal tests to 0.61 between the mNSS and Adhesive

M. Lu et al. / Journal of Neuroscience Methods 128 (2003) 183�/190 187

Page 6: Global test statistics for treatment effect of stroke and traumatic brain injury in rats with administration of bone marrow stromal cells

Removal test scores. The application of the global test

showed significant MSC dose responses on functional

recovery at 14 days (P�/0.004). Among them, there was

a significant improvement on functional recovery at 14

days for rats treated with 3�/106 MSCs compared to

control rats (P�/0.001) with the significant MSC effect

on each individual outcome except the Rotarod test

score (P�/0.58); a marginal improvement for rats

treated with 1�/106 MSCs compared to the control

rats (P�/0.09); and no significant difference between

MSC dose of 3�/106 and 1�/106 (P�/0.17) (Table 2).

3.2. The MSC dose effect of MSCs on functional

recovery for rats with TBI

The TBI rat model was discussed in Section 2.2. A

total of 36 TBI rats (n�/9 per group) were employed in

this study: control (PBS), 1�/106, 2�/106 and 4�/106

MSCs were administered at 1 day after TBI. Functional

outcomes (mNSS and Corner test) were measured at

baseline (before treatment), and at 1 month after

treatment. The Corner test score has an inverse order

as compared to the mNSS. Therefore, the analysis was

conducted based on the transformed data described in

Section 2.4.1. TBI severity on functional scores prior to

treatment was balanced between the MSC treated

groups and the control group (P �/0.66). The correla-

tion and coefficient between the two tests was 0.76. The

results of the global test showed significant MSC dose

response on functional recovery at 1 month after

treatment. Of them, each MSC dose group significantly

improved on 1-month functional recovery compared to

the controls (P B/0.001 for both 2�/106 and 4�/106

MSC doses), except the low dose of 1�/106 MSCs (P�/

0.86). The 2�/106 or 4�/106 MSC dose improved 1-

month functional recovery significantly compared to the

low dose 1�/106 MSCs (P B/0.01). Rats treated with

4�/106 MSC dose were significantly improved on 1

month functional recovery compared to rats treated

with 2�/106 MSCs with significant improvement on

both mNSS and the Corner test scores, respectively

(Table 3).

4. Discussion and conclusions

The goal of this research is to establish a statistical

algorithm for testing the MSC dose response or thetreatment effect, so that the conclusion of the MSC

effect does not rely on a single functional test score, but

relies on multiple functional test scores in the animal

models for brain injury and stroke. The global test for

continuous outcomes can be implemented under careful

statistical modeling using the framework of ANCOVA

and necessary data transformation.

Multiple tests are often considered in neuroscienceresearch, because no single test measures all dimensions

of brain recovery, and there is no ‘gold standard’ test.

We would be criticized for increasing type I error by

Table 2

MSC dose response on 14 days functional recovery

Treatment Adhesive-Removal (s) mean9/S.D. Rotarod (%) mean9/S.D. mNSS (score) mean9/S.D.

Pre-treatment (baseline)

Control (n�/6) 116.09/8.0 43.39/21.3 8.39/1.5

1�/106 MSCs (n�/6) 111.79/13.3 35.99/21.7 8.79/2.3

3�/106 MSCs (n�/7) 106.99/23.8 49.19/22.4 8.69/2.8

Post-treatment (14 d)

Control 86.39/28.0 68.89/16.1 7.39/0.8

1�/106 MSCsa 52.09/22.2 65.69/21.5 5.29/3.1

3�/106 MSCsa,b 33.69/15.8 73.59/18.3 4.69/1.8

a Significant improvement on functional recovery compared to controls based on the global test.b P�/0.06, marginal differences on functional recovery between MSC dose 1�/106 and 3�/106.

Table 3

Dose response for TBI

Treatment (n�/9) Corner test mean9/

S.D.

mNSS (score) mean9/

S.D.

Pre-treatment (base-

line)

Control 6.79/7.0 10.39/1.5

1�/106 MSCs 8.99/6.0 10.29/1.6

2�/106 MSCs 7.89/6.7 10.39/2.1

4�/106 MSCs 10.09/7.1 10.09/2.3

5.09/0.7

Post-treatment (1

month)

Control 14.49/7.3 5.09/0.7

1�/106 MSCs 15.69/5.3 4.99/1.1

2�/106 MSCsa 30.09/7.1 2.49/0.7

4�/106 MSCsa,b 36.79/5.0 1.89/0.4

a Significant improvement on functional recovery compared to

controls using the global test.b Significant improvement on functional recovery compared to the

MSC dose 2�/106 MSC group using the global test.

M. Lu et al. / Journal of Neuroscience Methods 128 (2003) 183�/190188

Page 7: Global test statistics for treatment effect of stroke and traumatic brain injury in rats with administration of bone marrow stromal cells

presenting the treatment effect on each individual out-

come without adjusting for multiple outcomes. The

global test considers multiple outcomes and tests the

common treatment effect that is relevant to the hypoth-esis of a MSC effect on functional recovery.

The global test on multiple outcomes is more efficient

than a test using a single outcome, when treatment

effects are consistent on outcomes. The less the correla-

tion among outcomes, the more power, and therefore,

the higher efficiency of the global test (Tilley et al.,

1996). If outcomes are perfectly correlated, the global

test on multiple outcomes becomes a test for a singleoutcome. On the other hand, the global test may be less

efficient than a test based on a single outcome if

treatment effects are inconsistent among multiple out-

comes. If the MSC effect were positive on one outcome

and negative on another, the treatment effects would be

diminished as a result. The global test will be less likely

to detect a significant result, which is exactly what we

expect and could be an advantage of the global test inanalyzing the MSC treatment effect. We cannot define

the effectiveness of MSC therapy based on one outcome,

especially when the other outcomes are in the wrong

direction.

Note, that there are different concepts between the

outcome measurement consistency and the treatment

effect consistency on outcomes. The earlier term is a

numerical issue and the later term is a pre-clinical orclinical issue. To have a valid statistical test, data

transformation must be considered if outcome measure-

ments are inconsistent or they are not in a common

scale. As an example, for the TBI rat model, using the

re-scaling Corner test and mNSS test scores without

changing inconsistent measurements, we obtained an

overall MSC dose effect on 1 month functional recovery

with P�/0.59 and P �/0.32 for any pair-wise dose groupcomparisons, and the results totally conflict with results

presented in Section 3.2. On the other hand, if the

treatment effect is inconsistent on outcomes (i.e. a

treatment improves the functional recovery as one

outcome, but increases the chance of adverse event as

another outcome), more likely, we would not have

significant results based on the global test, as we

discussed previously.The global test allows researchers to study the

treatment effect on a single outcome using the step-

down procedure at the critical value of 0.05 if the global

test is significant. Otherwise, the individual tests would

be considered informative with no conclusion.

In summary, we have implemented the global test on

the MSC treatment of stroke and TBI. The global test

appears as a useful statistical tool for testing thecommon dose effect when multiple outcomes are

involved and correlated. The global test for continuous

outcomes can be implemented under careful statistical

modeling and proper data transformation.

Acknowledgements

The authors thank Lula Adams for editing. This work

was supported by NINDS grants PO1 NS23393, RO1

NS33627, RO1 NS38292 and RO1 HL64766.

References

Chen H, Chopp M, Zhang ZG, Garcia JH. The effect of hypothermia

on transient middle cerebral artery occlusion in the rat. J Cereb

Blood Flow Metab 1992;12:621�/8.

Chen J, Li Y, Wang L, Zhang Z, Lu D, Lu M, et al. Therapeutic

benefit of intravenous administration of bone marrow stromal cells

after cerebral ischemia in rats. Stroke 2001;32:1005�/11.

Cohen J. Statistical power analysis for the behavioral sciences, 2nd ed.

New Jersey: Lawrence Erlbaum Associates, Inc, 1988.

Cohen J, Cohen P. Applied multiple regression. Hillsdale, NJ:

Lawrence Erlbaum Associates, 1983.

Diggle PJ, Liang K, Zeger SL. Analysis of longitudinal data.

Clarendon Press: Oxford, 1994.

Dixon CE, Clifton GL, Lighthall JW, Yaghmai AA, Hayes RL. A

controlled cortical impact model of traumatic brain injury in the

rat. J Neurosci Methods 1991;39:253�/62.

Lehmann E. Testing statistical hypotheses, 1st ed. New York: Wiley,

1986.

Lennmyr F, Ata KA, Funa K, Olsson Y, Terent A. Expression of

vascular endothelial growth factor (VEGF) and its receptors (Flt-1

and Flk-1) following permanent and transient occlusion of the

middle cerebral artery in the rat. J Neuropathol Exp Neurol

1998;57:874�/82.

Liang K, Zeger SL. Longitudinal data analysis using generalized linear

models. Biometrika 1986;72(1):13�/22.

Lipsitz SR, Laird NM, Harrington DP. Generalized estimation

equations for correlated binary data: using the odds as a measure

of association. Biometrika 1991;78:153�/60.

Lu M, Tilley BC. Use of odds ratio or relative risk to measure a

treatment effect in clinical trials with multiple correlated binary

outcomes: data from the NINDS t-PA stroke trial. Stat Med

2001;20:1891�/901.

Lu D, Mahmood A, Wang L, Li Y, Lu M, Chopp M. Adult bone

marrow stromal cells administered intravenously to rats after

traumatic brain injury migrate into brain and improve neurological

outcome. Neuroreport 2001;12:559�/63.

Mahmood A, Lu D, Wang L, Li Y, Lu M, Chopp M. Treatment of

traumatic brain injury in female rats with intravenous administra-

tion of bone marrow stromal cells. Neurosurgery 2001a;49:1196�/

203.

Mahmood A, Lu D, Yi L, Chen JL, Chopp M. Intracranial bone

marrow transplantation after traumatic brain injury improving

functional outcome in adult rats. J Neurosurg 2001b;94:589�/95.

National Institute of Neurological Disorders and Stroke rt-PA Stroke

Study Group. Tissue plasminogen activator for acute ischemic

stroke. N Eng J Med 1995;333:1581�/7.

O’Brien PC. Procedures for comparing samples with multiple end-

points. Biometrics 1984;40(December):1079�/87.

Park T. A comparison of the generalized estimating equation approach

with the maximum likelihood approach for repeated measure-

ments. Stat Med 1993;12:1723�/32.

Pocock SJ, Geller NL, Tsiatis AA. The analysis of multiple endpoints

in clinical trials. Biometrics 1987;43(September):487�/98.

SAS Institution. SAS/STAT Software, 8th ed. Cary (NC): SAS Institu-

tion; 1999.

M. Lu et al. / Journal of Neuroscience Methods 128 (2003) 183�/190 189

Page 8: Global test statistics for treatment effect of stroke and traumatic brain injury in rats with administration of bone marrow stromal cells

Schallert T, Fleming SM, Leasure JL, Tillerson JL, Bland ST. CNS

plasticity and assessment of forelimb sensorimotor outcome in

unilateral rat models of stroke, cortical ablation, parkinsonism

and spinal cord injury. Neuropharmacology 2000;39:777�/87.

Tilley BC, Marler J, Geller NL, Lu M, Legler J, Brott T, et al. Use of a

global test for multiple outcomes in Stroke Trials with application

to the National Institute of Neurological Disorders and Stroke t-

PA Stroke Trial. Stroke 1996;27(11):2136�/42.

Wedderburn RWM. Quasi-likelihood functions, generalized linear

models, and the Gauss�/Newton method. Biometrika

1974;61:439�/47.

Zeger SL, Liang K. Longitudinal data analysis for discrete and

continuous outcomes. Biometrics 1986;42:121�/30.

Zhang L, Schallert T, Zhang ZG, Jiang Q, Arniego P, Li Q, et al. A

test for detecting long-term sensorimotor dysfunction in the mouse

after focal cerebral ischemia. J Neurosci Methods 2002;117:207�/

14.

M. Lu et al. / Journal of Neuroscience Methods 128 (2003) 183�/190190