Upload
mei-lu
View
212
Download
0
Embed Size (px)
Citation preview
Global test statistics for treatment effect of stroke and traumaticbrain injury in rats with administration of bone marrow stromal cells
Mei Lu a,*, Jieli Chen b, Dunyue Lu c, Li Yi b, Asim Mahmood c, Michael Chopp d
a Department of Biostatistics and Research Epidemiology, Henry Ford Health Sciences Center, One Ford Place, 3E, Detroit, MI 48202, USAb Department of Neurology, Henry Ford Health System, Detroit, MI 48202, USA
c Department of Neurosurgery, Henry Ford Health System, Detroit, MI 48202, USAd Oakland University, Department of Physics, Rochester, MI 48309, USA
Received 20 February 2003; received in revised form 12 June 2003; accepted 12 June 2003
Journal of Neuroscience Methods 128 (2003) 183�/190
www.elsevier.com/locate/jneumeth
Abstract
Because no single test measures disability in rats with middle cerebral artery occlusion/traumatic brain injury (MCAo/TBI),
multiple tests are needed to assess the effect of bone marrow stromal cell (MSC) on functional recovery. Testing the treatment effect
on each outcome at the 0.05 level without adjusting for multiple outcomes can increase type I error. Therefore, we applied the global
test to evaluate a common MSC dose effect on multiple outcomes in two applications: (i) MCAo rats with the MSC dose of zero
(BPS), 1�/106 and 3�/106, and (ii) TBI rats with the MSC dose of zero, 1�/106, 2�/106 and 4�/106, administered intravenously at
1 day after injury. For the MCAo rats, 3�/106 MSCs improved the 14 day functional recovery (P B/0.05) compared to the controls.
TBI rats with the MSC dose 4�/106 were improved significantly at 1 month compared to controls, rats with 1�/106 or 2�/106
MSCs (P B/0.05). The global test on multiple outcomes is more efficient than a single outcome when treatment effects are consistent.
The less correlation among the outcomes, the more power and, therefore, the higher efficiency of the global test. We demonstrated
that the global test for continuous outcomes could be implemented under careful statistical modeling and proper data
transformation.
# 2003 Elsevier B.V. All rights reserved.
Keywords: Stroke; Traumatic brain injury; Rat; Data analysis; Global test
1. Introduction
Stroke is the number three cause of death and the
leading cause of serious long term disability, and
treatment of stroke is restricted to thrombolysis (rt-
PA) within a 3-h window after symptom onset. Trau-
matic brain injury (TBI) is an important cause of human
morbidity, and as many as 50 000 Americans are killed
and an equal number disabled by head trauma each
year. Currently, treatment of TBI consists of evacuating
mass lesion and providing an optimal milieu for brain
recovery, which cannot repair the bio-structure neuronal
damage. Marrow stromal cells (MSCs) administered at
24 h after brain injury have shown therapeutic benefit
on improvement of functional recovery after brain
injury of stroke and TBI in animal (Chen et al., 2001;
Lu et al., 2001; Mahmood et al., 2001a). The functional
response was assessed using modified neurological
severity score (mNSS), Adhesive-Removal patch test,
Rotarod motor test and Corner test scores.No single test (e.g. Barthel index, Modified Rankin
scale, Glasgow outcome scale, or NIH stroke scale for
stroke patients) describes all dimensions of brain deficit
and recovery. In animal research, the researcher may
focus on a single outcome, even though none is
considered as the ‘gold standard.’ At the other extreme,
we may report an array of tests without adjusting for the
number of outcomes. Consequently, no integrated
conclusion can be drawn. For example, a treatment
comparison of each outcome at the critical value of 0.05
without adjusting for multiple outcomes can increase
type I error (the probability of erroneously rejecting the
null hypothesis) from 5 to 15%, for three outcome
* Corresponding author. Tel.: �/1-313-874-6413; fax: �/1-313-874-
6730.
E-mail address: [email protected] (M. Lu).
0165-0270/03/$ - see front matter # 2003 Elsevier B.V. All rights reserved.
doi:10.1016/S0165-0270(03)00188-2
measurements. The Bonferroni approach, which divides
type I error as 0.05/3�/0.016 for each outcome of three
and detects a treatment benefit if there is significant
treatment benefit on every single outcome at the critical
value of 0.016, is too conservative (O’Brien, 1984;
Pocock et al., 1987). Several global test statistics have
been developed to test treatment efficacy for multiple
binary outcomes (Lipsitz et al., 1991; Lu and Tilley,
2001) with important applications for clinical trial
studies (National Institute of Neurological Disorders
and Stroke rt-PA Stroke Study Group, 1995; Lehmann
et al., 1986). In 1995, the NINDS t-PA Stroke Study
Group reported the effectiveness of t-PA on 3-month
recovery (National Institute of Neurological Disorders
and Stroke rt-PA Stroke Study Group, 1995). The
primary outcome (a favorable outcome) was defined
from four neurological scores at 3 months, each
dichotomized as success or failure. Success was defined
on the Barthel index (an ordinal scale in increments of 5)
as a score of 95 or 100, on the modified Rankin scale as
0 or 1, on the Glasgow scale as 1, and on the NIH stroke
scale as 0 or 1. A global test was used to test for a
common treatment effect on the four binary outcomes,
followed by testing the treatment benefit on each
individual score at the 0.05 level, if the global test was
significant at the 0.05 level. Among these four binary
outcomes, the proportion agreements are in a range of
0.77�/0.94 and f coefficients (Cohen and Cohen, 1983)
are in a range of 0.55�/0.78 (Tilley et al., 1996). There
was significant t-PA benefit on 3-month stroke
recovery, based on the global test as well as
significant t-PA benefit on each outcome. For clinical
studies, many publications that include the global
test to handle multiple endpoints in stroke research are
limited to the binary outcomes (e.g. the evidence
of presence or absence). However, in pre-clinical in-
vestigations, the global test has not been employed and
tests/outcomes tend to be measured as continuous
variables.
It seems easy to generalize the global test statistics for
multiple continuous outcomes using the analysis ap-
proach for longitudinal data (Liang and Zeger, 1986;
Zeger and Liang, 1986). However, in analyzing data
collected to assess the MSC dose effect on neuro-
functional recovery in rats with brain injury, we
encountered outcomes measured in various scales that
would lead to an invalid test statistic, if no adjustments
were made. In this paper, we extend analysis of
covariance (ANCOVA) to calculate the global test,
which measures a common dose effect among multiple
test scores after re-scaling outcomes. We then test the
MSC dose effect on neuro-functional recovery in rats
with MACo (stroke) and with TBI using proper global
test statistics.
2. Materials and methods
2.1. Animal 2 h MCAO model and therapeutic MSCs
Young male Wistar rats were anesthetized withhalothane. The right femoral artery and vein were
cannulated for measuring blood gases (pH, pO2,
pCO2) and blood pressure as basic physiological vari-
ables. MCAo was induced by advancing a surgical nylon
suture (4�/0 for rats) with an expanded (heated) tip from
the external carotid artery into the lumen of the internal
carotid artery to block the origin of the MCA, and
reperfusion was performed by withdrawal of the suture(Chen et al., 1992). Donor MSCs were obtained from
age-matched adult male Wistar rats. Physiological
monitoring (e.g. blood gasses, blood pressure, and
body temperature) will be performed on all animals.
No immunosuppression was used. We administered
MSCs IV with a dosage of zero (PBS), 1�/106/ml, or
3�/106/ml to rats subjected to 2 h of MCAo (Chen et
al., 2001) at 1 day after MCAo.
2.2. Animal TBI model in rats
Young male Wistar rats (weighing 200�/300 g) were
anesthetized with chloral hydrate (35 mg/100 g) intra-
peritoneally, and then subjected to controlled cortical
impact (Dixon et al., 1991; Mahmood et al., 2001b).
MSCs were obtained from age-matched male Wistardonor rats and injected via the tail vein. No immuno-
suppression was used. The treatment MSCs were
administered at 24 h after TBI with various doses:
zero (BPS), 1�/106/ml, 2�/106/ml or 4�/106/ml of
MSCs.
2.3. Outcomes of interest
Several tests, such as the mNSS, Adhesive-Removal
test, Motor Rotarod test and the Corner test scores, are
used to evaluate neurological and behavior status of rats
with brain injury. For different injury models, different
tests might be performed. The assessments of neurolo-
gical and behavior recovery used in the two previous
injury models are described in the following:
(1) Motor Rotarod test: an accelerating Rotarod wasused to measure the motor function (Lennmyr et al.,
1998). Rats were trained 3 days before injury. A rat was
placed on the Rotarod cylinder and the time remaining
on the Rotarod was recorded. The speed would be
slowly increased from 4 to 40 rpm within 5 min. An
experiment was ended if the rat fell off the rungs or
gripped the device and spun around for two consecutive
revolutions without attempting to walk on the rungs.The mean duration (in seconds) on the device was
recorded in three replicates 1 day before the brain injury
(pre-baseline). Motor test score is calculated as a
M. Lu et al. / Journal of Neuroscience Methods 128 (2003) 183�/190184
percentage of the mean duration in three replications
compared to the pre-baseline mean duration.
(2) Adhesive-Removal Test (Schallert et al., 2000)
measures somatosensory deficit both before and aftersurgery. All rats were familiarized with the testing
environment. In the initial test, two small pieces of
adhesive-backed paper dots (of equal size, 113.1 mm2)
were used as bilateral tactile stimuli occupying the
distal�/radial region on the wrist of each forelimb. The
rat was then returned to its cage. The time to remove
each stimulus from forelimbs was recorded on five trials
per day. Individual trials were separated by at least 5min. The animals were trained for 3 days prior to
surgery. Once the rats were able to remove the dots
within 10 s, they were subjected to the surgery (MCAo
or TBI).
(3) The mNSS test (Chen et al., 2001) is graded on a
scale of 0�/18 (normal score 0; maximal deficit score 18).
The mNSS is a composite of motor, sensory, reflex and
balance tests. In the severity scores of injury, one scorepoint is awarded for the inability to perform the test or
for the lack of a tested reflex, described in Table 1.
(4) The Corner test measures sensory and motor
functions and is more sensitive for long-term sensory
and motor deficits (Zhang et al., 2002). A rat was placed
between two boards, each with dimensions of 30�/20�/
1 cm3 in the home cage. The edges of the two boards
were attached at a 308 angle with a small opening alongthe joint between the two boards to encourage entry into
the corner. The rat was placed between the two angled
boards facing the corner and half-way to the corner.
When entering deep into the corner, both sides of the
vibrissae were stimulated together. The rat then reared
forward and upward, and then turned back to face the
open end. A non-injured rat either turned left or right,
but the injured rats preferentially turned toward thenon-impaired, ipsilateral (right for MCAo and left for
TBI) side. The turns in one versus the other direction
were recorded from ten trials for each test, and the
fraction of the turns was used as the Corner test score.
For the MCAo model, we used the mNSS, Adhesive-
Removal and Rotarod tests, and for the TBI model, we
used the mNSS and Corner tests for neurological and
behavior status. MCAo/TBI induced neurological defi-cits include somatosensory, motor, balance and other
deficits, which cannot be evaluated by one test. There-
fore, we selected the Adhesive-Removal test to deter-
mine the somatosensory deficit, the Rotarod test for
motor deficit detection, the Corner test for sensory and
motor deficit of TBI model and the mNSS that measures
sensory and motor function, balance and reflexes, which
is similar to clinical. Functional deficit/recovery, definedas the set of functional tests for each brain injury model
(MCAo or TBI), is measured at baseline prior to the
treatment and at 14 days after the brain injury. We are
interested in testing the null hypothesis: there is a MSC
treatment effect on functional recovery in rats with
brain injury (Stroke, TBI).
2.4. Statistical background of the global test statistic
We consider observations (Yi , Xi ), for i�/1, 2,. . ., N
(the number of subjects), where, vector Y ? i �/(yi ,1,
yi ,2,. . ., yi,k ) represents K outcomes for some integer K
(e.g. mNSS, Adhesive-Removal test and Rotarod test
scores are the outcomes for the MCAo Model, and K�/
3) and X ? i �/(xi ,1, xi ,2,. . ., xi,q ) represents q covariates
for some integer q and xi,q is the treatment indicator, forconvenience, with the value of 0 as the MSC zero dose, 1
as MSC dose 1, and 2 as MSC dose 2. Suppose Yi has a
distribution of f( �/, Ui , f), in which EYi �/Ui is K �/1
Table 1
Modified neurological severity score
Motor tests
Raising the rat by the tail 3
1 Flexion of forelimb
1 Flexion of hind limb
1 Head moved more than 108 to the vertical axis
within 30 s
Walking on the floor (normal�/0; maximum�/3) 3
0 Normal walk
1 Inability to walk straight
2 Circling toward the paretic side
3 Fall down to the paretic side
Sensory tests 2
1 Placing test (visual and tactile test)
2 Proprioceptive test (deep sensation, pushing the
paw against the table edge to stimulate limb
muscles)
Beam balance tests (normal�/0; maximum�/6) 6
0 Balances with steady posture
1 Grasps side of beam
2 Hugs the beam and one limb fall down from the
beam
3 Hugs the beam and two limbs fall down from the
beam, or spins on the beam (�/60 s)
4 Attempts to balance on the beam but falls off (�/
40 s)
5 Attempts to balance on the beam but falls off (�/
20 s)
6 Falls off: no attempt to balance or hang on to the
beam (B/20 s)
Reflexes absence and abnormal movements 4
1 Pinna reflex (a head shake when touching the
auditory meatus)
1 Corneal reflex (an eye blink when lightly touching
the cornea with cotton)
1 Startle reflex (a motor response to a brief noise
from snapping a clipboard paper)
1 Seizures, myoclonus, myodystony
Maximum
points
18
One point is awarded for the inability to perform the tasks or for the
lack of a tested reflex, 13�/18 severe injury; 7�/12 moderate injury; 1�/6
mild injury.
M. Lu et al. / Journal of Neuroscience Methods 128 (2003) 183�/190 185
vector and f is an unknown scale parameter. The
statistical model can be expressed as
g(ui;1)
g(ui;2)
ng(ui;K )
2664
3775
K�1
�
1 0 0 � � � 0 xi;1 � � � xi;q
0 1 0 � � � 0 xi;1 � � � xi;q
n n n ::: n n ::: n0 0 0 � � � 1 xi;1 � � � xi;q
2664
3775
K�(K�q)
a1
a1
naK
b1
nbq
2666666664
3777777775
(K�q)�1
(1)
where g( �/) is a link function, ak is the intercept for thek th outcome, b1, b2,. . ., bq�1, is the set of nuisance
parameters (e.g. pre-treatment functional test scores),
and bq is the parameter of interest (the coefficient for
the common treatment effect). Assuming the K �/K
covariance matrix, var(Yi)�/fV (Ui), for some known
K �/K matrix V (the working covariance matrix), the
quasi-likelihood estimator vector b , b ?�/ (a1, a2,. . ., aK ,
b1,. . ., bq )?, is the solution of the score-like equationsystem
XN
i�1
�dUi
db
?
V�1(Ui)(Yi�Ui)�0 (2)
Wedderburn (1974) first proposed the quasi-likeli-
hood theory. Liang and Zeger (1986) developed general-
ized estimating equations (GEE) based on the score-like
equations in Eq. (2) for multiple or clustered continuous
outcomes and for discrete longitudinal data (Zeger and
Liang, 1986). In Eq. (2), b is estimated using a quasi-
likelihood function, not a proper likelihood function.Therefore, we could assume the link function, g ( �/), and
variance structure, matrix V , without attempting to
specify the entire distribution of Yi , for i�/1, 2,. . ., N .
When matrix V is the true variance matrix of Yi Eq. (2)
becomes the score equation system yielding a maximum
likelihood estimate of b , which can be calculated using
the ANCOVA approach (Park, 1993).
We can conduct a global test for the treatment effecton correlated outcomes assuming a common treatment
effect on all outcomes (a homogeneous effect). Testing
the null hypothesis, described in the above section, is
equivalent to testing H0:bq �/0 in Eqs. (1) and (2), where
bq is the global test statistic for the MSC effect. The
treatment of MSC is effective, if bq "/0 at the critical
level of 0.05. In addition, Eq. (1) can be extended to test
the common treatment effect and the variable interac-tion (e.g. MSC and the time of MSC administration
interaction) by adding the interaction term along with
the two individual main effect terms.
Furthermore, in order to calculate a valid global test,
based on Eqs. (1) and (2), we need to have a closer look
at how the outcome data are collected. Two necessary
steps need to be checked before conducting the globaltest for the MSC effect.
2.4.1. Data transformation or re-scaling the outcome
variables
2.4.1.1. Outcome measurement consistency. Stroke re-
covery can be measured in different directions numeri-
cally. For example, the mNSS test score, in a range of
0�/18, is a composite of motor, sensory, reflex andbalance tests, where the higher the scale, the more
severity of brain injury and less recovery. In contrast,
the Rotarod test measures the time the animals
remained on the Rotarod cylinder; the higher the score,
the less severity of brain injury and more recovery.
Including the inconsistent outcome measures in Eq. (1)
will provide an invalid test that diminishes the treatment
effect. To solve this problem, all of the outcomes ofinterest should be evaluated for numerical direction of
injury. Data transformation, such as subtracting from
the maximum or calculating the reciprocal of the scale, if
the scale "/0, will be considered.
2.4.1.2. Outcome re-scaling. From Eqs. (1) and (2), as a
continuous outcome, the scale of outcome plays a role in
testing the treatment effect. The outcome with the larger
scale will dominate the estimation of the treatment
effect, compared to the outcome with a smaller scale,
which is a validation of the common treatment assump-
tion in Eq. (1). To overcome this, data will be expressed
in common units via a transformation described in thefollowing expression:
y+i;k�
yi;k � yk
sk
(3)
where yk is the mean and sk is standard deviation of yi ,k ,i�/1, 2,. . ., N .
Note that, although data transformation must be
considered when outcomes are measured in different
directions or in different scales numerically, to be
realistic, we would prefer to present descriptive statistics
such as the mean or the standard deviation of each test
score based on the raw data for data illustration.
2.4.2. Statistical software used for the global test
The global test can be implemented using SAS, the
commercial software for statistical analyses (SAS In-
stitution, 1999). PROC GLM or PROC MIXED in SAS can be
used, if data are normal, and PROC GENMOD in SAS orthe GEE SAS macro 2.03 (which may be obtained from U.
Gromping, Fachbereich Statistik, Universitat) written
using SAS MODULAR IML, if data are otherwise. A
M. Lu et al. / Journal of Neuroscience Methods 128 (2003) 183�/190186
significant difference between the treated group and
controls is detected if the P -value for a global test is less
than 0.05. (SAS code for testing the global statistic can
be obtained from the first author.)
2.4.3. Subgroup analysis of MSC effect on single
outcome in controlling type I error
In addition to the global test for the common
treatment effect for functional recovery measured from
multiple test scores, researchers are also interested in the
MSC effect on single outcome or common MSC effectamong the multiple outcomes between pair-wise group
comparisons, if more than two doses are involved. When
the treatment difference is detected, the critical value of
0.05 is based on the global test. The sub-group analysis,
such as the pair-wise dose comparison, can be further
tested using the CONTRAST statement in SAS con-
trolled by the same critical value (0.05). Any significant
dose difference on functional recovery in the pair-wisecomparison will then be tested for dose effect on the
individual outcome at the same critical value (0.05). If
the global test is not significant at the 0.05 level, the
formative analysis should be stopped without further
testing for pair-wise dose group comparisons or dose
responses on the individual outcome with a conclusion
of no treatment effect. Otherwise, analysis of the
individual outcome will be considered as exploratoryanalysis, which has to be confirmed in further study.
2.5. Sample size/power calculation for global test
To design an experiment for testing a treatment effect
on multiple test scores, sample size (e.g. the number ofsubjects needed per group) will be considered to insure
sufficient power (e.g. 80% of power) to detect the
treatment effect. Two approaches are discussed. One
approach is to use the single outcome to estimate the
sample size, assuming that test scores are correlated in
the same direction, given that the global test is more
efficient than a single outcome.
We will have
n�2(Za=2 � Zb)2
d2(4)
where Zp is the value cutting off the proportion p in the
upper tail of the standard normal distribution, and a is
type I error (e.g. 0.05), b is type II error (e.g. 1�/b�/0.8
for the power) and d is effect size, defined as the
difference in treatment means expressed in units of
standard deviation d�/jm1�/m0j/s . Eq. (4)provides a
formula to calculate power for a fixed sample size. We
can also see from Eq. (4) that with fixed type I and typeII errors, the sample size determination is based on the
effect size, d . The larger the required effect size, the
more subjects are needed per group.
The second approach is to consider the correlation (r )
among outcomes, proposed by Diggle et al. (1994) with
the expression
n�2(Za=2 � Zb)2(1 � (K � 1)r)
Kd2(5)
where K is number of outcomes. If the outcomes are
perfectly correlated outcomes (e.g. r�/1), Eq. (5)becomes Eq. (4). With a fixed sample, the power for
multiple outcomes would be equal to the power for a
single outcome. However, when the correlation de-
creases, the sample size required decreases compared
to the sample size calculation used in Eq. (4). If the
correlation varies, to be conservative, one should use the
highest correlation in Eq. (5), or use Eq. (4) for the
sample size calculation.In addition, if more than two treatment groups are
involved, the effect size, d , can be defined by two
treatment groups is given Cohen (1988) as
d�mmax � mmin
s(6)
where M is number of treatment groups for M �/2, mmax
is the largest of the M means, mmin is the smallest of the
M means and s is the common standard deviation
within the population. Assuming intermediate variabil-
ity: m means equally spaced over d , with estimation of d
in Eq. (6), we could determine the sample size, based on
Eq. (4) or Eq. (5).
3. Experiment: data collection and analysis results
3.1. The MSC dose responses on functional recovery for
rats with MCAo
The MCAo rat model was discussed in 2.1. The MSC
dose response on functional recovery is measured from
three test scores ; Adhesive-Removal patch test, Rotarod
test and mNSS. Nineteen rats were involved in this
study with three dose groups; PBS (control, n�/6), 1�/
106 MSCs (n�/6) and 3�/106 MSCs (n�/7) at an
administration time of 1 day after MCAo. The func-
tional status, Adhesive-Removal patch test, Rotarodtest and mNSS were collected at baseline (immediately
before treatment) and at 14 days after stroke, where the
Rotarod test had an inverse order (higher score and
better recovery) compared to the mNSS or Adhesive-
Removal test. The data was transformed to have
measurement consistency and scaled outcomes. MCAo
severity on functional scores prior to treatment was
balanced among the dose groups (P -values �/0.23). Thecorrelation coefficients among the three outcomes were
in a range of 0.14 between the Rotarod and Adhesive
Removal tests to 0.61 between the mNSS and Adhesive
M. Lu et al. / Journal of Neuroscience Methods 128 (2003) 183�/190 187
Removal test scores. The application of the global test
showed significant MSC dose responses on functional
recovery at 14 days (P�/0.004). Among them, there was
a significant improvement on functional recovery at 14
days for rats treated with 3�/106 MSCs compared to
control rats (P�/0.001) with the significant MSC effect
on each individual outcome except the Rotarod test
score (P�/0.58); a marginal improvement for rats
treated with 1�/106 MSCs compared to the control
rats (P�/0.09); and no significant difference between
MSC dose of 3�/106 and 1�/106 (P�/0.17) (Table 2).
3.2. The MSC dose effect of MSCs on functional
recovery for rats with TBI
The TBI rat model was discussed in Section 2.2. A
total of 36 TBI rats (n�/9 per group) were employed in
this study: control (PBS), 1�/106, 2�/106 and 4�/106
MSCs were administered at 1 day after TBI. Functional
outcomes (mNSS and Corner test) were measured at
baseline (before treatment), and at 1 month after
treatment. The Corner test score has an inverse order
as compared to the mNSS. Therefore, the analysis was
conducted based on the transformed data described in
Section 2.4.1. TBI severity on functional scores prior to
treatment was balanced between the MSC treated
groups and the control group (P �/0.66). The correla-
tion and coefficient between the two tests was 0.76. The
results of the global test showed significant MSC dose
response on functional recovery at 1 month after
treatment. Of them, each MSC dose group significantly
improved on 1-month functional recovery compared to
the controls (P B/0.001 for both 2�/106 and 4�/106
MSC doses), except the low dose of 1�/106 MSCs (P�/
0.86). The 2�/106 or 4�/106 MSC dose improved 1-
month functional recovery significantly compared to the
low dose 1�/106 MSCs (P B/0.01). Rats treated with
4�/106 MSC dose were significantly improved on 1
month functional recovery compared to rats treated
with 2�/106 MSCs with significant improvement on
both mNSS and the Corner test scores, respectively
(Table 3).
4. Discussion and conclusions
The goal of this research is to establish a statistical
algorithm for testing the MSC dose response or thetreatment effect, so that the conclusion of the MSC
effect does not rely on a single functional test score, but
relies on multiple functional test scores in the animal
models for brain injury and stroke. The global test for
continuous outcomes can be implemented under careful
statistical modeling using the framework of ANCOVA
and necessary data transformation.
Multiple tests are often considered in neuroscienceresearch, because no single test measures all dimensions
of brain recovery, and there is no ‘gold standard’ test.
We would be criticized for increasing type I error by
Table 2
MSC dose response on 14 days functional recovery
Treatment Adhesive-Removal (s) mean9/S.D. Rotarod (%) mean9/S.D. mNSS (score) mean9/S.D.
Pre-treatment (baseline)
Control (n�/6) 116.09/8.0 43.39/21.3 8.39/1.5
1�/106 MSCs (n�/6) 111.79/13.3 35.99/21.7 8.79/2.3
3�/106 MSCs (n�/7) 106.99/23.8 49.19/22.4 8.69/2.8
Post-treatment (14 d)
Control 86.39/28.0 68.89/16.1 7.39/0.8
1�/106 MSCsa 52.09/22.2 65.69/21.5 5.29/3.1
3�/106 MSCsa,b 33.69/15.8 73.59/18.3 4.69/1.8
a Significant improvement on functional recovery compared to controls based on the global test.b P�/0.06, marginal differences on functional recovery between MSC dose 1�/106 and 3�/106.
Table 3
Dose response for TBI
Treatment (n�/9) Corner test mean9/
S.D.
mNSS (score) mean9/
S.D.
Pre-treatment (base-
line)
Control 6.79/7.0 10.39/1.5
1�/106 MSCs 8.99/6.0 10.29/1.6
2�/106 MSCs 7.89/6.7 10.39/2.1
4�/106 MSCs 10.09/7.1 10.09/2.3
5.09/0.7
Post-treatment (1
month)
Control 14.49/7.3 5.09/0.7
1�/106 MSCs 15.69/5.3 4.99/1.1
2�/106 MSCsa 30.09/7.1 2.49/0.7
4�/106 MSCsa,b 36.79/5.0 1.89/0.4
a Significant improvement on functional recovery compared to
controls using the global test.b Significant improvement on functional recovery compared to the
MSC dose 2�/106 MSC group using the global test.
M. Lu et al. / Journal of Neuroscience Methods 128 (2003) 183�/190188
presenting the treatment effect on each individual out-
come without adjusting for multiple outcomes. The
global test considers multiple outcomes and tests the
common treatment effect that is relevant to the hypoth-esis of a MSC effect on functional recovery.
The global test on multiple outcomes is more efficient
than a test using a single outcome, when treatment
effects are consistent on outcomes. The less the correla-
tion among outcomes, the more power, and therefore,
the higher efficiency of the global test (Tilley et al.,
1996). If outcomes are perfectly correlated, the global
test on multiple outcomes becomes a test for a singleoutcome. On the other hand, the global test may be less
efficient than a test based on a single outcome if
treatment effects are inconsistent among multiple out-
comes. If the MSC effect were positive on one outcome
and negative on another, the treatment effects would be
diminished as a result. The global test will be less likely
to detect a significant result, which is exactly what we
expect and could be an advantage of the global test inanalyzing the MSC treatment effect. We cannot define
the effectiveness of MSC therapy based on one outcome,
especially when the other outcomes are in the wrong
direction.
Note, that there are different concepts between the
outcome measurement consistency and the treatment
effect consistency on outcomes. The earlier term is a
numerical issue and the later term is a pre-clinical orclinical issue. To have a valid statistical test, data
transformation must be considered if outcome measure-
ments are inconsistent or they are not in a common
scale. As an example, for the TBI rat model, using the
re-scaling Corner test and mNSS test scores without
changing inconsistent measurements, we obtained an
overall MSC dose effect on 1 month functional recovery
with P�/0.59 and P �/0.32 for any pair-wise dose groupcomparisons, and the results totally conflict with results
presented in Section 3.2. On the other hand, if the
treatment effect is inconsistent on outcomes (i.e. a
treatment improves the functional recovery as one
outcome, but increases the chance of adverse event as
another outcome), more likely, we would not have
significant results based on the global test, as we
discussed previously.The global test allows researchers to study the
treatment effect on a single outcome using the step-
down procedure at the critical value of 0.05 if the global
test is significant. Otherwise, the individual tests would
be considered informative with no conclusion.
In summary, we have implemented the global test on
the MSC treatment of stroke and TBI. The global test
appears as a useful statistical tool for testing thecommon dose effect when multiple outcomes are
involved and correlated. The global test for continuous
outcomes can be implemented under careful statistical
modeling and proper data transformation.
Acknowledgements
The authors thank Lula Adams for editing. This work
was supported by NINDS grants PO1 NS23393, RO1
NS33627, RO1 NS38292 and RO1 HL64766.
References
Chen H, Chopp M, Zhang ZG, Garcia JH. The effect of hypothermia
on transient middle cerebral artery occlusion in the rat. J Cereb
Blood Flow Metab 1992;12:621�/8.
Chen J, Li Y, Wang L, Zhang Z, Lu D, Lu M, et al. Therapeutic
benefit of intravenous administration of bone marrow stromal cells
after cerebral ischemia in rats. Stroke 2001;32:1005�/11.
Cohen J. Statistical power analysis for the behavioral sciences, 2nd ed.
New Jersey: Lawrence Erlbaum Associates, Inc, 1988.
Cohen J, Cohen P. Applied multiple regression. Hillsdale, NJ:
Lawrence Erlbaum Associates, 1983.
Diggle PJ, Liang K, Zeger SL. Analysis of longitudinal data.
Clarendon Press: Oxford, 1994.
Dixon CE, Clifton GL, Lighthall JW, Yaghmai AA, Hayes RL. A
controlled cortical impact model of traumatic brain injury in the
rat. J Neurosci Methods 1991;39:253�/62.
Lehmann E. Testing statistical hypotheses, 1st ed. New York: Wiley,
1986.
Lennmyr F, Ata KA, Funa K, Olsson Y, Terent A. Expression of
vascular endothelial growth factor (VEGF) and its receptors (Flt-1
and Flk-1) following permanent and transient occlusion of the
middle cerebral artery in the rat. J Neuropathol Exp Neurol
1998;57:874�/82.
Liang K, Zeger SL. Longitudinal data analysis using generalized linear
models. Biometrika 1986;72(1):13�/22.
Lipsitz SR, Laird NM, Harrington DP. Generalized estimation
equations for correlated binary data: using the odds as a measure
of association. Biometrika 1991;78:153�/60.
Lu M, Tilley BC. Use of odds ratio or relative risk to measure a
treatment effect in clinical trials with multiple correlated binary
outcomes: data from the NINDS t-PA stroke trial. Stat Med
2001;20:1891�/901.
Lu D, Mahmood A, Wang L, Li Y, Lu M, Chopp M. Adult bone
marrow stromal cells administered intravenously to rats after
traumatic brain injury migrate into brain and improve neurological
outcome. Neuroreport 2001;12:559�/63.
Mahmood A, Lu D, Wang L, Li Y, Lu M, Chopp M. Treatment of
traumatic brain injury in female rats with intravenous administra-
tion of bone marrow stromal cells. Neurosurgery 2001a;49:1196�/
203.
Mahmood A, Lu D, Yi L, Chen JL, Chopp M. Intracranial bone
marrow transplantation after traumatic brain injury improving
functional outcome in adult rats. J Neurosurg 2001b;94:589�/95.
National Institute of Neurological Disorders and Stroke rt-PA Stroke
Study Group. Tissue plasminogen activator for acute ischemic
stroke. N Eng J Med 1995;333:1581�/7.
O’Brien PC. Procedures for comparing samples with multiple end-
points. Biometrics 1984;40(December):1079�/87.
Park T. A comparison of the generalized estimating equation approach
with the maximum likelihood approach for repeated measure-
ments. Stat Med 1993;12:1723�/32.
Pocock SJ, Geller NL, Tsiatis AA. The analysis of multiple endpoints
in clinical trials. Biometrics 1987;43(September):487�/98.
SAS Institution. SAS/STAT Software, 8th ed. Cary (NC): SAS Institu-
tion; 1999.
M. Lu et al. / Journal of Neuroscience Methods 128 (2003) 183�/190 189
Schallert T, Fleming SM, Leasure JL, Tillerson JL, Bland ST. CNS
plasticity and assessment of forelimb sensorimotor outcome in
unilateral rat models of stroke, cortical ablation, parkinsonism
and spinal cord injury. Neuropharmacology 2000;39:777�/87.
Tilley BC, Marler J, Geller NL, Lu M, Legler J, Brott T, et al. Use of a
global test for multiple outcomes in Stroke Trials with application
to the National Institute of Neurological Disorders and Stroke t-
PA Stroke Trial. Stroke 1996;27(11):2136�/42.
Wedderburn RWM. Quasi-likelihood functions, generalized linear
models, and the Gauss�/Newton method. Biometrika
1974;61:439�/47.
Zeger SL, Liang K. Longitudinal data analysis for discrete and
continuous outcomes. Biometrics 1986;42:121�/30.
Zhang L, Schallert T, Zhang ZG, Jiang Q, Arniego P, Li Q, et al. A
test for detecting long-term sensorimotor dysfunction in the mouse
after focal cerebral ischemia. J Neurosci Methods 2002;117:207�/
14.
M. Lu et al. / Journal of Neuroscience Methods 128 (2003) 183�/190190