16
Journal of Abnormal Child Psychology, VoL 9, No. 2, 1981, pp 203-218 Format Effects in Two Teacher Rating Scales of Hyperactivity 1 Jonathan Sandovai University of California, Davis The object of this study was to investigate the effect of differences in format on the precision of teacher ratings and thus on the reliability and validity of two teacher rating scales of children's hyperactive behavior. Teachers (N = 242) rated a sample of children in their classrooms using rating scales assessing similar attributes with different formats. For a sub- sample the rating scales were readministered after 2 weeks. The results indicated that improvement can be made in the precision of teacher ratings that may be reflected in improved reliability and validity. The most ubiquitous method used to identify and monitor the behavior of children considered to be hyperactive in school settings is the rating scale completed by teachers. Measures obtained from rating information are quick and relatively inexpensive to obtain, and teachers are often the professionals best able to observe children's reactions to tasks that require (a) sustained attention, (b) a systematic and deliberate approach to problem solving, (c) self-control of physical activity level, and (d) social interactional skills with peers and teachers. Teachers have a wide experience with children in similar situations and, the longer they teach, a standard of comparison that parents and many physicians do not have. Teacher rating scales used in assessing hyperactivity are not without problems, however. In a previous paper I have criticized the existing rating scales on a number of grounds: (1) they contain leading questions Manuscript received in final form September 18, 1980. 'This research was supported by grants from the Spencer Foundation and the National Institute of Mental Health (MH 29495). A somewhat different version of this paper was presented at the annual convention of the American Educational Research Association, Toronto, Canada, March 1978. The author thanks Nadine Lambert and Leonard Marascuilo for their assistance in the preparation of this paper. 203 0091-0627/81/0600-0203 $03.00/0 1981 Plenum Publishing Corporation

Format effects in two teacher rating scales of hyperactivity

Embed Size (px)

Citation preview

Page 1: Format effects in two teacher rating scales of hyperactivity

Journal o f Abnormal Child Psychology, VoL 9, No. 2, 1981, pp 203-218

Format Effects in Two Teacher Rating Scales

of Hyperactivity 1

Jonathan Sandovai University o f California, Davis

The object o f this study was to investigate the effect o f differences in format on the precision o f teacher ratings and thus on the reliability and validity o f two teacher rating scales o f children's hyperactive behavior. Teachers (N = 242) rated a sample o f children in their classrooms using rating scales assessing similar attributes with different formats. For a sub- sample the rating scales were readministered after 2 weeks. The results indicated that improvement can be made in the precision o f teacher ratings that may be reflected in improved reliability and validity.

The most ubiquitous method used to identify and monitor the behavior of children considered to be hyperactive in school settings is the rating scale completed by teachers. Measures obtained from rating information are quick and relatively inexpensive to obtain, and teachers are often the professionals best able to observe children's reactions to tasks that require (a) sustained attention, (b) a systematic and deliberate approach to problem solving, (c) self-control of physical activity level, and (d) social interactional skills with peers and teachers. Teachers have a wide experience with children in similar situations and, the longer they teach, a standard of comparison that parents and many physicians do not have.

Teacher rating scales used in assessing hyperactivity are not without problems, however. In a previous paper I have criticized the existing rating scales on a number of grounds: (1) they contain leading questions

Manuscript received in final form September 18, 1980. 'This research was supported by grants from the Spencer Foundation and the National Institute of Mental Health (MH 29495). A somewhat different version of this paper was presented at the annual convention of the American Educational Research Association, Toronto, Canada, March 1978. The author thanks Nadine Lambert and Leonard Marascuilo for their assistance in the preparation of this paper.

203

0091-0627/81 /0600-0203 $03.00/0 �9 1981 Plenum Publishing Corporat ion

Page 2: Format effects in two teacher rating scales of hyperactivity

204 Sandoval

worded in the negative and are thus subject to rater bias and response sets such as reverse halo effects or reverse generosity errors (Selltiz, Wrightsman, & Cook, 1976), (2) the items consist of traits of symptoms and thus no concrete descriptive anchors so that there are no set standards for the behavior to be rated against, and (3) the scale's factorial structures indicate that teachers do not draw critical distinctions in children's behavior. The result of these forces, mainly originating in the rating format, is that the weaknesses already inherent in teacher rating procedures are magnified (Sandoval, 1977).

When grades or other teacher ratings are used as criterion measures for research purposes, the sought-after relationships are often hard to establish because teachers vary in their standards for a given rating, and differ in their propensity to avoid or to use particular rating categories. Many teachers report discomfort in rating only negative, socially undesirable behavior, such as the content of most hyperactivity rating scales. The extent of teacher bias is often difficult to assess because class- rooms do differ in the level of performance of pupils and in the degree of heterogeneity with respect to the attribute being rated. As a result, teacher ratings may not have a great deal of precision. Both the mean and standard deviation of one teacher's ratings may be quite different from another's even though their classrooms may be made up of similar children. From the perspective of sources of error in a test score, the amount of error variance attributable to teachers could be considerable.

By changing the format of a teacher rating scale, the intent is to make the judgments of children's behavior by one teacher more comparable to those of another and to increase the distribution of scores across classrooms so that discriminations between pupils may be made more easily. Very serious decisions are made on the basis of teacher rating scales: the evaluation of a new medication in a research study by psycho- pharmacologists, the referral of a child to a physician by school personnel, the labeling of a child as hyperactive or the prescription of a drug by a physician, the estimation of the prevalence of hyperkinesis in a particular population. Finding procedures yielding the most precise measure is, or should be, an important quest for researchers in the area of childhood hyperactivity.

This study investigated the improvement in precision of teacher ratings that may come about from changes in format. If teachers become more precise in rating, concomitant changes may occur in the reliability and validity of the scale. The work reported represents an effort to develop an effective method for school personnel to contribute to the evaluation of children who may be considered hyperactive. The question under investigation was whether or not the addition of positively and negatively phrased items along with the use of more concrete behavior

Page 3: Format effects in two teacher rating scales of hyperactivity

Teacher Rating of Hyperactivity 205

descriptors in a new teacher rating scale influences the reliability and validity of teacher rating procedures.

M E T H O D

Subjects

In two northern California urban-suburban counties 242 teachers rated five or six children in their classrooms as part of a longitudinal study of children considered hyperactive. One, and sometimes two of the children in the classroom were study children and have been described by Lambert and Sandoval (1980). Approximately 22% of the children from the longitudinal study were identified as hyperactive by a physician and met the same standards as subjects in clinical medication studies--namely, had teacher and parent ratings of nonmedicated hyperactive behavior in the top 15% of children, had physician confirmations of hyperactivity and the absence of a competing medical explanation for the behavior, had parental reports of long-standing hyperactive behavior, and were in the normal range of intellectual functioning. Thirteen percent were also identified as hyperactive, but for these subjects, physicians had established a competing diagnosis. Another 13% were identified as hyperactive by parents or teachers but not by physicians, 15070 had 2 years earlier been considered extreme in their behavior and similar to hyperactive children by a teacher hut were not considered hyperactive by their parents or teacher, and 37% had been selected at random. In addition to the project children, the teachers rated an additional four children selected at random. Their ratings provided a behavioral frame of reference for the teacher's rating, and data by which to compare project children to classroom peers. All teachers were blind as to the status of the project child, although they knew the child participated in a research pro- ject that included both hyperactive and control children. The final sample consisted of 1,120 children, grades 2 to 8, 12~ of whom were considered hyperactive by physicians, parents, or teachers.

Measures

Four teacher rating measures were used, each of a different type. The first, the Behavior and Temperament Survey (BTS) (Sandoval, Lambert, & Sassone, 1980), had been designed for use in the project in the early 1970s before other rating scales had been in widespread use. It is

Page 4: Format effects in two teacher rating scales of hyperactivity

Ta

ble

1.

Ite

ms

from

the

Beh

avio

r an

d T

emp

eram

ent

Sur

vey,

the

Sch

ool

Beh

avio

r S

urve

y, a

nd t

he A

bbre

viat

ed S

ym

pto

m

to

Que

stio

nnai

re

Beh

avio

r an

d

Abb

revi

ated

Sy

mp

tom

C

ateg

ory

Tem

per

amen

t S

urve

y S

choo

l B

ehav

ior

Sur

vey

Que

stio

nnai

re

Neg

ativ

e M

oto

r re

stle

ssne

ss

1.

Fid

gets

a 3.

A

t re

cess

pre

fers

gam

es

1,

Res

tles

s or

(a

lso

know

n as

2.

T

alks

to

o m

uch

a su

ch a

s ta

g or

sp

on

tan

eou

s ov

erac

tive

hy

pera

ctiv

ity)

4.

G

ets

into

thi

ngs

acti

viti

es i

nvol

ving

run

ning

3.

D

istu

rbs

oth

er

24.

Sits

fid

dlin

g w

ith

smal

l 10

. T

aps

fing

ers,

han

ds,

or

chil

dren

ob

ject

s a

feet

at

desk

25

. H

um

s an

d m

akes

od

d

15.

Doe

s no

t si

t st

ill a

t de

sk

nois

es

duri

ng s

eat

wor

k o

r re

gula

r cl

ass

acti

viti

es

20.

Tal

ks c

onst

antl

y, a

lway

s ch

atte

ring

Pos

itiv

e 4.

S

tay

calm

ly s

eate

d du

ring

sc

hool

ass

embl

ies

or

othe

r pr

ogra

ms

16.

Kee

ps d

esk

neat

an

d o

rder

ly

17.

Usu

ally

wal

ks a

rou

nd

ob

ject

s ra

ther

th

an

over

the

m

19.

Sit

s st

ill

whe

n gi

ven

indi

vidu

al a

tten

tio

n

Neg

ativ

e In

atte

ntiv

enes

s 7.

P

lays

nom

adic

ally

a 7.

Wh

en a

sket

l to

per

form

a

4. -

Fails

to

fin

ish

thin

gs

(als

o kn

own

as

9.

Do

esn

't c

ompl

ete

com

plex

cla

ssro

om c

hore

, he

sta

rts,

sho

rt

dist

ract

ibil

ity)

pr

ojec

ts,

do

esn

't s

tay

has

diff

icul

ty c

ompl

etin

g it

at

ten

tio

n s

pan

w

ith

gam

es a

11.

Inat

tent

ive,

dis

trac

tibl

e a

8.

Has

a h

ard

tim

e fi

nish

ing

6.

Inat

tent

ive,

18

. D

oesn

't f

ollo

w

mo

st c

lass

ass

ignm

ents

di

stra

ctib

le

dire

ctio

ns

11.

Doe

s n

ot

atte

nd

whe

n ot

her

pupi

ls r

ead

alou

d

r/l

~t

m

Page 5: Format effects in two teacher rating scales of hyperactivity

Impu

lsiv

ity

8.

Una

ble

to d

elay

gr

atif

icat

ion

12.

Unp

redi

ctab

le s

how

of

affe

ctio

n

32.

Impu

lsiv

e a

27.

Ove

rly

anxi

ous

to p

leas

e

22.

Can

no

t do

sea

t w

ork

whe

n an

unf

amil

iar

chil

d or

adu

lt i

s in

the

cla

ssro

om

Pos

itiv

e 1.

L

iste

ns w

ell

to d

irec

tion

s in

mat

h,

spel

ling

, et

c.,

and

nee

ds v

ery

litt

le

foll

ow-u

p he

lp

5.

Wh

en c

hoos

es t

o do

so

met

hing

, st

icks

wit

h it

to

com

plet

ion

14.

Att

ends

whe

n gi

ven

one-

to

-one

ass

ista

nce

27.

Can

wor

k w

ell

in a

noi

sy

or "

bu

sy"

env

iro

nm

ent

Neg

ativ

e 12

. B

urst

s th

rou

gh

doo

rs o

r ou

t in

to h

alls

wit

hout

loo

king

18.

In c

lass

dis

cuss

ion,

blu

rts

out

answ

ers

wit

hout

rai

sing

h

and

or

foll

owin

g ot

her

proc

edur

es

29.

Doe

s m

any

ass

ignm

ents

in

corr

ectl

y be

caus

e do

es

not

wai

t fo

r di

rect

ion

32.

In m

ath

or

othe

r su

bjec

ts,

jum

ps

to t

he a

nsw

er w

itho

ut

chec

king

the

cal

cula

tion

s or

fac

ts,

neve

r ch

ecks

the

w

ork

2.

Exc

itab

le,

impu

lsiv

e

[]

Page 6: Format effects in two teacher rating scales of hyperactivity

Tab

le I

. C

on

tin

ued

Beh

avio

r an

d A

bb

rev

iate

d S

ym

pto

m

Cat

egor

y T

emp

eram

ent S

urve

y S

choo

l B

ehav

ior

Sur

vey

Que

stio

nnai

re

O~

Agg

ress

iven

ess/

ex

cita

bili

ty

(als

o kn

own

as

cond

uct

prob

lem

re

ckle

ssne

ss a

nd

emot

iona

l lab

ilit

y)

3.

Des

truc

tive

of

toys

, m

ater

ial,

and

fur

nitu

re

5.

Acc

iden

t-pr

one

6.

Rec

kles

s a

10.

Ad

apts

slo

wly

to c

hang

es

in t

he e

nvir

onm

ent a

13.

Con

stan

tly

dem

ands

at

tent

ion a

Pos

itiv

e 2.

In

kic

kbal

l, b

aseb

all,

and

o

ther

lik

e sp

orts

, w

aits

un

til

a g

oo

d p

itch

or

ball

is

del

iver

ed b

efo

re a

ctin

g 21

. W

aits

unt

il t

he

teac

her

is

unoc

cupi

ed b

efor

e as

king

qu

esti

ons

23.

Doe

s no

t gi

ve a

n op

inio

n un

til

all

the

fact

s ar

e in

an

d h

ave

been

bee

n co

nsid

ered

24

. W

riti

ng a

nd

pen

man

ship

sh

ow g

oo

d e

vide

nce

of

fore

sigh

t an

d pl

anni

ng

Neg

ativ

e 6.

S

natc

hes

penc

ils,

cra

yons

, m

ater

ials

, et

c.,

fro

m o

ther

ch

ildr

en w

itho

ut a

skin

g 25

. G

ets

very

wou

nd u

p w

hen

so

met

hin

d n

ew i

s b

rou

gh

t in

to t

he

clas

sroo

m

26.

Is b

ossy

, tr

ies

to d

om

inat

e o

ther

chi

ldre

n 30

. W

hen

oth

er c

hild

ren

are

exci

ted

pick

s up

the

mo

od

an

d b

ecom

es u

ncon

trol

labl

e

5. C

onst

antly

fi

ghtin

g

7.

Dem

and

s m

ust

be

met

im

med

iate

ly,

easi

ly f

rust

rate

d

8.

Cri

es o

ften

an

d

easi

ly

9.

Mo

od

cha

nges

qu

ickl

y an

d

dras

tica

lly

O

Page 7: Format effects in two teacher rating scales of hyperactivity

14.

Can

't a

ccep

t co

rrec

tion

15

. T

ease

s o

ther

chi

ldre

n or

in

terf

eres

wit

h th

eir

acti

viti

es a

16.

Un

resp

on

siv

e to

dis

cipl

ine

17.

Def

ian

t 19

. L

ies

20.

Un

po

pu

lar

wit

h pe

ers

21.

Un

usu

ally

agg

ress

ive

in b

ehav

ior a

22

. P

lay

s so

as

to p

rov

ok

e ad

ult

in

terv

enti

on

aMat

ched

neg

ativ

e it

em.

23.

Wit

hd

raw

s fr

on

new

o

bje

cts

or p

erso

ns a

26

. E

xcit

able

a 28

. C

lum

sy,

po

or

gen

eral

co

ord

inat

ion

29

. P

erso

nal

ity

is v

ery

chan

gea

ble

an

d

un

pre

dic

tab

le w

ith

grea

t ch

ang

es i

n m

oo

d

30.

Fig

hts

31

. D

iffi

cult

y h

and

lin

g

fru

stra

tio

n a

Pos

itiv

e 9.

P

laci

d in

th

e fa

ce o

f su

rpri

ses

or s

ud

den

noi

ses

13.

Tak

es l

osi

ng

at

a g

ame

in

go

od

str

ide

28.

Tex

ts a

nd

wo

rkb

oo

ks

give

n to

pup

il r

emai

n

un

dam

aged

an

d i

n g

oo

d

con

dit

ion

31

. Is

abl

e to

sh

rug

off

ac

cide

ntal

bu

mp

s or

bru

shes

w

ith

oth

er p

up

ils

10.

Tem

per

ou

tbu

rsts

, ex

plos

ive

and

u

np

red

icta

ble

b

ehav

ior

-]

-i

IMP

Page 8: Format effects in two teacher rating scales of hyperactivity

210 Sandoval

similar in form to the Conners scale (Conners, 1969) and the Greenberg scale (Greenberg, Deem, & McMahon, 1972) in that it is a list of general attributes, usually deviant behaviors, that early research had indicated were characteristic of hyperactive children. Thirty-two descriptors of hyperactivity from articles by Stewart, Pitts, Craig, and Dieruf (1966), Clements and Peters (1962), and Laufer and Denhoff (1957) are used as scale items. The attributes to be assessed were motor restlessness, inattentiveness, impulsivity, and aggressiveness/emotional lability. The respondent checks a 4-point scale from "not at all characteristic" to "very much characteristic." By coincidence, over half the items were later compared to and found to be similar or identical to those on the Conners scale. The BTS is comparable to those used in most research on hyperactive children.

The second measure, the School Behavior Survey (SBS), was designed to respond to many of the problems inherent in the BTS. Rather than general attributes, it consists of descriptors of children in school settings. In this respect it is similar to the Davids scales (Davids, 1971) and the Bell scale (Bell, Waldrop, & Weller, 1972), although it contains a greater number of items. Instead of the list containing only negative or undesirable behaviors, half the 32 items are positively stated. In addition, the items were constructed with an intent to tap four characteristics: motor restlessness, impulsivity, attention/distractibility, and social aggressiveness, with the same number of items for each attribute. On this rating form the 4-point scale descriptors consisted of the frequency the child acted in a particular way: every day (or nearly always), once or twice a week, rarely, and never. Table I presents the items of both hyperactivity rating scales.

The third measure is the Pupil Behavior Rating Scale (PBRS) (Lambert & Hartsough, 1973, 1979). This rating scale is made up with an entirely different format using anchored behavioral descriptors that have been empirically verified on each of 11 attributes:

1. Gets into fights and quarrels with other pupils more often than others.

2. Has difficulty in following directions in the classroom or in games on the playground.

3. Makes immature or inappropriate responses during normal school activities.

4. Is overly dependent on his teacher for choice of activity and becomes uneasy without continued supervision.

5. Has to be coaxed or forced to work or play with other pupils. 6. Is easily distracted. 7. Behaves in ways that are dangerous to self and others.

Page 9: Format effects in two teacher rating scales of hyperactivity

Teacher Rating of Hyperactivity 211

8. Has no enthusiasm for school and does not respond to or maintain interest in learning tasks.

9. Has difficulty in learning school subjects. 10. Becomes sick or upset or may stay home from school when faced

with a difficult problem or situation. 11. Seems unhappy or depressed.

The anchors for each attribute have been assigned values ranging from .00 to 3.00. The 11 attributes have been reduced to three factors: (a) adaptation, (b) interpersonal functioning, and (c) affective or intrapersonal functioning. A behaviorally anchored scale has the virtue of encouraging raters to be more precise in differentiating children. For further descriptions of this procedure, see Smith and Kendall (1963) and Campbell, Dunnette, Arvey, and Hellervik (1973).

Another concurrent measure of hyperactive behavior, the Abbreviated Symptom Questionnaire (ASQ; Sprague & Sleator, 1973) was employed as the fourth measure. The 10-item scale, derived from the Conners 36-item teacher rating scale, is similar in format to the BTS and is widely used in research on the use of medication with hyperactive children. It too has items tapping the four attributes of motor restlessness, inattentiveness, impulsivity, and aggressiveness/excitability. Only 1 item deals with impulsivity, however.

Procedure

All teachers were asked to rate the sample children using the BTS and the PBRS. A randomly selected subsample of 80 was also asked to rate the children with the SBS (the balance completed another experi- mental questionnaire dealing with curriculum practices). Two weeks later, all teachers were asked to volunteer to again rate the children with two of the following three measures: the SBS, the ASQ, or the BTS. The ASQ was administered only on the second occasion and to the majority, but the assignment was done randomly. Teachers declined to participate in the rerating, or they did not use all items on the ratings so that complete data were available on 902 pupils from the second ratings. This crude matrix sampling permitted estimates of test-retest reliability for the BTS and the SBS as well as estimates of the concurrent validity for all four measures.

R E S U L T S A N D D I S C U S S I O N

The major focus of this inquiry is on the two rating scales for hyperactivity, the BTS and the SBS. The other Scales, the PBRS and the

Page 10: Format effects in two teacher rating scales of hyperactivity

212 Sandoval

ASQ, were included for comparison purposes. If the differences between the BTS and the SBS had the result of making the teachers more sensitive to individual differences and more discriminating, a number of outcomes are likely. First, the total score distribution would be different. Next, the teachers would use more of the categories available to them during their rating on the SBS than on the BTS. In addition, the score obtained with the SBS would be more reliable and have greater validity in that it would cor- rectly classify children considered hyperactive by other means.

To examine the effect of the different rating scales on the distribution of scores, the total score range on each measure was divided into 10 equal intervals. The number 10 is arbitrary but represents the distribution well and permits the analysis below. Next, the total score for each of the 418 children rated on both measures concurrently was assigned to the appropriate interval. Finally a cross-tabulated matrix (Table II) was prepared indicating where each pupil was rated on both measures. An inspection of the marginal figures illustrates that the BTS scores are highly skewed, whereas the SBS scores are more evenly spread across the range.

Since the scores are correlated, the comparability of the distributions may be evaluated by examining the symmetry of the off-diagonal probabilities in the matrix with Bowker's test (Marascuilo & McSweeney, 1977). This procedure indicated that the probabilities are not symmetrical (x~ = 295, p < .001) and that the distributions are different for the two measures. Teachers using the SBS make more discriminations in their ratings and distribute children evenly across a behavioral dimension.

It is also possible to examine directly differences in the use of the four categories on the BTS and the SBS. Thirty-seven teachers each rated five children in their classrooms on both rating scales concurrently. Table III lists the mean number of times the teachers used each rating category on each of the surveys in the rating of all their children. A separate tabulation was made for each of the four categories over the 32 items on each survey and for the 16-item subscale of the BTS consisting of the 16 items matched to the negatively worded item subtest of the SBS. These items are footnoted in Table I.

By comparing the frequency with which each rating category was used on the entire BTS and SBS, it is possible to judge the influence of the differences in the two measures. When the frequency of use of categories I through 4 on the SBS is compared with the parallel categories on the BTS by four matched-pairs tests, all are used significantly dif- ferently (category 2, p < .05; the rest, p < .02). Teachers appear to be more willing to use the extreme rating categories on the SBS, and, as the distribution of the total scores implies, the greater the dispersion of the ratings, the more discriminating the measurement.

Page 11: Format effects in two teacher rating scales of hyperactivity

4 5 6 7 8 9 10

Tot

al

07o

Tab

le I

I. C

ross

-Tab

ulat

ion

of

Indi

vidu

al T

otal

Sco

res

Fal

ling

in

Dec

iles

on

the

Beh

avio

r an

d T

emp

eram

ent

Sur

vey

and

the

Sch

ool

Beh

avio

r S

urve

y

Beh

avio

r an

d T

emp

eram

ent

Sur

vey

1 2

3 4

5 6

7 8

9 10

In

terv

al s

core

32

-40

41-5

1 52

-59

60-6

8 69

-78

79-8

7 88

-97

98-1

06

107-

116

117-

125

Tot

al (

070)

1 32

-40

29

2 0

0 0

0 0

0 0

0 31

(7

.5)

2 41

-48

44

8 0

0 0

0 0

0 0

0 52

(1

2.4)

3

49-5

7 41

16

3

1 3

0 0

0 0

0 64

(1

5.3)

58

-65

16

24

15

2 2

0 0

0 0

0 59

(1

4.1)

66

-74

14

14

19

14

1 2

0 0

0 0

64

(15.

3)

75-8

3 1

11

11

10

13

6 1

3 0

0 56

(1

3.4)

84

_-91

...

...

0_

__

2

....

3

__

_1

1

....

9

....

7

_2

_

1 1

2 38

(9

.1)

92-1

00

0 2

1 3

7 4

7 1

2 0

27

(6.5

)] .

...

101-

108

1 0

0 1

1 4

5 3

5 1

21

(5.0

)~ 1

3o/0

i

0 0

2 0

3 1

6 10

9-11

7 0

0 0

0 !

(1.4

)}

146

79

52

42

36

23

17

8 11

4

418

(34.

9)

(18.

9)

(12.

4)

(10.

0)

(8.6

) (5

.5)

(4,1

) (1

.9)

(2.6

) (1

.0)

24"0

70

-'

Page 12: Format effects in two teacher rating scales of hyperactivity

214 Sandoval

Table III. Mean Frequency Teachers Use Each of Four Rating Categories in Rating Pupils on the Behavior and Temperament Survey and the School Behavior Survey (N = 37)

Category

1 2 3 4

Not at all A little Quite a bit Very much Characteristic Characteristic Characteristic Characteristic

Behavior and X 18.63 7.66 3.39 Temperament SD (5.54) (3.28) (2.39) Survey (32 items)

Sixteen-item ,,~ 8.16 5.39 2.22 subscale of items SD (2.84) (2.16) (1.47) matched to negatively worded SBS items

2.32 (2.36)

1.46 (1.44)

Once or twice Every Never Rarely per week day

School Behavior X 12.11 8.90 6.81 4.18 Survey a SD (4.86) (3.10) (2.39) (2.62) (32 items)

Sixteen-item )( 3.39 5.39 3.98 3.26 subscale of SD (2.99) (2.16) (1.68) (1.83) negatively worded items

Every Once or twice day per week Rarely Never

Sixteen-item ,,Y 8.73 3.52 2.84 .91 subscale of SD (2.43) (1.81) (1.09) (1.02) positively worded worded items

aCategory values reflected for items worded positively so that high scores related to hyperactivity.

It is instructive to examine the differences in the usage of the rating categories 1 through 4 on the two negatively worded subscales consisting of items common to both the BTS and the SBS. Again, using matched- pairs tests, the usage was shown to be different on each measure (all categories p < .01). Teachers were more willing to use the rating categories indicating moderate and extreme deviance on the SBS if the negative items were interspersed with positive ones and had more concrete descriptors.

In a study of attitudes toward old people, Kogan (1967) found differential correlations with other measures of attitude subscales made up of positively worded items and versions made up of comparable negatively worded items. Because there were two such subscales on the

Page 13: Format effects in two teacher rating scales of hyperactivity

Teacher Rating of Hyperactivity 215

SBS, the positive and negative subscales could be examined for category usage. The correlated t-test comparisons of the category usage on these subscales also indicated that each category was used significantly differently (p < .01). The categories indicating deviance were selected more often for negatively worded items than for positively worded items. This finding comes as no surprise but suggests that the addition of positive items does cause more consideration to be given to all of the items rated.

Reliability

Changes in the precision of the ratings should lead to improved reliability, particularly retest reliability. Two estimates of reliability have been calculated for the BTS and the SBS, an internal consistency measure, and a stability coefficient. Table IV presents these results. Although the alpha coefficients of the two scales are virtually identical, the retest reliability of the SBS is higher, although the difference is not quite significant. The reliabilities are sufficiently high to use the results of these ratings to make decisions about pupils.

Concurrent Validity

Table V presents the concurrent validity coefficients for the three rating scales. Examining the correlations between ratings made at the same time, it is clear that all of the ratings correlate to a fairly high degree with each other. The BTS correlates significantly higher with the ASQ than does the SBS (the 8 = .05 confidence intervals for the correlations do not overlap). Since the ASQ and the BTS have the same format and

Table IV. Reliability Coefficients for Hyperactivity Teacher Rating Scale

Test-retest Standardized item stability alpha internal

Mean SD coefficient consistency

Behavior and 15.01 18.79 .89 .96 Temperament (N = 1,120) (iV= 1,120) (N = 152) (N = 1,120) Survey Total

School Behavior 67.15 19.43 .95 .95 Survey Total (N = 448) (iV = 448) (N = 33) ( N = 448)

16.99 6.79 _ .92 a (N = 716) (?4 = 716) (N = 716)

Conners

a.97 corrected to length 30 items.

Page 14: Format effects in two teacher rating scales of hyperactivity

216 S a n d o v a l

T a b l e V. Concurrent Validity Estimates for Hyperactivity Rating Scales

Behavior and School Behavior Temperament Survey Survey

- - .82 (N = 418)

.82 (N = 418)

.89 (N = 672)

.75 (N = 1,120)

.66

.60

Behavior and Temperament Survey

School Behavior Survey

Conners Abbreviated Symptoms Questionnare

Pupil Behavior Rating Survey Total

1. Fights and quarrels 2. Difficulty

following directions

3. Immature 4. Dependent

teacher 5. Coaxed or forced

to work 6. Easily distracted 7. Dangerous to self

and others 8. No enthusiasm

for school 9. Difficulty learning

10. Sick or upset 11. Unhappy or

depressed

.76 (N = 95)

.81 (N = 421)

.65

.70

.69 .71

.56 .66

.28 .17

.63 .76

.69 .71

.57 .69

.40 .57

.41 .50

.53 .51

very s imi lar conten t , the resul t m a y ref lect re l iab i l i ty o f the t eache r ' s " r a t i n g se t " r a the r t han re l iab i l i ty o f the behav io r s being ra ted .

Tab le V also demons t r a t e s in teres t ing convergen t and d ivergent va l id i ty i n f o r m a t i o n in the cor re la t ions be tween the ind iv idua l a t t r ibu tes o f the P B R S and the two scales. Each o f the a t t r ibu tes dea l ing with charac ter i s t ics assoc ia ted with hyperac t ive ch i ld ren cor re la te .60 or be t te r wi th the two scales. The anchors on the distractibility a t t r ibu te are re la ted to a t t en t ion span , those on difficulty following directions are re la ted to impuls iv i ty and d is t rac t ib i l i ty , those on immaturity to p o o r social skills and se l f -cont ro l , those on the fights and quarrels a t t r ibu te to impuls iv i ty and p o o r social skil ls and aggress ion, those on the a t t r i bu t e dangerous to self and others to impuls iveness and h igh act ivi ty level, and those on no enthusiasm for school to lack o f sus ta ined act iv i ty on schoo l - re l a t ed work . O n the o ther hand , the a t t r ibu te needs to be coaxed to work or play, which con ta ins anchors descr ipt ive o f shy, w i t h d r a w n chi ldren , and the a t t r ibu te sick or upset have no t been assoc ia ted with hype rac t ive ch i ld ren and have lower cor re la t ions wi th the two scales.

Page 15: Format effects in two teacher rating scales of hyperactivity

Teacher Rating of Hyperactivity 217

Classificatory Efficiency

If the mean of physician-identified nonmedicated hyperactive children is used as a cutoff for the designation of hyperactivity (and this method is used in research on the revalence of hyperactivity), we may assess how likely each scale is to identify a child as hyperactive. In our project, the means for the physician-diagnosed hyperactive children, rated off medication, were 68.6 for the BTS, 95.2 for the SBS, and 24.6 for the Conners. Returning to Table II, we see these means fall into intervals 5 and 8, respectively. In spite of the skewed distribution, 24% of the children fall at or above the mean interval of the hyperactive children on the BTS (to the right of the dotted line), whereas only 13% fall at or above the mean interval of the hyperactive children on the SBS (below the dotted line). Examinination of the total score of the project children who were consensually agreed to be hyperactive and of the control children shows that 92.3 of those failing above 1 SD above the mean on the SBS (an arbitrary cutoff) were hyperactive and 7.7% were controls. On the BTS, 83.6% of those scoring above the cutoff were hyperactive and 16.4 were controls. These proportions are significantly different (~2 = 51.61, p > .001). Use of the BTS would lead a practitioner to identify more children as potentially hyperactive than would use of the SBS; using the SBS, there would be fewer false-positive identifications.

This study is only suggestive in judging the effect of using both positive and negatively worded items because there are content differences and category label differences confounded with the use of positively worded items on the SBS. Nevertheless, the changes made in constructing the SBS did have a desirable effect. Format can make raters more sensitive to individual differences. Because the validity of teacher ratings of hyperactivity can be improved, a new generation of rating procedures is required. One model for future rating scales is the SBS. A great deal of research on hyperactive children has been done with rating scales that are not as precise as they should be. Researchers need to attend to the types of rating scales they are using to collect information from teachers.

REFERENCES

Bell, R. Q., Waldrop, M. F., & Weller, G. M. A rating system for the assessment of hyperac- tive and withdrawn children in preschool samples. American Journal of Orthopsychiatry, 1972, 42, 23-24.

Campbell, J. P., Dunnette, M. D., Arvey, R. D., & Hellervik, L. W. The development and evaluation of behaviorally based rating scales. Journal of Applied Psychology, 1973, 57, 15-22.

Clements, S. D., & Peters, J. E. Minimal brain dysfunctions in the school-age child. Archives of General Psychiatry, 1962, 6, 185-197.

Page 16: Format effects in two teacher rating scales of hyperactivity

218 Sandovai

Conners, C. K. A teacher rating scale for use in drug studies with children. American Journal of Psychiatry, 1969, 126, 884-888.

Davids, A. An objective instrument for assessing hyperkinesis in children. Journal of Learning Disabilities, 1971, 4, 499-501.

Greenberg, L. M., Deem, M. A., & McMahon, S. Effects of dextroamphetamine, chlor- promazine and hydroxyzine on behavior and performance in hyperactive children. American Journal of Psychiatry, 1972, 129, 532-539.

Kogan, N. Attitudes toward old people. In D. N. Jackson & S. Messick (Eds.), Problems in human assessment. New York: McGraw-Hill, 1967.

Lambert, N. M., & Hartsough, C. S. Scaling behavioral attributes of children using multiple teacher iudgments of pupil characteristics. Educational and Psychological Measurement, 1973, 33, 859-874.

Lambert, N. M., & Hartsough, C. S. Pupil behavior rating scaie. In N. M. Lambert, E. M. Bower, & C. S. Hartsough (Eds.), A process for the assessment of effective student functioning. Monterey, California: Publisher's Test Service, CTB/McGraw-HiI1, 1979.

Lambert, N. M., & Sandoval, J. The prevalence of learning disabilities in a sample of children considered hyperactive. Journal of Abnormal Child Psychology, 1980, 8, 33-50.

Laufer, M. W., & Denhoff, E. Hyperkinetic behavior syndrome in children. Journal of Pediatrics, 1957, 50, 463-474.

Marascuilo, L. A., & McSweeney, M. Nonparametric and distribution free methods for the social sciences. Monterey, California: Brooks/Cole, 1977.

Sandoval, J. The measurement of the hyperactive syndrome in children. Review of Educational Research, 1977, 47, 292-318.

Sandoval, J., Lambert, N. M., & Sassone, D. M. The identification and labeling of hyperac- tivity in children: An interactive model In C. K. Whalen & B. Henker (Eds.), Hyperac- tive children: The social ecology of identification and treatment. New York: Academic Press, 1980.

Selltiz, C., Wrightsrnan, L. S., & Cook, S. W. Research methods in social relations (3rd ed.). New York: Holt, Rinehart & Winston, 1976.

Smith, P. C., & Kendall, L. M. Retranslation of expectations: An approach to the construc- tion of unambiguous anchors for rating scales. Journal of Applied Psychology, 1963, 47, 149-155.

Sprague, R. L., & Sleator, E. K. Effects of psychopharmacologic agents on learning disorders. Pediatric Clinics o f North America, 1973, 20, 719-735.

Stewart, M. A., Pitts, F. N., Craig, A. G., & Dieruf, W. The hyperactive child syndrome. American Journal of Orthopsychiatry, 1966, 36, 861-867.