10
The development of the Children's Empathic Attitudes Questionnaire using classical and Rasch analyses Jeanne Funk a, , Christine Fox b , Margaret Chan a , Kathleen Curtiss a a Department of Psychology, University of Toledo, United States b Department of Foundations of Education, University of Toledo, United States article info abstract Available online 18 March 2008 Empathic responding is implicated in antisocial behaviors such as bullying, sexual offending, and violent crime. Identifying children and adolescents at risk for antisocial behavior and evaluating interventions designed to address problem behaviors require valid and reliable measures. Denitional controversies and limited measurement models have hindered measurement. This study describes the development and analysis of the Children's Empathic Attitudes Questionnaire (CEAQ) using both classical and modern techniques. Rasch analyses provided probabilistic results over large item and person groups, enabling meaningful inferences from patterns of responses at the construct level. Analyses of fth through seventh graders' responses to the nal version of the CEAQ provide support for its reliability, validity, and functionality. Four meaningful item clusters were identied, each reecting more cognitively advanced empathic attitudes. These analyses suggest that the CEAQ provides a theoretically sound, hierarchically meaningful measure of empathic attitudes that will be useful in identication and intervention with children and adolescents at risk for antisocial behavior. © 2008 Elsevier Inc. All rights reserved. Keywords: Empathy Measure development Rasch analysis Item Response Theory 1. Introduction Examining empathy in children and adolescents is important because decits in empathic responding have been implicated in the development of antisocial behavior such as bullying (Hanish et al., 2004), aggression (Guerra, Nucci, & Huesmann, 1994; Richardson, Hammock, Smith, Gardner, & Signo, 1994; Sams & Truscott, 2004), sexual offending (Joliffe & Farrington, 2004), and serious violent crime (Loper, Hoffschmidt, & Ash, 2001). Identifying those at risk for antisocial behavior and evaluating interventions designed to address these problems require valid and reliable measures. However, despite the fact that the construct of empathy has been examined from numerous disciplinary perspectives since the late 19th century, empathyis a term that has deed consensus denition and measurement. The present paper reviews denitional controversies and existing measures, and a new approach to the measurement of empathy in children that takes advantage of both classical and contemporary (Rasch) analysis is described. One of the major historical controversies is whether empathy is primarily a cognitive or an affective phenomenon (Decety & Jackson, 2004; Kerem, Fishman, & Josselson, 2001; Preston & de Waal, 2002; Rankin, Kramer, & Miller, 2005). Some advocates of the cognitive view emphasize that the intellectual understanding of the emotions of another is necessary for the affective empathic response (e.g., Duan & Hill, 1996). This social cognitive viewpoint emphasizes the cognitive, perspective-taking element of empathy and predictive accuracy (Davis et al., 2004; Strayer, 1987). Those who emphasize its affective nature dene empathy as fundamentally the experience of another's emotional state (Eisenberg, Wentzel, & Harris, 1998). Once empathy occurs, then empathy-related responding occurs. This may include sympathy (dened by Eisenberg and colleagues as other-oriented concern), or personal distress in which one has an aversive response to experiencing another's distress (Batson, 1991). Although Journal of Applied Developmental Psychology 29 (2008) 187196 Corresponding author. Department of Psychology, University of Toledo, 2801 West Bancroft, Toledo, OH 43606-3390, United States. E-mail address: [email protected] (J. Funk). 0193-3973/$ see front matter © 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.appdev.2008.02.005 Contents lists available at ScienceDirect Journal of Applied Developmental Psychology

The development of the Children's Empathic Attitudes Questionnaire using classical and Rasch analyses

Embed Size (px)

Citation preview

Journal of Applied Developmental Psychology 29 (2008) 187–196

Contents lists available at ScienceDirect

Journal of Applied Developmental Psychology

The development of the Children's Empathic Attitudes Questionnaire usingclassical and Rasch analyses

Jeanne Funk a,⁎, Christine Fox b, Margaret Chan a, Kathleen Curtiss a

a Department of Psychology, University of Toledo, United Statesb Department of Foundations of Education, University of Toledo, United States

a r t i c l e i n f o

⁎ Corresponding author. Department of Psychology,E-mail address: [email protected] (J. Funk)

0193-3973/$ – see front matter © 2008 Elsevier Inc. Adoi:10.1016/j.appdev.2008.02.005

a b s t r a c t

Available online 18 March 2008

Empathic responding is implicated in antisocial behaviors such as bullying, sexual offending,and violent crime. Identifying children and adolescents at risk for antisocial behavior andevaluating interventions designed to address problem behaviors require valid and reliablemeasures. Definitional controversies and limited measurement models have hinderedmeasurement. This study describes the development and analysis of the Children's EmpathicAttitudes Questionnaire (CEAQ) using both classical and modern techniques. Rasch analysesprovided probabilistic results over large item and person groups, enabling meaningfulinferences from patterns of responses at the construct level. Analyses of fifth throughseventh graders' responses to the final version of the CEAQ provide support for its reliability,validity, and functionality. Four meaningful item clusters were identified, each reflecting morecognitively advanced empathic attitudes. These analyses suggest that the CEAQ provides atheoretically sound, hierarchicallymeaningful measure of empathic attitudes that will be usefulin identification and intervention with children and adolescents at risk for antisocial behavior.

© 2008 Elsevier Inc. All rights reserved.

University of Toledo, 2801 West Bancroft, Toledo, OH 43606-3390, United States..

ll rights reserved.

Keywords:EmpathyMeasure developmentRasch analysisItem Response Theory

1. Introduction

Examining empathy in children and adolescents is important because deficits in empathic responding have been implicated inthe development of antisocial behavior such as bullying (Hanish et al., 2004), aggression (Guerra, Nucci, & Huesmann, 1994;Richardson, Hammock, Smith, Gardner, & Signo, 1994; Sams & Truscott, 2004), sexual offending (Joliffe & Farrington, 2004), andserious violent crime (Loper, Hoffschmidt, & Ash, 2001). Identifying those at risk for antisocial behavior and evaluatinginterventions designed to address these problems require valid and reliable measures. However, despite the fact that the constructof empathy has been examined from numerous disciplinary perspectives since the late 19th century, “empathy” is a term that hasdefied consensus definition and measurement. The present paper reviews definitional controversies and existing measures, and anew approach to the measurement of empathy in children that takes advantage of both classical and contemporary (Rasch)analysis is described.

One of the major historical controversies is whether empathy is primarily a cognitive or an affective phenomenon (Decety &Jackson, 2004; Kerem, Fishman, & Josselson, 2001; Preston & de Waal, 2002; Rankin, Kramer, & Miller, 2005). Some advocates ofthe cognitive view emphasize that the intellectual understanding of the emotions of another is necessary for the affective empathicresponse (e.g., Duan &Hill, 1996). This social cognitive viewpoint emphasizes the cognitive, perspective-taking element of empathyand predictive accuracy (Davis et al., 2004; Strayer, 1987). Those who emphasize its affective nature define empathy asfundamentally the experience of another's emotional state (Eisenberg, Wentzel, & Harris, 1998). Once empathy occurs, thenempathy-related responding occurs. This may include sympathy (defined by Eisenberg and colleagues as “other-orientedconcern”), or personal distress in which one has an aversive response to experiencing another's distress (Batson, 1991). Although

188 J. Funk et al. / Journal of Applied Developmental Psychology 29 (2008) 187–196

sympathy is associated with socially appropriate behavior and helping, personal distress may lead to escape rather than helping(Eisenberg et al., 1998).

Integrative approaches to defining empathy combine both cognitive and affective perspectives. For example, Hoffman'sdevelopmental theory addresses how the individual's emotional reactions interact with their developing cognitive capabilities toproduce a specific empathic response (Hoffman, 1987, 2000). Feshbach (1997) also proposed an integrative developmental modelinwhich the ability to discriminate the emotional state of another interacts with the child's ability to assume another's perspective,and with the ability to share the emotional state of the other. More recently, integrative theories have combined behavioral datawith findings from functional neuroanatomy studies (Decety & Jackson, 2004; Preston & deWaal, 2002). In this conceptualization,empathy is viewed as a process, not a simple unitary response, and although empathic responding does not require consciousawareness, cognitive processes can contribute to the behavioral outcome. For example, prefrontal functions are believed tofacilitate empathic responding through enhancingworkingmemory and improving the ability to assess likely outcomes (Preston &de Waal, 2002).

Despite continuing discussion of the relative importance of cognitive or affective elements, as well as disagreement aboutwhich is process andwhich is outcome, there appears to be general consensus that the cognitive and affective features of empathicprocessing and responding are both necessary, and that the relative importance of each varies with situation (level of distressperceived or experienced), developmental level, and individual differences such as temperament (Feshbach, 1997; Zahn-Waxler,1999). There is also agreement that at least the precursors of empathy emerge at a very early age, but that empathic responsivenessbecomes more discriminating and deliberate with advances in development (Eisenberg & Fabes, 1998; Hoffman, 1987). Most agreethat by later childhood, children should be able to empathizewith complex situations and emotions, evenwith abstractions such asthe condition of war victims. Most researchers have found girls to be generally more empathic than boys (Zahn-Waxler, 1998;Zahn-Waxler & Robinson,1995), however results vary somewhat by how empathy is measured (Eisenberg & Fabes,1998; Eisenberg& Lennon, 1983).

1.1. Measuring empathy

Several approaches have been taken to measuring empathy in children and adolescents. Comprehensive reviews can be foundin Eisenberg and Strayer (1987), and Feshbach (1997). In basic research on the concept and development of empathy, multimethodapproaches are favored, with measures ranging from physiological indices (see Holmgren, Eisenberg, & Fabes, 1998), response tovignettes (see Eisenberg et al., 1996), and self-report (Bryant, 1982). Although the inherent disadvantages of self-report must beacknowledged, self-report can be a vital tool for some research questions, with responses reflecting attitudes and likely behavior(Andershed, Gustafson, Kerr, & Stattin, 2002).

1.2. Measuring empathic attitudes through self-report

In reviewing self-report measures of empathy in both children and adults, it appears that what is beingmeasured by self-reportis the more cognitively-based component of empathy that can be best conceptualized as empathic attitudes. Although there is notone-to-one correspondence, due in some cases to attitudinal ambivalence (Armitage & Christian, 2003; Conner et al., 2002), thereis considerable evidence that expressed attitudes are predictive of behavior across a variety of behavioral domains: for exampleaggression (Dahlberg, Toal, & Behrens, 1998; McConville & Cornell, 2003; Van Schoiack-Edstrom, Frey, & Beland, 2002), cigarettesmoking (Brook, Morojele, Brook, & Rosen, 2005), and adolescent pregnancy (Zabin, Astone, & Emerson, 1993). Self-reportmeasures are one technique used to evaluate groups of children at risk for violence, or who have participated in violenceprevention interventions, when it is not practical to use measures that require individual administration (see for exampleDahlberg, Toal, & Behrens, 1998; Farrell, Meyer, & White, 2001; Farrell & Sullivan, 2004; Guerra, Huesmann & Spindler, 2003).

The evaluation of empathic attitudes, though only one facet of empathy, is a meaningful endeavor for both basic and appliedquestions. Empathic attitudes are stable but modifiable knowledge structures: coherent, memory-based mental structures thatinfluence behavioral choice (Eisenberg et al., 1999; Eisenberg, Cumberland, Guthrie, Murphy, & Shepard, 2005). The definition ofattitude as a psychological tendency that requires an evaluative component (Eagley & Chaiken, 1993) shares many characteristicswith extant definitions of empathy. Attitudes may have both innate and learned components, can be stable but modifiable, andinvolve cognitive, affective, and behavioral evaluations and responses. Attitudes are not directly observable and must be inferredfrom some type of overt response.

Bryant's (1982) empathy questionnaire is a widely used self-report measure of empathy, here conceptualized as empathicattitudes, designed for usewith children and adolescents. Twenty-two “yes/no” questions based onMehrabian and Epstein's (1972)adult empathy measure assess the individual's response to common situations that typically should prompt an empathic response(e.g., “I get upset when I see a boy being hurt;” “Kids who have no friends probably don't want any” [reverse-scored]). Cronbach'salpha coefficientswere .79 for a sample of seventh graders, and other psychometric propertieswere satisfactory (Bryant,1982). Thisis awell-knownand frequently used instrument, but thewording of some items is awkward and dated (“Peoplewho kiss and hug inpublic are silly” and “I don't feel upset when I see a classmate being punished by a teacher for not obeying school rules”), and themeasure has been criticized for item heterogeneity (Eisenberg & Strayer, 1987). Additionally, having only two response optionsprecludes the detection of possible nuances in attitudes and also limits the variability in the resultant measure.

The Interpersonal Reactivity Index (IRI) is another self-report measure of empathy designed for use with adults, but is also usedwith adolescents (Davis, 1994). The IRI has four scales: Empathic Concern, Perspective-Taking, Fantasy, and Personal Distress.

189J. Funk et al. / Journal of Applied Developmental Psychology 29 (2008) 187–196

Factor analysis supported this structure, and provided evidence for Eisenberg's contention that empathy (sympathy in herterminology) and personal distress are separate constructs (Pulos, Elison, & Lennon, 2004). The IRI was reworded for use in a studywith fourth through sixth graders (Litvak-Miller & McDougall, 1997), however indices of reliability and validity were poor and thefactors were only somewhat similar to those found by Davis (1983) in his original research.

Both the Bryant questionnaire and the IRI were developed using the principles of Classical Test Theory (CTT, Spearman, 1907,1913). Unlike CTT, modern measurement theory, that is, Item Response Theory (IRT), provides a more rigorous approach tomeasurement and is widely used in high-stakes measurement situations, such as the development of board examinations forlicensing physicians. However, IRT is only slowly gaining acceptance in classical academic social science measurement. Thereforefeatures of both models were used in the present study to develop a psychometrically sound and clinically relevant self-reportmeasure of children's empathic attitudes, The Children's Empathic Attitudes Questionnaire.

1.3. The Children's Empathic Attitudes Questionnaire

Our approach to instrument validation combined both Classical Test Theory with the more modern-day family of Rasch models(Rasch 1960, 1980). The Rasch models use the original response data as sufficient for estimating probabilities of responses— theseprobabilities are expressed on a log-odd scale, and the units are called logits (for log-odd units).1 Classical Test Theory describes thedata at hand, whereas the Rasch analyses use these descriptive data to develop predictions over larger item and person groups. Thepredictive information provides a better understanding of the construct by making more meaningful comparisons of responsesacross different groups, times, and items.

The Rasch models construct unidimensional measures (sometimes referred to as “rulers” for the sake of analogy) wherepersons and items are placed on the same metric scale. For example, consider the construction of a motor development ruler forinfants. This would likely be based on a unidimensional hierarchy where skills such as “rolling front-to-back” and “sitting unaided”are near the bottom of the ruler (less advanced), and skills such as standing without support are near the top (more advanced).Children can then be placed along this “motor development ruler” to measure their likelihood of achieving different milestones.Although development does follow a hierarchical structure, the sequence through which children should/might progress, can bepredicted only in a probabilistic sense. This is not deterministic – some children do not follow the pattern the same way as otherchildren – but in order to quantify a human attribute such as motor development, Rasch tools can be used to create useful rulerswith additive and linear properties. How well any given data fit this conception of a quantitative linear scale is determined by thediagnostic statistics that accompany the model.

Rasch analyses also provide a set of diagnostics to help evaluate the reliability and the validity of the scores for the intendedpurpose. Reliability estimates not only include reliability coefficients (similar in interpretation to Cronbach's alpha), but alsoinclude estimates of the standard error spread of persons and items (referred to as the separation statistic, or G), and the number ofstatistically distinct groups or strata of persons that can be differentiated by the measure. Unlike their classical counterparts, theseRasch reliabilities are based on linear measures rather than raw data and therefore are suitable for statistical calculations (Merbitz,Morris, & Grip, 1989).

Rasch fit statistics assess the extent towhich each item and each personperform according to expectation. For example, childrenwith the most empathy should more strongly endorse empathic items as compared to childrenwith less empathy. Likewise, itemsthat elicit stronger empathic responses should be more difficult to endorse than items that elicit weaker empathic responses.

Rasch threshold statistics are provided for rating scale items (Wright & Masters, 1982). These thresholds are the estimateddifficulties in choosing one response over another (for example, the difficulty in choosing “yes” over “maybe”). Thresholds shouldincrease across the scale (i.e., it should bemore difficult to answer “yes” than it is to answer “maybe”), and should be appropriatelydistanced from one another (i.e., at least 1.4 logits, but not further than 5 logits; Linacre, 1999) so that each rating scale categoryspecifies a distinct meaning along the measured variable (for more detailed explanations of Rasch analyses see Bond & Fox, 2007;Wright & Masters, 1982; Wright & Stone, 1979).

In the present study, Rasch analyses were used to evaluate the extent to which children's empathic attitudes can be reliablyquantified using the Children's Empathic Attitudes Questionnaire. Fifth through seventh graders were studied because by this ageempathic attitudes should be fairly stable, individuals should have the cognitive capacity needed to respond to a self-reportinstrument, and because this is a critical age range for interventions to address emerging behavioral problems reflecting empathicdeficits. The following questions guided the analysis and interpretation of the scores:

1. How many reliably distinct groups of children can be differentiated along the empathy continuum?2. Do the items form together in a single unidimensional measure?3. Is the rating scale functioning properly?4. Does the item hierarchy make sense theoretically?5. Are there sex differences in CEAQ responding?6. Is there evidence of convergent and divergent validity?

1 Logits are a unit of measurement that results when the Rasch model is used to transform raw scores obtained from ordinal data to log odds ratios on acommon interval scale. The value of 0.0 logits is routinely allocated to the average item difficulty estimate for purposes of comparison, much like a z score (Wright& Stone, 1979).

190 J. Funk et al. / Journal of Applied Developmental Psychology 29 (2008) 187–196

2. Method

2.1. Initial phase of measure development

Measure development proceeded in several phases, with initial phases relying on classical measure development procedures. Foreach phase of the project, consent was obtained from parents and all children gave assent. First, the literature on empathy definitionsand development, and existing questionnaires such as the Bryant (1982), and the Davis Interpersonal Reactivity Index (Davis, 1994)were carefully examined. Following this process, a 9-item measure, the Children's Empathy Questionnaire was constructed. Thisversion had four response options: No, Maybe, Probably, and Yes. The questionnaire was administered to 728 children in fourththrough sixth grade. Cronbach's alpha was .79. To better cover the range of empathic attitudes (from those statements that mostchildren would agree with to those that only a few children would endorse), an expanded 12-item version was constructed andadministered to 349 fourth through sixth graders. In this version Cronbach's alpha was .74.The questionnaire was again revised,resulting in a 15-itemversion thatwas administered to 149 fourth andfifth grade children. For this version Cronbach's alphawas .68. ARasch analysis identified itemswith poor fit, and determined thatmore difficult to empathize itemswere needed to identify themostempathic individuals. To better reflect its content, the resulting 16 item revision based upon the Rasch analysis was renamed theChildren's Empathic Attitudes Questionnaire. Flesch–Kincaid readability level for this version is grade 5.1.

The Rasch model also provides statistics that are used for diagnosing the utility of the rating scale so that the researcher canassess the extent to which the original rating scale matches the number and type of response categories that are salient andmeaningful to the respondents (Bond & Fox, 2007; Wright & Linacre, 1992). Rasch diagnostic statistics for the five responsecategories in the 15-item version showed that adjacent rating scale categories did not distinguish different levels of the variable.Therefore, for the final 16-itemversion, these categories were collapsed into a three-response version (No, Maybe, and Yes) to elicitclearer information from respondents (a validity issue), while maintaining the same level of reliability found with the 5-categoryversion (Bond & Fox, 2007).

2.2. Participants and procedures

One urban (A) and one suburban (B) school system from a mid-sized, midwestern US city participated. System A providednames and addresses of all fifth through seventh graders (N = 7770). Two hundred names were randomly selected andquestionnaire packets were mailed with return envelopes. System B distributed the same packets to all sixth and seventh gradestudents in one middle school (N = 445). From System A, 56 (28%) completed questionnaires were returned. From System B, 157(35%) questionnaires were returned. Participants were mailed a check for $10 upon receipt of the questionnaire. See Table 1 for adescription of the sex, grade, and parental socioeconomic status of the final sample. Socioeconomic status was measured usingNakao and Treas's (1992) occupational codes, described by Entwisle and Astone (1994). Higher occupational category ratingsindicate higher socioeconomic status; for example, physicians receive a score of 97, retail activities typically rate in the 40–50range, secretaries receive a 39, and garbage collectors receive a rating of about 25. Overall, the present sample was generallyclassified as being middle class, with System B participants having a slightly higher mean socioeconomic index rating for bothmothers and fathers (see Table 1).

2.3. Measures

2.3.1. Index of empathy for children and adolescentsBryant's 22-item (example items: “Even when I don't know why someone is laughing, I laugh too,” and “It's silly to treat dogs

and cats as though they have feelings like people” [reverse-scored]) “yes–no” self-report measure was briefly described in theintroduction section (Bryant, 1982). Eleven of the 22 items are reverse-scored, and higher scores indicate higher empathy. Bryant(1982) also reported data supporting convergent and discriminant validity.

Table 1Sex, grade, ethnicity, and socioeconomic index distribution (percentages)

MeanSocioeconomic index

Sex Ethnicity Grade Mother Father

System Aa Female 57.7% EAb 43.3% Fifth 15.8% 43.97 46.68Male 42.3% AAc 46.8% Sixth 52.7%

HAd 7.2% Seventh 30.9%Other 2.7% Missing .6%

System B Female 48.0% EA 58.7% Sixth 31.0% 56.94 55.44Male 52.0% AA 1.9% Seventh 39.5%

HA 1.4% Missing 3.3%Other 8.9%Missing 29.0%

aEthnicity percentages based on district statistics. bEuropean American. cAfrican American. dHispanic American.

191J. Funk et al. / Journal of Applied Developmental Psychology 29 (2008) 187–196

2.3.2. Strengths and Difficulties Questionnaire (SDQ)The SDQ is a brief behavioral screening questionnaire designed for children ages 3–16 (YouthinMind, 2007). Respondents

answer each item on a 3-point scale (0,1, or 2; “Not true,” “Somewhat true,” “Certainly true”), and a total score is calculated for eachscale by summing across responses to the relevant scale items. The SDQ includes four specific problem scales, with five items each(Emotional Symptoms, Conduct Problems, Hyperactivity/Inattention, Peer Relationship Problems), and one Prosocial Behaviourscale, also with five items. A Total Difficulties Score, the sum of scores on all of the problem scales, can also be calculated, withhigher scores indicating more behavioral and emotional difficulties (maximum possible = 50). Only the Prosocial and ConductProblems scales were utilized in the present study. The Prosocial scale includes items that reflect behaviors based on empathicattitudes (sample items include “Considerate of other people's feelings,” “Kind to younger children”). Scores on the Prosocial scalerange from 0 to 10, with higher scores indicating more prosocial behavior. The Conduct Problems scale consists of commonbehavior problems (sample items include “Often fights with other youth, or bullies them,” and “Often loses temper”) that reflectlower empathy. Scores range from 0 to 10, with higher scores indicating more conduct problems. There are several differentversions of the SDQ based on age of child and respondent. The SDQ can be completed by children (age 11 and older), parents orteachers. In the present study, parents reported on their child's behavior.

The SDQ has been used as a research tool in many countries, including studies examining the outcome of clinical interventions.One of the early descriptions of the SDQ's psychometric properties confirmed a five factor structure with Cronbach's alphas forparent report ranging from .57 (Conduct Problems) to .77 (Hyperactivity/Inattention; Goodman, 2001). In an Australian sample,parent report resulted in similar alphas ranging from .59 (Peer Problems) to .80 (Hyperactivity/Inattention; Hawes & Dadds,2004).

2.3.3. Crandall Social Desirability Test for Children, Short Form (CSDTC-SF)The original 48-itemversion of the CSDTC (Carifio, 2001) was developed tomeasure the tendency of children and adolescents to

give socially desirable answers, rather than answers reflecting the individual's true opinion (Crandall, 1975). This 48-itemmeasurewas reduced to 12 stand-alone items using factor analytic instrument reduction techniques to enhance its usefulness in researchand program evaluation by Carifio (2001). Cronbach's alpha for the short formwas .73. Higher scores indicate stronger tendenciesto respond in a socially desirable answer. Carifio (2001) notes that this short form is useful in evaluating the impact of givingsocially desirable responses on questionnaire responses in a variety of situations.

3. Results

3.1. Reliability

The CEAQ data were analyzed using the Rasch rating scale model (Andrich, 1978), with WINSTEPS software (Linacre & Wright,2004).This model is used for rating scale data where each item format is the same across the instrument.

Internal consistency reliability, as estimated from Cronbach's alpha, was moderate at .77. The Rasch model analogue to this, i.e.,person reliability, was .75. The Rasch estimate differs in that extreme persons (persons who “top out” or “bottom out” on the scale)are removed from the estimate, thus giving a more conservative reliability estimate than classical analysis.

The Rasch person separation index is a reliability index determined on the basis of how many statistically different levels (orstrata) of empathy were distinguished by the items. For these data, the separation of 1.75 was transformed into a strata index[Strata = (4G + 1)/3;Wright &Masters, 1982] of 2.67 (rounded down to 2.0). This indicates that therewere two statistically distinctgroups of children on this variable, providing evidence that there is a quantifiable distinction among children along the empathymeasure.

In order to better understand the reliability statistics provided by the Rasch analysis, it is important to visualize the measuredvariable. Fig. 1 plots the sample of children against the response options, once they have been converted to an interval scale acrossall items. This figure aids in making probabilistic statements about how different childrenwould be expected to respond to this setof items. For example, the average child in this sample (indicated by the arrow drawn through the diagram and centered at theaverage child response) would most likely respond “yes” (Y) to the items eliciting immediate perceived harm, between “maybe”(M) and “yes” (Y) to the items that require identification with others' problems/feelings in familiar situations, and “maybe” (M) tothose items that require identification with other's problems/feelings (where the child cannot relate to the reason for the feeling).Because the average child's score is beyond the halfway point, between “no” and “maybe,” the most likely response is “maybe.”Note that the average child in this sample is not likely to respond “no” (N) to any of these types of items.

In this analysis, the x-axis is scaled to the standard error of measure so that person locations that are more than 2 (SEM) unitsapart will differ significantly (p b .05) from one another on the variable (Smith, 2001). The Xs below the SEMmarkers indicate howmany children, on average, fall at that level. Using the SEM as a guide, it is clear that almost all of the children in the sample(M = “mean”, S = “standard deviation”, and Q = “quartile”) are within 2 SEM of one another. This indicates that the only significantdifferences in empathy in this sample are between those children with higher empathy and those with lower empathy.

The interpretation of two statistically different groups of children can be meaningfully understood when combined with thevariable interpretation on the keymap (Fig. 1). This map shows not only which groups of children are significantly differentfrom one another, but also helps to identify how they differ. Children labeled “lower” in empathy are those who are most likelyto respond “maybe” to items of perceived immediate harm or meanness, and to respond “no” to identification with others’feelings.

Fig. 1. Children's empathy scores (each x represents one child) centered at mean = 0, bounded by +/− 2 standard errors. Items are arranged in difficulty order, withthe easiest to agreewith at the bottom of the scale. N, M, and Y refer to “No”, “Maybe”, and “Yes” respectively, with the colon in between referring to the halfway (or50/50 chance) of responding in one category or the other.

192 J. Funk et al. / Journal of Applied Developmental Psychology 29 (2008) 187–196

3.2. Validity

3.2.1. Rasch principal components analysisUnlike classical principal components analysis, Rasch PCA analyzes only the variability in the set of items after the linear

measure has been accounted for (Bond & Fox, 2007; Wright, 1996). This residual variance is factor analyzed to determine whetherany substantial amount of systematic variance exists in the data that is unrelated to the original linear measure that is beingconstructed. The first extracted factor, based on the residual variance, accounted for 1.8 units (items) out of a total of 16 residualvariance units (items). These indices were then used to calculate a factor sensitivity ratio (Wright & Stone, 2004) by taking thecommon residual units divided by the explained variance units. Dividing 1.8/16, yielding .11, reveals that 11% of the measure

Table 2Standardized factor loadings for principal components analysis of secondary factor

Item Factor loading

Seeing a kid who is crying makes me feel like crying (13) .69Other people's problems really bother me (6) .60I would feel bad if the kid sitting next to me got in trouble (9) .45It bothers me when my teacher doesn't feel well (11) .22When I see a kid who is upset it really bothers me (8) .19It would bother me if my friend got grounded (15) .15I understand how other kids feel (4) .11When I see someone who's happy, I feel happy too (16) .03I would feel bad if my mom's friend got sick (5) − .42I feel sorry for kids who can't find anyone to hang out with (12) − .41I'm happy when the teacher says my friend did a good job (2) − .36I feel happy when my friend gets a good grade (7) − .21When I'm mean to someone, I usually feel bad about it later (1) − .20It's easy for me to tell when my mom or dad has a good day at work (10) − .18If two kids are fighting, someone should stop it (14) − .14I would get upset if I saw someone hurt an animal (3) − .14

Note. Bold items loaded .4 or greater on secondary factor.

Table 3Items, fit, and item–total correlations

Item and number Measure Error Infit Mnsq Item–total correlation

Seeing a kid who is crying makes me feel like crying (13) 1.60 .11 .95 .61Other people's problems really bother me (6) 1.39 .11 .97 .49I would feel bad if the kid sitting next to me got in trouble (9) 1.28 .11 .91 .58It bothers me when my teacher doesn't feel well (11) 1.06 .11 .76 .65When I see a kid who is upset it really bothers me (8) .70 .11 .73 .64It would bother me if my friend got grounded (15) .17 .11 1.30 .37I understand how other kids feel (4) .07 .12 1.12 .38When I see someone who's happy, I feel happy too (16) − .03 .12 .93 .51I would feel bad if my mom's friend got sick (5) − .23 .12 .86 .52I feel sorry for kids who can't find anyone to hang out with (12) − .59 .13 .96 .51I'm happy when the teacher says my friend did a good job (2) − .61 .13 .96 .55I feel happy when my friend gets a good grade (7) − .64 .13 .82 .55When I'm mean to someone, I usually feel bad about it later (1) − .86 .14 1.09 .48It's easy for me to tell when my mom or dad has a good day at work (10) a − .98 .14 1.91 .11If two kids are fighting, someone should stop it (14) − 1.07 .15 1.18 .42I would get upset if I saw someone hurt an animal (3) − 1.25 .15 1.34 .39

a Misfitting item.

193J. Funk et al. / Journal of Applied Developmental Psychology 29 (2008) 187–196

stability is affected by an additional factor in the responses. Although 11% is not substantial enough to cause concern about themeasure, the factor loadingswere examined tobetter understand this small amountof systematic variance thatwas unrelated to thelinear measure. This 11% of variance is analogous to the percent of variance reported by eigenvalues in a classical PCA. Given thisframe of reference, it is clear that a factor explaining 11% of the variance is not large enough to cause a meaningful measurementdisturbance. Table 2 presents the factor loadings for this secondary factor, with loadings of .4 or greater in bold font. The three itemswith high positive loadings all contain theword “happy,” andwere as follows: (7) I feel happywhenmy friends get a good grade; (2)I'm happy when the teacher says my friend did a good job; and (16) When I see someone who is happy, I feel happy too.

The three items with high negative loadings all contain the word “bothered” were (6) Other people's problems really botherme; (11) It bothers me when my teacher doesn't feel well; (8) When I see a kid who is upset, it really bothers me. This secondaryfactor appears to represent a small amount of variance associated with response styles that are elicited by similar item wording.

3.2.2. Item fitTable 3 shows the items (in logits) in difficulty order, item errors, fit statistics, and point-biserials (item–total correlations). Both

the fit statistics and item–total correlations indicate that Item 10 does not function in a manner similar to the other items in themeasure. The item–total correlation for Item 10 was extremely low (.11) and the infit mean square was greater than 1.4 (the cut-offthat is recommended for rating scales; Bond & Fox, 2007). Thus, these indices suggest that Item 10 (“It's easy forme to tell whenmymom or dad has a good day at work”) elicited responses that were qualitatively different from the other items. In other words, thepattern of responses to this item was not consistent with the pattern of responses to the other fifteen items, indicating that thisitem should be dropped from the measure. This item required the child to not only empathize with an adult, but also to drawmoresophisticated conclusions about the source of distress than required by other items. When Item 10 was omitted from the measureand the analysis conducted without the item, person reliability increased to .77 whereas fit statistics and item difficulty orderremain unchanged. In future use of the scale, Item 10 should be omitted.

194 J. Funk et al. / Journal of Applied Developmental Psychology 29 (2008) 187–196

3.2.3. Rating scale functioningAssessment of rating scale clarity involved examination of the step thresholds and category fit statistics. Step thresholds reflect

the magnitude of the distance between each rating scale category. If adjacent categories are too close together, this impliesambiguity of meaning (for example, in other measures categories labeled ‘sometimes’ and ‘frequently’ typically have smaller thanrecommend step thresholds). Thresholds that are too far apart imply the need for additional distinctions among the measuredvariable. The step thresholds between the rating scale categories of the CEAQ (− .74 for threshold 1 and .74 for threshold 2) arewithin the proper distances as recommended by Linacre (1999). Furthermore, each of the three categories displayed adequate fit(1.03 for ‘No’, .90 for ‘Maybe’, and 1.03 for ‘Yes’). Thismeans that the number of rating scale points are neither too few nor toomany,and that each rating scale category reflects a meaningful distinction in empathy along the measured variable.

3.2.4. Sex differencesGirls had significantly higher scores than boys on the CEAQ, t(200) =4.42, p b .01, with girls scoring slightly higher on mean

empathy score (M for girls = 1.37, SD = .94, n = 94; M for boys = .75, SD = 1.04, n = 108). This sex difference is consistent with mostself-report research on children's empathy (Eisenberg & Lennon, 1983).

3.2.5. Convergent and divergent validityBoth the Bryant empathy scores (child self-report) and the SDQ Prosocial scale scores (parent report of child behavior) were

correlated with the CEAQ to examine convergent validity. The convergent validity correlationwith the Bryant was r = .57, and wasr = .39 with the SDQ Prosocial scale scores (parent report). Divergent validity was examined by correlating the CEAQ with theConduct Problems scores of the SDQ. The divergent validity index of − .17 was marginally acceptable. Finally, the CEAQ wascorrelated with the Crandall social desirability questionnaire (child self-report), yielding r = .39. This suggests that there is someoverlap between the empathy construct being measured by the CEAQ and social desirability as assessed by the Crandall measure.This finding is not surprising given that empathy is an inherently socially desirable characteristic. Additional research onconvergent and divergent validity will be useful in clarifying the unique contribution of the CEAQ.

4. Discussion

The Children's Empathic Attitudes Questionnaire (CEAQ) was developed as a self-report measure of empathic attitudes forchildren and adolescents. Both classical analyses and Rasch analyses provide support for the reliability, validity, and functionality ofthe CEAQ scores when used with children in later elementary school years and early adolescence. Rasch analyses demonstratedadequate separation, fit, rating scale functioning, and dimensionality of the CEAQ, thus implying that empathic attitudes arequantifiable, an important step forward in understanding the complex construct of empathy.

Most researchers agree that the perspective-taking component of empathic attitudes develops later than a reflexive sense ofpersonal distress (Eisenberg et al., 1998; Litvak-Miller & McDougall, 1997; Zahn-Waxler & Robinson, 1995). This progression isreflected in the Rasch-derived increasing difficulty of the items in the CEAQ. An item cluster referencing immediate perceived harmwas easiest for children to endorse, followed by items reflecting a similar response to perceived happiness. The third clusterinvolves identifying with others' feelings in familiar situations, and the fourth and most difficult to agree with cluster requires anempathic response to a person in a situation that the individual may not really understand. Each cluster requires increasinglyabstract processing, reflectingmore cognitively advanced empathic attitudes. In summary, the Rasch analysis of the CEAQ suggestsfour item clusters that are theoretically consistent and developmentally appropriate.

Rasch models put data into a probabilistic framework, making it possible to understand fit-to-response patterns, to clarify therelations among the persons and items, and to make inferences about other samples of children and other samples of items. As aresult, empirical results from Rasch analyses often inform theory (Bond & Fox, 2007) by highlighting patterns in the data thatwould not be detected by classical statistical methods (Cole, Rabin, Smith, & Kaufman, 2004). These patterns can then beinterpreted in light of the present theory (in a data-driven dialectic), often resulting in refinement of the theory and providingguidance for further instrument development and analysis.

In the present study, two distinct groups were identified along the empathy continuum: Children who can be characterized ashaving lower empathy and those with higher empathy (see Fig. 1). Children with lower empathy were unwilling to endorse eventhose empathic statements referencing immediate perceived harm or meanness, and were unable to identify with others' feelingseven in more familiar situations. The findings of the present study add to the general understanding of empathy as a construct —which types of items (not just these specific ones on the CEAQ) elicit the most likely empathic responses (perceived immediateharm or meanness) and which types of items elicit more sophisticated empathic responses (identifying with others' feelings whenthe child cannot relate to the reason). These findings suggest that future discussions of empathic attitudes can move from a focuson the specific items on a given instrument to a discussion of the meaning of different levels of empathic responding.

4.1. Implications

The CEAQ fills a gap in the measurement of empathy by providing a psychometrically strong measure of empathic attitudes,which are modifiable knowledge structures that influence behavioral choice. Assessing empathic attitudes could be onecomponent of an overall violence risk assessment in high risk populations (Borum, 2000). Sams and Truscott (2004) found that lowempathy, in association with exposure to community violence, predicted use of violence in at-risk adolescents. Several projects

195J. Funk et al. / Journal of Applied Developmental Psychology 29 (2008) 187–196

have successfully targeted empathy development in at-risk children and adolescents (Eisenberg et al., 1998; Feshbach, 1997;Hatcher & Nadeau, 1994; Kelly, Longbottom, Potts & Williamson, 2004). For example, developing empathy is one goal of thePromoting Alternative Thinking Strategies (PATHS) program (Kelly et al., 2004). This school-based program targets components ofemotional intelligence through a structured curriculum. The program was delivered to one class of fourth and fifth graders, andchildren with problems with aggression were targeted for additional assessment. Based on student and teacher impressions,improvement in empathy was one outcome for these children but no empirically derived measure was used. Second Step, anotherviolence prevention program, also targeted empathy in fifth through eighth grade African American students (McMahon &Washburn, 2003). Using a five-item measure with low internal consistency (alpha = .54), an increase in empathy that wasassociated with a decline in self-reported aggressionwas identified. The CEAQ could be especially useful in monitoring the impactof these types of interventions because, as previously noted, the CEAQ can define deficiencies and progress within the frameworkof levels of empathic responding, not just items. Even in programs where changing empathic attitudes may not be the primarytarget, the brevity of the CEAQ lends itself to monitoring secondary change.

The CEAQ could also be used to monitor treatment effects for individuals. Kazdin (2005) notes that new clinical measuresshould not only contribute to the understanding of where an individual stands in relation to a norm group, but also be feasible foruse in measuring individual clinical progress. The CEAQ is positioned to meet these standards for several reasons. First, the brevityof the CEAQ lends itself to repeated administration. Next, because measures derived from Rasch analysis are sample independent(within standard errors), findings have consistent meaning from one sample and from one person to another. Given additionalnormative data, it should be possible to develop a representation of the typical pattern of empathic attitudes in particulardevelopmental periods. Deviations, which could include either weaknesses in empathic attitudes, or pathologically strongempathic attitudes, could inform treatment planning.

Some limitations of the CEAQ should be noted. The sample was selected from one geographical area, with moderate ethnicdiversity. Additional work is needed to determine how the CEAQ performs with other samples, including more diverse ethnicgroups and children with identified aggressive behavioral problems and presumably lower empathy. Validity indices were basedon child self-report and parent report measures. Cross-validation of the measure using additional methods will be important inestablishing the measure as a useful addition to evidence-based assessment batteries. Finally, although empathic attitudes arecritical to understanding empathic capacity, it is important to recognize that this cognitive component is only one facet of thespectrum of empathy. Despite these limitations, the current analyses suggest that the CEAQ provides a theoretically sound,hierarchically meaningful measure of empathic attitudes.

Acknowledgment

We thank Dr. Robert Elliot for his assistance in the first phase of the development of this measure.

References

Andershed, H. A., Gustafson, S. B., Kerr, M., & Stattin, H. (2002). The usefulness of self-reported psychopathy-like traits in the study of antisocial behaviour amongnon-referred adolescents. European Journal of Personality, 16, 383−402.

Andrich, D. (1978). A rating formulation for ordered response categories. Psychometrika, 43, 357−374.Armitage, C. J., & Christian, J. (2003). From attitudes to behavior: Basic and applied research on the theory of planned behaviour. Current Psychology: Developmental,

Learning, Personality, Social, 22, 187−195.Batson, C. D. (1991). The altruism question: Toward a social–psychological answer. Hillsdale, NJ: Erlbaum.Bond, T. G., & Fox, C. M. (2007). Applying the Rasch model: Fundamental measurement in the human sciences, 2nd ed. Hillsdale, NJ: Erlbaum.Borum, R. (2000). Assessing violence risk among youth. Journal of Clinical Psychology, 56, 1263−1288.Brook, J. S., Morojele, N. K., Brook, D. B., & Rosen, Z. (2005). Predictors of cigarette use among South African adolescents. International Journal of Behavioral Medicine,

12, 207−217.Bryant, B. K. (1982). An index of empathy for children and adolescents. Child Development, 53, 413−425.Carifio, J. (2001). Sensitive data and students' tendencies to give socially desirable responses. Journal of Alcohol and Drug Education, 39, 74−84.Cole, J. C., Rabin, A. S., Smith, T. L., & Kaufman, A. S. (2004). Development and validation of a Rasch-derived CES-D Short Form. Psychological Assessment, 16, 360−372.Conner, M., Sparks, P., Povey, R., James, R., Shepherd, R., & Armitage, C. J. (2002). Moderator effects of attitudinal ambivalence on attitude–behaviour relationships.

European Journal of Social Psychology, 32, 705−718.Crandall, V. (1975). A children's social desirability questionnaire for children. Journal of Consulting Psychology, 29, 28−36.Dahlberg, L. L., Toal, S. B., & Behrens, C. B. (1998). Measuring violence-related attitudes, beliefs, and behaviors among youths: A compendium of assessment tools.

National Center for Injury Prevention and Control, 1998 Atlanta, GA: Centers for Disease Control and Prevention.Davis, M. H. (1983). Measuring individual differences in empathy: Evidence for a multidimensional approach. Journal of Personality and Social Psychology, 44,

167−184.Davis, M. H. (1994). Empathy: A social psychological approach. Dubuque, IA: Wm.C. Brown.Davis, M. H., Soderlund, T., Cole, J., Gadol, E., Kute, M., Myers, M., et al. (2004). Cognitions associated with attempts to empathize: How do we imagine the

perspective of another? Personality and Social Psychology Bulletin, 30, 1625−1635.Decety, J., & Jackson, P. L. (2004). The functional architecture of human empathy. Behavioral and Cognitive Neuroscience Reviews, 3, 71−100.Duan, C., & Hill, C. E. (1996). The current state of empathy research. Journal of Counseling Psychology, 43, 261−274.Eagley, A. H., & Chaiken, S. (1993). The psychology of attitudes. NY: Harcourt Brace Jovanovich.Eisenberg, N., Cumberland, A., Guthrie, I. K., Murphy, B. C., & Shepard, S. A. (2005). Age changes in prosocial responding and moral reasoning in adolescence and

early adulthood. Journal of Research in Adolescence, 15, 235−260.Eisenberg, N., & Fabes, R. A. (1998). Prosocial development. In N. Eisenberg (Ed.), Handbook of child psychologySocial, emotional, and personality development, Vol. 3.

(pp. 701−778) New York: Wiley.Eisenberg, N., Fabes, R. A., Murphy, B., Karbon, M., Smith, M., & Maszk, P. (1996). The relations of children's dispositional empathy-related responding to their

emotionality, regulation, and social functioning. Developmental Psychology, 32, 195−209.Eisenberg, N., Guthrie, I. K., Murphy, B. C., Shepard, S. A., Cumberland, A., & Carlo, G. (1999). Consistency and development of prosocial dispositions: A longitudinal

study. Child Development, 70, 1360−1372.

196 J. Funk et al. / Journal of Applied Developmental Psychology 29 (2008) 187–196

Eisenberg, N., & Lennon, R. (1983). Sex differences in empathy and related capacities. Psychological Bulletin, 94, 100−131.Eisenberg, N., & Strayer, J. (1987). Critical issues in the study of empathy. In N. Eisenberg, & J. Strayer (Eds.), Empathy and its development (pp. 3−13). NY: Cambridge.Eisenberg, N., Wentzel, N. M., & Harris, J. D. (1998). The role of emotionality and regulation in empathy-related responding. School Psychology Review, 27, 506−522.Entwisle, D. R., & Astone, N. M. (1994). Some practical guidelines for measuring youth's race/ethnicity and socioeconomic status. Child Development, 65, 1521−1540.Farrell, A. D., Meyer, A. L., & White, K. S. (2001). Evaluation of responding in peaceful and positive ways (RIPP): A school-based prevention program for reducing

violence among urban adolescents. Journal of Clinical Child Psychology, 30, 451−463.Farrell, A. D., & Sullivan, T. N. (2004). Impact of witnessing violence on growth curves for problem behaviors among early adolescents in urban and rural settings.

Journal of Community Psychology, 32, 505−525.Feshbach, N. D. (1997). Empathy, the formative years: Implications for clinical practice. In A. C. Bohart, & L. S. Greenberg (Eds.), Empathy reconsidered (pp. 33−59).

Washington, DC: American Psychological Association.Goodman, R. (2001). Psychometric properties of the Strengths and Difficulties Questionnaire (SDQ). Journal of the American Academy of Child and Adolescent

Psychiatry, 40, 1337−1345.Guerra, N. G., Huesmann, L. R., & Spindler, A. (2003). Community violence exposure, social cognition, and aggression among urban elementary school children.

Child Development, 74, 1561−1576.Guerra, N., Nucci, L., & Huesmann, L. R. (1994). Moral cognition and childhood aggression. In L. R. Huesmann (Ed.), Aggressive behavior: Current perspectives

(pp. 13−33). New York: Plenum.Hanish, L. D., Eisenberg, N., Fabes, R. A., Spinrad, T. L., Ryan, P., & Schmidt, S. (2004). The expression and regulation of negative emotions: Risk factors for young

children's peer victimization. Development and Psychopathology, 16, 335−353.Hatcher, S., & Nadeau, M. S. (1994). The teaching of empathy for high school and college students. Adolescence, 29, 961−973.Hawes, D. J., & Dadds, M. R. (2004). Australian data and psychometric properties of the Strengths and Difficulties Questionnaire. Australian and New Zealand Journal

of Psychiatry, 38, 644−651.Hoffman, M. (1987). The contribution of empathy to justice and moral judgment. In N. Eisenberg, & J. Strayer (Eds.), Empathy and its development (pp. 47−80).

Cambridge: Cambridge University Press.Hoffman, M. (2000). Empathy and moral development: Implications for caring and justice. Cambridge, England: Cambridge University Press.Holmgren, R. A., Eisenberg, N., & Fabes, R. (1998). The relations of children's situational empathy-related emotions to dispositional prosocial behaviour. Interna-

tional Journal of Behavioural Development, 22, 169−193.Joliffe, D., & Farrington, D. P. (2004). Empathy and offending: A systematic review and meta-analysis. Aggression and Violent Behavior, 9, 441−476.Kazdin, A. E. (2005). Evidence-based assessment for children and adolescents: Issues in measurement development and clinical application. Journal of Clinical Child

and Adolescent Psychology, 34, 548−558.Kelly, B., Longbottom, J., Potts, F., & Williamson, J. (2004). Applying emotional intelligence: Exploring the Promoting Alternative Thinking Strategies curriculum.

Educational Psychology in Practice, 20, 221−240.Kerem, E., Fishman, N., & Josselson, R. (2001). The experience of empathy in everyday relationships: Cognitive and affective elements. Journal of Social and Personal

Relationships, 18, 709−729.Linacre, J. M. (1999). Investigating rating scale category utility. Journal of Outcome Measurement, 3, 103−122.Linacre, J. M., & Wright, B. D. (2004). WINSTEPS: Multiple-choice, rating scale, and partial credit Rasch analysis [computer software]. Chicago: MESA Press.Litvak-Miller, W., & McDougall, D. (1997). The structure of empathy during middle childhood and its relationship to prosocial behavior. Genetic, Social, and General

Psychology Monographs, 123, 303−324.Loper, A. B., Hoffschmidt, S. J., & Ash, E. (2001). Personality features and characteristics of violent events committed by juvenile offenders. Behavioral Sciences and

the Law, 19, 81−96.McConville, D. W., & Cornell, D. G. (2003). Aggressive attitudes predict aggressive behavior in middle school students. Journal of Emotional and Behavioral Disorders,

11, 179−187.McMahon, S. D., & Washburn, J. J. (2003). Violence prevention: An evaluation of program effects with urban African American students. The Journal of Primary

Prevention, 24, 43−62.Mehrabian, A., & Epstein, N. (1972). A measure of emotional empathy. Journal of Personality, 40, 525−543.Merbitz, C., Morris, J., & Grip, J. C. (1989). Ordinal scales and foundations of misinference. Archives of Physical Medicine and Rehabilitation, 70, 308−312.Nakao, K., & Treas, J. (1992). The 1989 socioeconomic index of occupations: Construction from the 1989 occupational prestige scores (General Social Survey

Methodological Report No. 74). Chicago: University of Chicago, National Opinion Research Center.Preston, S. D., & de Waal, F. B. M. (2002). Empathy: Its ultimate and proximate bases. Behavioral and Brain Science, 25, 1−72.Pulos, S., Elison, J., & Lennon, R. (2004). The hierarchical structure of the Interpersonal Reactivity Index. Social Behavior and Personality, 32, 355−360.Rankin, K. P., Kramer, J. H., & Miller, B. L. (2005). Patterns of cognitive and emotional empathy in frontotemporal lobar degeneration. Cognitive and Behavioral

Neurology, 18, 28−36.Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. Copenhagen, Danmarks: Paedagogiske Institut.Rasch, G. (1980). Probabilistic models for some intelligence and attainment tests (Expanded ed.). Chicago: University of Chicago.Richardson, D. R., Hammock, G. S., Smith, S. M., Gardner, W., & Signo, M. (1994). Empathy as a cognitive inhibitor of interpersonal aggression. Aggressive Behavior,

20, 275−289.Sams, D. P., & Truscott, S. (2004). Empathy, exposure to community violence, and use of violence among urban, at-risk adolescents. Child and Youth Care Forum, 33,

33−50.Smith, E. V., Jr. (2001). Evidence for the reliability of measures and validity of measure interpretation: A Rasch measurement perspective. Journal of Applied

Measurement, 2, 281−311.Spearman, C. (1907). Demonstration of formulae for true measurement of correlation. American Journal of Psychology, 15, 72−101.Spearman, C. (1913). Correlations of sums and differences. British Journal of Psychology, 5, 417−426.Strayer, J. (1987). Affective and cognitive perspectives on empathy. In N. Eisenberg, & J. Strayer (Eds.), Empathy and its development (pp. 218−244). New York:

Cambridge University Press.Van Schoiack-Edstrom, L., Frey, K., & Beland, K. (2002). Changing adolescents' attitudes about relational and physical aggression: An early evaluation of a school-

based intervention. School Psychology Review, 31, 201−216.Wright, B. D. (1996). Comparing Rasch measurement and factor analysis. Structural Equation Modeling, 3, 3−24.Wright, B. D., & Linacre, J. M. (1992). Combining and splitting of categories. Rasch Measurement Transactions, 6233. Retrieved September 17, 2007 from http://rasch.

org/rmt/rmt63f.htm.Wright, B. D., & Masters, G. N. (1982). Rating scale analysis. Chicago: MESA Press.Wright, B. D., & Stone, M. H. (1979). Best test design. Chicago: MESA Press.Wright, B. D., & Stone, M. H. (2004). Making measures. Chicago: Phaneron Press.YouthinMind (2007). Strengths and Difficulties Questionnaire. Retrieved September 12, 2007 at http://sdqinfo.com/b1.html.Zabin, L. S., Astone, N. M., & Emerson, M. R. (1993). Do adolescents want babies? The relationship between attitudes and behavior. Journal of Research on

Adolescence, 3, 67−86.Zahn-Waxler, C. (1998, Fall). From the enlightenment to the millenium: Changing conceptions of the moral sentiments. Developmental Psychology, 1−7.Zahn-Waxler, C. (1999). The development of empathy, guilt, and internalization of distress: Implications for gender differences in internalizing and externalizing

problems. In R. Davidson (Ed.), Anxiety, depression, and emotion: The First Wisconsin Symposium on Emotion, Vol. 1. (pp. 222−266)NY: Oxford.Zahn-Waxler, C., & Robinson, J. (1995). Empathy and guilt: Early origins of feelings of responsibility. In J. P. Tangney, & K. W. Fischer (Eds.), Self-conscious emotions:

The psychology of shame, guilt, embarrassment, and pride (pp. 143−173). NY: Guilford.