manual.pdf

293

The Relation Between ItemFormat and the Structure of theEysenck Personality InventoryWayne F. Velicer and John F. StevensonUniversity of Rhode Island

A Likert seven-choice response format for per-sonality inventories allows finer distinctions by sub-jects than the traditional two-choice format. TheEysenck Personality Inventory was employed in thepresent study to test the hypothesis that use of theexpanded format would result in a clearer andmore accurate indication of test structure. The sub-jects, volunteers in a psychology course, took thestandard two-choice version of the EPI and a seven-choice version one week apart, with the ordercounter-balanced. A principal components analysiswith a varimax rotation yielded two components forthe two-choice format, clearly identifiable asEysencks "Neuroticism" and "Extraversion" whichtogether accounted for 18% of the variance. Theseven-choice version resulted in six components ac-counting for 46% of the variance. The expandedformat suggested inadequacies in the structure ofthe EPI, defined the factor structure more clearly,and explained a greater proportion of the variance.It thus demonstrated the apparent advantages ofthe multiple-response format for scale construction.

Traditionally, structured personality inven-tories have employed a two-choice item format(e.g., true-false, agree-disagree, a forced choicebetween two alternatives). The reasons are pri-marily practical: (1) ease of administration (i.e.,simplicity of instruction for subjects); (2) re-duced administration time; (3) ease of scoring;and (4) avoidance of scaling issues. The recent

Comrey Personality Scales (Comrey, 1970) areone of the few exceptions to this trend, employ-ing a Likert-type seven-choice item format. Thepresent study investigates the value of the multi-category approach, based on the reasoning thatsuch items will permit the subject to make finerdistinctions and, therefore, will provide moreprecise and meaningful responses. This shouldresult in (1) increased item and scale reliability,(2) more favorable subject reactions to the inven-tory, and (3) a clearer and more accurate indica-tion of the test structure.Of these questions, the reliability issue has

been the most widely researched, but conflictingresults have been produced. For example,Jahoda, Deutch, and Cook (1951) and Ferguson(1941) report that reliability increases as thenumber of response categories increases. Bendig(1954), Komorita (1963), Peabody (1962), andMatell and Jacoby (1971) have found reliabilityto be generally independent of the number of re-sponse categories. Komorita and Graham (1965)found an increase in reliability with an increasein number of categories only for scales with rela-tively homogeneous items. Masters (1974) re-ported a relation only for a scale that had an ini-tial low total score variation. It is difficult to re-solve these differences because (1) different testinstruments were used in the various studies and(2) different methodologies were employed.

APPLIED PSYCHOLOGICAL MEASUREMENTVol. 2, No. 2 Spring 1978 pp. 293-304@ Copyright 1978 West Publishing Co.

Downloaded from the Digital Conservancy at the University of Minnesota, http://purl.umn.edu/93227. May be reproduced with no cost by students and faculty for academic use. Non-academic reproduction

requires payment of royalties through the Copyright Clearance Center, http://www.copyright.com/

294

A potentially confounding and uncontrolledeffect on reliability is the possibility that the in-crease in the number of response categoriesaltered the structure of the instrument. Thestructure is usually determined on the two-re-sponse form of the instrument or established&dquo;theoretically.&dquo; If the multiple category versionsof the instruments have a different structure,then measures of internal consistency based onthe scales of original (binary) form would be in-appropriate, particularly if the structure is morecomplex. This would explain the Komorita andGraham (1965) result with respect to &dquo;heteroge-nous&dquo; scales. The only study to control for struc-ture is the recent monte carlo study by Lissitzand Green (1975), which found an increase in re-liability as the number of categories increasedfrom two to five and no change for further in-creases in the number of categories.The issue of subject reaction to type of format

has not been extensively investigated. Jones(1968) reported that subjects generally preferreda multi-category form to a two-choice format.This finding is supported by the first authors in-formal observations.The relation between item format and the

structure of the inventory has not been studiedextensively. If the multi-category format does, infact, provide meaningful finer distinctions, ananalysis of structure at the item level could re-sult in a better defmed scale and inventory struc-ture. Improvements in structure resulting fromthis approach would include (1) accounting formore of the total variation, (2) higher componentloadings, and/or (3) identification of additionalcomponents. Joe and John (1973) employed boththe traditional forced-choice (two-response) for-mat and a six-choice response with the Rotter I-E Scale. The six-choice format resulted in twoclearly interpretable factors, illustrating the po-tential value of the multi-category approach forprobing scale structure.

In the present study, the Eysenck PersonalityInventory (EPI) was selected for use in an in-vestigation of the effects of response format oninventory structure. The EPI was originally de-

rived factorially, possesses good psychometricproperties, and has been used extensively. Thelimited number of items (57) permits a fullanalysis at the item level. The reasoning in thepreceding paragraph led to the expectation thatthe standard EPI with two-choice format shouldyield two components (or three if the Lie Scaleemerges as a separate component), while aLikert format version of the EPI should yield ad-ditional components and/or better defined com-ponents ; these components should then accountfor more of the total variance.

Method

The subjects were students enrolled in twosections of a lower level psychology course.Participation was voluntary, and the studentswere told the general nature of the task, but notthe specific hypothesis. Each student completedthe standard Eysenck Personality Inventory anda seven-choice version of the EPI. The seven-choice version was developed by employing theitem format of the Comrey Personality Scaleswith the 57 items of the EPI. Comrey employedtwo different scalings for different items: onebased on a frequency concept (ranging from&dquo;Always&dquo; to &dquo;Never&dquo;) and the other based on alikelihood concept (ranging from &dquo;Definitely&dquo; to&dquo;Definitely Not&dquo;). This method was easilyadapted to the items of the EPI. Administrationwas in two sessions one week apart. Approxi-mately half the students took the Standard EPIfirst and then the Likert EPI, while the orderwas reversed for the remaining students. Onlystudents who completed both forms were in-cluded in the final sample (N = 77).

For each version of the inventory, a principalcomponents analysis was performed on the57X57 matrix of item intercorrelations.Velicers (1976) Minimum Average Partial(MAP) correlation method was used to deter-mine the number of components to extract. Avarimax rotation was performed on the compo-nent pattern.A critical issue in this study is the method of



295

Table 1

Comparison of Results of Component Analysis with EPI Scoring Keys.*

(Continued on next page)



296

Table 1: Continued

*Loadings greater than .30 are underlined.

determining the number of components to beextracted. The MAP method (Velicer, 1976) israther new, but it possesses a number of advan-

tages for a study of this type. The procedure de-termines the number of components by succes-sively partialing out components until the aver-age squared partial correlation reaches a mini-mum. Components retained by this procedureare clearly &dquo;common&dquo; components. For com-

parison purposes, an exact stopping point that isempirically determined is necessary. Alternative,more traditional methods, such as the scree testor the eigenvalue-greater-than-one criterion,give less satisfactory results. For both analyses,the scree test provided no clear solution. Theeigenvalue-greater-than-one method resulted in21 components for the binary format and 17 forthe Likert format. However, in both cases, the



297

Table 2

Cross Classification Table for Scale and Components

additional components were either poorly de-fined components or &dquo;unique&dquo; components.

Results

The analysis of the traditional two-choice ver-sion of the Eysenck Personality Inventory re-sulted in two components by the MAP method(see Table 1). These were clearly identifiable as&dquo;Neuroticism&dquo; and &dquo;Extraversion.&dquo; However,the two components together accounted for only18% of the total variance. An analysis of thevarimax-rotated pattern showed that 19 itemsdid not load on either component and were un-classifiable (see Table 2).The seven-choice version of the EPI resulted

in six components (see Table 3). The six compo-nents together accounted for 46% of the totalvariance, with the first two components aloneaccounting for 26% of the variance. If six com-ponents had been extracted for the two-choiceversion, a total of 37% of the variance wouldhave been accounted for. The first two compo-nents were identified as &dquo;General Anxiety&dquo; and&dquo;Social Extraversion.&dquo; The four new compo-nents were identified as &dquo;Compulsivity,&dquo; &dquo;Im-pulse Control,&dquo; &dquo;Health Concerns,&dquo; and &dquo;Af-filiative Concern.&dquo; Each component was identi-fied by at least four different items with loadingsof .49 or higher. Table 4 lists marker items foreach of these six components.The pool of items represented in Table 4 was

employed to calculate a scale score for each ofthe six components from the Likert format.

Items with negative loadings were reflected, andthe unweighted sum of the items formed thescale score. The binary version was scored in thestandard manner. The means, standard devi-ations, and intercorrelations for the three tra-ditional scales and six new scales are presentedin Table 5.An examination of the correlations suggests a

match of the Extraversion and Social Extraver-sion, Neuroticism and General Anxiety, and Lieand Compulsivity Scales. Since Health Concernsand General Anxiety correlated with Neuro-ticism and with each other, there is some sup-port for Neuroticism as a second order factor.Impulse Control and Affiliative Concerns wereessentially uncorrelated with the other sevenscales.

Discussion

These results have implications for several dif-ferent issues. The appropriateness of the presentform of the EPI is a relatively specific issueraised by the data and will be discussed first.The extensive examination of the factor struc-

ture of the EPI by Eysenck and Eysenck (1969)fails to report the percent of total variance ac-counted for by factors in any of the severalanalyses cited; such figures are also absent froma more recent investigation of the EPI (Howarth,1976). Hence the 18% figure obtained in the pre-sent study must stand as the only available esti-mate of the variance accounted for by Eysenckstwo factors when the two-choice response format



298

Table 3Varimax Rotated Component Patternfor Likert Item Version of EPI*



299

Table 3: Continued

*Loadings greater than .30 are underlined.

is used. This low figure and the large residue ofunclassifiable items calls for further investiga-tion and tentatively points to the need for re-vision of the inventory. The meaningfulness ofneuroticism and extraversion scores derived ac-cording to the test manual is also called intoquestion.

Use of the expanded response format has theapparent advantage of producing components

which account for a much higher proportion oftotal variance. Even when only the first twocomponents are considered (those which mostclearly represent Eysencks Neuroticism and Ex-traversion factors), 26% of the total variance wasaccounted for. These results indicate that usersof the EPI might obtain more meaningful scoreson Neuroticism and Extraversion by basingthem on the items which load highest on thesetwo components.



300

Table 4Marker Variables for Varimax Rotated Components of the Seven

Choice Item Format Version of the EPI.

(Continued on next page)



301

Table 4: Continued



302

mQ)

o/gM

(L)r-Im0M

0)4

4-1

04-4

m

Co

T-t4-Jt0r-iQ)$4

Lr) 0ci

a) 1-i Q)rci 41co 0

E--i +

roCt0

m

eo.ri

4-JCd

.,-I>(L)p

&dquo;0

m10

SCd4JM

E~C/)sm

The four additional components whichemerged with the multi-choice format may havesome value in their own right. However, exten-sive research would be needed before thesecomponents could be viewed as established. Ad-ditional items would be needed, and the relationto previously established personality scales ofother inventories would have to be investigated.The component names should be viewed astentative working names only.

Limited support for the component structurewhich emerged in the present study may befound in Guilfords (1975) critical analysis of theEysenck Personality Inventory. He suggestedthat the Extraversion-Introversion Scale doesnot represent a factor at any level, but is rather a&dquo;shotgun wedding&dquo; of two first-order factors,impulsivity and sociability. The present studydid extract a Social Extraversion componentand an Impulse Control component, consistingprimarily of items drawn from the ExtraversionScale. However, there was also a third compo-nent, Affiliative Concern, comprised of E-scaleitems. The present study found the Neuroticismscale items also split into two (correlated) com-ponents ; this somewhat parallels the substruc-ture anticipated by Guilford, who viewedNeuroticism as a legitimate second-order factor.A more general question toward which the

present study is directed concerns the value of afiner-grained response format in the develop-ment of personality scales and inventories. Theresults indicate that a more precise definition ofscales and a greater explained portion of the to-tal variance follow from use of the expandedformat. Research employing scales constructedin this way would be more likely to obtain mean-ingful relationships to other variables, and theitems involved in these relationships would bemore accurately identified. The results also sup-port the hypothesis that conflict in the researchon the relation between item format and relia-bility is due to changes in structure. Since relia-bility as measured by coefficient alpha is directlyrelated to the size of the eigenvalues, the relia-bilities of the components from the seven-choiceformat is clearly better.Downloaded from the Digital Conservancy at the University of Minnesota, http://purl.umn.edu/93227.

May be reproduced with no cost by students and faculty for academic use. Non-academic reproduction requires payment of royalties through the Copyright Clearance Center, http://www.copyright.com/

303

Two additional methodological issues must beconsidered. The obtained results might have oc-curred if either the estimates of the correlationcoefficients for the binary format were under-estimated or the estimates for the Likert formatwere inflated. In the first case, the substitutionof tetrachoric rs for the phi coefficients em-ployed would result in higher values in the cor-relation matrix. This also would result in 22negative eigenvalues. It seems more reasonableto view the (potentially) lower correlation valuesresulting from the use of the phi coefficient asaccurately reflecting a problem implicit in theuse of binary data; therefore a purely statisticalcorrection need not be attempted. In the secondcase, inflated values in the Likert format datacould occur as a result of an extremity responseset. Such a response set would result in in-creased item standard deviations. The averagestandard deviation for the 57 items was 1.29.While this was slightly larger than might be ex-pected if each item were normally distributed, itcan be accounted for by the presence of a num-ber of skewed items. Likewise, a review of theoriginal responses by subject does not supportthe presence of an extremity response set.A number of limitations restrict the strength

of generalizations from this study. First, samplesize is somewhat small. Second, a critical de-cision, regarding the number of componentswhich should be extracted, is based on a rela-tively new criterion. Third, generalizationsabout the advantages of the multiple-responseformat require replications using other person-ality inventories; of particular interest would bethe effect of a format change on inventorieswhich have been developed by different pro-cedures, such as theoretical approaches or first-order factor analytic procedures.With the above limitations in mind, the fol-

lowing conclusions may be drawn from thisstudy. The present scoring procedure of the bi-nary form of the Eysenck Personality Inventoryis acceptable, but inefficient. The factor struc-ture is more clearly defined with the multiple-re-sponse format and more variance is explained.

The Likert format generally supports Guilfords(1975) position. More generally, the multiple-re-sponse format has demonstrated very good po-tential for improving the quality of personalityinventories. Potential benefits include greaterscale reliability, a more clearly defined factorstructure, and more favorable subject reactions.

References

Bendig, A. W. Reliability and the number of ratingscale categories. Journal of Applied Psychology,1954,38,38-40.

Comrey, A. L. EITS Manual for the Comrey Person-ality Scales. San Diego: Educational and Industri-al Testing Service, 1970.

Eysenck, H. J., & Eysenck, S. B. G. Personality struc-ture and measurement. San Diego: Knapp, 1969.

Ferguson, L. W. A study of the Likert technique of at-titude scale construction. Journal of Social Psy-chology, 1941,13, 51-57.

Guilford, J. P. Factors and factors of personality.Psychological Bulletin, 1975, 82, 802-814.

Howarth, E. A psychometric investigation ofEysencks personality inventory. Journal of Per-sonality Assessment, 1976, 40, 173-185.

Jahoda, M., Deutsch, M., & Cook, S. W. (Eds.). Re-search methods in social relations. New York:Dryden Press, 1951.

Joe, V. C., & John, J. C. Factor structure of the RotterI-E Scale. Journal of Clinical Psychology, 1973,29, 66-68.

Jones, R. R. Differences in response consistency andsubjects preferences for three personality invento-ry response formats. Proceedings of the 76th An-nual Convention of the American PsychologicalAssociation, 1968, 3, 247-248.

Komorita, S. S. Attitude content, intensity, and theneutral point on a Likert scale. Journal of SocialPsychology, 1963, 61, 327-334.

Komorita, S. S., & Graham, W. K. Number of scalepoints and the reliability of scales. Educationaland Psychological Measurement, 1965, 4,987-995.

Lissitz, R. W., & Green, S. B. Effect of the number ofscale points on reliability: A monte carlo ap-proach. Journal of Applied Psychology, 1975, 60,10-13.

Masters, J. R. The relationship between number ofresponse categories and reliability of Likert-typequestionnaires. Journal of Educational Measure-ment, 1974,11, 49-53.



304

Matell, M. S., & Jacoby, J. Is there an optimal num-ber of alternatives for Likert scale items? Study I:Reliability and validity. Educational and Psycho-logical Measurement, 1971, 31, 657-674.

Peabody, D. Two components in bipolar scales: Di-rection and extremeness. Psychological Review,1962,69,65-73.

Velicer, W. Determining the number of componentsfrom the matrix of partial correlations. Psychome-trika, 1976, 41, 321-327.

AcknowledgmentsAn early version of this paper was presented at the

Spring 1977, Eastern Psychological AssociationMeeting, Boston. The authors acknowledge, withdeep thanks, the assistance of Raymond Kilduff inthe initial stages of the study.

Authors AddressWayne F. Velicer, Department of Psychology,University of Rhode Island, Kingston, RI 02881.



Documents

manual.pdf