15
DONALD J. TREFFINGER JOHN P. POGGIO Needed Research on the Measurement 01 Creativity* Although the volume of literature on creativity has increased very rapidly since the early 1950's, there are many difficult problems which have not been solved. Central among these difficulties - perhaps because of its pervasiveness - is the issue of assessing creativity. How can we recognize creativity? Can we identify creative behavior and creative potential with confidence and accuracy? By what standards will individual or group differences be described, or the effects of training pro- grams be documented? These are practical questions which, in their simplest form, say, "How can creativity be assessed?" The purposes of this paper are, therefore, to review briefly and selectively some major issues concerning the assessment of creativity, to identify theoretical and methodological issues in the study of creativity, and to examine the areas in which research is needed. In dealing with problems of psychological measurement or assessment,three general categories may be employed: validity, reliability, and usability. This paper has been divided into three major sections, corresponding to these categories i within each, major problems and research needs will be identified. VALIDITY Among our several concerns in assessing creativity, perhaps none is more important or more complex than validity. The question of whether or not some measure of creativity "really" It Many of the ideas in this article are presented in greater detaU in a report of the Creativity Task Force (E. Paul Torrance, Chairman) of a project on the Critical Appraisal of Research in the Personality - Emotions - Motit1ation Domain, directed by S. B. Sells and supported by the U. S. Office of Education. 253 Volume 6 Number 4 Fourth Quarter 1972

Needed Research on the Measurement of Creativity

Embed Size (px)

Citation preview

Page 1: Needed Research on the Measurement of Creativity

DONALD J. TREFFINGER

JOHN P. POGGIO

Needed Research on theMeasurement 01 Creativity*

Although the volume of literature on creativity has increasedvery rapidly since the early 1950's, there are many difficultproblems which have not been solved. Central among thesedifficulties - perhaps because of its pervasiveness - is theissue of assessing creativity. How can we recognize creativity?Can we identify creative behavior and creative potential withconfidence and accuracy? By what standards will individual orgroup differences be described, or the effects of training pro­grams be documented? These are practical questions which, intheir simplest form, say, "How can creativity be assessed?"The purposes of this paper are, therefore, to review brieflyand selectively some major issues concerning the assessmentof creativity, to identify theoretical and methodological issuesin the study of creativity, and to examine the areas in whichresearch is needed.

In dealing with problems of psychological measurement orassessment, three general categories may be employed:validity,reliability, and usability. This paper has been divided into threemajor sections, corresponding to these categories i within each,major problems and research needs will be identified.

VALIDITY Among our several concerns in assessing creativity, perhapsnone is more important or more complex than validity. Thequestion of whether or not some measure of creativity "really"

It Many of the ideas in this article are presented in greater detaU ina report of the Creativity Task Force (E. Paul Torrance, Chairman) ofa project on the Critical Appraisal of Research in the Personality ­Emotions - Motit1ation Domain, directed by S. B. Sells and supportedby the U. S. Office of Education.

253 Volume 6 Number 4 Fourth Quarter 1972

Page 2: Needed Research on the Measurement of Creativity

Needed Research on the Measurement of Creativity

taps something that is genuinely IIcreativity" is probably theforemost concern of the researcher as well as the generalaudience. No psychological procedure, regardless of its sta­bility, consistency, or ease and economy of use, is of muchvalue unless there is some unequivocal evidence for its validity.

It is customary among psychologists to describe three gen­eral categories in which the validity of a test can be docu­mented. These are content validity, criterion-related validity,and construct validity. Many theoretical and methodologicalproblems confront the creativity researcher in each of thesethree areas; thus, each area will be considered separately.

oomen: Validity Content validity is defined as lithe systematic examinationof the test content to determine whether it covers a represen­tative sample of the behavior domain to be measured" (Anas­tasi,1968).

Theoretical Issues. Although traditionally associated withthe measurement of achievement, the problem of contentvalidity also confronts the creativity researcher. In order toargue for content validity, it is necessary to present evidencethat one's test or assessment procedure samples in a represen­tative manner the domain of concern. In attempting to estab­lish the content validity of creativity measures we are con­fronted with three major problems. First, what is the universefrom which we must sample? Without an adequately defineduniverse from which to sample, it seems Virtually impossibleto establish content validity for a creativity measure. Torrance(1966) has contended that it would be impossible to developa comprehensive battery of tests of creative thinking thatwould sample any kind of universe of creative thinking abilities.

A second problem in establishing the content validity ofmeasures of creativity results from the absence of a simple,generally accepted theory of creativity which would serve tounify or direct efforts at specifying assessment procedures.This problem which has resulted in the availability of numer­ous creativity tests yet each differing in a number of ways,has been pointed out by Treffinger, Renzulli, and Feldhusen(1971).

In viewing the problem of content validity of creativitymeasures, another related issue concerns the complexity ofcreativity as a psychological construct. Does creativity repre­sent a unitary psychological construct, comprised of a specificset of basic aptitudes and traits which are common acrossa variety of creative expressions? Or are there "many creativi­ties," each comprised of a unique structure of aptitudes andtraits? In the first case, the problem of establishing content

254

Page 3: Needed Research on the Measurement of Creativity

The Journal of Creative Behavior

validity focuses upon the adequacy with which we can define,and sample, the basic aptitudes and traits (cf., Guilford, 1971).In the latter case, the general term "creativity" may hayeactually been misleading, in that we have attempted to defineand sample one universe rather than several (d., Ausubel,1963; Wallach and Kogan, 1965; Wallach, 1968).

It is possible that creativity may represent such a complexhuman phenomenon that we may never be able to representit adequately as a single, unidimensional operational variable,or even as a small set of operations. There remains a clear chal­lenge for contemporary students of creativity: to engage insignificant theoretical work which may lead to improvementsin our ability to define the universe of creative abilities, andsubsequently to sample that universe more effectively in newmeasures.

Methodological Problems. Several methodological problemsin measuring creativity also are related to the question ofcontent validity.

Covington (in press) argued that, in our attempts to developmeasures of creativity that "fit" well into established psycho­metric procedures, we have often sacrificed some of the essen­tial attributes of the creative process. He contended that tradi­tional mental measurement procedures are characterized bytimed, speeded performance on a large number of discreetitems, items which represent artificial and highly contrivedsituations, and an emphasis on standardized scoring proceduresand unique, specific abilities, with clearly defined and presentedrequirements and directions. By contrast, Covington argued,the creative process is usually characterized by intense, per­sonal involvement in one real problem, over a long period oftime, with an emphasis on ordering the problem, co-ordinatingor managing one's efforts, and attaining a personal solution.

Guilford (1971) also warned of common misconceptionswhich must be avoided in studying creative talent. He observedthat creativity has too often been associated only with "diver­gent thinking," although he has argued strongly that manyother aptitudes and traits are involved. The clear implicationis that any operational definition of creativity which is re­stricted to divergent thinking cannot be content valid as anassessment of creativity, since it is known to sample only asmall portion of the abilities which contribute to creativetalent. What must be stressed is that in order to sampleaccurately a particular part of the universe of creative abilities,we must be cautious about the selection and use of test tasks.It also raises serious questions about the comparability, and

255

Page 4: Needed Research on the Measurement of Creativity

Criterlon­relatedValidity

Needed Research on the Measurement of Creativity

perhaps directly about the content validity, of studies in whichexperimenters do not report carefully the tasks selected, or inwhich tasks vary from study to study or are modified insome way by the experimenter.

Criterion-related validity is defined as "the effectiveness ofa test in predicting an individual's behavior in specified situa­tions" (Anastasi, 1968, p. 105). The criterion may be an imme­diate criterion, in which case we usually discuss "concurrent"validity, or a long-term criterion, in which case we discuss"predictive" validity.

Theoretical Issues. The greatest single problem in establish­ing criterion-related validity (either concurrent or predictive)is, of course, the selection of criteria. What are the externalcriteria against which measures of creativity may be validated?

There is great concern about the identification of acceptablecriteria against which measures of creativity may be validated.This concern is not new i indeed, as one reads through thereports of many of the pioneering Utah conferences on crea­tivity (Taylor, 1964a; Taylor, 1964b), the striking impressioncreated is that we still have with us, almost a decade later, thesame fundamental problems with which the conference re­searchers grappled. Brogden and Sprecher (1964), in theiressay on criteria for creativity, raised many still-familiar con­cems-product-process distinctions; difficulties of identifyingreliable criteria, and problems of generalization and controlvariables.

Establishing criteria for concurrent validity measures hasalso been difficult because of disagreement over a varietyof specific issues, the evaluation of products, the pos­sibility of determining process criteria, the question ofnovelty (for whom?), and the persistent criticism that "crea­tivity" may in fact be used better to describe a rare quality orgenius rather than a psychologically distinct set of individualdifference variables. It is clear that one's positions on theseissues will determine to a rather great extent the suitability(or unsuitability) of various criteria proposed for the valida­tion of creativity measures. Finally, in establishing criteria,much more must be known about the effects of a variety ofcontrol variables. Are different criteria needed for sexes, vari­ous age groups, or in different cultural settings?

In considering long-term studies of criterion-related validity(i.e., predictive validity) numerous additional questions areraised. Foremost, there is the need to conduct longitudinalstudies of creative development over a substantial period oftime, and involving large-scale psychological assessment. It

25G

Page 5: Needed Research on the Measurement of Creativity

ConstruclValidity

The Journal of Creative Behavior

seems true that, again, there is a substantial need for a generaltheoretical work in the area of creativity to provide a betterconceptual framework for the identification of criteria, bothimmediate and long range.

Methodological Problems. A variety of specific methodo­logical issues relate to the problem of criterion-related validity.

First, it must be made clear that measures of creativity, asan extremely complex construct, will not be likely to be sub­stantially validated against any single criterion (Guilford,1971). Because of the number and extent of aptitude factorsinvolved in creative talent, it is unlikely that any small, rela-

~

tively arbitrary selection of tests will pred{ct well a complex,multidimensional criterion of creative behavlor. This suggests,in addition to the need for broadening the selection of testtasks, the need to utilize complex multivariate statistical pro­cedures rather than simple bivariate correlational procedures.

Next, increased attention must be given to the adequacy ofthe criteria themselves. New approaches to the identificationof criteria and the sampling of complex behavior must besought, which will lead to more appropriate and reliablecriterion assessments. Finally, as Guilford (1971) has alsoargued, it is necessary to examine carefully the variety ofcommonly-used criteria to evaluate their adequacy, and possi­bly identify improvements.

The problem of establishing construct validity is a complexmatter of determining the extent to which a test may be saidto measure a theoretical construct or trait (Anastasi, 1968).The American Psychological Association's Standards for Edu­cational and Psychological Tests and Manuals (French andMichael, 1966) holds that there are three essential steps in theconstruct validation procedure. First, on the basis of the theoryupon which the test has been developed, the researcher devel­ops hypotheses concerning the behavior of high and lowscorers. Then, data are gathered to test those hypotheses.Third, the data collected provide evidence for inferring whetherthe theory is adequate. If the theory fails to account for theactual evidence, there is need for revision of the test, reformu­lation of the theory, or rejection of the theory. Therefore thedata used to investigate construct validity are preferably ex­perimental, although correlational evidence may be useful totest certain construct validation hypotheses.

Theoretical Issues. The problems of definition and criteria,which create problems in relation to content and criterion­related. validity are also related to construct validity. Differ­ences among writers concerning definitions and criteria lead

257

Page 6: Needed Research on the Measurement of Creativity

Needed Research on the Measurement of Creativity

to substantial difficulty in formulating testable hypotheses, orin documenting the theoretical or empirical rationale for certainhypotheses. This is further compounded by the fact that manyresearch studies have employed widely-differing tasks (as inthe area of problem-solving; cf., Davis, 1966) or varying sub­sets of tasks. Selection of sub-tests may imply that funda­mentally different psychological processes; are being assessedin each study, so the problem of developing a consistenttheoretical basis for the interpretation of results or derivationof new hypotheses is very important.

Methodological Problems. Because of the complexity ofassessing construct validity of creativity measures, there aremany methodological problems, involving general concerns forconstruct validation, as well as some which relate to very speci­fic issues. The two greatest areas of concern appear to be: (1)the theoretical and empirical distinction between creativity andintelligence; and (2) the need for the development of experi­mental studies of creative behavior.

A complete consideration of the creativity-intelligence ques­tion is beyond the scope of the present paper.

This problem has not been fully resolved, however, and it isrelated in part to a broader theoretical problem. The researcher,as noted above, must assume the responsibility for statingfully his theoretical position and the interpretation of his data;in addition, he must distinguish the variables with which he isconcerned from other constructs. It is certain that much ofthe controversy concerning the creativity-intelligence relation­ship is related to problems in the definition and theoreticalinterpretation of both creativity and intelligence. When "cre­ativity" is defined, for example, by performance on a specificmeasure of divergent production, and "intelligence" by refer­ence to performance on a specific IQ test, the theoretical pre­diction of the relationship may be more clearly stated thanwhen we argue about the ,relationship between "creativity"and "intelligence" as general (but non-operational) constructs.

A more complex reformulation of the creativity-intelligencequestion involves what has been called convergent-discrimi­nant validation (Campbell and Fiske, 1959). Stated simply, theproblem holds that measures of a certain construct shouldcorrelate highly with other measures of the same construct,but negligably with measures of some different construct.Several measures which purport to assess "creativity" should,therefore, intercorrelate substantially (convergent validity),whereas they should yield low correlations with measures ofsome other, different construct (discriminant validity). Wallach

258

Page 7: Needed Research on the Measurement of Creativity

The Journal of CreaUve Behavior

and Kogan's (1965) criticism that tests of "creativity" oftencorrelate as well or better with measures of IQ (presumablya different construct) than they correlate among themselvesillustrates such a concern. However, the problem is complex,and the Wallach-Kogan results have not always been sup­ported (Williams and Fleming, 1969i Feldhusen et al., 1971).In addition, Guilford (1971) has argued that creative talentmay be so complex that current measures of related aptitudes(such as divergent thinking) may well tap quite unique aspectsof the construct, and so may not be expected to display highintercorrelations.

It is also true that many studies of creativity and creativitymeasures have been simple correlational studies, from whichonly a limited theoretical information may be obtained. Fromsimple correlational studies, it is possible to describe themagnitude and direction of a relationship between the vari­ables studied) typically the underlying cause(s) of such arelationship is not open to examination. Thus, in order to testadequately a full range of hypotheses concerning the natureand assessment of creativity, more complex research method­ologies should be employed. These include:

(a) use of multivariate statistical techniques, to allow for theinvestigation of the more complex multiple aptitudes whichare involved in creative talent;

(b) the use of experimental and quasi-experimental researchdesigns, including large-scale sampling of populations of in­terest, well-controlled studies, and replication studies;

(c) the development and implementation of longitudinalstudies of creative talent.

Table 1 summarizes theoretical and methodological prob­lems in determining the validity of creativity measur.es, andrecommendations for needed research.

RELIABILITY Reliability is often defined as "the accuracy (consistency andstability) of measurement by a test" (French and Michael,1966). Since, however, there are a variety of sources of inac­curacy or 'error' in the measurement of some psychologicalconstructs, there are several approaches to the establishmentof reliability.

Stability One approach inquires about the stability of test scores overa period of time. This method for assessing stability is com­monly referred to as "test-retest" reliability. A considerationof some of the theoretical and methodological postulatesinvolved with determining the stability of test scores does notprovide clear evidence for automatic acceptance of what mightbe termed the stability of measures of creativity. These include:

259

Page 8: Needed Research on the Measurement of Creativity

Needed Research on the Measurement of Creativity

TABLE 1 Needed research relating to the validation of creativitymeasures.

Area ofValidation

CONTENTVALIDITY

CRITERION­RELATED

VALIDITY(short-term or

concurrent andlong-term or

predictive)

CONSTRUCTVALIDITY

Theoretical Problems

What is the universe fromwhich we must sample?What do differentmeasures attempt tosample?Isthere one "creativity"or "many"?What is an adequateconceptual definition of"creativity"?

Selection of appropriatecriteria.Product·Processdistinctions.Generalization andmoderator variables.Need for theoretical basisfor predictions.Identification ofcognitiveand affective components.Novelty for whom?Need for long-termstudies, wide sampling,more extensive criteria.

Need for extensive theoryonwhich to basepredictions andinterpretations.Need toexplain theoreticalrationale for predictionsand interpretations, andto distinguish creativityfrom other constructs.Need for Integration andevaluation ofextensiveresearch literature.Problems ofdefinition,criteria, and selection ofmeasures.

280

Methodological Problems

Are traditionalpsychometric proceduresinappropriate?What new procedurescan beused?What other aptitudesbesides divergent thinkingare involved increativity?How can they beassessed?How can studies whichemploy different measuresbe compared?What constitutes anadequate operationaldefinition of"creativity"?

Criteria will probably bemultiple-eomplex.Small selection ofmeasures may be toolimited to account forcomplex behavior.Eitablishing thevalidityand reliability ofcriteria.Inadequacy ofteacher andpeer judgments.Diversity oftasksemployed In literature.Statistical problems inoriginality criteria.Need formore extensivesampling and testing­spanning cognitiveabilities, affective factors,behavioral indices.Developmental and cross­cultural differencesunclear.

Theoretical and empiricaldistinction betweencreativity and othervariables (particularlyintelligence).Age differentiation;contrasts between highand low scoring groups.Convergent·DiscriminantValidation.Need for multivariateanalyses, experimentaland quasi·experimentalstudies, replication, andeffective controls.Need for complex,long·term studies ofcreative behavior.

Research Needs

1. Integration oftheoriesand research literature.

2. More adequateconceptual andoperational definitions.

3. Development ofcriteriafor new measures ofcreative talent,

4. Development of newprocedures forimplementing thosecriteria.

1. Evaluation ofvalidityand reliability ofexternal criteria ofcreativity.

2. tong-term, multl­dimensional studies ofcreative abilities,personality, andbehavior.

3. Need to conductdevelopmental andcross·cultural studies.

1. Multivariate researchprocedures applied tocorrelational problems.

2. Experimental andquasi-experimentalstudies ofcreativity,including adequatecontrols.

3. Replication studies.4. tong-term studies of

creative behavior.5. Extensive theoretical

work, synthesis andevaluation oftheliterature, urgentlyneeded.

Page 9: Needed Research on the Measurement of Creativity

Equivalence orComparablIlty

The Journal or Creallve Behavior

(1) Determining whether creativity is, in facti a stablehuman characteristic. Since certain theoretical formulations ofcreativity stress an irrational, preconscious, or emotional com­ponent (e.g. Kubie, 1958i Gordon, 1961)/ it may not be possibleto expect stability in measures of creativity. To the extent thatone is influenced by such theoretical orientation, it becomesirrelevant to inquire about test-retest reliability. Alternateviews, however, such as Guilford's Aptitude-approach, wouldplace more emphasis on the stability of measures of creativeabilities. Additionally, assuming that creativity is a multi­dimensional construct, it seems questionable at best to referto the "stability of creativity," and rather more appropriate todetermine the consistency of each component part.

(2) Identifying an appropriate interval. Crucial to the esti­mation of test-retest reliability is the length of time or theinterval between test administrations. However, no "ideal"interval can be specified, yet research is needed to investigatethis question. Until more information is made available re­searchers of creativity should at least state clearly the intervalsemployed.

(3) Motivational influences. As Torrance (1966) haspointed out/ test-retest reliabilities in measures of creativethinking may be influenced substantially by the motivationallevels of the subject tested. Torrance concluded that researcherswere often more adequate in their consideration of such moti­vational factors in experimental studies than when collectingnormative data (1966/ p. 22). This suggests that, in researchon the measurement of creativity/such factors must be con­sidered/ manipulated, and, at very least, clearly described (cf.,Elkind et aI., 1970).

(4) Incomplete or partial sampling of the measurementuniverse. A test can only attempt to provide a representativesample of a content universe. Retesting is not a theoreticallydesirable approach to determining reliability when the testsamples only one of many real or hypothetical sets of itemswhich might have been used to assess the trait. In view of thecomplexity of the aptitudes and personality traits which maybe involved in creative talent, and our tendency to employonly a limited sample of measures in most studies, this limita­tion appears to have considerable importance in the measure­ment of creativity.

A second general approach to determining the reliability ofa test has to do with the "equivalence" or "comparability" ofvarious forms of a test. Customarily, this approach to assessingthe reliability of a test involves the administration of alternate

281

Page 10: Needed Research on the Measurement of Creativity

InternalConsistency

USABILITY

Needed Research on the Measurement of Creativity

forms of a test to the subject. If there are many tasks or itemswhich might comprise a certain test, there is often no reasonto assume that one particular sampling of that pool will yielda score which is systematically superior or inferior to any othersampling of the same number of items from the same item pool.

Of course, when we attempt to measure creativity, wecannot be certain that a selection of a certain set of tasksrepresents a random and representative sample of some gen­eral "item pool." The great problem, then, in considering theuse of alternate forms reliability, is to verify that the presum­ably alternate forms do, in fact, measure the same aptitudes.Reliance solely on this particular index of reliability seems atthis time weak. What is needed is a table of specifications(Ahmann and Glock, 1971) for each of the many commerciallyavailable tests of creativity.

A third approach for determining reliability involves severalmethods for assessing internal consistency. These also areproblematic for the creativity researcher. Measures of internalconsistency (odd-even or other split-half measures, or themore general Kuder-Richardson formulas) generally assumethat the subject's performance on one part of a test should notordinarily be greatly different from his performance on anotherpart, and as a compliment to this first assumption, that the testscore reflects a unidimensional trait or behavior. Such mea­sures may be entirely inappropriate, however, in the case ofcreativity measures which are open-ended, rather than com­prised of discreet "items," and which are often selected, asTorrance (1966) argues, to represent a range of distinctlydifferent abilities and performances.

It is not dear, then, that the traditional approaches todetermining the reliability of a test are well-suited to themeasurement of creativity. Except in the case of single­response, discreet-item tests (where validity may be doubtfulagainst any complex criteria of creative talent), such measuresmay be difficult to employ, and may yield misleading dataconcerning the accuracy of measurement. Nevertheless, thegeneral idea of determining the accuracy of reliability ofcreativity measures seems to have significance in evaluatingresearch which must be conducted in this area.Although usability usually refers to several practical considera­tions in the selection and evaluation of a test, such as cost,availability, and supporting information or technical manuals,it also subsumes several problems which relate to research onthe measurement of creativity. Primarily, these problems are:test administration, test scoring, and norms.

262

Page 11: Needed Research on the Measurement of Creativity

TestAdministration

TestScoring

Norms

The Journal of Creative Behavior

Basically, if conditions under which tests of creativity are tobe administered are not controlled, the resulting influences onscores obtained will impair our ability to interpret the scores.Research evidence to date, although not often conclusive,suggests that if factors such as working time, instructions fortest taking, administration procedures, warm-up activities,and the test environment itself are not controlled, differencesin subject scores can be found attributable to these factors aswell as due to individual differences on the test itself. What ishazardous is that once scores are affected by such extraneousvariables it becomes impossible to distinguish what is attribu­table to creativity and what stems from lack of control overthe variability of subjects' scores.

Our suggestion is twofold. First, more controlled research isneeded to determine the effects of these variables on creativitytest scores, and second, research which is conducted mustcontrol for these conditions.

When it becomes necessary that subjective processes enterinto the scoring of a test of creative ability, evidence concerningthe degree of agreement among independent scorers should beprovided. Documentation should also be provided outlining thebasis for scoring and procedures for training scorers, so as topermit the unbiased replication of research or effective use ofthe research results.

In view of the complexity of scoring "open-ended" measuresof creative thinking, research should be conducted on twolevels: first, on the development of new scoring procedureswhich will yield more accurate assessments of originality andimagination; and second, on ways to improve the accuracy ofexisting scoring procedures, such as through the utilization ofnatural language computing for the scoring of tests (Paulusand Renzulli, 1969).

The development of norms for use in the measurement ofcreativity represents another very difficult problem. Indeed,there are some who contend that, because of the very natureof creativity, it is impossible to develop or apply normativescoring procedures. In this view, the creative response is, bydefinition, one which cannot be anticipated, and one whichrepresents essentially a departure from the ordinary. As such,it is impossible to specify in advance what kind of responsewill be considered creative. The initial problem in this view,of course, is that it seems to remove the potential for creativebehavior from the domain of most persons, considering as"creative" only rare instances of exceptional or unusual accom­plishment. It seems more fruitful to consider creativity as

263

Page 12: Needed Research on the Measurement of Creativity

Needed Research on the Measurement of Cr••Uvlty

a complex construct involving numerous individual differencecharacteristics, suggesting that inter-individual variations increative thinking are presentIand predictable) and that profilesmay be more useful than single or composite scores. Undersuch a view, in which every subject shares creative potential,although some will demonstrate greater potential or moreexceptional actual performance than others, some distinctionsamong the responses of subjects can be classified and scoredagainst normative criteria. Provision for the exceptional re­sponses, unanticipated in the norms, must also be made. Thisapproach seems to be consistent with that employed in Tor­ranee's assessment of creative thinking abilities (1966) andGuilford's assessment of Structure of Intellect aptitudes increativity (1967).

Under this view, the problem is not whether there can benorms for scoring such variables as fluency, flexibility, origi­nality, or elaboration. The question of interest to the researcheris, how can such norms most effectively be developed? Astrong criticism of existing tests of creative thinking, in thepresent writers' view, is not their utilization of normativescoring criteria; it is that the norms used are frequently inade­quate. If normative scoring procedures are to be utilized,research must clarify the population for which the norms areappropriate; specific predictions for variations in other popula­tions; the differentiation of norms according to age, socio­economic status, educational attainment, standing on otherrelated cognitive or affective characteristics, or other relevantvariables; and, the provision of adequate information for thestandardization of test scores.

A related issue has to do with the selection and combinationof sub-tests. In reviewing research which employs the TorranceTests of Creative Thinking (1966), for example, one probleminvolves the fact that researchers have frequently employeddifferent samplings of subtests, which renders comparabilityof results across studies Virtually impossible. In addition, someresearchers have reported only undifferentiated total fluency,flexibility, originality, or elaboration scores. In some cases, iteven appears that verbal and figural scores may not have beendifferentiated. Other studies have used total scores, derivedfrom different groups of tasks, and some have utilized scoresderived successively from single tasks. These variations amongstudies have further reduced the comparability of test results.In addition, research by Harvey et al. (1970) suggested that thesub-tests or tasks selected by the researcher may substantiallyinfluence the nature of the abilities measured. In addition,

284

Page 13: Needed Research on the Measurement of Creativity

The Journal 0' Creative Behavior

Harvey et al. (1970) suggested that it was doubtful that scoringdimensions (fluency, flexibility, etc.) could be accurately com­bined across tasks.

SUMMARY The purpose of this paper was to identify several critical prob­lems and areas of needed research on the measurement ofcreativity. The area was surveyed in three general categories:validity, reliability, and usability. In each of these areas, majorproblems and research needs included:

Validity

1. There is a substantial need for extensive theoreticalwork in the field of creativity, as well as for synthesis,integration, and evaluation of the research literature.

2. Progress in developing adequate operational definitionsof creativity depends greatly on progress in developingadequate conceptual definitions.

3. There is a need for extensive studies of new, moreadequate external criteria for the validation of creativitymeasures, as well as for inquiry into the validity and reli­ability of existing criteria.

4. There is a need for multivariate methods to be em­ployed in correlational studies of creative talent.

5. There are needs for longitudinal studies,well-controlledexperimental studies, replications, and for developmentaland cross-cultural studies. .

Reliability

1. Studies are needed which investigate new methods ofdetermining the accuracy or reliability of measures of cre­ativity, with emphasis on the specification of IIerror" com­ponents more comprehensively.

2. In employing traditional stability indices, attentionmust be given to determining the extent to which creativityshould be expected to be a stable trait, in identifying appro­priate intervals for assessing stability, and for assessingsystematically the influence of motivation, moods, and othersituational variables on reliability of test scores.

3. In considering the utilization of alternate forms orinternal consistency indices of reliability, attention must begiven to the problems involved in selection and use of sub­tests from larger batteries. It must be recognized that tasksin creativity tests may not be discreet "items," and thatscores derived from various tasks may neither be additive,nor meet many fundamental assumptions involved in thetraditional determination of reliability indices.

265

Page 14: Needed Research on the Measurement of Creativity

Needed Research on the Measurement 01 Creativity

Usability

1. Research must be addressed to developing a systematictheoretical and empirical understanding of the effects ofvariations in test administration procedures and conditions(including directions, testing environment, working time,and response modes).

2. Problems relating to test scoring are very important inthe measurement of creativity. In addition to research on thecomparability of scores derived from different tasks anddifferent methods of testing, studies should also be con­ducted which investigate new methods and criteria forscoring (particularly for originality and "imagination").

3. Problems of the validity and reliability of scorers areextremely important, and all research employing creativitymeasures should provide full information concerning inter­scorer correlations, as well as comparison of means andvariances among scorers and between scorers and test norms.

4. Creativity measures which involve normative scoringprocedures must be accompanied by extensive supportingdata concerning the norm groups employed and the tasksinvolved.These problems are very complex, and may not soon be

resolved. It seems necessary to recognize them, however, andto take into account such problems in the interpretation ofresearch in which "creativity" measures are used. It wouldalso be of significant value to researchers in the psychologicalstudy of creativity if support were increased for research inthese areas.

REFERENCES AHMANN, J. S. & GLOCK, M. D. Evaluating pupil growth (4th ed.).Boston: Allyn and Bacon, 1971.

ANASTASI, A. Psychological testing. NYC: Macmillan, 1968.AUSUBEL, D. P. The psychology of meaningful verbal learning. NYC:

Grane and Stratton, 1963.

BROGDEN, H. E. & SPRECHER, T. B. Criteria of creativity. In Taylor,C. W. (ed.), Creativity: progress and potential. NYC: McGraw, 1964.

CAMPBELL, D. T. & FISKE,D. W. Convergent and discriminant valida­tion by the multitrait-multimethod matrix. Psychological Bulletin,1959, 56, 81-105.

COVINGTON, M. V. New directions in the appraisal of creative think­ing. In Treffinger, D. J. (ed.), Readings on creativity in education.Englewood Cliffs: Prentice (In press).

DAVIS, G. A. Current status of research and theory in human problemsolving. Psychological Bulletin, 1966, 66, 36-54.

ELKIND, D., DEBLINGER, r. & ADLER, D. Motivation and creativity:the context effect. American Educational Research Journal, 1970, 7,351-358.

266

Page 15: Needed Research on the Measurement of Creativity

The Journal of Creative Behavior

FELDHUSEN, J. F., TREFFINGER, D. I., VAN MONDFRANS, A. P. &FERRIS, D. R. The relationship between academic grades and divergent

thinking scores derived from four different methods of testing. JournalofExperimental Education, 1971, 40, 35-40.

FRENCH, I. W. & MICHAEL, W. B. Standards for educational andpsychological tests and manuals. (Prepared by a joint committee ofAPA, AERA, and NCME.) Washington: American PsychologicalAssociation, 1966.

GORDON, W. J. J. Synectics: the detielopment of creative capacity.NYC: Harper,1961.

GUILFORD, J. P. The nature ofhuman intelligence. NYC: McGraw, 1967.HARVEY, O. J., HOFFMEISTER, I. K., COATES, C. & WHITE, B. J.

A partial evaluation of Torrance's test of ·creativity. American Educa­tional Research Journal, 1970, 7,359-372.

KUBIE, L. S. Neurotic distortion of the creative process. Lawrence:University of Kansas, 1958.

PAULUS, D. H. & RENZULLI, J. S. Computer scoring of creativity tests.Gifted Child Quarterly, 1968, 12, 79-83.

TAYLOR, C. W. (ed.). Widening horizons in creativity. NYC: Wiley,1964 (a).

TAYLOR, C. W. (ed.). Creativity: progress and potential. NYC: McGraw,1964 (b).

TORRANCE, E. P. Torrance Tests of Creati'ue Thinking: norms andtechnical manual. Princeton: Personnel Press, 1966.

TREFFINGER, D. I., RENZULLI, I. S. & FELDHUSEN, I. F. Problemsin the assessment of creative thinking. Journal of Creative Behavior,1971,5,104-112.

WALLACH, M. A. Review of the Torrance Tests of Creative Thinking.American Educational Research '[ournal, 1968, 5, 272-281.

WALLACH, M. A. & KOGAN, N. Modes of thinking in young children.NYC: Holt, 1965.

WILLIAMS, T. M. & FLEMING, I. W. Methodological study of therelationship between associate fluency and intelligence. Develop­mental Psychology, 1969, 1, 155-162.

Donald ]. Treffinger, Associate Professor and Chairman, Department ofEducational Psychology and Research.Address: University of Kansas, Bailey Hall, Lawrence, Kansas 66044.

John P. Poggio, Assistant Professor of Educational Psychology andResearch.Address: University of Kansas, Lawrence, Kansas 66044.

267