Needed Research on the Measurement of Creativity

DONALD J. TREFFINGER

JOHN P. POGGIO

Needed Research on theMeasurement 01 Creativity*

Although the volume of literature on creativity has increasedvery rapidly since the early 1950's, there are many difficultproblems which have not been solved. Central among thesedifficulties - perhaps because of its pervasiveness - is theissue of assessing creativity. How can we recognize creativity?Can we identify creative behavior and creative potential withconfidence and accuracy? By what standards will individual orgroup differences be described, or the effects of training programs be documented? These are practical questions which, intheir simplest form, say, "How can creativity be assessed?"The purposes of this paper are, therefore, to review brieflyand selectively some major issues concerning the assessmentof creativity, to identify theoretical and methodological issuesin the study of creativity, and to examine the areas in whichresearch is needed.

In dealing with problems of psychological measurement orassessment, three general categories may be employed:validity,reliability, and usability. This paper has been divided into threemajor sections, corresponding to these categories i within each,major problems and research needs will be identified.

VALIDITY Among our several concerns in assessing creativity, perhapsnone is more important or more complex than validity. Thequestion of whether or not some measure of creativity "really"

It Many of the ideas in this article are presented in greater detaU ina report of the Creativity Task Force (E. Paul Torrance, Chairman) ofa project on the Critical Appraisal of Research in the Personality Emotions - Motit1ation Domain, directed by S. B. Sells and supportedby the U. S. Office of Education.

253 Volume 6 Number 4 Fourth Quarter 1972

Needed Research on the Measurement of Creativity

taps something that is genuinely IIcreativity" is probably theforemost concern of the researcher as well as the generalaudience. No psychological procedure, regardless of its stability, consistency, or ease and economy of use, is of muchvalue unless there is some unequivocal evidence for its validity.

It is customary among psychologists to describe three general categories in which the validity of a test can be documented. These are content validity, criterion-related validity,and construct validity. Many theoretical and methodologicalproblems confront the creativity researcher in each of thesethree areas; thus, each area will be considered separately.

oomen: Validity Content validity is defined as lithe systematic examinationof the test content to determine whether it covers a representative sample of the behavior domain to be measured" (Anastasi,1968).

Theoretical Issues. Although traditionally associated withthe measurement of achievement, the problem of contentvalidity also confronts the creativity researcher. In order toargue for content validity, it is necessary to present evidencethat one's test or assessment procedure samples in a representative manner the domain of concern. In attempting to establish the content validity of creativity measures we are confronted with three major problems. First, what is the universefrom which we must sample? Without an adequately defineduniverse from which to sample, it seems Virtually impossibleto establish content validity for a creativity measure. Torrance(1966) has contended that it would be impossible to developa comprehensive battery of tests of creative thinking thatwould sample any kind of universe of creative thinking abilities.

A second problem in establishing the content validity ofmeasures of creativity results from the absence of a simple,generally accepted theory of creativity which would serve tounify or direct efforts at specifying assessment procedures.This problem which has resulted in the availability of numerous creativity tests yet each differing in a number of ways,has been pointed out by Treffinger, Renzulli, and Feldhusen(1971).

In viewing the problem of content validity of creativitymeasures, another related issue concerns the complexity ofcreativity as a psychological construct. Does creativity represent a unitary psychological construct, comprised of a specificset of basic aptitudes and traits which are common acrossa variety of creative expressions? Or are there "many creativities," each comprised of a unique structure of aptitudes andtraits? In the first case, the problem of establishing content

254

The Journal of Creative Behavior

validity focuses upon the adequacy with which we can define,and sample, the basic aptitudes and traits (cf., Guilford, 1971).In the latter case, the general term "creativity" may hayeactually been misleading, in that we have attempted to defineand sample one universe rather than several (d., Ausubel,1963; Wallach and Kogan, 1965; Wallach, 1968).

It is possible that creativity may represent such a complexhuman phenomenon that we may never be able to representit adequately as a single, unidimensional operational variable,or even as a small set of operations. There remains a clear challenge for contemporary students of creativity: to engage insignificant theoretical work which may lead to improvementsin our ability to define the universe of creative abilities, andsubsequently to sample that universe more effectively in newmeasures.

Methodological Problems. Several methodological problemsin measuring creativity also are related to the question ofcontent validity.

Covington (in press) argued that, in our attempts to developmeasures of creativity that "fit" well into established psychometric procedures, we have often sacrificed some of the essential attributes of the creative process. He contended that traditional mental measurement procedures are characterized bytimed, speeded performance on a large number of discreetitems, items which represent artificial and highly contrivedsituations, and an emphasis on standardized scoring proceduresand unique, specific abilities, with clearly defined and presentedrequirements and directions. By contrast, Covington argued,the creative process is usually characterized by intense, personal involvement in one real problem, over a long period oftime, with an emphasis on ordering the problem, co-ordinatingor managing one's efforts, and attaining a personal solution.

Guilford (1971) also warned of common misconceptionswhich must be avoided in studying creative talent. He observedthat creativity has too often been associated only with "divergent thinking," although he has argued strongly that manyother aptitudes and traits are involved. The clear implicationis that any operational definition of creativity which is restricted to divergent thinking cannot be content valid as anassessment of creativity, since it is known to sample only asmall portion of the abilities which contribute to creativetalent. What must be stressed is that in order to sampleaccurately a particular part of the universe of creative abilities,we must be cautious about the selection and use of test tasks.It also raises serious questions about the comparability, and

255

CriterlonrelatedValidity


perhaps directly about the content validity, of studies in whichexperimenters do not report carefully the tasks selected, or inwhich tasks vary from study to study or are modified insome way by the experimenter.

Criterion-related validity is defined as "the effectiveness ofa test in predicting an individual's behavior in specified situations" (Anastasi, 1968, p. 105). The criterion may be an immediate criterion, in which case we usually discuss "concurrent"validity, or a long-term criterion, in which case we discuss"predictive" validity.

Theoretical Issues. The greatest single problem in establishing criterion-related validity (either concurrent or predictive)is, of course, the selection of criteria. What are the externalcriteria against which measures of creativity may be validated?

There is great concern about the identification of acceptablecriteria against which measures of creativity may be validated.This concern is not new i indeed, as one reads through thereports of many of the pioneering Utah conferences on creativity (Taylor, 1964a; Taylor, 1964b), the striking impressioncreated is that we still have with us, almost a decade later, thesame fundamental problems with which the conference researchers grappled. Brogden and Sprecher (1964), in theiressay on criteria for creativity, raised many still-familiar concems-product-process distinctions; difficulties of identifyingreliable criteria, and problems of generalization and controlvariables.

Establishing criteria for concurrent validity measures hasalso been difficult because of disagreement over a varietyof specific issues, the evaluation of products, the possibility of determining process criteria, the question ofnovelty (for whom?), and the persistent criticism that "creativity" may in fact be used better to describe a rare quality orgenius rather than a psychologically distinct set of individualdifference variables. It is clear that one's positions on theseissues will determine to a rather great extent the suitability(or unsuitability) of various criteria proposed for the validation of creativity measures. Finally, in establishing criteria,much more must be known about the effects of a variety ofcontrol variables. Are different criteria needed for sexes, various age groups, or in different cultural settings?

In considering long-term studies of criterion-related validity(i.e., predictive validity) numerous additional questions areraised. Foremost, there is the need to conduct longitudinalstudies of creative development over a substantial period oftime, and involving large-scale psychological assessment. It

25G

ConstruclValidity


seems true that, again, there is a substantial need for a generaltheoretical work in the area of creativity to provide a betterconceptual framework for the identification of criteria, bothimmediate and long range.

Methodological Problems. A variety of specific methodological issues relate to the problem of criterion-related validity.

First, it must be made clear that measures of creativity, asan extremely complex construct, will not be likely to be substantially validated against any single criterion (Guilford,1971). Because of the number and extent of aptitude factorsinvolved in creative talent, it is unlikely that any small, rela-

~

tively arbitrary selection of tests will pred{ct well a complex,multidimensional criterion of creative behavlor. This suggests,in addition to the need for broadening the selection of testtasks, the need to utilize complex multivariate statistical procedures rather than simple bivariate correlational procedures.

Next, increased attention must be given to the adequacy ofthe criteria themselves. New approaches to the identificationof criteria and the sampling of complex behavior must besought, which will lead to more appropriate and reliablecriterion assessments. Finally, as Guilford (1971) has alsoargued, it is necessary to examine carefully the variety ofcommonly-used criteria to evaluate their adequacy, and possibly identify improvements.

The problem of establishing construct validity is a complexmatter of determining the extent to which a test may be saidto measure a theoretical construct or trait (Anastasi, 1968).The American Psychological Association's Standards for Educational and Psychological Tests and Manuals (French andMichael, 1966) holds that there are three essential steps in theconstruct validation procedure. First, on the basis of the theoryupon which the test has been developed, the researcher develops hypotheses concerning the behavior of high and lowscorers. Then, data are gathered to test those hypotheses.Third, the data collected provide evidence for inferring whetherthe theory is adequate. If the theory fails to account for theactual evidence, there is need for revision of the test, reformulation of the theory, or rejection of the theory. Therefore thedata used to investigate construct validity are preferably experimental, although correlational evidence may be useful totest certain construct validation hypotheses.

Theoretical Issues. The problems of definition and criteria,which create problems in relation to content and criterionrelated. validity are also related to construct validity. Differences among writers concerning definitions and criteria lead

257


to substantial difficulty in formulating testable hypotheses, orin documenting the theoretical or empirical rationale for certainhypotheses. This is further compounded by the fact that manyresearch studies have employed widely-differing tasks (as inthe area of problem-solving; cf., Davis, 1966) or varying subsets of tasks. Selection of sub-tests may imply that fundamentally different psychological processes; are being assessedin each study, so the problem of developing a consistenttheoretical basis for the interpretation of results or derivationof new hypotheses is very important.

Methodological Problems. Because of the complexity ofassessing construct validity of creativity measures, there aremany methodological problems, involving general concerns forconstruct validation, as well as some which relate to very specific issues. The two greatest areas of concern appear to be: (1)the theoretical and empirical distinction between creativity andintelligence; and (2) the need for the development of experimental studies of creative behavior.

A complete consideration of the creativity-intelligence question is beyond the scope of the present paper.

This problem has not been fully resolved, however, and it isrelated in part to a broader theoretical problem. The researcher,as noted above, must assume the responsibility for statingfully his theoretical position and the interpretation of his data;in addition, he must distinguish the variables with which he isconcerned from other constructs. It is certain that much ofthe controversy concerning the creativity-intelligence relationship is related to problems in the definition and theoreticalinterpretation of both creativity and intelligence. When "creativity" is defined, for example, by performance on a specificmeasure of divergent production, and "intelligence" by reference to performance on a specific IQ test, the theoretical prediction of the relationship may be more clearly stated thanwhen we argue about the ,relationship between "creativity"and "intelligence" as general (but non-operational) constructs.

A more complex reformulation of the creativity-intelligencequestion involves what has been called convergent-discriminant validation (Campbell and Fiske, 1959). Stated simply, theproblem holds that measures of a certain construct shouldcorrelate highly with other measures of the same construct,but negligably with measures of some different construct.Several measures which purport to assess "creativity" should,therefore, intercorrelate substantially (convergent validity),whereas they should yield low correlations with measures ofsome other, different construct (discriminant validity). Wallach

258

The Journal of CreaUve Behavior

and Kogan's (1965) criticism that tests of "creativity" oftencorrelate as well or better with measures of IQ (presumablya different construct) than they correlate among themselvesillustrates such a concern. However, the problem is complex,and the Wallach-Kogan results have not always been supported (Williams and Fleming, 1969i Feldhusen et al., 1971).In addition, Guilford (1971) has argued that creative talentmay be so complex that current measures of related aptitudes(such as divergent thinking) may well tap quite unique aspectsof the construct, and so may not be expected to display highintercorrelations.

It is also true that many studies of creativity and creativitymeasures have been simple correlational studies, from whichonly a limited theoretical information may be obtained. Fromsimple correlational studies, it is possible to describe themagnitude and direction of a relationship between the variables studied) typically the underlying cause(s) of such arelationship is not open to examination. Thus, in order to testadequately a full range of hypotheses concerning the natureand assessment of creativity, more complex research methodologies should be employed. These include:

(a) use of multivariate statistical techniques, to allow for theinvestigation of the more complex multiple aptitudes whichare involved in creative talent;

(b) the use of experimental and quasi-experimental researchdesigns, including large-scale sampling of populations of interest, well-controlled studies, and replication studies;

(c) the development and implementation of longitudinalstudies of creative talent.

Table 1 summarizes theoretical and methodological problems in determining the validity of creativity measur.es, andrecommendations for needed research.

RELIABILITY Reliability is often defined as "the accuracy (consistency andstability) of measurement by a test" (French and Michael,1966). Since, however, there are a variety of sources of inaccuracy or 'error' in the measurement of some psychologicalconstructs, there are several approaches to the establishmentof reliability.

Stability One approach inquires about the stability of test scores overa period of time. This method for assessing stability is commonly referred to as "test-retest" reliability. A considerationof some of the theoretical and methodological postulatesinvolved with determining the stability of test scores does notprovide clear evidence for automatic acceptance of what mightbe termed the stability of measures of creativity. These include:

259


TABLE 1 Needed research relating to the validation of creativitymeasures.

Area ofValidation

CONTENTVALIDITY

CRITERIONRELATED

VALIDITY(short-term or

concurrent andlong-term or

predictive)

CONSTRUCTVALIDITY

Theoretical Problems

What is the universe fromwhich we must sample?What do differentmeasures attempt tosample?Isthere one "creativity"or "many"?What is an adequateconceptual definition of"creativity"?

Selection of appropriatecriteria.Product·Processdistinctions.Generalization andmoderator variables.Need for theoretical basisfor predictions.Identification ofcognitiveand affective components.Novelty for whom?Need for long-termstudies, wide sampling,more extensive criteria.

Need for extensive theoryonwhich to basepredictions andinterpretations.Need toexplain theoreticalrationale for predictionsand interpretations, andto distinguish creativityfrom other constructs.Need for Integration andevaluation ofextensiveresearch literature.Problems ofdefinition,criteria, and selection ofmeasures.

280

Methodological Problems

Are traditionalpsychometric proceduresinappropriate?What new procedurescan beused?What other aptitudesbesides divergent thinkingare involved increativity?How can they beassessed?How can studies whichemploy different measuresbe compared?What constitutes anadequate operationaldefinition of"creativity"?

Criteria will probably bemultiple-eomplex.Small selection ofmeasures may be toolimited to account forcomplex behavior.Eitablishing thevalidityand reliability ofcriteria.Inadequacy ofteacher andpeer judgments.Diversity oftasksemployed In literature.Statistical problems inoriginality criteria.Need formore extensivesampling and testingspanning cognitiveabilities, affective factors,behavioral indices.Developmental and crosscultural differencesunclear.

Theoretical and empiricaldistinction betweencreativity and othervariables (particularlyintelligence).Age differentiation;contrasts between highand low scoring groups.Convergent·DiscriminantValidation.Need for multivariateanalyses, experimentaland quasi·experimentalstudies, replication, andeffective controls.Need for complex,long·term studies ofcreative behavior.

Research Needs

1. Integration oftheoriesand research literature.

2. More adequateconceptual andoperational definitions.

3. Development ofcriteriafor new measures ofcreative talent,

4. Development of newprocedures forimplementing thosecriteria.

1. Evaluation ofvalidityand reliability ofexternal criteria ofcreativity.

2. tong-term, multldimensional studies ofcreative abilities,personality, andbehavior.

3. Need to conductdevelopmental andcross·cultural studies.

1. Multivariate researchprocedures applied tocorrelational problems.

2. Experimental andquasi-experimentalstudies ofcreativity,including adequatecontrols.

3. Replication studies.4. tong-term studies of

creative behavior.5. Extensive theoretical

work, synthesis andevaluation oftheliterature, urgentlyneeded.

Equivalence orComparablIlty

The Journal or Creallve Behavior

(1) Determining whether creativity is, in facti a stablehuman characteristic. Since certain theoretical formulations ofcreativity stress an irrational, preconscious, or emotional component (e.g. Kubie, 1958i Gordon, 1961)/ it may not be possibleto expect stability in measures of creativity. To the extent thatone is influenced by such theoretical orientation, it becomesirrelevant to inquire about test-retest reliability. Alternateviews, however, such as Guilford's Aptitude-approach, wouldplace more emphasis on the stability of measures of creativeabilities. Additionally, assuming that creativity is a multidimensional construct, it seems questionable at best to referto the "stability of creativity," and rather more appropriate todetermine the consistency of each component part.

(2) Identifying an appropriate interval. Crucial to the estimation of test-retest reliability is the length of time or theinterval between test administrations. However, no "ideal"interval can be specified, yet research is needed to investigatethis question. Until more information is made available researchers of creativity should at least state clearly the intervalsemployed.

(3) Motivational influences. As Torrance (1966) haspointed out/ test-retest reliabilities in measures of creativethinking may be influenced substantially by the motivationallevels of the subject tested. Torrance concluded that researcherswere often more adequate in their consideration of such motivational factors in experimental studies than when collectingnormative data (1966/ p. 22). This suggests that, in researchon the measurement of creativity/such factors must be considered/ manipulated, and, at very least, clearly described (cf.,Elkind et aI., 1970).

(4) Incomplete or partial sampling of the measurementuniverse. A test can only attempt to provide a representativesample of a content universe. Retesting is not a theoreticallydesirable approach to determining reliability when the testsamples only one of many real or hypothetical sets of itemswhich might have been used to assess the trait. In view of thecomplexity of the aptitudes and personality traits which maybe involved in creative talent, and our tendency to employonly a limited sample of measures in most studies, this limitation appears to have considerable importance in the measurement of creativity.

A second general approach to determining the reliability ofa test has to do with the "equivalence" or "comparability" ofvarious forms of a test. Customarily, this approach to assessingthe reliability of a test involves the administration of alternate

281

InternalConsistency

USABILITY


forms of a test to the subject. If there are many tasks or itemswhich might comprise a certain test, there is often no reasonto assume that one particular sampling of that pool will yielda score which is systematically superior or inferior to any othersampling of the same number of items from the same item pool.

Of course, when we attempt to measure creativity, wecannot be certain that a selection of a certain set of tasksrepresents a random and representative sample of some general "item pool." The great problem, then, in considering theuse of alternate forms reliability, is to verify that the presumably alternate forms do, in fact, measure the same aptitudes.Reliance solely on this particular index of reliability seems atthis time weak. What is needed is a table of specifications(Ahmann and Glock, 1971) for each of the many commerciallyavailable tests of creativity.

A third approach for determining reliability involves severalmethods for assessing internal consistency. These also areproblematic for the creativity researcher. Measures of internalconsistency (odd-even or other split-half measures, or themore general Kuder-Richardson formulas) generally assumethat the subject's performance on one part of a test should notordinarily be greatly different from his performance on anotherpart, and as a compliment to this first assumption, that the testscore reflects a unidimensional trait or behavior. Such measures may be entirely inappropriate, however, in the case ofcreativity measures which are open-ended, rather than comprised of discreet "items," and which are often selected, asTorrance (1966) argues, to represent a range of distinctlydifferent abilities and performances.

It is not dear, then, that the traditional approaches todetermining the reliability of a test are well-suited to themeasurement of creativity. Except in the case of singleresponse, discreet-item tests (where validity may be doubtfulagainst any complex criteria of creative talent), such measuresmay be difficult to employ, and may yield misleading dataconcerning the accuracy of measurement. Nevertheless, thegeneral idea of determining the accuracy of reliability ofcreativity measures seems to have significance in evaluatingresearch which must be conducted in this area.Although usability usually refers to several practical considerations in the selection and evaluation of a test, such as cost,availability, and supporting information or technical manuals,it also subsumes several problems which relate to research onthe measurement of creativity. Primarily, these problems are:test administration, test scoring, and norms.

262

TestAdministration

TestScoring

Norms


Basically, if conditions under which tests of creativity are tobe administered are not controlled, the resulting influences onscores obtained will impair our ability to interpret the scores.Research evidence to date, although not often conclusive,suggests that if factors such as working time, instructions fortest taking, administration procedures, warm-up activities,and the test environment itself are not controlled, differencesin subject scores can be found attributable to these factors aswell as due to individual differences on the test itself. What ishazardous is that once scores are affected by such extraneousvariables it becomes impossible to distinguish what is attributable to creativity and what stems from lack of control overthe variability of subjects' scores.

Our suggestion is twofold. First, more controlled research isneeded to determine the effects of these variables on creativitytest scores, and second, research which is conducted mustcontrol for these conditions.

When it becomes necessary that subjective processes enterinto the scoring of a test of creative ability, evidence concerningthe degree of agreement among independent scorers should beprovided. Documentation should also be provided outlining thebasis for scoring and procedures for training scorers, so as topermit the unbiased replication of research or effective use ofthe research results.

In view of the complexity of scoring "open-ended" measuresof creative thinking, research should be conducted on twolevels: first, on the development of new scoring procedureswhich will yield more accurate assessments of originality andimagination; and second, on ways to improve the accuracy ofexisting scoring procedures, such as through the utilization ofnatural language computing for the scoring of tests (Paulusand Renzulli, 1969).

The development of norms for use in the measurement ofcreativity represents another very difficult problem. Indeed,there are some who contend that, because of the very natureof creativity, it is impossible to develop or apply normativescoring procedures. In this view, the creative response is, bydefinition, one which cannot be anticipated, and one whichrepresents essentially a departure from the ordinary. As such,it is impossible to specify in advance what kind of responsewill be considered creative. The initial problem in this view,of course, is that it seems to remove the potential for creativebehavior from the domain of most persons, considering as"creative" only rare instances of exceptional or unusual accomplishment. It seems more fruitful to consider creativity as

263

Needed Research on the Measurement of Cr••Uvlty

a complex construct involving numerous individual differencecharacteristics, suggesting that inter-individual variations increative thinking are presentIand predictable) and that profilesmay be more useful than single or composite scores. Undersuch a view, in which every subject shares creative potential,although some will demonstrate greater potential or moreexceptional actual performance than others, some distinctionsamong the responses of subjects can be classified and scoredagainst normative criteria. Provision for the exceptional responses, unanticipated in the norms, must also be made. Thisapproach seems to be consistent with that employed in Torranee's assessment of creative thinking abilities (1966) andGuilford's assessment of Structure of Intellect aptitudes increativity (1967).

Under this view, the problem is not whether there can benorms for scoring such variables as fluency, flexibility, originality, or elaboration. The question of interest to the researcheris, how can such norms most effectively be developed? Astrong criticism of existing tests of creative thinking, in thepresent writers' view, is not their utilization of normativescoring criteria; it is that the norms used are frequently inadequate. If normative scoring procedures are to be utilized,research must clarify the population for which the norms areappropriate; specific predictions for variations in other populations; the differentiation of norms according to age, socioeconomic status, educational attainment, standing on otherrelated cognitive or affective characteristics, or other relevantvariables; and, the provision of adequate information for thestandardization of test scores.

A related issue has to do with the selection and combinationof sub-tests. In reviewing research which employs the TorranceTests of Creative Thinking (1966), for example, one probleminvolves the fact that researchers have frequently employeddifferent samplings of subtests, which renders comparabilityof results across studies Virtually impossible. In addition, someresearchers have reported only undifferentiated total fluency,flexibility, originality, or elaboration scores. In some cases, iteven appears that verbal and figural scores may not have beendifferentiated. Other studies have used total scores, derivedfrom different groups of tasks, and some have utilized scoresderived successively from single tasks. These variations amongstudies have further reduced the comparability of test results.In addition, research by Harvey et al. (1970) suggested that thesub-tests or tasks selected by the researcher may substantiallyinfluence the nature of the abilities measured. In addition,

284

The Journal 0' Creative Behavior

Harvey et al. (1970) suggested that it was doubtful that scoringdimensions (fluency, flexibility, etc.) could be accurately combined across tasks.

SUMMARY The purpose of this paper was to identify several critical problems and areas of needed research on the measurement ofcreativity. The area was surveyed in three general categories:validity, reliability, and usability. In each of these areas, majorproblems and research needs included:

Validity

1. There is a substantial need for extensive theoreticalwork in the field of creativity, as well as for synthesis,integration, and evaluation of the research literature.

2. Progress in developing adequate operational definitionsof creativity depends greatly on progress in developingadequate conceptual definitions.

3. There is a need for extensive studies of new, moreadequate external criteria for the validation of creativitymeasures, as well as for inquiry into the validity and reliability of existing criteria.

4. There is a need for multivariate methods to be employed in correlational studies of creative talent.

5. There are needs for longitudinal studies,well-controlledexperimental studies, replications, and for developmentaland cross-cultural studies. .

Reliability

1. Studies are needed which investigate new methods ofdetermining the accuracy or reliability of measures of creativity, with emphasis on the specification of IIerror" components more comprehensively.

2. In employing traditional stability indices, attentionmust be given to determining the extent to which creativityshould be expected to be a stable trait, in identifying appropriate intervals for assessing stability, and for assessingsystematically the influence of motivation, moods, and othersituational variables on reliability of test scores.

3. In considering the utilization of alternate forms orinternal consistency indices of reliability, attention must begiven to the problems involved in selection and use of subtests from larger batteries. It must be recognized that tasksin creativity tests may not be discreet "items," and thatscores derived from various tasks may neither be additive,nor meet many fundamental assumptions involved in thetraditional determination of reliability indices.

265

Needed Research on the Measurement 01 Creativity

Usability

1. Research must be addressed to developing a systematictheoretical and empirical understanding of the effects ofvariations in test administration procedures and conditions(including directions, testing environment, working time,and response modes).

2. Problems relating to test scoring are very important inthe measurement of creativity. In addition to research on thecomparability of scores derived from different tasks anddifferent methods of testing, studies should also be conducted which investigate new methods and criteria forscoring (particularly for originality and "imagination").

3. Problems of the validity and reliability of scorers areextremely important, and all research employing creativitymeasures should provide full information concerning interscorer correlations, as well as comparison of means andvariances among scorers and between scorers and test norms.

4. Creativity measures which involve normative scoringprocedures must be accompanied by extensive supportingdata concerning the norm groups employed and the tasksinvolved.These problems are very complex, and may not soon be

resolved. It seems necessary to recognize them, however, andto take into account such problems in the interpretation ofresearch in which "creativity" measures are used. It wouldalso be of significant value to researchers in the psychologicalstudy of creativity if support were increased for research inthese areas.

REFERENCES AHMANN, J. S. & GLOCK, M. D. Evaluating pupil growth (4th ed.).Boston: Allyn and Bacon, 1971.

ANASTASI, A. Psychological testing. NYC: Macmillan, 1968.AUSUBEL, D. P. The psychology of meaningful verbal learning. NYC:

Grane and Stratton, 1963.

BROGDEN, H. E. & SPRECHER, T. B. Criteria of creativity. In Taylor,C. W. (ed.), Creativity: progress and potential. NYC: McGraw, 1964.

CAMPBELL, D. T. & FISKE,D. W. Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin,1959, 56, 81-105.

COVINGTON, M. V. New directions in the appraisal of creative thinking. In Treffinger, D. J. (ed.), Readings on creativity in education.Englewood Cliffs: Prentice (In press).

DAVIS, G. A. Current status of research and theory in human problemsolving. Psychological Bulletin, 1966, 66, 36-54.

ELKIND, D., DEBLINGER, r. & ADLER, D. Motivation and creativity:the context effect. American Educational Research Journal, 1970, 7,351-358.

266


FELDHUSEN, J. F., TREFFINGER, D. I., VAN MONDFRANS, A. P. &FERRIS, D. R. The relationship between academic grades and divergent

thinking scores derived from four different methods of testing. JournalofExperimental Education, 1971, 40, 35-40.

FRENCH, I. W. & MICHAEL, W. B. Standards for educational andpsychological tests and manuals. (Prepared by a joint committee ofAPA, AERA, and NCME.) Washington: American PsychologicalAssociation, 1966.

GORDON, W. J. J. Synectics: the detielopment of creative capacity.NYC: Harper,1961.

GUILFORD, J. P. The nature ofhuman intelligence. NYC: McGraw, 1967.HARVEY, O. J., HOFFMEISTER, I. K., COATES, C. & WHITE, B. J.

A partial evaluation of Torrance's test of ·creativity. American Educational Research Journal, 1970, 7,359-372.

KUBIE, L. S. Neurotic distortion of the creative process. Lawrence:University of Kansas, 1958.

PAULUS, D. H. & RENZULLI, J. S. Computer scoring of creativity tests.Gifted Child Quarterly, 1968, 12, 79-83.

TAYLOR, C. W. (ed.). Widening horizons in creativity. NYC: Wiley,1964 (a).

TAYLOR, C. W. (ed.). Creativity: progress and potential. NYC: McGraw,1964 (b).

TORRANCE, E. P. Torrance Tests of Creati'ue Thinking: norms andtechnical manual. Princeton: Personnel Press, 1966.

TREFFINGER, D. I., RENZULLI, I. S. & FELDHUSEN, I. F. Problemsin the assessment of creative thinking. Journal of Creative Behavior,1971,5,104-112.

WALLACH, M. A. Review of the Torrance Tests of Creative Thinking.American Educational Research '[ournal, 1968, 5, 272-281.

WALLACH, M. A. & KOGAN, N. Modes of thinking in young children.NYC: Holt, 1965.

WILLIAMS, T. M. & FLEMING, I. W. Methodological study of therelationship between associate fluency and intelligence. Developmental Psychology, 1969, 1, 155-162.

Donald ]. Treffinger, Associate Professor and Chairman, Department ofEducational Psychology and Research.Address: University of Kansas, Bailey Hall, Lawrence, Kansas 66044.

John P. Poggio, Assistant Professor of Educational Psychology andResearch.Address: University of Kansas, Lawrence, Kansas 66044.

267

Documents

Needed Research on the Measurement of Creativity