Characteristics of a good test

Preview:

Citation preview

Characteristics of a Good Test

aliheydari.tums@gmail.com

A test is an instrument or systematic procedure for observing and describing one or more characteristics of student, using either a numerical scale or classification scheme.

Measurement: is procedure for assigning numbers to specified attribute or characteristic of person.

Evaluation: is the process of making a value judgment about the worth of someone or something. (Nitko, 2001).

Definitions

aliheydari.tums@gmail.com

• 13% of students who fail in class are caused by faulty test questions • World watch- The Philadelphia trumpet, 2005

• It is estimated that 90% of the testing items are out of quality • Wilen WW (1992)

Literature Review

aliheydari.tums@gmail.com

A learning objective (target) specifies what you would like students to achieve at the completion of an instructional segment.

Learning objectives

aliheydari.tums@gmail.com

Stages in Test Construction

A. Determining the Objectives

B. Preparing the Table of Specifications

C. Selecting the Appropriate Item Format

I. Planning the Test

D. Writing the Test items

E. Editing the Test items

aliheydari.tums@gmail.com

B. Item analysis

Stages in Test Construction

A. Administering the test

II. Trying Out the Test

C. Preparing the Final Form of the Test

aliheydari.tums@gmail.com

Stages in Test Construction

IV. Establishing Test Reliability

III. Establishing Test Validity

V. Interpreting the Test Scores

aliheydari.tums@gmail.com

The teacher’s blueprint in constructing a test for classroom use.

TOS ensures that there is a balance between items that test lower level thinking skills and those with higher order thinking skills in the test.

A Table of Specifications is:

aliheydari.tums@gmail.com

List down the topics covered for inclusion in the test.

Determine the objectives (Bloom’s Taxonomy) to be assessed by the test.

Determine the percentage allocation of the test items for each topic.

Steps in Preparing TOS

aliheydari.tums@gmail.com

Characteristics of a Good Test

Validity Reliability Practicality Administrability

Comprehensiveness Objectivity

Simplicity Scorability

aliheydari.tums@gmail.com

Validity

A test is valid if it measures what we want it to measure and nothing else.

Validity is a more test-dependant concept but reliability is a purely statistical parameter.

 

aliheydari.tums@gmail.com

Content Validity

Criterion-Related Validity

Construct Validity

Face Validity

Types Of Validity

aliheydari.tums@gmail.com

Does the test measure the objectives of the course?

The extent to which a test measures a representative sample of the content to be tested at the intended level of learning.

Content Validity

aliheydari.tums@gmail.com

Criterion-related Validity investigates the correspondence between the scores obtained from the newly-developed test and the scores obtained from some independent outside criteria.

Criterion-related Validity

aliheydari.tums@gmail.com

Criterion-related ValidityDepending on the time of

administration

Concurrent Validity:Correlation between the test scores (new test) with a recognized measure taken at the same time.

Predictive validity:Comparison (correlation) of students' scores with a criterion taken at a later time.

aliheydari.tums@gmail.com

Refers to measuring certain traits or theoretical construct.

It is based on the degree to which the items in a test reflect the essential aspects of the theory on which the test is based on.

Construct validity

aliheydari.tums@gmail.com

Does it appear to measure what it is supposed to measure?

Face Validity

aliheydari.tums@gmail.com

a. Directions (clear and simple)

b. Difficulty level of the test (not too easy nor too difficult)

c. Structure of the items (poorly constructed and/or ambiguous items will contribute to invalidity)

d. Arrangement of items and correct responses

(starting with the easiest items and ending with the difficult ones + arranging item responses randomly not based on an identifiable pattern)

Factors Affecting Validity

aliheydari.tums@gmail.com

A test is reliable if we get the same results repeatedly.

An “unreliable” test, on the other hand one’s score might fluctuate from one administration to the other.

Reliability

aliheydari.tums@gmail.com

several ways to measuring reliability

Internal Consistency

Test-retest Reliability

Split-half Methods

Inter rater Reliability • Parallel-Forms

aliheydari.tums@gmail.com

Administrating a given test to a particular group twice and calculating the correlation between the two sets of score

Since there has to be a reasonable amount of time between the two administrations, this kind of reliability is referred to as the reliability or consistency over time.

Test-Retest

aliheydari.tums@gmail.com

Test-Retest

aliheydari.tums@gmail.com

It requires two administrations.

Preparing similar conditions under which the administration take place adds to the complications of this method.

There should be a short time between to administration. Although not too short nor too long. To keep the balance it is recommended to have a period of two weeks between them.

Disadvantages of Test-Retest

aliheydari.tums@gmail.com

Two similar, or parallel forms of the same test are administrated to a group of examinees just once.

The two form of the test should be the same.

Subtests should also be the same.

The problem here is constructing two parallel forms of a test which is a difficult job to do.

Parallel-Forms

aliheydari.tums@gmail.com

In this method, when a single test with homogeneous items is administrated to a group of examinees, the test is split, or divided, into two equal halves. The correlation between the two halves is an estimate of the test score reliability.

In this method easy and difficult items should be equally distributed in two halves.

Split-Half Test

aliheydari.tums@gmail.com

Advantages: There is no need to administer the same test twice. Nor is it necessary to develop two parallel form of the same test.

Disadvantages: Developing a test with homogeneous items

Split-Half Advantages and Disadvantages

aliheydari.tums@gmail.com

It depends on the function of the test. Test-retest method is appropriate when the

consistency of scores a particular time interval (stability of test scores over time) is important

The Parallel-forms method is desirable when the consistency of scores over different forms is of importance.

When the go-togetherness of the items of a test is of significance (the internal consistency), Split-Half and KR-21 will be the most appropriate methods.

Which method should we use?

aliheydari.tums@gmail.com

To have a reliability estimate, one or two sets of scores should be obtained from the same group of testees. Thus, two factors contribute to test reliability: the testee and the test itself.

Factors Influencing Reliability

aliheydari.tums@gmail.com

Validity and Reliability

Neither Valid nor Reliable

Reliable but not Valid

Valid & Reliable

A test must be reliable to be valid, but reliability does not guarantee validity

aliheydari.tums@gmail.com

practicality refers to the ease of administration and scoring of a test.

Practicality

aliheydari.tums@gmail.com

Administrability the test should directed uniformly to all students so that the scores obtained will not vary due to factors other than differences of the students’ knowledge and skills. There should a clear provision for instruction for the students and the one who will check the test (having clear directions and processes)

Administrability

aliheydari.tums@gmail.com

Scorability the test should be easy to score, directions for scoring is clear, provide the answer sheet and the answer key.

Scorability

aliheydari.tums@gmail.com

A test is said to have comprehensiveness if it encompasses all aspects of a particular subject of study.

Comprehensiveness

aliheydari.tums@gmail.com

A test is said to be simple if it is easy to

understand along with the instructions and other details.

Simplicity

aliheydari.tums@gmail.com

Objectivity represents the agreement of two or more raters or a test administrator concerning the score of a student.

Not influenced by emotion or personal prejudice.

Lack of objectivity reduces test validity in the same way that lack reliability influence validity.

Objectivity

aliheydari.tums@gmail.com

Test length

Speed

Item difficulty

The Other Factors

aliheydari.tums@gmail.com

David ,M. Robert , L. . Norman, E. Measurement and Assessment in Teaching (10th Ed). Pearson(2008)

Anthony J, N. Educational Assessment of Students (3th Ed). Merrill Prentice Hall (2001)

http://www.ehow.com/how_4913690_steps-preparing-test.html

References

aliheydari.tums@gmail.com