Testing New 2007 Second Session

Embed Size (px)

Citation preview

  • 8/2/2019 Testing New 2007 Second Session

    1/64

    Chapter Six, Characteristics of a Good Test

  • 8/2/2019 Testing New 2007 Second Session

    2/64

    1. What are the characteristics of a good test?

    Reliability,

    validity, and

    practicality

  • 8/2/2019 Testing New 2007 Second Session

    3/64

    2. What is the main concern of the reliability? The notion of consistency of ones score with respect to

    ones average score over repeated administrations is the

    central concern of the concept of reliability.

  • 8/2/2019 Testing New 2007 Second Session

    4/64

    3. What is the classical test theorys assumption about the concept of reliability? The fact that repeated measurements of some

    attributes of the same individual almost never duplicate

    one another is called unreliability.

    On the other hand the tendency toward consistency

    from one set of measurement to the next is calledreliability.

  • 8/2/2019 Testing New 2007 Second Session

    5/64

    4. What is meant by systematic variation or predictable change of ones score in

    different administrations of a test?

    Those are some of the changes in scores which are due

    to some sort of learning.

  • 8/2/2019 Testing New 2007 Second Session

    6/64

    5. What is meant by unsystematic variation or unpredictable change of ones

    score in different administrations of a test?

    Those are some of the changes in students score which

    are due to the other factors rather than learning and

    may not be predictable.

  • 8/2/2019 Testing New 2007 Second Session

    7/64

    6. What is the difference between the observedscore and true score?

    The observed score includes the measurement error or

    error score but the true score is the errorless score.

  • 8/2/2019 Testing New 2007 Second Session

    8/64

    7. When is the observed score equal to the true score?

    When there is absolutely no error of measurement, the

    observed score is equal to true score.

  • 8/2/2019 Testing New 2007 Second Session

    9/64

    8. What are the possibilities for the relationship between the observed and the

    true score?

    The observed score can be greater than, equal to, or

    smaller than the true score.

    X= T , X> T , X

  • 8/2/2019 Testing New 2007 Second Session

    10/64

    9. How is the variance of observed score calculated?

    The magnitude of the observed variance equals themagnitude of true variance plus the magnitude of theerror variance.

    Vx= Vt+ Ve

  • 8/2/2019 Testing New 2007 Second Session

    11/64

    10. What is the technical definition for the term

    reliability?

    It is the consistency of scores produced by a given test.It is the ration of true score variance to the observedscore variance.

    r=

  • 8/2/2019 Testing New 2007 Second Session

    12/64

    11. What is the formula for the reliability?

    The reliability is the total variance in test scores minusthe error variance. R=

  • 8/2/2019 Testing New 2007 Second Session

    13/64

    12. What does a reliability of zero mean?

    It means that all observed variation is due to error.

    When there is the greatest amount of error in

    measurement, the reliability will equal zero.

  • 8/2/2019 Testing New 2007 Second Session

    14/64

    13. What is standard error of measurement (SEM)?

    It is the standard deviation of all error scores obtained

    from a given measure in different situations.

    SEM= Sx

  • 8/2/2019 Testing New 2007 Second Session

    15/64

    14. What is the relationship between reliability

    and SEM?

    There is a negative relationship between reliability and

    SEM. The higher the reliability, the smaller the

    standard error of measurement. By the same token, the

    lower the reliability, the greater the SEM.

  • 8/2/2019 Testing New 2007 Second Session

    16/64

    15. How true score is estimated from SEM and

    observed score?

    The true score is interpreted within the range of plus or

    minus one SEM from the observed score.

    EX: X= 14, SEM= 3 , true score= 14+ 3

    14+3= 17 , 14-3=11

  • 8/2/2019 Testing New 2007 Second Session

    17/64

    16. Why is the calculating the real magnitude of reliability impossible? What is

    the implication of this issue?

    Calculating is impossible because measuring the true

    score is impossible. The implication is that the

    mathematical value of reliability is always estimates

    rather than calculated.

  • 8/2/2019 Testing New 2007 Second Session

    18/64

    17. What are different practical methods of estimating

    reliability?

    Correlation,

    test-retest method,

    parallelforms method,

    split half method,

    KR-21 method

  • 8/2/2019 Testing New 2007 Second Session

    19/64

    18. Which factors influence reliability?

    The effect of testees,

    the effect of test factors,

    the structure of the test,

    the effect of administration factor,

    the influence of scoring factors

  • 8/2/2019 Testing New 2007 Second Session

    20/64

    19. What is reliability?

    Reliability is the consistency of scores produced by a

    given test. Or it is the total variance in test scores

    minus the error variance.

  • 8/2/2019 Testing New 2007 Second Session

    21/64

    20. What does the difference between the scores of the

    two administrations contribute to? The difference contributes to the unreliability of the

    scores or the degree of the error in measurement.

  • 8/2/2019 Testing New 2007 Second Session

    22/64

    The correlation coefficient between two sets of scores

    obtained from two administrations of the same test to

    the same group should be calculated.

    21. How is the reliability calculated from the correlation

    coefficient?

  • 8/2/2019 Testing New 2007 Second Session

    23/64

    22. How is reliability estimated in test- retest method?

    The reliability is obtained through administrating a

    given test to a particular group twice and calculating

    the correlation between the two sets of scores.

  • 8/2/2019 Testing New 2007 Second Session

    24/64

    23. What are the disadvantages of test- retest

    method? a. It requires two administrations (it is difficult to

    arrange two testing sessions for the same group).

    b. human beings are intelligent and dynamic creatures.

    c. there is the test effect, especially when the interval is

    short.

  • 8/2/2019 Testing New 2007 Second Session

    25/64

    24. What is the main procedure for estimating reliability in (Test-

    retest, parallel forms, split-half and KR-21 methods?

    For the first three methods correlational procedure and

    for the KR-21 a formula is used.

  • 8/2/2019 Testing New 2007 Second Session

    26/64

    25. Why is the test- retest method called as reliability of

    scores over time?

    Since there should be a reasonable amount of time

    between the two administrations in test- retest method,

    it is referred to as reliability of scores over time.

  • 8/2/2019 Testing New 2007 Second Session

    27/64

    26. Which factors affect the interval time in testretest method?

    Memory and change. The longer the interval, the more

    change will occur in the testees behavior but less

    memory factor will exist; the shorter the interval, the

    less change will occur in the testees behavior, but more

    memory factor will exist.

  • 8/2/2019 Testing New 2007 Second Session

    28/64

    27. What is the main disadvantage of test- retest

    method? The major disadvantage is the difficulties involved in

    administering a single test to the same group twice.

  • 8/2/2019 Testing New 2007 Second Session

    29/64

    28. How is reliability estimated in parallel forms method?

    Two parallel forms of the same test are administered to

    a group of examinees just once. Then the correlation

    coefficient between the scores of the two forms will be

    an estimate of reliability.

  • 8/2/2019 Testing New 2007 Second Session

    30/64

    29. What is the disadvantage of parallel forms method?

    Constructing two parallel forms of a test is not an easy

    task.

  • 8/2/2019 Testing New 2007 Second Session

    31/64

    30. What are the main issues in constructing two

    parallel forms of a test? a. the table of specifications for the two forms of the

    test must be the same.

    b. the components of the two tests. i.e., subtests should

    also be the same.

  • 8/2/2019 Testing New 2007 Second Session

    32/64

    31. What is the main idea behind the splithalf method?

    The main idea is that the items comprising a test are

    homogeneous.

  • 8/2/2019 Testing New 2007 Second Session

    33/64

    32. What is the main assumption behind split- half

    method? The assumption is similar to the assumptions of

    parallel forms. Here parallel forms of the items in a

    single test are important.

  • 8/2/2019 Testing New 2007 Second Session

    34/64

    33. Why the split- half method is called internal

    consistency of the test scores, too?

    The method assumes that there is an internal

    homogeneity among the items.

  • 8/2/2019 Testing New 2007 Second Session

    35/64

    34. How is the reliability obtained in split- half method?

    The correlation between the two halves is an estimate

    of the test reliability.

  • 8/2/2019 Testing New 2007 Second Session

    36/64

    35. What is the appropriate procedure for dividing a test

    into two halves?

    Test developers should select odd items for one half and

    even items for the other. Through this procedure, easy

    and difficult items will be equally distributes in the two

    halves.

  • 8/2/2019 Testing New 2007 Second Session

    37/64

    36. What is the formula for estimating the total test reliability in split- half method? *the

    total test reliability will always be higher than the reliability of half of the test.

    r=

  • 8/2/2019 Testing New 2007 Second Session

    38/64

    37. What are the main advantages of split half method?

    a. it is more practical than others.

    b. there is no need to administer the same test twice.

    c. it not necessary to develop two parallel forms of the

    same test.

  • 8/2/2019 Testing New 2007 Second Session

    39/64

    38. What is the disadvantage of split- half method? Developing a test with homogeneous items is difficult.

  • 8/2/2019 Testing New 2007 Second Session

    40/64

    39. What is the assumption behind KR-21 method?

    The assumption is that all items in a test are designed

    to measure a single trait.

  • 8/2/2019 Testing New 2007 Second Session

    41/64

    40. What is the formula in KR-21 method? (KR-21) r= { }{1- }

  • 8/2/2019 Testing New 2007 Second Session

    42/64

    41. What are the advantages of KR-21 method?

    a. it does not require double administrations.

    b. It does not require the utilization of correlational

    procedure.

  • 8/2/2019 Testing New 2007 Second Session

    43/64

    42. What are the factors which influence reliability?

    a. the effects of testees.

    b. the effect of test factors.

  • 8/2/2019 Testing New 2007 Second Session

    44/64

    43. How do the testees affect the test reliability? a. the psychological conditions of the testees and the

    dynamic nature of human attributes affect reliability.

    b. the homogeneity of the testees ability on the

    measured attribute influence the reliability.

  • 8/2/2019 Testing New 2007 Second Session

    45/64

    44. How do the test factors affect the reliability of a test?

    a. The structure of the content of the test.

    b. administration procedure of the test

    c. and the scoring process of the test are the teat factors

    which influence reliability.

  • 8/2/2019 Testing New 2007 Second Session

    46/64

    45. Which parameters of the structure of the test affect reliability?

    a. homogeneity of the items.

    b. the speed with which a test is performed.

    c. the length of the test.

  • 8/2/2019 Testing New 2007 Second Session

    47/64

    46. What is the relationship between the homogeneity of the

    items and reliability?

    The more homogeneous the test items, the more

    consistent score it will produce.

  • 8/2/2019 Testing New 2007 Second Session

    48/64

    47. What is the relationship between the length of the

    test and reliability? The longer the test the more reliable the test will be.

  • 8/2/2019 Testing New 2007 Second Session

    49/64

    a. the ambiguity of instructions .

    b. the time of administration and the extent of

    interaction between the tester and the testees.

    c. and some irregularities in the administration

    process(ex: regular time announcement, giving extraexplanations and answering the questions) .

    e. environment of testing

    48. What are the main reasons for the fluctuations of

    scores in test administration?

  • 8/2/2019 Testing New 2007 Second Session

    50/64

    49. What is meant by validity?

    It refers to the extent to which a test measures what is

    supposed to measure. It is concerned with whether the

    test is achieving what is intended to or not.

  • 8/2/2019 Testing New 2007 Second Session

    51/64

    50. What are different types of validity? a. content validity,

    b. criterion- related validity,

    c. construct validity

  • 8/2/2019 Testing New 2007 Second Session

    52/64

    51. What is content validity? Content validity refers to the degree of correspondence

    between the test content and the content of the

    materials to be tested.

  • 8/2/2019 Testing New 2007 Second Session

    53/64

    52. Why content validity is called the appropriateness

    of the test? Because the focus of content validity is on the

    appropriacy of the elements included in the test and

    appropriacy of the learning level, it is called the

    appropriateness of the test.

  • 8/2/2019 Testing New 2007 Second Session

    54/64

    53. What are the two major criteria for a test to be

    content valid? a. the content of the test should be selected

    appropriately to correspond to the content of the

    materials to be tested.

    b. Second, the test should be aimed at measuring the

    appropriate level of the students learning.

  • 8/2/2019 Testing New 2007 Second Session

    55/64

    54. What is the main drawback of content validity?

    There is no numerical expression for content validity

    and it provides just subjective information about the

    appropriateness of the test.

  • 8/2/2019 Testing New 2007 Second Session

    56/64

    55. What are the ways to avoid too much subjectivity

    of judgments about the content of a test? a. One way is to have the test reviewed by more than

    one expert.

    b. another is to define the content to be tested in

    detailed terms on a table of specification.

  • 8/2/2019 Testing New 2007 Second Session

    57/64

    56. What is face validity?

    It is simply whether the test looks valid on the face of it

    or not.

  • 8/2/2019 Testing New 2007 Second Session

    58/64

    57. What is criterion related validity?

    It investigates the correspondence between the scores

    obtained from the newly developed test and the scores

    obtained from some independent outside criteria.

  • 8/2/2019 Testing New 2007 Second Session

    59/64

    58. What are different types of criterion- related

    validity? a. concurrent validity: the newly developed test is

    administered concurrently with another well- known ,

    reputable test of which the validity is already

    established.

    b. Predictive validity: the administration of the newlydeveloped test and the reputable test is not concurrent

    but in some time interval.

  • 8/2/2019 Testing New 2007 Second Session

    60/64

    59. What is construct validity? How can the test

    developers determine it? It examines whether the test measures what is

    purported to measure.

    Construct validity can be determined through

    utilization of sophisticated statistics called factor

    analysis.

  • 8/2/2019 Testing New 2007 Second Session

    61/64

    60. What are the factors which influence validity? a. directions:(directions should be quite clear and

    simple)

    b. difficulty level of the test: (too easy or too difficult

    items will reduce test validity)

    c. structure of the items: (poorly constructed andambiguous items will contribute to the invalidity of thetest)

    d. arrangement of items and correct responses:(theitems should be arranged from easy to difficult)

  • 8/2/2019 Testing New 2007 Second Session

    62/64

    61. What does practicality of a test depend on? a. ease of administration,

    b. ease of scoring,

    c. cost of testing

    d. ease of interpretation of scores

    e. ease of application of scoresf. availability of comparable forms

  • 8/2/2019 Testing New 2007 Second Session

    63/64

    62. Why is validity more important than reliability?

    A reliable test may or may not be valid, but a valid test

    is, to some extent, reliable.

  • 8/2/2019 Testing New 2007 Second Session

    64/64

    63. What is the difference between reliability and

    validity? Reliability is an independent statistical concept and its

    computation depends on a set of scores. Validity, on the

    other hand, has a direct correspondence to the content

    of the test.