145
Assessment Principles Carlo Magno, PhD Counseling and Educational Psychology Department De La Salle University, Manila

Assessment principles

Embed Size (px)

Citation preview

Page 1: Assessment principles

Assessment PrinciplesCarlo Magno, PhD

Counseling and Educational Psychology Department

De La Salle University, Manila

Page 2: Assessment principles

ASSESSMENT COMPETENCIES FOR TEACHERS

• Constructed by the AFT, NCME, NEA:• Teachers should be skilled in:1.choosing assessment methods appropriate for

instructional decisions.2.Administering, scoring, and interpreting the

results of both externally produced and teacher produced assessment methods.

3.Using assessment results when making decisions about individual students, planning teaching, and developing curriculum and school improvement.

American Federation of Teachers, National Council on Measurement and Evaluation, and National Education Association in the United States of America.

Page 3: Assessment principles

ASSESSMENT COMPETENCIES FOR TEACHERS

4. Developing valid pupil grading procedures that use pupil assessment.

5. Communicating assessment results to students, parents, other lay audiences, and other educators.

6. Recognizing unethical, illegal, and otherwise inappropriate assessment methods and uses of assessment information.

Page 4: Assessment principles

SHIFTS IN ASSESSMENT• Testing Alternative assessment

• Paper and pencil Performance assessment • Multiple choice Supply • Single correct answer Many correct answer • Summative Formative • Outcome only Process and Outcome • Skill focused Task-based • Isolated facts Application of knowledge • Decontextualized task Contextualized task

Page 5: Assessment principles

Assessment Literacy

• (1) Assessment comes with a clear purpose

• (2) focusing on achievement targets

• (3) selecting proper assessment methods

• (4) sampling student achievement

Page 6: Assessment principles

ALTERNATIVE FORMS OF ASSESSMENT

• Performance based assessment

• Authentic assessment

• Portfolio assessment

Page 7: Assessment principles

OBJECTIVES

• 1. Distinguish performance-based assessment with the traditional paper and pencil tests.

• 2. Construct tasks that are performance based.

• Design a rubric to assess a performance based task

Page 8: Assessment principles

TERMS• Authentic

assessment• Direct assessment• Alternative

assessment• Performance testing• Performance

assessment• Changes are taking

place in assessment

Page 9: Assessment principles

METHOD• Assessment should measure what is really

important in the curriculum.

• Assessment should look more like instructional activities than like tests.

• Educational assessment should approximate the learning tasks of interest, so that, when students practice for the assessment, some useful learning takes place.

Page 10: Assessment principles

WHAT IS PERFORMANCE ASSESSMENT?

• Testing that requires a student to create an answer or a product that demonstrates his/her knowledge or skills (Rudner & Boston, 1991).

Page 11: Assessment principles

FEATURES OF PERFORMANCE ASSESSMENT

• Intended to assess what it is that students know and can do with the emphasis on doing.

• Have a high degree of realism about them.• Involve: (a) activities for which there is no correct

answer, (b) assessing groups rather than individuals, (c) testing that would continue over an extended period of time, (d) self-evaluation of performances.

• Likely use open-ended tasks aimed at assessing higher level cognitive skills.

Page 12: Assessment principles

PUSH ON PERFORMANCE ASSESSMENT

• Bring testing methods more in line with instruction.

• Assessment should approximate closely what it is students should know and be able to do.

Page 13: Assessment principles

EMPHASIS OF PERFORMANCE ASSESSMENT

• Should assess higher level cognitive skills rather than narrow and lower level discreet skills.

• Direct measures of skills of interest.

Page 14: Assessment principles

CHARACTERISTICS OF PERFORMANCE-BASED ASSESSMENT

• Students perform, create, construct, produce, or do something.

• Deep understanding and/or reasoning skills are needed and assessed.

• Involves sustained work, often days and weeks.• Calls on students to explain, justify, and defend.• Performance is directly observable.• Involves engaging in ideas of importance and

substance.• Relies on trained assessor’s judgments for scoring• Multiple criteria and standards are prespecified and

public• There is no single correct answer.• If authentic, the performance is grounded in real world

contexts and constraints.

Page 15: Assessment principles

VARIATION OF AUTHENTICITYRelatively authentic Somewhat authentic Authentic

Indicate which parts of a garden design are accurate

Design a garden Create a garden

Write a paper on zoning

Write a proposal to change fictitious zoning laws

Write a proposal to present to city council to change zoning laws

Explain what would you teach to students learning basketball

Show how to perform basketball skills in practice

Play a basketball game.

Page 16: Assessment principles

• Answer worksheet 2

Page 17: Assessment principles

CONSTRUCTING PERFORMANCE BASED TASKS

1. Identify the performance task in which students will be engaged

2. Develop descriptions of the task and the context in which the performance is to be conducted.

3. Write the specific question, prompt, or problem that the student will receive.

• Structure: Individual or group?• Content: Specific or integrated?• Complexity: Restricted or extended?

Page 18: Assessment principles

COMPLEXITY OF TASK• Restricted-type task

– Narrowly defined and require brief responses– Task is structured and specific– Ex:

• Construct a bar graph from data provided• Demonstrate a shorter conversation in French about what is on a

menu• Read an article from the newspaper and answer questions• Flip a coin ten times. Predict what the next ten flips of the coin will

be, and explain why.• Listen to the evening news on television and explain if you believe

the stories are biased.• Construct a circle, square, and triangle from provided materials that

have the same circumference.

Page 19: Assessment principles

• Extended-type task– Complex, elaborate, and time-consuming.– Often include collaborative work with small group of students.– Requires the use of a variety of information– Examples:

• Design a playhouse and estimate cost of materials and labor• Plan a trip to another country: Include the budget and itinerary,

and justify why you want to visit certain places• Conduct a historical reenactment (e. g. impeachment trial of

ERAP)• Diagnose and repair a car problem• Design an advertising campaign for a new or existing product

Page 20: Assessment principles

IDENTIFYING PERFORMANCE TASK DESCRIPTION

• Prepare a task description• Listing of specifications to ensure that essential if criteria are met• Includes the ff.:

– Content and skill targets to be assessed– Description of student activities

• Group or individual• Help allowed

– Resources needed– Teacher role– Administrative process– Scoring procedures

Page 21: Assessment principles

PERFORMANCE-BASED TASK QUESTION PROMPT

• Task prompts and questions will be based on the task descriptions.

• Clearly identifies the outcomes, outlines what the students are encourage dot do, explains criteria for judgment.

Page 22: Assessment principles

EXAMPLE OF A TASK PROMPT:

Page 23: Assessment principles

PERFORMANCE CRITERIA• What you look for in student responses to

evaluate their progress toward meeting the learning target.

• Dimensions of traits in performance that are used to illustrate understanding, reasoning, and proficiency.

• Start with identifying the most important dimensions of the performance

• What distinguishes an adequate to an inadequate demonstration of the target?

Page 24: Assessment principles

EXAMPLE OF CRITERIA• Learning target:

– Students will be able to write a persuasive paper to encourage the reader to accept a specific course of action or point of view.

• Criteria:– Appropriateness of language for the audience– Plausibility and relevance of supporting arguments.– Level of detail presented– Evidence of creative, innovative thinking– Clarity of expression – Organization of ideas

Page 25: Assessment principles

• Watch video of Cody Green

Page 26: Assessment principles

RATING SCALES• Indicate the degree to which a particular dimension is

present.• Three kinds: Numerical, qualitative, combined

qualitative/quantitative

Page 27: Assessment principles

• Numerical Scale– Numbers of a continuum to indicate different level

of proficiency in terms of frequency or quality

Example:No Understanding 1 2 3 4 5 Complete

understanding

No organization 1 2 3 4 5 Clear organization

Emergent reader 1 2 3 4 5 Fluent reader

Page 28: Assessment principles

• Qualitative scale– Uses verbal descriptions to indicate student performance.– Provides a way to check the whether each dimension was

evidenced.• Type A: Indicate different gradations of the dimension• Type B: Checklist

Page 29: Assessment principles

• Example of Type A:– Minimal, partial, complete– Never, seldom, occasionally, frequently, always– Consistent, sporadically, rarely– None, some, complete– Novice, intermediate, advance, superior– Inadequate, needs improvement, good excellent– Excellent, proficient, needs improvement– Absent, developing, adequate, fully developed– Limited, partial, thorough– Emerging, developing, achieving– Not there yet, shows growth, proficient– Excellent, good, fair, poor

Page 30: Assessment principles

• Example of Type A: Checklist

Page 31: Assessment principles

• Holistic scale– The category of the scale contains several criteria,

yielding a single score that gives an overall impression or rating

Examplelevel 4: Sophisticated understanding of text indicated with constructed meaninglevel 3: Solid understanding of text indicated with some constructed meaninglevel 2: Partial understanding of text indicated with tenuous constructed meaninglevel 1: superficial understanding of text with little or no constructed meaning

Page 32: Assessment principles

EXAMPLE HOLISTIC SCALE

Page 33: Assessment principles

• Analytic Scale– One in which each criterion receives a separate score.

Example

Criteria Outstanding5 4

Competent 3

Marginal2 1

Creative ideas

Logical organization

Relevance of detail

Variety in words and sentences

Vivid images

Page 34: Assessment principles

RUBRICS• When scoring criteria are combined with a

rating scale, a complete scoring guideline is produced or rubric.

• A scoring guide that uses criteria to differentiate between levels of student proficiency.

Page 35: Assessment principles

EXAMPLE OF A RUBRIC

Page 36: Assessment principles

GUIDELINES IN CREATING A RUBRIC

1. Be sure the criteria focus on important aspects of the performance

2. Match the type of rating with the purpose of the assessment

3. The descriptions of the criteria should be directly observable

4. The criteria should be written so that students, parents, and others understand them.

5. The characteristics and traits used in the scale should be clearly and specifically defined.

6. Take appropriate steps to minimize scoring frame

Page 37: Assessment principles

PORTFOLIO ASSESSMENT: EXPLORATION

• Have you ever done a portfolio?

• Tell me about this experience. Did you enjoy it?

• What elements did you include in your portfolio?

• Are the materials placed in the portfolio required?

Page 38: Assessment principles

What are Portfolios?

• Purposeful, systematic process of collecting and evaluating student products to document progress toward the attainment of learning targets or show evidence that a learning target has been achieved.

• Includes student participation in the selection and student self-reflection.

• “A collection of artifacts accompanied by a reflective narrative that not only helps the learner to understand and extend learning, but invites the reader of the portfolio to gain insight about learning and the learner (Porter & Cleland, 1995)

Page 39: Assessment principles

Characteristics of Portfolio assessment

• Clearly defined purpose and learning targets• Systematic and organized collection of student

products• Preestablished guidelines for what will be included• Student selection of some works that will be

included• Student self-reflection and self-evaluation• Progress documented with specific products

and/or evaluations• Portfolio conferences between students and

teachers

Page 40: Assessment principles

A portfolio is:

• Purposeful

• Systematic and well-organized

• Prestablished guidelines are set-up

• Students are engaged in the selection of some materials

• Clear and well-specified scoring criteria

Page 41: Assessment principles

Purpose of Portfolio

• Showcase portfolio: Selection of best works. Student chooses work, profile are accomplishments and individual profile emerges.

• Documentation portfolio: Like a scrapbook of information and examples. Inlcudes observations, tests, checklists, and rating scales.

• Evaluation portfolio: More standardized. Assess student learning with self-reflection. Examples are selected by teachers and predetermined.

Page 42: Assessment principles

Advantages of portfolio• Students are actively involved in self-evaluation and self-

reflection• Involves collaborative assessment• Ongoing process where students demonstrate performance,

evaluate , revise , and produce quality work.• Focus on self-improvement rather than comparison with

others• Students become more engaged in learning because both

instruction and assessment shift from teacher controlled to mix of internal and external control.

• Products help teachers diagnose learning difficulties• clarify reasons for evaluation• Flexible

Page 43: Assessment principles

Disadvatntages

• Scoring difficulties may lead to low reliability

• Teacher training needed

• Time-consuming to develop criteria, score and meet students

• Students may not make good selections of which of which material to include

• Sampling of student products may lead to weak generalization

• Parents find the portfolio difficult to underdstand

Page 44: Assessment principles

Steps in Planning and Implementing Portfolio Assessment1. Determine the the purpose2. Identify physical structure3. Determine sources of content4. Determine sources of content5. Determine student reflective guidelines and scoring criteria6. Review with students7. Portfolio content supplied by teacher and/or student8. Student self-evaluation of contents9. Teacher evaluation of content and student self-evaluation10. Student-teacher conference11. Portfolios returned to students for school

Page 45: Assessment principles

Purpose

• Based on specific learning targets• Ideal for assessing product, skill, and reasoning

targetsUses:• Showcase portfolio-to illustrate what students are

capable of doing• Evaluation of portfolio-standardization of what to

include • For parents-what will make sense to parents

“Provide specific attention to purpose and corresponding implications when implementing a portfolio.”

Page 46: Assessment principles

Physical structure

• What will it look like?• How large will the portfolios be?• Where are they stored so that students can

easily access them?• Will it be in folders or scrap books?• How will the works be arranged in the

portfolio?• What materials are needed to separate the

works in the portfolio?

Page 47: Assessment principles

Sources of content

• Work samples• Student and teacher evaluations

Guidelines:• Select categories that will allow you to meet

the pupose of the portfolio.• Show improvement in the portfolio• Provide feedback on the students on the

procedures they are putting together• Provide indicator system

Page 48: Assessment principles

Self-reflective guidelines and scoring

• Establish guidelines for student self-reflection and the scoring criteria

• Scoring guidelines are explained to the students before they begin instruction

Page 49: Assessment principles

Implementing portfolio assessment

• Review with students: Explain to students what is involved in doing a portfolio.

• Begin with learning targets• Show examples • Give opportunities to ask questions• Provide just enough structure so that they can get

started without telling them exactly what to do.• Selection of content will depend on the age and

previos experience of students• Students and teachers decide together what to

include with nonrestrictive guidelines

Page 50: Assessment principles

Some organization

• Include table of contents

• Brief description of activities

• Date produced

• Date submitted

• Date evaluated

Page 51: Assessment principles

Student self-evaluations

• Reflective and self-evaluation activities need to be taught.

• Some guide questions for students:– Can you tell me what you did?– What did you like best abut this sample of your writing?– What will you do next?

• Self-reflective questions:– What did you learn from writing this piece?– What would you have done differently if you had more

time?– What are your greatest strengths and weaknesses in this

sample?– What would you do differently if you did this over?

Page 52: Assessment principles

Peer evaluations

• Analysis and constructive, supportive criticism of strategies, styles, and other concrete aspects of the product.

• Can include comments or a review by parents

Teacher evaluations:

• Checklist of content

• Portfolio structure evaluation: selection of samples, thoroughness, appearance, self-reflection, and organization.

• Evaluation of individual entries: use rubrics

• Evaluation of entire content: use rubrics

Page 53: Assessment principles
Page 54: Assessment principles
Page 55: Assessment principles
Page 56: Assessment principles
Page 57: Assessment principles
Page 58: Assessment principles

Student-teacher conferences

• Conference is conducted with students before returning the portfolio

• Scheduled throughout the school year; some have it monthly

• Clarify purposes and procedure with students, answer questions and establish trust

• Give guidelines to prepare for each conference• Allow the students to do most of the talking• Have students compare your reflections with theirs• Weaknesses and areas for improvement need to be

communicated –show them what is possible for progress

Page 59: Assessment principles

Student-teacher conferences

• At the end of the conference there is an action plan for the future

• Limit the conference to no more than 10 minutes

• Students are encouraged to take notes

• Focus on one or two major areas of each conference-helps to have a thoughtful discussion

Page 60: Assessment principles

Advance Organizer1 The Test Blueprint

Outline of the Test Development Process

Table of Specifications

2 Designing Selected-Response ItemsBinary-choice items

Instructions in Writing Binary Type of Items

Multiple-choice items

Guidelines in Writing Multiple-choice Items

Matching items

Guidelines in Writing Multiple-choice Items

3 Designing Constructed-Response ItemsShort-answer items

Guidelines in Writing Short Answer Items

Essay items

4 Designing Interpretive ExerciseGuidelines in Writing Intepretive Exercise

Examples of Interpretive Exercise

60

Page 61: Assessment principles

Objectives

• 1. Explain the theories and concepts that rationalize the practice of assessment.

• 2. Make a table of specifications of the test items.

• 3. Design pen-and-paper tests that are aligned to the learning intents.

• 4. Justify the advantages and disadvantages of any pen-and-paper test.

• 5. Evaluate the test items according to the guidelines presented.

61

Page 62: Assessment principles

62

Outline of Test Development Process

• 1. Specify the ultimate goals of the education process• 2. Derive from these the goals of the portion of the

system under study• 3. Specify these goals in terms of expected student

behavior. If relevant, specify the acceptance level of successful learning.

• 4. Determine the relative emphasis or importance of various objectives, their content, and their behaviors.

• 5. Select or develop situations that will elicit the desired behavior in the appropriate context or environment, assuming the student has learned it.

• 6. Assemble a sample of such situations that together represent accurately the emphasis on content and behavior previously determined.

Page 63: Assessment principles

63

Outline of Test Development Process

• 7. Provide for the recording of responses in a form that will facilitate scoring but will not so distort the nature of the behavior elicited that it is no longer a true sample or index of the behavior desired.

• 8. Establish scoring criteria and guides to provide objective and unbiased judgment.

• 9. Try out the instrument in preliminary form.• 10. Revise the sample of situations on the basis of

tryout information.• 11. Analyze reliability, validity, and score distribution in

accordance with the projected use of scores.• 12. Develop test norms and a manual, and reproduce

and distribute the test.

Page 64: Assessment principles

64

Test Length

• The test must be of sufficient length to yield reliable scores

• The longer the test, the more the reliable the results

• The test should be valid if it is reliable• For the grade school, one must consider the

stamina and attention span of the pupils• The test should be long enough to be

adequately reliable and short enough to be administered

Page 65: Assessment principles

65

Test Instruction

• It is the function of the test instructions to furnish the learning experiences needed in order to enable each examinee to understand clearly what he is being asked to do?

• Instructions may be oral, a combination of written and oral instruction is probably desirable, except with very young children.

• Clear concise and specific.

Page 66: Assessment principles

66

Test layout• The arrangement of the test items influences the speed

and accuracy of the examinee• Utilize the space available while retaining readability.• Items of the same type should be grouped together• Arrange test items from easiest to most difficult as a

means of reducing test anxiety.• The test should be ordered first by type then by content• Each item should be completed in the column and page

in which it is started.• If the reference material is needed, it should occur on the

same page as the item• If you are using numbers to identify items it is better to

use letters for the options

Page 67: Assessment principles

67

Scoring the test• Use separate answer sheets• Punched key• Overlay key• Strip keyPlight of the student• The teacher should discuss with the class the

content areas and levels of the cognitive domain to be examined

• The discussion should utilize a vocabulary and a level of complexity appropriate to the development level of the student

• Types of test• Examples of test type

Page 68: Assessment principles

Table of Specifications

Content Outline No. of items

1. Table of specifications 102. Test and Item characteristics 203. Test layout 54. Test instructions 55. Reproducing the test 56. Test length 57. Scoring the test 5TOTAL 55 68

One Grid TOS

Page 69: Assessment principles

Table of SpecificationsTwo Grid TOS

69

Weight(Time Frame)

ContentOutline

Knowledge30%

Comprehension40%

Application30%

No. of items by content

area

35% 1. Table of specifications 1 4 4 9

30% 2. Test and Item characteristics 2 3 3 8

10% 3. Test layout 1 1 0 2

5% 4. Test instructions 0 1 0 1

5% 5. Reproducing the test 1 0 0 1

5% 6. Test length 1 0 1 2

10% 7. Scoring the test 2 1 0 3

8 10 8 26

The number of items in a cell is computed using the formula: 

items ofnumber totalX skill cognitive of percentage timeTotal

timeGivenXitems

Page 70: Assessment principles

70

Classification of test Items• Selected Response

– Binary Choices– Multiple Choice– Matching Type

• Constructed Response “Supply Test”– Short Form answers - identification– Completion – fill in the blanks, cloz test– Essay

• Performance Type– Paper and pencil type– Identification type– Simulation

Page 71: Assessment principles

71

Item Writing Commandments

• Thou shall not produce opaque directions to students regarding how to respond to your instructions (opaque directions)

• Thou shall not employ ambiguous statements in your assessment item (ambiguous statements)

• Thou shall not unintentionally provide students with clues regarding appropriate response (unintended clues)

• Thou shall not employ complex syntax in your assessment item (complex syntax)

• Thou shall not use vocabulary that is more advanced than required (Difficult vocabulary)

Page 72: Assessment principles

72

SHORT ANSWER ITEMS

• 1. Word the item so that the answer is both brief and definite.

• 2. Do not take statements directly from books to use as a basis for short answer items.

• 3. A direct question is generally more acceptable than an incomplete statement.

• 4. The answer to be expressed in numerical units indicate the type of answer wanted.

• 5. Blanks for answers should be equal in length.• 6. Do not use to many blanks.

Page 73: Assessment principles

73

Writing supply items1. Require short, definite, clear-cut, and explicit answers

FAULTY: Earnest Hemingway wrote______

IMPROVED: The Old Man and the Sea was written by _______.

Who wrote The Old man and the Sea?

2. Avoid multimutilated statements

FAULTY: _____ pointed out in ____ the freedom of thought in America was seriously hampered by ___, ____, & __.

IMPROVED: That freedom of thought in America was seriously hampered by social pressures toward conformity was pointed out in 1830 by ______.

Page 74: Assessment principles

74

Writing supply items

3. If several equal answers equal credit should be given to each one.

4. Specify and announce in advance whether scoring will take spelling into account.

5. In testing for comprehension of terms and knowledge of definition, it is often better to supply the term and require a definition than to provide a definition and require the term.

FAULTY: What is the general measurement term describing the consistency with which items in a test measure the same thing?

IMPROVED: Define “internal consistency reliability.”

Page 75: Assessment principles

75

Writing supply items6. It is generally recommended that in completion items

the blanks come at the end of the statement.FAULTY: A (an) ________ is the index obtained by dividing

a mental age score by chronological age and multiplying by 100.

IMPROVED: The index obtained by dividing a mental age score by chronological age and multiplying by 100 is called a (an) ________

7. Minimize the use of textbook expressions and stereotyped language.

FAULTY: The power to declare war is vested in ______IMPROVED: Which national legislative body has the

authority to declare war?

Page 76: Assessment principles

76

Writing supply items8. Specify the terms in which the response is to be given.

FAULTY: Where does the Security Council of the United Nations hold its meeting?

IMPROVED: In what city of the United States does the Security Council of the United Nations hold its meeting?

FAULTY: If a circle has 4-inch diameter, its area is_____

IMPROVED: A circle has 4-inch diameter. Its area in square inches correct to two decimal places, is _____

9. In general, direct questions are preferable to incomplete declarative sentences.

FAULTY: Gold was discovered in California in the year ___

IMPROVED: In what year was gold discovered in California?

Page 77: Assessment principles

77

Writing supply items

10. Avoid extraneous clues to the correct answer

FAULTY: A fraction whose denominator is greater than its numerator is a _____

IMPROVED: Fractions whose denominator are greater than their numerators are called _____

Page 78: Assessment principles

78

ALTERNATIVE RESPONSE ITEM• 1. Avoid broad general statements if they are to be

judged true or false.• 2. Avoid trivial statements.• 3. Avoid the use of negative statements.• 4. Avoid long complex sentences.• 5. Avoid including two ideas in one statement unless

cause and effect relationship are being measured.• 6. If the opinion is being used, attribute it to some

source unless the ability to identify opinion is being specifically measured.

• 7. True statements and false statements should be equal in length.

• 8. The number of true and false statements should be approximately equal.

Page 79: Assessment principles

79

Writing TRUE-FALSE Items

1. Avoid the use of “specific determiners” FAULTY: No picture-no sound in a television set may

indicate a bad 5U4G.IMPROVED: A bad 5U4G tube in a television set will result

in no picture sound.2. Base true-false items upon statements that are

absolutely true or false, without qualifications or exceptions.

FAULTY: World War II was fought in Europe and the Far East.

IMPROVED: The primary combat locations in terms of military personnel during World War II were Europe and the Far East.

Page 80: Assessment principles

80

Writing TRUE-FALSE Items

3. Avoid negative stated items when possible and eliminate all double negatives.

FAULTY: It is not frequently observed that copper turns green as a result of oxidation.

IMPROVED: Copper will turn green upon oxidizing.

4. Use quantitative and precise rather than qualitative language where possible.

FAULTY: Many people voted for Gloria Arroyo in the 2003 Presidential election.

IMPROVED: Gloria Arroyo received more than 60 percent of the popular votes cast in the Presidential election of 2003.

Page 81: Assessment principles

81

Writing TRUE-FALSE Items

5. Avoid stereotypic and textbook statements.

FAULTY: From time to time efforts have been made to explode the notion that there may be a cause-and-effect relationship between arboreal life and primate anatomy.

IMPROVED: There is a known relationship between primate anatomy and arboreal life.

6. Avoid making the true items consistently longer than the false items.

7. Avoid the use of unfamiliar or esoteric language.

FAULTY: According to some peripatetic politicos, the raison d’etre for capital punishment is retribution.

IMPROVED: According to some politicians, justification for the existence of capital punishment can be traced to the Biblical statement, “An eye for an eye.”

Page 82: Assessment principles

82

Writing TRUE-FALSE Items

8. Avoid complex sentences with many dependent clauses.

FAULTY: Jane Austen, an American novelist born in 1790, was a prolific writer and is best known for her novel Pride and Prejudice, which was published in 1820.

IMPROVED: Jane Austen is best known for her novel Pride and prejudice.

9. It is suggested that the crucial elements of an item be placed at the end of the statement.

FAULTY: Oxygen reduction occurs more readily because carbon monoxide combines with hemoglobin faster than oxygen does.

IMPROVED: Carbon monoxide poisoning occurs because carbon monoxide dissolves delicate lung tissue.

Page 83: Assessment principles

83

Writing Matching Type Test

1. Matching Exercises should be complete on a single page.

2. Use response categories that are related but mutually exclusive.

3. Keep the number of stimuli relatively small (10-15), and let the number of possible responses exceed the number of stimuli by two or three.

4. The direction should clearly specify the basis for matching stimuli and responses.

5. Keep the statements in the response column short and list them in some logical order

Page 84: Assessment principles

84

FAULTY: Match List A with List B. You will be given one point for each correct match.

List A List B

a. cotton gin a. Eli Whitney

b. reaper b. Alexander Graham Bell

c. wheel c. David Brinkley

d. TU54G tube d. Louisa May Alcott

e. steamboat e. None of these• Directions failed to specify the basis for matching• List are enumerated identically• Responses not listed logically• Lacks homogeneity

• Equal number of elements • Use of “None of the above”

Page 85: Assessment principles

85

IMPROVED: Famous inventions are listed in the left-hand column and inventors in the right-hand column below. Place the letter corresponding to the inventor in the space next to the invention for which he is famous. Each match is worth 1 point, and “None of these” may be the correct answer. Inventors may be used more than once.

Inventions Inventors

__ 1. steamboat a. Alexander Graham-Bell

__ 2. cotton skin b. Robert Fulton

__ 3. sewing machine c. Elias Howe

__ 4. reaper d. Cyrus McCormick

e. Eli Whitney

f. None of these

Page 86: Assessment principles

86

Writing Multiple Choice1. It is recommended that the stem be a direct question.2. The stem should pose a clear, define, explicit, and

singular problem.FAULTY: Salvador Dali is

a. a famous Indian.b. important in international law.c. known for his surrealistic art.d. the author of many avant-garde plays.

IMPROVED: With which one of the fine arts is Salvador Dali associated?a. surrealistic paintingb. avant-garde theatrec. polytonal symphonic musicd. impressionistic poetry

Page 87: Assessment principles

87

Writing Multiple Choice

3. Include in the stem any words that might otherwise be repeated in each response.

FAULTY: Milk can be pasteurized at home bya. heating it to a temperature of 130o

b. Heating it to a temperature of 145o

c. Heating it to a temperature of 160o

d. Heating it to a temperature of 175o

IMPROVED: The minimum temperature that can be used to pasteurize milk at home is:

a. 130o

b. 145o

c. 160o

d. 175o

Page 88: Assessment principles

88

Writing Multiple Choice4. Items should be stated simply and understandably, excluding all

nonfunctional words from stem and alternatives.FAULTY: Although the experimental research, particularly that by

Hansmocker must be considered equivocal and assumptions viewed as too restrictive, most testing experts would recommend as the easiest method of significantly improving paper-and-pencil achievement test reliability toa. increase the size of the group being tested.b. increase the differential weighting of items.c. increase the objective of scoring.d. increase the number of items.e. increase the amount of testing time.

IMPROVED: Assume a 10-item, 10-minute paper-and-pencil multiple choice achievement test has a reliability of .40. The easiest way of increasing the reliability to .80 would be to increaseda. group sizeb. scoring objectivityc. differential item scoring weightsd. the number of itemse. testing time

Page 89: Assessment principles

89

Writing Multiple Choice

5. Avoid interrelated items6. Avoid negatively stated itemsFAULTY: None of the following cities is a state capital

excepta. Bangorb. Los Angelesc. Denverd. New Haven

IMPROVED: Which of the following cities is a state capital?a. Bangorb. Los Angelesc. Denverd. New Haven

Page 90: Assessment principles

90

Writing Multiple Choice7. Avoid making the correct alternative systematically different from

other options8. If possible the alternatives should be presented in some logical,

numerical, or systematic order.9. Response alternatives should be mutually exclusive.FAULTY: Who wrote Harry Potter and the Goblet of Fire?

a. J. K. Rowlingb. Manny Paquiaoc. Lea Salongad. Mark Twain

IMPROVED: Who wrote Penrod?a. J. K. Rowlingb. J. R. R. Tolkienc. V. Hugo d. L. Carrol

Page 91: Assessment principles

91

Writing Multiple Choice

10. Make all responses plausible and attractive to the less knowledgeable and skillful student.

FAULTY: Which of the following statements makes clear the meaning of the word “electron”?a. An electronic toolb. Neutral particlesc. Negative particlesd. A voting machinee. The nuclei of atoms

IMPROVED: Which of the following phrases is a description of an “electron”?a. Neutral particleb. Negative particlec. Neutralized protond. Radiated particlee. Atom nucleus

Page 92: Assessment principles

92

Writing Multiple Choice

11. The response alternative “None of the above” should be used with caution, if at all.

FAULTY: What is the area of a right triangle whose sides adjacent to the right angle are 4 inches long respectively?

a. 7

b. 12

c. 25

d. None of the above

IMPROVED: What is the area of a right triangle whose sides adjacent to the right angle are 4 inches and 3 inches respectively?

a. 6 sq. inches

b. 7 sq. inches

c. 12 sq. inches

d. 25 sq. inches

e. None of the above

Page 93: Assessment principles

93

Writing Multiple Choice12. Make options grammatically parallel to each other and consistent

with the stem.FAULTY: As compared with the American factory worker in the early

part of the 19th century, the American factory worker at the close of the centurya. was working long hoursb. received greater social security benefitsc. was to receive lower money wagesd. was less likely to belong to a labor union.e. became less likely to have personal contact with employers

IMPROVED: As compared with the American factory worker in the early part of the century, the American factory worker at the close of the centurya. worked longer hours.b. had more social security.c. received lower money wages.d. was less likely to belong to a labor unione. had less personal contact with his employer

Page 94: Assessment principles

94

Writing Multiple Choice

13. Avoid such irrelevant cues as “common elements” and “pat verbal associations.”

FAULTY: The “standard error of estimate’ refer to

a. the objectivity of scoring.

b. the percentage of reduced error variance.

c. an absolute amount of possible error.

d. the amount of error in estimating criterion scores.

IMPROVED: The “standard error of estimate” is most directly related to which of the following test characteristic?

a. Objectivity

b. Reliability

c. Validity

d. Usability

e. Specificity

Page 95: Assessment principles

95

Writing Multiple Choice

14. In testing for understanding of a term or concept, it is generally preferable to present the term in the stem and alternative definitions in the options.

FAULTY: What name is given to the group of complex organic compounds that occur in small quantities in natural foods that are essential to normal nutrition?

a. Calorie

b. Minerals

c. Nutrients

d. Vitamins

IMPROVED: Which of the following statements is the best description of a vitamin?

15. Use objective items – items’ whose correct answers are agreed by experts

Page 96: Assessment principles

96

Factual Knowledge• The Monroe Doctrine was announced about 10 years after the

a. Revolutionary War

b. War of 1812

c. Civil War

d. Spanish-American War

Conceptual Knowledge

2. Which of the following statements of the relationship between market price and normal price is true?

a. Over a short period of time, market price varies directly with changes in normal price.

b. Over a long period of time, market price tends to equal normal price.

c. Market price is usually lower than normal price.

d. Over a long period of time, market price determines normal price.

Page 97: Assessment principles

97

Translation from symbolic form to another form, or vice versa

3. Which of the graphs below best represent the supply situation where a monopolist maintains a uniform price regardless of the amounts which people buy?

A B C D

S

Pric

e

Quantity

S

Pric

e

Quantity

S

SP

rice

Quantity

S S

Pric

e

Quantity

S

Page 98: Assessment principles

98

Application

In the following items (4-8) you are to judge the effects of a particular policy on the distribution of income. In each case assume that there are no other changes in policy that would counteract the effect of the policy described in the item. Mark the item:

A. If the policy described would tend to reduce the existing degree of inequality in the distribution of income,

B. If the policy described would tend to increase the existing degree of inequality in the distribution of income, or

C. If the policy described would have no effect, or an indeterminate effect, on the distribution of income.

__ 4. Increasingly progressive income taxes.

__ 5. Confiscation of rent on unimproved

__ 6. Introduction of a national sales tax

__ 7. Increasing the personal exemptions from income taxes

__ 8. Distributing a subsidy to sharecroppers on southern farms

Page 99: Assessment principles

99

Analysis

9. An assumption basic to Lindsay’s preference for voluntary associations rather than government order… is a belief

a. that government is not organized to make the best use of experts

b. that freedom of speech, freedom of meeting, freedom of association, and possible only under a system of voluntary associations.

c. in the value of experiment and initiative as a means of attaining an ever improving society

d. in the benefits of competition

Page 100: Assessment principles

100

Judgments in terms of external criteriaFor items 14-16, assume that in doing research for a paper about the

English language you find a statement by Otto Jespersen that contradicts one point of view in a language you have always accepted. Indicate which of the statements would be significant in determining the value of Jespersen’s statement. For the purpose of these items, you may assume that these statements are accurate. Mark each item using the following key.

A. Significant positively – that is, might lead you to trust his statement and to revise your own opinion.

B. Significant negatively – that is, night lead you to distrust his statement

C. Has no significance__ 14. Mr. Jesperson was professor of English at Copenhagen

University __ 15. The statement in question was taken from the very first article

that Jespersen published__ 16. Mr. Jespersen’s books are frequently referred to in other works

that you consult.

Page 101: Assessment principles

101

Essay Questions• 1. Ask questions or set tasks that will require the

examinee to demonstrate a command of essential knowledge.

• 2. Ask questions that are determinate, in the sense that experts could agree that one answer is better than another.

• 3. Define the examinee’s task as completely and specifically as possible without interfering with measurements of the achievement intended.

• 4. In general, give preference to more specific questions that can be more answered briefly.

• 5. Avoid giving the examinee a choice among optional questions unless special circumstances make such option necessary.

• 6. Test the questions by writing an ideal answer

Page 102: Assessment principles

102

Types of Essays:• General – extensiveness of responses• Restrictive Response – reliable scoringLearning outcomes measured by Essay:• Explain cause-effect relationship• Describe applications of principles• Present relevant arguments• Formulate tangible hypothesis• Formulate valid conclusions• State necessary assumptions• Describe the limitations of data• Explain methods and procedures• Produce, organize, and express ideas• Integrate learnings in different areas• Create original forms• Evaluate the worth of ideas

Page 103: Assessment principles

103

Understanding:A. Comparison of two phenomena on a single designated basis:

Compare the writers of the English Renaissance to those of the nineteenth century with respect to their ability tot describe nature

B. Comparison of two phenomena in generalCompare the French and Russian Revolutions

C. Explanation of the use or exact meaning of a phrase or statementThe book of John begins “In the beginning was the word…” From what philosophical system does this statement derive?

D. Summary of a text or some portion of itState the central theme of the Communist Manifesto

E. Statement of an artist’s purpose in the selection or organization of materialWhy did Hemingway describe in detail the episode in which Gordon, lying wounded, engage the oncoming enemy?What was Beethoven’s purpose in deviating from the orthodox form of a symphony in Symphony No. 6?

Page 104: Assessment principles

104

Application:A. Causes or effects

Why may too frequent reliance on penicillin for the treatment of minor ailments eventually result in its diminished effectiveness against major invasion of body tissues by infectious bacteria?

B. AnalysisWhy was Hamlet torn by conflicting desires?

C. Statement of relationshipIt is said that intelligence correlates with school achievement at about .65. Explain this relationship

D. Illustrations or examples of principlesName three examples of uses of the lever in typical American homes

E. Application of rules or principlesWould you weigh more or less on the moon? On the sun? Explain.

F. Reorganization of factsSome writers have said that the American Revolution was not merely a political revolution against England but also a social revolution, within the colonies, of the poor against the wealthy. Using the same evidence what other conclusion is possible?

Page 105: Assessment principles

105

Judgment:

A. Decision for or against

Should members of the Communist Party be allowed to teach in American colleges? Why or why not?

B. Discussion

Discuss the likelihood that four-year private liberal arts colleges will gradually be replaced by junior colleges and state universities.

C. Criticism of the adequacy, correctness, or relevance of a statement

The discovery of penicillin has often been called an accident. Comment on the adequacy of this explanation.

D. Formulation of new questions

What should one find out in order to explain why some students of high intelligence fail in school?

Page 106: Assessment principles

Designing Interpretive Exercise• Guidelines in Writing Interpretive Exercise• 1. Select an introductory that is in harmony with the

objectives of the course.– Amount of emphasis of various interpretive skills is a factor.– Do not overload test takers with interpretive items in a particular

area.– Selection of introductory should be guided by general emphasis

to be given to the measurement of complex achievement.

• 2. Select introductory material that is appropriate to the curricular experience and reading ability of the examinees.

106

Page 107: Assessment principles

Guidelines in Writing Interpretive Exercise

• 3. Select introductory material that is new to pupils.• 4. Select introductory material that is brief but

meaningful.• 5. Revise introductory material for clarity, conciseness,

and greater interpretive value. • 6. Construct test items that require analysis and

interpretation of introductory material.• 7. Make the number of items roughly proportional to the

length of the introductory material.• 8. Observe all suggestions for constructing objective

test items.

107

Page 108: Assessment principles

• Ability to Recognize the Relevance of Information

108

Page 109: Assessment principles

• Ability to Recognize Warranted and Unwarranted Generalizations

109

Page 110: Assessment principles

• Ability to Recognize Inferences

110

Page 111: Assessment principles

• Ability to Interpret Experimental Findings

111

Page 112: Assessment principles

• Ability to Apply Principles

112

Page 113: Assessment principles

• Ability to Recognize Assumptions

113

Page 114: Assessment principles

Reading comprehension• Bem (1975) has argued that androgynous people

are “better off” than their sex-typed counterparts because they are not constrained by rigid sex-role concepts and are freer to respond to a wider variety of situations. Seeking to test this hypothesis, Bem exposed masculine, feminine, and androgynous men and women to situations that called for independence (a masculine attribute) or nurturance (a feminine attribute). The test for masculine independence assessed the subject’s willingness to resist social pressure by refusing to agree with peers who gave bogus judgments when rating cartoons for funniness (for example, several peers might say that a very funny cartoon was hilarious). Nurturance or feminine expressiveness, was measured by observing the behavior of the subject when left alone for ten minutes with a 5-month old baby. The result confirmed Bem’s hypothesis. Both the masculine sex-typed and the androgynous subjects were more independent (less conforming) on the ‘independence” test than feminine sex-typed individuals. Furthermore, both the feminine and the androgynous subjects were more “nurturant” than the masculine sex-typed individuals when interacting with the baby. Thus, the androgynous subjects were quite flexible, they performed as masculine subjects did on the “feminine” task.

114

35. What is the independent variable in the study? a.Situations calling for independence and nurturanceb.Situation to make the sex type reactc.Situations to make the androgynous be flexibled.Situations like sex type, androgynous and sex role concepts 36. What are the levels of the IV? a.masculine attribute and feminine attributeb.rating cartoons and taking care of a babyc.independence and nurturanced.flexibility and rigidity 

Page 115: Assessment principles

Interpreting DiagramsInstruction. Study the following illustrations and answer the following

questions.

Figure 1

115

Pretest Posttest

101. Which group received the treatment? a.group A b. group Bb.c. none of the above 102. Why did group B remain stable across the experiment? a. there is an EVb. there is no treatmentc. there is the occurence of ceiling effect 103. What is the problem at the start of the experiment? a.the groups are nonequivalentb.the groups are competing with each otherc. the treatment took place immediately

Page 116: Assessment principles

Analysis of Test Results

Reliability, Validity, and Item Analysis

Page 117: Assessment principles

Learning Content

• Levels of Measurement

• Correlation Coefficient

• Reliability

• Validity

• Item Analysis

Page 118: Assessment principles

Objectives

• 1. Determine the use of the different ways of establishing an assessment tools’ validity and reliability.

• 2. Familiarize on the different methods of establishing an assessment tools’ validity and reliability.

• 3. Assess how good an assessment tool is by determining the index of validity, reliability, item discrimination, and item difficulty.

Page 119: Assessment principles

Levels of Measurement

• Nominal

• Ordinal

• Interval

• Ratio

Page 120: Assessment principles

Correlation Coefficient

• Relationship of two variables (X & Y)

• Direction

• Positive Negative

X

Y

Page 121: Assessment principles

Degree of Relationship

• 0.80 – 1.00 Very High relationship

• 0.6 – 0.79High Relationship

• 0.40 – 0.59 Substantial/Marked relationship

• 0.20 – 0.39 Low relationship

• 0.00 – 0.19 Negligible relationship

Page 122: Assessment principles

Testing for Significance

• Nominal: Phi Coefficient• Ordinal: Spearman rho• Interval & Ratio: Pearson r• Interval with nominal: Point biserial• Decision rule: • If p value < =.05: significant relationship• If p value > =.05: no significant

relationship

Page 123: Assessment principles

Variance

• R2

• Square the correlation coefficient

• Interpretation: percentage of time that the variability in X accounts for the variability in Y.

Page 124: Assessment principles

Reliability

• Consistency of scores Obtained by the same person when retested with the identical test or with an equivalent form of the test

Page 125: Assessment principles

Test-Retest Reliability

• Repeating the identical test on a second occasion

• Temporal stability• When variables are stable ex: motor

coordination, finger dexterity, aptitude, capacity to learn

• Correlate the scores from the first test and second test.· The higher the correlation the more reliable

Page 126: Assessment principles

Alternate Form/Parallel Form

• Same person is tested with one form on the first occasion and with another equivalent form on the second

• Equivalence;• Temporal stability and consistency of response• Used for personality and mental ability tests• Correlate scores on the first form and scores on

the second form

Page 127: Assessment principles

Split half

• Two scores are obtained for each person by dividing the test into equivalent halves

• Internal consistency;• Homogeneity of items• Used for personality and mental ability tests• The test should have many items • Correlate scores of the odd and even numbered

items• Convert the obtained correlation coefficient into a

coefficient estimate using Spearman Brown •  

Page 128: Assessment principles

Kuder Richardson(KR #20/KR #21)

• When computing for binary (e.g., true/false) items

• Consistency of responses to all items

• Used if there is a correct answer (right or wrong)

• Use KR #20 or KR #21 formula

Page 129: Assessment principles

Coefficient Alpha

• The reliability that would result if all values for each item were standardized (z transformed)

• Consistency of responses to all items• Homogeneity of items• Used for personality tests with multiple

scored-items• Use the cronbach’s alpha formula

Page 130: Assessment principles

Inter-item reliability

• Consistency of responses to all items

• Homogeneity of items

• Used for personality tests with multiple scored-items

• Each item is correlated with every item in the test

Page 131: Assessment principles

Scorer Reliability

• Having a sample of test papers independently scored by two examiners

• To decrease examiner or scorer variance• Clinical instruments employed in intensive

individual tests ex. projective tests• The two scores from the two raters obtained are

correlated with each other

Page 132: Assessment principles

Validity

• Degree to which the test actually measures what it purports to measure

Page 133: Assessment principles

Content Validity

• Systematic examination of the test content to determine whether it covers a representative sample of the behavior domain to be measured.

• More appropriate for achievement tests & teacher made tests

• Items are based on instructional objectives, course syllabi & textbooks

• Consultation with experts• Making test-specifications

Page 134: Assessment principles

Criterion-Prediction Validity

• Prediction from the test to any criterion situation over time interval

• Hiring job applicants, selecting students for admission to college, assigning military personnel to occupational training programs

• Test scores are correlated with other criterion measures ex: mechanical aptitude and job performance as a machinist

Page 135: Assessment principles

Concurrent validity

• Tests are administered to a group on whom criterion data are already available

• Diagnosing for existing status ex. entrance exam scores of students for college with their average grade for their senior year.

• Correlate the test score with the other existing measure

Page 136: Assessment principles

Construct Validity• The extent to which the test may be said to

measure a theoretical construct or trait. • Used for personality tests. Measures that are

multidimensional        Correlate a new test with a similar earlier

test as measured approximately the same general behavior

       Factor analysis        Comparison of the upper and lower group        Point-biserial correlation (pass and fail with

total test score)        Correlate subtest with the entire test

Page 137: Assessment principles

Convergent Validity

• The test should correlate significantly from variables it is related to

• Commonly for personality measures

• Multitrait-multidimensional matrix

Page 138: Assessment principles

Divergent Validity

• The test should not correlate significantly from variables from which it should differ

• Commonly for personality measures

• Multitrait-multidimensional matrix

Page 139: Assessment principles
Page 140: Assessment principles
Page 141: Assessment principles
Page 142: Assessment principles
Page 143: Assessment principles

Item Analysis

• Item Difficulty – The percentage of respondents who answered an item correctly

• Item Discrimination – Degree to which an item differentiates correctly among test takers in the behavior that the test is designed to measure.

Page 144: Assessment principles

Difficulty Index

• Difficulty Index Remark

• .76 or higher Easy Item

• .25 to .75 Average Item

• .24 or lower Difficult Item

Page 145: Assessment principles

Index Discrimination

• .40 and above - Very good item

• .30 - .39 - Good item

• .20 - .29 - Reasonably Good item

• .10 - .19 - Marginal item

• Below .10 - Poor item