Download pdf - ASSESSMENT...Bilakah Jadual Spesifikasi Item dibina? PENGENALAN Kubiszyn & Borich, (2003) emphasized the following significance and components of TOS: 1. A Table …

ASSESSMENT

ADIBAH BINTI ABDUL LATIF

SCHOOL OF EDUCATIONFACULTY OF SOCIAL SCIENCES AND HUMANITIES

TERMINOLOGY

Testing

Measurement

Evaluation

Assessment

TESTING

• A tool to determine student’s ability to completespecific tasks or demonstrate mastery of a skill orknowledge of content

• The most critical basis in ensuring the validity ofstudents interpretation score

• Examples: Q&A session in class, assignment,performance task, test, quiz and final exam

MEASUREMENT

• A systematic process of assigning numerals(quantitative) to the test administered.

• It can be in raw scores, percentile, standard score,etc.

• Examples: Assignment marks, total sore in a finalexam, mean of PLO, KPI score, mean of elppt, rankingscore.

12 August 2020SPP2032::Educational Measurement and

Evaluation

MEASUREMENT

1

2

n

O

O

O

1

2

n

x

x

x

Ability Score

EVALUATION

• The process of describing, obtaining, and providinguseful information for judging decision alternatives.This process allows one to make a judgment aboutthe desirability or value of something.

• Examples: ABCD, Pass and Fail, HL,TM,MM,Description from the value (Baik, Cemerlang,Sederhana), Description of P1 to P5 in e-LPPT.

ASSESSMENT

• The process of gathering information to monitor andreflect the progress in learning and teaching and tomake educational decisions if necessary.

• Examples: Dr A found out that her students areexcellent in formative assessment, but they cannotperform well in their final exam. What to do?

• Dr B realises that there are two groups ofability/level of skills in his class. How can he do tomake sure the learning environment can helpstudents’ learning?

WHAT IS ASSESSMENT?

• The word “assess” comes from the Latin verb “assidere” meaning ‘to sit with’

• In assessment, one is supposed to sit with the learner. This implies it is something we do ‘with’ and ‘for’ students and not ‘to’ students.

9

“Assessment is at the heart of student experience” Brown and Knight (1994)

“If you want to change student learning then change the method of assessment”

Brown, Bull & Pendlebury (1997)

ASSESSMENT

SUMMATIVEFORMATIVE

ALTERNATIVETRADITIONAL

ONLINEOFFLINE

Continuously At the end of lesson

Synchronous

Asynchronous

Online Examination

Asynchronous

Traditional

Restricted Responses

Extended Responses

AlternativePerformance

Based

Synchronous Traditional

Manual Invigilation

Online Proctoring

Without Invigilation and

Proctoring

Types of Assessment (Function)

Students Monitor their own learning, use selfassessment

At the end of learning, learning evidence (product),give judgement on students’ achievement.

Assessment FOR Learning

Continuous assessment, learning evidence (process),give feedback to students and how to improve

Assessment OF Learning

Assessment AS Learning

Types of Assessment (Time)

ASSESSMENT

PLACEMENT FORMATIVE DIAGNOSTIC SUMMATIVE

CONTINUOUS AT THE END

Pre test Assignment, exercise, discussion

Tutorial, remedial Final exam , viva,Mid term exam,one attempt quiz

Types of Assessment (Interpretation)

15

Norm Referenced Test – compare withThe norm

Criterion Referenced Test –Compare with the standard

ASSESSMENT IN OBE

Identify outcomes

Determine assessment

Learning activities

Curriculum

PLO-CLO-ULO STRATEGIES/ METHODSTOOLS

CONSTRUCTIVE RELEVANCY

Table 1: Subjects Without Practical Components

Percentage Parts Assessed

10-20% Soft skills (e.g. communication, teamwork,

problem solving, responsibility)

40-60% Academic coursework (tests, quizzes,

assignments, papers)

30-40% Final examination

Distribution of Marks

Table 2: Subjects With Practical Components

Percentage Parts Assessed

10-20% Soft skills (e.g. discipline, teamwork, problem

solving, ethics)

80-90% Practical knowledge and skills

Distribution of Marks

Differences between Assessment and Evaluation

Dimension of Difference Assessment Evaluation

Timing Formative Summative

Focus of Measurement Process-Oriented Product-Oriented

Relationship Between Administrator and Recipient

Reflective Prescriptive

Findings, Uses Thereof Diagnostic Judgmental

Ongoing Modifiability of Criteria, Measures Thereof

Flexible Fixed

Standards of Measurement Absolute Comparative

Relation Between Objects of A/E Coöperative Competitive

PRINCIPLE OF ASSESSMENT

• Well-aligned with educational learning outcomes.

• Assessment should be valid and reliable

• Formative assessments needs to scaffold students inthe summative assessment

• Student should receive feedback on their work intimely manner.

• Assessment should be inclusive and equitable for allstudents

• Assessment is not used to threaten and intimidatestudents.

• Assessment should help student mastery learning.

VALIDITY (KESAHAN)

• Measuring what should be measured

• The appropriateness of the interpretations made from test scores and other evaluation results with regard to a particular use.

CHARACTERISTIC OF A GOOD TEST

CONTENT VALIDITY• Most related with achievement test

• The test represent the topic and cognitive process

towards the syllabus.

• Does it measuring learning objectives? –cognitive /

affective / psychomotor

• Table of specification –quiz, test, exam

•Construct formation and Operational Definition -

Rubric

• Subject matter expert –for both

TABLE OF SPECIFICATION

AKTIVITI 1

Bilakah Jadual Spesifikasi Item dibina?

PENGENALAN

Kubiszyn & Borich, (2003) emphasized the followingsignificance and components of TOS:

1. A Table of Specifications consists of a two-way chartor grid relating instructional objectives to theinstructional content.The column of the chart lists the objectives or"levels of skills" (Gredlcr, 1999) to be addressed;The rows list the key concepts or content the test isto measure.

TUJUAN JSI

• Menjamin content validity

• Memastikan sample item yang representative secara adil.

• Ujian memfokuskan kepada kandungan yang penting

• Menentukan pemberat / masa yang akan ditetapkan dalam kuliah.

TUJUAN JSI

• JSI juga dapat membantu pensyarah sebagaipanduan dalam perancangan menetapkan topik yanglebih penting, masa yang diperlukan untuk topiktertentu dan apakah tugasan / projek yang bolehdilakukan untuk membantu pelajar belajar topiktersebut lebih bermakna.

TUJUAN JSI

According to Bloom, et al. (1971),"We have found it useful to represent the

relation of content and behaviors in the form of a two dimensional table with the objectives on one axis, the content on the other”.

TUJUAN JSI

2. A Table of Specifications identifies not only thecontent areas covered in class, it identifies theperformance objectives at each level of the cognitivedomain of Bloom's Taxonomy.

Teachers can be assured that they are measuring students' learning

across a wide range of content and readings as well as cognitive

processes requiring higher order thinking.

TUJUAN JSI

3. A Table of Specifications is developed before the test is written. In fact it should be constructed before the actual teaching begins.

TUJUAN JSI

The cornerstone of classroom assessmentpractices is the validity of the judgments aboutstudents’ learning and knowledge.

A TOS is one tool that teachers can use tosupport their professional judgment when creatingor selecting test for use with their students.

TUJUAN JSI

In order to understand how to best modify a TOSto meet your needs, it is important to understand thegoal of this strategy: improving validity of a teacher’sevaluations based on a given assessment. Validity isthe degree to which the evaluations or judgments wemake as teachers about our students can be trustedbased on the quality of evidence we gathered(Wolming & Wilkstrom, 2010).

TUJUAN JSI

A Table of Specifications helps to ensure thatthere is a match between what is taught and what istested. Classroom assessment should be driven byclassroom teaching which itself is driven by coursegoals and objectives.

Tables of Specifications provide the link betweenteaching and testing. (University of Kansas, 2013)

FORMULA

Formula A

Relative weight for the importance of content =

( The number of the TLO OR class period for one topic ÷TOTAL number of TLO OR class period ) ×100%

(3/10)*100 = 30

Relative weight of the subjectTLO / Hours spentContent

%303Topic 1

%101Topic 2

%101Topic 3

%202Topic 4

%101Topic 5

%202Topic 6

100%10Total TLO / class periods for teaching the unit

FORMULA

Formula B

Relative weight for the item =

(% of weight in each Bloom level x total

item of the test)

(0.3*20)= 6

Objectives

Totals 100%

Topics

Knowledge and

Comprehension

30 %

Application

and Analysis

50%

Evaluation

and

Synthesize

20%

Totals 100%

Topic 1 (30 %)

Topic 2 (10 %)

Topic 3 (10 %)

Topic 4 (20 %)

Topic 5 (10 %)

Topic 6 (20 %)

Weight for item 6 10 4 20

FORMULA

Formula C

Identify the number of questions in each topic for

each level of objectives =

(The total number of test x relative weight of the

topics x relative weight of Bloom level)

(20*0.3*0.3)= 1.8

Objectives

(Totals 100%)

Topics

Knowledge and

Comprehension

30 %

Application

and Analysis

50%

Evaluation

and

Synthesize

20%

Totals 100%

Topic 1 (30 %) 1.8 (2) 3 (3) 1.2 (1) 6

Topic 2 (10 %) 0.6 (1) 1 (1) 0.4 (0) 2

Topic 3 (10 %) 0.6 (1) 1 (1) 0.4 (0) 2

Topic 4 (20 %) 1.2 (1) 2(2) 0.8 (1) 4

Topic 5 (10 %) 0.6 (0) 1 (1) 0.4 (1) 2

Topic 6 (20 %) 1.2 (1) 2(2) 0.8 (1) 4

Number of questions 6 10 4 20

RELIABILITY

Test-retest reliability.

• Reliability coefficient is obtained by administering the same test twice and correlating the scores.

• An excellent measure of score consistency as one is directly measuring consistency from administration to administration.

RELIABILITYSplit Half Test

• Coefficient is obtained by dividing a test into halves, correlatingthe scores on each half, and then correcting for length (longertests tend to be more reliable).

• The split can be based on:

odd versus even numbered items, randomly selecting items,or manually balancing content and difficulty.

• Advantage: only requires a single test administration.

• Weakness: - the resultant coefficient will vary as afunction of how the test was split.

- not appropriate on tests where speed is a factor

RELIABILITY

Alternate Form Reliability

Most standardized tests provide equivalent forms that can be used interchangeably.

These alternative forms are typically matched in terms of content and difficulty.

Scores on pairs of alternative forms for the same examinees are correlated to provide a measure of consistency or reliability.

RELIABILITY

CORRELATION = RELIABILITY

Kebolehpercayaan Item

98% pengulangan keputusan boleh Berlaku jika ditadbir kepada kumpulan

Pelajar lain

Activity

46

WHAT CAN BE CONCLUDED FROM THE GIVEN DIAGRAM?

Validity and Reliability



49

TEST DURATION

Carey (1988) pointed out that the timeavailable for testing depended not onlyon the length of the class period butalso on students' attention spans.

TEST DURATION

Linn & Gronlund (2000):

1. A true-false test item takes 15 seconds to answer unless the student is asked to provide the correct answer for false questions. Then the time increases to 30-45 seconds.

2. A seven item matching exercise takes 60-90 seconds.

TEST DURATION

3. A four response multiple choice testitem that asks for an answer regardinga term, fact, definition, rule orprinciple (knowledge level item) takes30 seconds. The same type of test itemthat is at the application level may take60 seconds.

TEST DURATION

4. Any test item format that requiressolving a problem, analyzing,synthesizing information or evaluatingexamples adds 30-60 seconds to aquestion.

TEST DURATION

5. Short-answer test items take 30-45 seconds.

6. An essay test takes 60 seconds

for each point to be compared and contrasted.

ARAS KESUKARAN UJIAN

Aras kesukaran

• Memastikan item yang dibina adalah bersesuaiandengan aras keupayaan pelajar.

• Membuktikan aras kesukaran item yang ditetapkandalam JSI.

• Analisis aras kesukaran ini dilakukan untukpenetapan aras item untuk disimpan di dalam bankitem.

• Bagi mengkaji semula aras kesukaran item yangdiletakkan semasa penulisan Jadual Spesfikasi Item.

• Analisis secara CTT boleh dilakukan sebagai asasanalisis item

Item Difficulty Level: Definition

The percentage of students who answered the item correctly.

High

(Difficult)

Medium

(Moderate)

Low

(Easy)

≤= 30% > 30% AND < 80% ≥=80%

0 10 20 30 40 50 60 70 80 90 100

• Menentukan indeks kesukaran bagi item objektif:

pengiraan itu boleh dilakukan dalam bentuk jadual seperti dibawah.

cTT

• Menentukan indeks kesukaran bagi item subjektif:

-Pengiraan dibuat seperti jadual di bawah:

cTT

Item Difficulty Level: Discussion

• Is a test that nobody failed too easy?

• Is a test on which nobody got 100% too difficult?

• Should items that are “too easy” or “too difficult” be thrown out?

KUALITI INSTRUMEN

Indeks Diskriminasi Item

Bagi memastikan item yang dibina berfungsi denganbaik. Boleh dianalisis menggunakan CTT dan IRT. Itemyang baik seharusnya dapat membezakan keupayaanpelajar yang berpencapaian tinggi dan berpencapaianrendah. Indeks diskriminasi membantu penetapan itemdibuang dan disimpan dalam bank item

Bagaimana anda menganalisis indeks kesukaran item?

What is a “good” value?

If the item has Ratio of Students answered the itemcorrectly

Positive Discrimination High achievers >Low achievers

Negative Discrimination High achievers < low achievers

No discrimination High achievers = low achievers

What is a “good” value?

Discrimination Index Item Evaluation

0.40 and above Very good

0.30-0.39 Good and can be improved

0.20-0.29 Marginal and need improvement

Below 0.19 Bad, cant be accepted and need proper checking

• Contoh:Jika terdapat 40 orang murid dalam satu kelas, bahagikan mereka kepada dua kumpulan iaitu 20 murid pencapaian tinggi dan 20 murid pencapaian rendah. Misalnya bagi item 8, 16 murid dari kumpulan berpencapaian tinggi dapat menjawab dengan betul manakala hanya 4 orang murid dari kumpulan berpencapaian rendah yang menjawab betul bagi item tersebut.

Maka:K t = 16 = 0.8 atau ( 80 % )

20K r = 4 = 0.2 atau ( 20 % )

20D = K t – K r = 0.8 - 0.2 = 0.6( Kesimpulannya item 8 adalah item yang baik)

cTT

TEST THEOREM

If an individual can perform the most difficult aspects of the objective, the instructor can "assume" the lower levels can be done.

However, if testing the lower levels, the instructor cannot "assume" the individual can perform the higher levels.

ALTERNATIVE ASSESSMENT

• Beyond the traditional psychometrically driventesting. Design to assess learning tasks that stimulatecritical thinking skills and require students toproduce or demonstrate knowledge rather simplyrecall information provided to them by others

Alternative assessments are used to determine what students can and cannot do, in contrast to what they

know or do not know

WHEN TO USE??

• Substitute Pencil and Paper test

• Measuring higher level skills or other skills that cannot be measured by pencil and paper

• E.g (acting skills, balancing, counting, drawing,experimenting, interviewing, musical skills, physicaleducation skills, speaking skills, writing skills)

HUMAN JUDGMENT IN SCORING

REAL WORLD APPLICATIONS

MEANINGFUL INSTRUCTIONAL TASK

HIGHER LEVEL OF THINKING

STUDENTS PERFORMANCE

Characteristics of Alternative Assessment

70

ASSESSMENT

CONVENTIONALALTERNATIVE

AUTHENTICPERFORMANCE

BASED

LITERATURE REVIEW ANALYSIS

ALTERNATIVE ASSESSMENT LADDER

The Ladder of Alternative Assessment

Examples of authentic assessment

Research Project Debate

Writing Speech / summary

Studio

Work

Portfolio

Article Review

Writing Journal / proposal

Case Study

AUTHENTIC ASSESSMENT

New Academia Learning Innovation

Not only performance based, buthappen in the real setting.

Emphasizing more on process ratherthan product

Soft skills development

Holistic assessment

Rubric

TEACHING PRACTICES

CAPSTONE PROJECT

SERVICE LEARNING

2U2I PROGRAMME

WORK BASED LEARNING

JOB CREATION

SCORING ALTERNATIVE ASSESSMENT

METHOD

CHECKLISTRATING SCALE

RUBRIC

HOLISTIC ANALYTIC

82

GRADED ASSIGNMENT

• Develop one table of specification for your final exam, using the formula given in this workshop.

• Send in softcopy (using excel form)

• Individual/ Group Assignment according to your course.

GRADED ASSIGNMENT

• Analyze your final examination item using Classical test theory

• Send in softcopy (using excel form)

• Individual/ Group Assignment according to your course

“Students can escape bad teaching but they cannot escape bad

assessment.”

Boud, 1995

85

86

Give full measure and weight with justice

87

Give just measure and weight

Look at the measure not the score…..

Emphasize on the outcome not the output….

THANK YOU