40
Australian Council for Educational Research (ACER) Australian Council for Educational Research (ACER) ACEReSearch ACEReSearch Policy Maker's Guides ACER Archives 2000 A Policy Makers Guide to International Achievement Studies A Policy Makers Guide to International Achievement Studies Margaret Forster ACER, [email protected] Follow this and additional works at: https://research.acer.edu.au/policy_makers_guides Part of the Educational Assessment, Evaluation, and Research Commons, and the International and Comparative Education Commons Recommended Citation Recommended Citation Forster, M. (2000). A Policy Makers Guide to International Achievement Studies. https://research.acer.edu.au/policy_makers_guides/1 This Book is brought to you by the ACER Archives at ACEReSearch. It has been accepted for inclusion in Policy Maker's Guides by an authorized administrator of ACEReSearch. For more information, please contact [email protected].

A Policy Makers Guide to International Achievement Studies

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A Policy Makers Guide to International Achievement Studies

Australian Council for Educational Research (ACER) Australian Council for Educational Research (ACER)

ACEReSearch ACEReSearch

Policy Maker's Guides ACER Archives

2000

A Policy Makers Guide to International Achievement Studies A Policy Makers Guide to International Achievement Studies

Margaret Forster ACER, [email protected]

Follow this and additional works at: https://research.acer.edu.au/policy_makers_guides

Part of the Educational Assessment, Evaluation, and Research Commons, and the International and

Comparative Education Commons

Recommended Citation Recommended Citation Forster, M. (2000). A Policy Makers Guide to International Achievement Studies. https://research.acer.edu.au/policy_makers_guides/1

This Book is brought to you by the ACER Archives at ACEReSearch. It has been accepted for inclusion in Policy Maker's Guides by an authorized administrator of ACEReSearch. For more information, please contact [email protected].

Page 2: A Policy Makers Guide to International Achievement Studies

A POLICY�

MAKER 'S�

GUIDE TO

International

Achievement

Studies

Margaret Forster

Page 3: A Policy Makers Guide to International Achievement Studies

The ACER Core-Funded Research Program

The Australian Council for Educational Research conductsa core program of research funded by an annual grantfrom the States and Territories and the Commonwealth.

This annual grant allows research to be undertaken intoissues of general importance in Australian education andcomplements research projects commissioned from timeto time by individual States, Territories and theCommonwealth.

Priorities for the ACER core research program arereviewed every three years. The three-year program underwhich this work was completed focused on an over-arching question: What can be done to improve learningoutcomes? and addressed five priority areas:

• assessment and reporting to improve learning

• improving literacy and numeracy learning

• improving outcomes for Indigenous students

• teaching practices to improve learning

• vocational outcomes and lifelong learning

Page 4: A Policy Makers Guide to International Achievement Studies

A POLICY MAKER'SGUIDE TO

InternationalAchievement

Studies

Margaret Forster

Page 5: A Policy Makers Guide to International Achievement Studies

This publication is the result of research that forms partof a program supported by a grant to the AustralianCouncil for Educational Research by State, Territory andCommonwealth governments. The support providedby these governments is gratefully acknowledged.

The views expressed in this publication are those of theauthor and not necessarily those of the State, Territoryand Commonwealth governments.

First published 2000 by the Australian Council for Educational Research Ltd19 Prospect Hill Road, Camberwell, Melbourne, Victoria 3124, Australia

Copyright © 2000 Australian Council for Educational Research

All rights reserved. Except as provided for by Australian copyright law, nopart of this book may be reproduced without written permission from thepublisher.

ISBN 0-86431-360-8

Printed in Australia by RossCo Print

Page 6: A Policy Makers Guide to International Achievement Studies

Introduction 3

What are the purposes of international achievement studies? 3

Why the interest in international achievement studies? 4

What are some examples of international achievement studies? 5

Who reports international achievement data and how are these data reported? 6

In what ways are international achievement data used and with what impact? 20

What concerns have been raised? 27

International achievement data and informed decision making 30

Useful websites 32

Endnotes 33

Page

1

Table of Contents Page

Page 7: A Policy Makers Guide to International Achievement Studies

Page

2

This guide provides policy makers with research-based information aboutinternational achievement studies.

Good decision making at all levels of an education system is facilitated by easilyaccessible, relevant, and reliable information.

Many indicators provide useful input to educational decision making; but themost important indicators are those which address the central concern ofeducation: the promotion of student learning.

Education systems monitor student learning—with the fundamental intention ofpromoting learning—by collecting, analysing and reporting student achievementdata. Given that state, national and international achievement studies are bothtime consuming and expensive, it seems prudent to reflect on this effort:

What are the purposes of these studies?

How are data reported and used?

What concerns have been raised about these studies?

How can we ensure that data will provide evidence for informed decisionmaking?

Page 8: A Policy Makers Guide to International Achievement Studies

International achievement studies focuson the collection and analysis of cross-national information. Two kinds ofinformation usually are collected:

• data on student achievement inparticular subject areas at particularages and stages of schooling; and

• background information includingcharacteristics of students, teachersand schools.

In some studies curriculum informationis central also.

Achievement data are collected throughtests administered to samples ofstudents selected to be representative ofnational populations. Backgroundinformation is collected by means of

questionnaires completed by students,their teachers and principals.

Two main agencies direct internationalachievement studies:

• IEA, the International Association forthe Evaluation of EducationalAchievement; and

• OECD, the Organisation for EconomicCo-operation and Development.

The US Educational Testing Service (ETS)also has conducted several studies(International Assessment of EducationProgress, IAEP studies) as has SACMEQ—the South African Consortium forMonitoring Educational Quality.

Page

3

The studies aim to

• provide policy makers and educationalpractitioners with information abouttheir education system in relation toother systems; and

• assist policy makers and educationalpractitioners to understand the reasonsfor observed differences in theachievements of students fromdifferent educational systems.

To achieve these aims the studiesexamine the impact and effect oneducational systems of policies that areapplied consistently (in general) withinnations but which may vary acrossnations:

The understandings we obtain fromcross-national comparisons of such

policies as age of school entry, hours andmethods of instruction, and teachertraining, can provide us with new insightsinto the performance of our owneducational system in general, and of therelationship between studentperformance and its antecedents andconsequences in particular.1

In international studies, the world isviewed as a global educational laboratorywhere different national policies andpractices yield different educationaloutcomes. The underlying assumption isthat differences in student performancebetween countries can be linked tocharacteristics of particular educationsystems. It is recognised that thesecharacteristics need to be understood intheir broader cultural and economiccontexts.

WHAT ARE THE PURPOSES OF INTERNATIONALACHIEVEMENT STUDIES?

INTRODUCTION

Page 9: A Policy Makers Guide to International Achievement Studies

Governments are convinced that futureeconomic competitiveness dependsupon high levels of knowledge and skillsin the working population—betterperformance is seen as essential to anation’s future standing in the globaleconomy. The link between futureeconomic performance and the currentachievement of school populations,particularly in literacy, numeracy, andscience, is assumed.

Researchers are interested in using theinformation provided to exploreassociations between policies,instructional approaches andachievement that might assist policymakers and teachers to improve studentlearning. Researchers also recognise thatinternational studies provide a contextfor the development of researchmethodologies and analyticalapproaches that can be of benefit inother educational contexts.

Is this interest well-founded?Evidence for a causal link between theachievement of a country’s schoolpopulation and economic performanceis inconclusive.

According to OECD research, there isconsistent evidence from multiplesources of a strong positive relationshipbetween educational attainment levelsand productivity growth at a nationallevel.2 The relationship is strongest whencomparing less developed with more

developed countries. However, there is adebate about the direction of causality.3

There is also evidence at an individuallevel of a relationship betweeneducational attainment and employmentand earnings.4

IEA publications indicate that there is‘very little firm evidence to support thewidely-held view that there is a strongand direct causal connection betweenmean student test scores for nations andtheir economic competitiveness’.5 Somerecent research studies conclude thatthere is no evidence to demonstrate thislink.6

Setting aside considerations of economicperformance, it is clear that internationalachievement studies do provideinformation with the potential toimprove student learning by informingdecision making.7 The ways in whichinternational achievement data arereported and used to assist policymakers is the focus of a large part of thisguide.

It also is evident that internationalstudies do provide a context for thedevelopment of analytical techniquesthat can be applied in other educationalcontexts. For example, pioneering workin the use of regression and multivariateanalyses was undertaken in IEA studies.8

Page

4

WHY THE INTEREST IN INTERNATIONALACHIEVEMENT STUDIES?

Page 10: A Policy Makers Guide to International Achievement Studies

IEA reading, mathematics andscience studies IEA, an international non-government co-operative organisation comprisingresearch centres and ministries ofeducation, has conducted severalassessments of reading, mathematics andscience as well as other subjects, since1959.

The recent Third InternationalMathematics and Science Study (TIMSS)is the largest international comparativestudy of educational achievement everundertaken. The 1994–5 testing included45 countries, more than 15 000 schools,and more than half a million students.Testing at five grade levels (3rd, 4th, 7th,8th and final year of secondary school)was conducted in more than 30 differentlanguages. Data from student, teacher,and principal questionnaires werecollected also.

OECD/PISA—the Programme forInternational Student AssessmentPISA assesses how far studentsapproaching the end of compulsoryschooling (defined as 15 year olds) haveacquired some of the knowledge andskills essential for full participation insociety. Assessments take place everythree years (beginning in 2000), in threedomains: reading literacy, mathematicalliteracy and scientific literacy. Thirty-three countries participated in the firstPISA cycle.

International Assessment ofEducational Progress (IAEP)mathematics and science studiesEducational Testing Service (ETS)conducted international assessments ofmathematics and science achievementsof 13 year olds in 1988 and in 1990–1.Seven countries participated in the firststudy, 20 in the second.

SACMEQ studies of mathematicsand readingSACMEQ, a network of ministries ofeducation in the South African sub-region, has conducted two assessments(1995 and 1999) of the mathematics andreading achievement of Grade 6students. Fourteen countries participatedin the 1999 study.

Page

5

WHAT ARE SOME EXAMPLES OF INTERNATIONALACHIEVEMENT STUDIES?

Page 11: A Policy Makers Guide to International Achievement Studies

Data are reported by

• agencies directing the studies (officialstudy reports);

• national agencies co-ordinating thestudies within participating countries(official national reports);

• independent educational researchers;and

• the press.

Data usually are reported showing

• rank ordering of countries (sometimescalled ‘league tables’) by whole testand sub-test including item level;

• the spread of student achievement(distributions);

• subgroup differences in achievement;and

• relationships between achievementand background variables.

A description of the scales against whichstudent achievement is reportedsometimes accompanies official studyreports.

Examples of the ways in which data arereported are shown on pages 7–19.

Rank ordering of countries

1 Countries are shown in rank order according to the average (mean)

achievement of their students on thecomplete set of test items.

Where league tables appear innewspaper articles they are presented intheir most basic form as tables or graphswithout qualifiers (see example 1opposite). Tables and graphs of this kindare simple and visually dramatic but areopen to misinterpretation (see page 27).

Where league tables appear in officialreports they usually are presented withqualifiers including years of formalschooling, average age of participants,mean and confidence intervals, samplingirregularities, and statistical significanceof country differences (see examples 2and 3). Tables and graphs of this kind arecomplex, but provide information thatassists readers to interpret the relativepositions of countries.

Page

6

WHO REPORTS INTERNATIONAL ACHIEVEMENT DATAAND HOW ARE THESE DATA REPORTED?

Page 12: A Policy Makers Guide to International Achievement Studies

(The Age, Melbourne, Australia, 12 June 1997)

Graphs of this kind are simple and visually dramatic but are open tomisinterpretation (see page 29).

Page

7

Example 1 Reporting student achievement: The press—league table

Page 13: A Policy Makers Guide to International Achievement Studies

Example 2 Reporting student achievement: Official study reports—league table with qualifiers

TIMSS Science achievement9

Tables and graphs of this kind are complex, but provide information that assistsreaders to interpret the relative positions of countries. For example, without aqualifier related to sample participation rates, Bulgaria and the Netherlands wouldhave appeared towards the top of the country rankings. Slovenia would haveappeared in fifth place without a caveat related to age grade specifications—inSlovenia a high percentage of older students participated.

Page

8

250 300 350 400 450 550 600 650 700 750 800Percentiles of Performance

5th 25th 75th 95th

Mean and Confidence Interval (+ 2SE)

500

International Average 516(Average of All Country Means)

200

Distributions of Achievement in the Sciences – Upper Grade (Eighth Grade*)Mean

607574571565554552550544538538535534531527525522522517498494485480476470463

545558471565560517

411531486560

478497525

524430326

(5.5)(4.3)(1.6)(1.9)(2.8)(3.3)(4.2)(3.2)(4.0)(4.5)(3.0)(4.7)(2.6)(1.9)(4.4)(4.7)(2.5)(1.7)(2.5)(4.0)(2.7)(2.3)(3.4)(2.4)(1.9)

(3.9)(3.7)(2.8)(5.3)(5.0)(5.1)

(4.1)(4.8)(4.7)(2.5)

(3.1)(2.2)(3.7)

(5.7)(3.7)(6.6)

~2~

~

1

1

1

~1

1

Years of Formal Schooling

88888988

7or 887887

8.5-9.58

7 or 888888888

8 or 988889

8888

788

898

Average Age14.514.414.414.214.314.014.114.314.014.413.914.214.113.914.014.214.214.314.313.614.314.514.314.613.7

14.214.314.314.014.313.7

15.714.814.614.8

13.913.614.3

14.115.315.4

Science Achievement Scale Score

* Eighth grade in most countries; see Table 2 for information about the grades tested in each country.~ Met guidelines for sample participation rates only after replacement schools were included (see Appendix A for details).1 National Desired Population does not cover all of International Desired Population (see table A.2). Because coverage falls below 65%,

Latvia is annotated LSS for Latvian Speaking Schools only.2 National Defined Population covers less than 90 percent of the National Desired Population (see Table A.2).() Standard errors appear in parentheses. Because results are rounded to the nearest whole number, some totals may appear inconsistent.Source: IEA Third International Mathematics and Science Study (TIMSS), 1994-95

Country

SingaporeCzech RepublicJapanKoreaHungaryEnglandBelgium (FI)Slovak RepublicRussian FederationIrelandSwedenUnited StatesCanadaNorwayNew ZealandHong KongSwitzerlandSpainFranceIcelandLatvia (LSS)Portugal LithuaniaIran, Islamic Rep.CyprusCountries Not Satisfying Guidelines for Sample Participation Rates (See Appendix A for Details):AustraliaAustriaBelgium (Fr)BulgariaNetherlandsScotlandCountries Not Meeting Age/Grade Specifications (High Percentage of Older Students; See Appendix A for Details):Colombia GermanyRomaniaSloveniaCountries With Unapproved Sampling Procedures at Classroom Level (See Appendix A for Details):Denmark GreeceThailandUnapproved Sampling Procedures at Classroom Level and Not Meeting Other Guidelines (See Appendix A for Details):IsraelKuwaitSouth Africa

Page 14: A Policy Makers Guide to International Achievement Studies

Example 3 Reporting student achievement: Official study reports—showing statistical significance of country differences

TIMSS Science achievement10

Although they are visually demanding, reports of this kind assist readers to understandthe significance of reported differences in the achievements of countries. For example,this report shows that although the Czech Republic, Japan and Korea were rankedsecond, third and fourth (Example 2) there was no significant difference in their scores.

Page

9

Multiple Comparisons of Achievement in the Sciences – Upper Grade (Eighth Grade*)Instructions: Read across the row for a country to compare performance with the countries listed in the heading of the chart. The symbols indicate whether the mean achievement of the country in the row is significantly lower than that of the comparison country, significantly higher than that of the comparison country, or if there is no statistically significant difference between the countries.1

SingaporeCzech RepublicJapanKoreaBulgariaNetherlandsSloveniaAustriaHungaryEnglandBelgium (FI)AustraliaSlovak RepublicRussian FederationIrelandSwedenUnited StatesGermanyCanadaNorwayNew ZealandThailandIsraelHong KongSwitzerlandScotlandSpainFranceGreeceIcelandRomaniaLatvia (LSS)Portugal DenmarkLithuaniaBelgium (Fr)Iran, Islamic Rep.CyprusKuwaitColombia South Africa

Sin

gap

ore

Cze

ch R

epu

blic

Jap

anK

ore

aB

ulga

riaN

ethe

rland

sS

love

nia

Aus

tria

Hu

ng

ary

En

gla

nd

Bel

giu

m (

FI)

Aus

tral

iaS

lova

k R

epu

blic

Ru

ssia

n F

eder

atio

nIr

elan

dS

wed

enU

nit

ed S

tate

sG

erm

any

Can

ada

No

rway

New

Zea

lan

dT

haila

ndIs

rael

Ho

ng

Ko

ng

Sw

itze

rlan

dS

cotla

ndS

pai

nF

ran

ceG

reec

eIc

elan

dR

oman

iaL

atvi

a (L

SS

)P

ort

ug

al

Den

mar

kL

ith

uan

iaB

elgi

um (

Fr)

Iran

, Isl

amic

Rep

.C

ypru

sK

uwai

tC

olom

bia

Sou

th A

fric

aCOUNTRY

Countries are ordered by mean achievement across the heading and down the rows.

* Eighth grade in most countries: see Table 2 for information about the grades tested in each country.

1 Statistically significant at .05 level, adjusted for multiple comparisons.Because coverage falls below 65%, Latvia is annotated LSS for Latvian Speaking School only.Countries shown in italics did not satisfy one or more guidelines for sample participation rates, age/grade specifications, or classroom sampling procedures (see Appendix A for details).Source: IEA Third International Mathematics and Science Study (TIMSS), 1994-95.

Mean achievement significantly higher than comparison country

No statistically significant difference from comparison country

Mean achievement significantly lower than comparison country

Page 15: A Policy Makers Guide to International Achievement Studies

2 Countries are shown in rank order according to the average (mean)

achievement of their students on subsetsof test items.

For example, country rankings in ascience study may be reportedseparately on subsets of itemsaddressing earth science, life science,physics, chemistry, environmental issues,and the nature of science. Displays ofthis kind usually are found in nationalreports and independent researchreports.

Given that the means rarely are reportedwith qualifiers, and that the items maynot have been selected to adequatelyrepresent a curriculum area, theinformation needs to be interpreted withcaution. Nevertheless, these breakdownshave the advantage of providinginformation that is masked by overallmeans. Students may performparticularly well, or poorly, in certaincontent areas and this may informcurriculum review.

Examples 4 and 5 illustrate increasinglydetailed content-level reporting.

Page

10

Common Fractions

Singapore 71.8

Hong Kong 65.3

Japan 62.5

Korea 61.5

Hungary 61.2

Netherlands 60.2

USA 53.7

Israel 53.3

Australia 52.0

Canada 50.4

International 50.0

Czech Republic 49.7

England 49.1

Thailand 46.4

New Zealand 45.3

Norway 39.2

Geometry: position and shapes

Australia 71.9

England 70.6

Netherlands 70.3

Hong Kong 70.1

Canada 69.4

USA 67.7

Czech Republic 67.4

Japan 66.7

Singapore 65.3

New Zealand 65.3

Korea 65.1

Hungary 62.3

International 61.7

Israel 59.1

Norway 58.3

Thailand 47.6

Example 4 Reporting student achievement: Independent researchreports—league tables showing mean scores for contentareas of the test

Average mathematics scores for specific content areas for fourth grade students(national per cent correct in each area).11

Page 16: A Policy Makers Guide to International Achievement Studies

3 Countries are shown in rank order according to the percentage of

students scoring in the top ten per cent ofstudents internationally.

Reports of this kind usually are found inindependent research articles. Theyprovide a different perspective onrelative performance. Example 6 showsthe percentage of students in eachcountry scoring in the top 10 per cent ofstudents in the world. About 45% of theparticipating students from Singaporescored in this category.

The highest scoring ten per centcontained about 13% of Australianstudents. ‘Perhaps we [in Australia] arenot doing as much for our best studentsas we could, to challenge them toachieve their potential,’ concludes theresearcher who presented these data.13

Page

11

84

79

79

77

76

76

74

73

71

73

97

97

91

92

97

93

94

93

89

90

90

83

86

84

82

73

65

75

76

71

95

93

94

90

96

93

92

83

88

90

84

73

69

75

72

70

71

73

69

71

95

90

86

88

87

87

81

78

77

83

44

38

45

31

21

41

44

39

29

33

Singapore

England

Switzerland

Canada

Netherlands

Scotland

Sweden

Czech Rep.

United States

International Mean

Overallmean fortask

Q1CompleteTable

Q2DescribePattern

Q3Applyalgorithm

Q4Count fre-quencies

Q5AIdentifymostfrequentnumber

Q5BExplainfindings

2 marks 1 mark 2 marks 2 marks 1 mark 1 mark

Mean percentage scores on items within task

Country

Example 5 Reporting student achievement: Independent researchreports—league tables showing mean scores for each itemwithin areas of the test

This table shows the mathematics scores for 13 year olds on each item for one of the12 tasks (Dice) used in the TIMSS performance assessment.12 The ‘mean percentage’score is the per cent of total possible points on each item averaged over students.

Page 17: A Policy Makers Guide to International Achievement Studies

Showing distributions of studentachievementBox and whisker displays sometimes areused to illustrate the distributions ofstudent achievement for eachparticipating country. Displays of thiskind, which are found in official studyreports, illustrate the spread of resultsbetween the lowest and highest five percent of students in any country (seeexample 7, opposite).

Page

12

Singapore

Korea

Japan

Hong Kong

Czech Rep.

Australia

Hungary

Canada

England

Mathematics

Science

NZ

USA

10 20 30 40

Example 6 Reporting student achievement: Independent researchreports—league tables illustrating achievements of highscoring students

Country percentages of students in world’s top ten per cent, TIMSS population 2.13

Page 18: A Policy Makers Guide to International Achievement Studies

Example 7 Reporting student achievement: Official study reports—illustrating the distributions of student achievement

Lower grade mathematics achievement in TIMSS14

Page

13

New Zealand

Singapore

Korea

Country

Japan

Hong Kong

Czech Rep.*Netherlands

*Slovenia

*Australia

*Austria

*These countries did not meet all the sampling criteria.

*Hungary

*Latvia*Scotland

Ireland

Canada

*England*Thailand

Cyprus

Greece

Portugal

Norway

Iceland

Iran, Isl. Rep.

United States

9.0

9.3

9.3

Averageage

Distribution of scores and mean score

9.4

9.1

9.4

9.3

9.9

9.2

9.5

9.4

9.7

8.7

9.3

9.1

9.1

9.7

8.8

8.6

9.1

8.8

8.6

9.4

9.2

Higher thanAustralia

Lower thanAustralia

Equivalent toAustralia

250 300 350 400 450 500 550 600 650 700 750 800

470 (Average of all country means)

Page 19: A Policy Makers Guide to International Achievement Studies

Showing subgroup differencesSubgroup differences are shown intabular form or illustrated graphically.Displays of this kind usually are found inofficial reports. They reveal between-and within-country differences that canbe masked by country means. Example 8below shows the achievements of maleand female students in the 1988 IAEPmathematics study.

Page

14

Describing achievement scalesAn elaboration of the scales againstwhich student achievement is reportedsometimes accompanies results inofficial reports. Different positions onthe scale are described in words (seeexample 9) and/or illustrated with tasks(see example 10). These displays helpreaders to understand the underlyingdimension against which achievement isbeing monitored.

Example 8 Reporting student achievement: Official study reports—illustrating sub-group achievement

Displays of this kind can illustrate within-country differences that can be maskedby country means.

Average mathematics proficiency by gender, age 13 IAEP 198815

Jackknifed standard errors are presented in parentheses.

Background data are missing from 31% of the New Brunswick (English) students.The resultant effect on the differences between groups from this particularprovince is estimated to be less than one standard error.

Reprinted by permission of Educational Testing Service, the copyright owner.1000

700

600

500

400

MALE

FEMALE

300

0

MALESFEMALES

Statistically significant differencebetween groups at the 0.05 level.

Korea

576.7(3.4) 558.0(3.9)

Quebec(French)

546.3(4.2) 539.3(3.0)

BritishColumbia

539.6(2.8) 541.3(2.6)

Quebec(English)

534.1(3.0) 537.3(2.3)

NewBrunswick(English)

526.6(4.3) 529.0(3.4)

Ontario(English)

517.8(4.4) 514.6(3.3)

NewBrunswick(French)516.8(4.2) 513.4(3.6)

Spain

523.2(5.3) 499.9(5.0)

UnitedKingdom

507.0(5.0) 512.5(3.9)

Ireland

508.2(5.7) 499.6(3.4)

Ontario(French)

480.6(3.4) 482.7(3.0)

UnitedStates

474.6(6.4) 473.2(5.1)

Page 20: A Policy Makers Guide to International Achievement Studies

Example 9 Reporting achievement: Official study reports—describingpositions on achievement scales

The first IAEP Science Scale.16

Descriptions of this kind help readers to understand the underlying dimensionagainst which achievement is being monitored.

Level 700 Integrate scientific information and experimental evidenceStudents at this level can interpret experimental data that involve several variables.They also can interrelate information represented in a variety of forms—text,graphs, figures, and diagrams. Students can make predictions based on data andobservations and are aware of limitations of extrapolations. Students demonstrate a growing understanding of more advanced scientific knowledge and concepts,such as the definition of a calorie or the concept of chemical change.

Level 600 Understand and apply intermediate scientific knowledge and principles

Students at this level demonstrate an understanding of intermediate scientific factsand principles and can apply this understanding in designing experiments andinterpreting data. They also can interpret figures and diagrams used to conveyscientific information. Students at this level can infer relationships and drawconclusions by applying facts and principles, particularly from physical science.

Level 500 Use scientific procedures and analyse scientific dataStudents at this level have a grasp of experimental procedures used in science, such as designing experiments, controlling variables, and using equipment. Theycan identify the best conclusions drawn from data on a graph and the bestexplanation for observed phenomena. Students also understand some concepts in a variety of science content areas, including the Life Sciences, Physical Sciences,and Earth and Space Sciences.

Level 400 Understand and apply simple scientific principlesStudents at this level exhibit growing knowledge in the Life Sciences, particularlyhuman biological systems, and can apply some basic principles from the PhysicalSciences, including force. They also display a beginning understanding of some ofthe basic methods of reasoning used in science, including classification andinterpretation of statements.

Level 300 Know everyday science factsStudents at this level know some general science facts of the type that can belearned from everyday experiences. For example, they exhibit some rudimentaryknowledge concerning the environment and animals.

Reprinted by permission of Educational Testing Service, the copyright owner.Page

15

Page 21: A Policy Makers Guide to International Achievement Studies

Example 10 Reporting achievement: Official study reports—illustratingpositions on achievement scales

TIMSS international science scale17

Displays of this kind help readers to understand the underlying dimension againstwhich achievement is being monitored.

Page

16

EXAMPLE 1BRiver on the plain:Bad place for farming.Scale value = 632International Average Per cent Correct:Eighth Grade = 42%Seventh Grade = 38% W01B

EXAMPLE 3Ozone layer.Scale value = 583International Average Per cent Correct:Eighth Grade = 53%Seventh Grade = 43% R04

EXAMPLE 1ARiver on the plain:Good place for farming.Scale value = 383International Average Per cent Correct:Eighth Grade = 79%Seventh Grade = 76% K01A

EXAMPLE 2Fossil fuels.Scale value = 526International Average Per cent Correct:Eighth Grade = 62%Seventh Grade = 55% K15

EXAMPLE 4Diagram of Earth's water cycle.Scale value = 659International Average Per cent Correct:Eighth Grade = 32%Seventh Grade = 27% W02

EXAMPLE 5Gases in air.Scale value = 750International Average Per cent Correct:Eighth Grade = 27%Seventh Grade = 22% 012

750

500

250

International Difficulty Map for Earth Science Example Items Lower and Upper Grades (Seventh and Eighth Grades*)

* Seventh and Eighth grades in most countries; see Table 2 for information about the grades tested in each country.NOTE: Each item was placed onto the TIMMS international science scale based on student's performance in both

grades. Items are shown at the point on the scale where students with that level of proficiency had a 65 per cent probability of providing a correct response.

Page 22: A Policy Makers Guide to International Achievement Studies

Showing relationships withbackground variablesThe relationships between studentachievement and background variablesare shown either in correlation tables(see example 11 below) or graphically(see examples 12 and 13). Displays of thiskind are found in official study reportsand analyses by independentresearchers.

Page

17

Population 1 Population 2Maths Science Maths Science

Number of books in home .18 .21 .27 .28

Family size -.14 -.17 -.10 -.11

Parents’ education status – – .28 .30

Parents’ occupation status .27 .27 .30 .29

Home background composite .36 .37 .40 .39

Language background composite .11 .15 .10 .15

Word knowledge .61 .60 .47 .48

Like mathematics .16 .15 .22 .10

Like science .14 .16 .19 .25

Attribute success to luck -.32 -.30 -.29 -.28

Self-efficacy belief .16 .15 .32 .28

Example 11 Reporting student achievement: Official study reports—correlations between TIMSS written tests and studentbackground variables for Australian students18

This table shows the relationship between student achievement on TIMSS writtentests in mathematics and science and a number of background variables. Forexample, there is a low correlation between students’ attraction to the subject (‘likemaths’, ‘like science’) and their achievement.

Page 23: A Policy Makers Guide to International Achievement Studies

Example 12 Reporting achievement: Official study reports—the relationship between mathematics and science TIMSS achievement and time spent watching TV forAustralian students19

Page

18

520

530

540

550

510

500

490

480

470

460

Achievement and daily TV watchingPopulation 1

No time < 1 hr 1-2 hrs 3-4 hrs > 4 hrs

MathsScience

Page 24: A Policy Makers Guide to International Achievement Studies

Example 13 Reporting achievement: Official study reports—correlations between reading achievement and the size of school libraries, IEA reading literacy study

This graph illustrates the regular increase in average score with increases inschool library size.20 This finding was across all countries participating in thestudy and within most of them. There was also a difference between wealthierand poorer countries as defined by the Composite Development Index (CDI).

Page

19

540

520

500

480

460

440

420

400

Lowest Quarter 2nd Quarter 3rd Quarter

High CDI countries

All countries

Low CDI countries

School library size

Ove

rall

Sco

re

Highest Quarter

521 525

500

452

536

504

454

474

515

535

492

445

Page 25: A Policy Makers Guide to International Achievement Studies

International achievement data are usedto

• motivate ‘improvement’; • confirm performance expectations; • inform policy making; and• initiate further within-country analyses.

Comparisons and analyses of differentkinds are made by the media, in officialstudy reports, in participating countryreports and by independent educationalresearchers. Examples of the impact ofreports are provided below.

Motivating ‘improvement’ bycomparing international rankings Countries compare their ranking withother countries and decide whether theyare happy with their position (sometimescalled ‘description’ or ‘mirror’ function ofthe studies, or the ‘cognitive Olympics’).

Comparisons of this kind are made byresearchers

…where would we want to beplaced in the next major cross-nationalstudy? What would we need to do toensure that we can achieve that goal?21

and the press. For example, widespreadmedia publicity was given to the resultsof 10 of the 23 countries participating inTIMSS. Minor media publicity was givenin a further six.22

What is the impact of this kind ofreporting?

Reporting international rankings canhave a major impact in generating publicengagement, in effecting policydecisions at ministerial level, and inmotivating schools to change practice.Examples 14 and 15 indicate the impactof reporting Second InternationalMathematics Study (SIMS) rankings inSweden and TIMSS rankings in theUnited States.

Page

20

IN WHAT WAYS ARE INTERNATIONAL ACHIEVEMENTDATA USED AND WITH WHAT IMPACT?

Example 14 Motivating ‘improvement’ by comparing internationalrankings: Sweden

The relatively low mean score for 13 year old students in Sweden in the SecondInternational Mathematics Study (SIMS) was publicised widely in the press:

‘Sweden at the Level of Developing Countries’, ‘Sweden close to Swaziland’. A task force was appointed by the minister of school education to investigatemathematics education, and a study of the competence of teachers in the middlegrades was undertaken.23 Changes in resource allocation to in-servicemathematics training in response to SIMS has been credited with theimprovement in TIMSS results.24

Page 26: A Policy Makers Guide to International Achievement Studies

Example 15 Motivating ‘improvement’ by comparing internationalrankings: United States

One of the six goals of education proclaimed at federal level in the United States in 1990 was that American youth would perform at the top of the competencyladder in mathematics and science by the year 2000.

One consortium of 17 school districts located in the north suburbs of Chicago andthe Illinois Math and Science Academy aims to become first in the world inmathematics and science achievement.25

The consortium has created a forum for dialogue with business and governmentleaders to clarify standards for being first in the world, and has established anetwork of learning communities involving math and science staff, research anddevelopment personnel, parents and community leaders.

The consortium begins with the question: How do students in consortium schoolsperform in comparison to students in countries around the world on internationaltests? Other questions to be answered include: Do school programs in theconsortium reflect a world-class curriculum? How do the preparation andinstructional practices of consortium teachers compare with those of teachers from countries that successfully prepare their students for the global market place? Does curriculum in consortium schools ‘fit’ with international standards?How do the social and cultural contexts differ between consortium schools andcountries around the world? What lessons can be learned from analysingstandardised test data that can be used to improve science and mathematicseducation in consortium schools?

Teachers in participating schools work with specialists from the North CentralRegional Educational Laboratory (NCREL) in four learning networks:

1 Curriculum analysis—making explicit, comparing, and aligning grade-level anddistrict-level curriculum with national and international standards.

2 Assessment strategies—using assessment data to establish school improvementplans and to integrate assessment (including alternative assessment techniques)and instruction.

3 Instructional practices—promoting practices that engage learners, includingproblem based learning, hands-on science and activity centred teaching

4 Technology—using technology to support learning.

Page

21

Page 27: A Policy Makers Guide to International Achievement Studies

Motivating ‘improvement’ bydisseminating information widelyCountries disseminate informationwidely to encourage stakeholders at alllevels of the education system to reflecton the findings. Example 16 belowdescribes TIMSS dissemination strategiesin Sweden.

Comparing internationalexpectations: ‘Benchmarking’ toconfirm performance expectationsThe achievements of students in othercountries are used as a ‘benchmark’ forthe development of a particular country’sperformance standards or expectations.

For example, during the development ofthe Australian Year 3, 5, 7 and 9 numeracy‘benchmarks’, the Commonwealth(Federal) Government commissioned astudy to compare the draft performancestandards with internationalachievement.26 Similarly, the Victorian(State) Board of Studies commissioned a1999 study to compare the English,science and mathematics expectationscontained in its revised CurriculumStandards Framework with internationalachievement data.27 In both instances theintention was to confirm the level ofexpectation with reference to actualstudent achievement.

Page

22

Example 16 Motivating ‘improvement’ by disseminating informationwidely: Sweden

The TIMSS results were disseminated widely in Sweden to gain public attentionand to encourage schools to make the greatest possible use of information. TheNational Agency for Education issued national reports at the same time as theinternational results were released. These reports were advertised on theAgency’s home page and sent to all participating schools.

Other reports including all released items with gender separated statistics forboth international and Swedish students were sent to all schools in the country toencourage them to use data from TIMSS for comparison in their classrooms.28

Page 28: A Policy Makers Guide to International Achievement Studies

Examining factors likely to influenceachievement to inform policyIn international studies, five broad clustersof factors likely to influence educationalachievement usually are explored: homebackground, school characteristics,teacher characteristics, teachingconditions and practices, and studentmotivation.

Both bivariate (e.g. amount of homeworkassigned and achievement in particularsubject) and multivariate analyses, whichinclude assessments of the joint effects ofbackground and potential causal factorson achievement, usually are undertaken.More recently, as the statistical techniqueshave become available, assessments ofcontextual and potential causal factors atmultiple levels (e.g. class performance,school performance, countryperformance), also have been carried out.

What is the impact of these analyses?

Results of these kinds of analyses haveled to extensive policy debate and tochanges in educational policy. Forexample, Hungary’s participation in IEAstudies has been credited withcurriculum reform in reading—thefinding that home factors accounted formore variance in student achievementthan school factors credited withundermining the Marxist-Leninistcurricular ideologies.29

Some key research findings from 35 yearsof IEA research and their implications forpolicy are shown on page 25. Example 17,below, illustrates the impact ofinternational achievement results onpolicy debate and policy decisions inGermany, Japan and Norway. Example 18illustrates the impact in the United States.

Page

23

Example 17 Influencing policy debate and policy decisions FIMSGermany, SIMS Japan30 and TIMSS Norway31

In the Federal Republic of Germany, vigorous debate followed the publication ofthe mean scores in Science for the various Lander (states). Different Lander havedifferent school structures and conclusions were drawn about the effectivenessof the different systems.

In Japan, the relatively higher achievement of students in mechanical operationsthan in higher mental processes required in problem solving became the focusof commissions composed of mathematics teachers and specialists and led to acurriculum revision.

In Norway, the TIMSS results have been used extensively by the Ministry ofEducation to inform policy on three fronts:

• Curricula for upper secondary education have been revised.

• New curriculum guidelines have been adopted in response to greatergender differences in achievement than expected.

• A revision of curriculum guidelines for teacher education has beenundertaken in response to findings that primary teachers’ backgrounds inmathematics and science needed to be strengthened.

Page 29: A Policy Makers Guide to International Achievement Studies

Initiating further within-countryanalysesCountries undertake their own analysesto expose within-country variations thatare obscured by international reportswhich typically focus on theachievements of the country as a whole.Analyses of this kind are conducted bynational agencies responsible forcoordinating the international studiesand by independent researchers. State bystate comparisons are common (seeexamples 19 and 20) as are populationsub-group analyses (example 21) andanalyses by curriculum area (example 22).

Example 18 Influencing policy debate —TIMSS United States

Three characteristics, in particular, of the US education system have been debatedby independent researchers as a result of international comparisons drawn fromTIMSS: curriculum, student instruction, and ongoing teacher professionaldevelopment.32

Science curricula and textbooks are unfocused and contain too many topics. Theyare ’a mile wide and an inch deep’. As a result, US teachers cover more topics,spend less time on topics, and provide more teaching activities per lesson thanhigh achieving countries.33 Curricula and textbooks also emphasise routineprocedures rather than challenging concepts.

Student instruction does not provide sufficient opportunity for studentengagement. Teachers instruct students in a concept or skill, solve exampleproblems with the class, and then have students practise on their own while theteacher assists individuals. In high achieving countries, teachers pose a complexthought provoking problem, students struggle with the problem, various studentspresent ideas or solutions to the class, the class discusses the various solutions andmethods, the teacher summarises the class’s conclusions, and students practisesimilar problems.

US teachers lack the long and carefully mentored introduction to teaching thatJapanese and German teachers receive. Nor do they have the rich informalopportunities to learn from each other and to share questions about teachingrelated issues that Japanese teachers enjoy.34

Page

24

Example 19 Within countryanalyses—US: stateby state comparisons

Although in TIMSS the US as a wholeranked below the international mean inmathematics achievement of 13 year-olds, and although the US did notsample in order to report statedifferences, secondary analyses claimthat one group of districts tied for firstranking.

The US Department of Education isencouraging school districts to useTIMSS at the district level to evaluatehow local students are doing comparedwith their international peers.

Page 30: A Policy Makers Guide to International Achievement Studies

Page

25

Some key research findings from 35 years of IEA research35

Effects of curricula time

Student achievement in Mathematics, French as a foreign language and Science ispositively related to the time given to the study of the subject at school, both incomparisons across countries and between students within countries.

Implication: School curriculum design must take into account the level ofachievement sought and relative emphasis given to each subject area.

Homework

Student achievement is related to the time spent on homework after other factorsinfluencing achievement have been taken into account.

Implication: Careful consideration needs to be given to the amount of timeassigned to homework—to support instruction without prejudicing motivation.

Sex differences

Sex differences in achievement are found to vary in size and direction acrosscountries, school subjects and over time. Programs can be effective in reducingthe gender gap in science achievement.

Implication: Programs to reduce the gender gap in achievement need to bemaintained where they exist and introduced where they do not exist.

Learning conditions

Although the effects of home background variables are similar across subjectareas, the effects of learning conditions in schools differ between subject areas,and in some subject areas are equivalent to, or greater in size than, the effects ofthe home.

Implication: Learning conditions in schools within a country should be raised andequalised.

Opportunity to learn

The average level of student achievement across countries is positively related tothe opportunity that students had to learn the content of the items tested.

Implication: The content and skills considered important must be identified in thecurriculum and students must be provided with the opportunity to learn thatcontent and skills.

Page 31: A Policy Makers Guide to International Achievement Studies

Example 21 Within country analyses—New Zealand: ethnic group analysis

Findings from previous New Zealand research have shown that studentsidentifying themselves as Pakeha/European achieved significantly better, onaverage, than students from other ethnic groupings.37 The New Zealand TIMSSreport investigated the achievements of students from different ethnic groupings.

Mean mathematics scores (mean per cent) for form 2 students, by ethnic grouping

Student Pakeha/ NZ Pacific Asian Other Gender European Maori Islands ethnic groupsGirls 50 38 35 52 49

Boys 50 36 32 57 52

Total 50 37 34 55 51

Note: Standard errors (by gender) range from 1.0% to 1.3% for Pakeha/European and Maori students and 1.8% to 2.8% for Pacific Islands and Asian students

Example 20 Within country analyses—Australia: state by statecomparisons

In Australia, there is no national curriculum; education is under the jurisdiction ofeach State. The Australian TIMSS report investigates the science achievements ofstudents in each Australian State.36

Page

26

State

WA

SA

QLD

ACT

NT

TAS

VIC

NSW

WA

SA

QLD

ACT

NT

TAS

VIC

NSW

Best estimateof mean

score

Best estimateof average

age

Average yearsof full-time

school

577 + 5

562 + 7

555 + 6

552 + 9

545 + 7

524 + 8

521 + 9

520 + 6

10.0 + .01

10.4 + .02

10.0 + .04

9.7 + .08

10.2 + .09

9.6 + .03

9.6 + .04

9.5 + .02

4.25 + .01

5.20 + .03

4.28 + .02

4.35 + .07

5.14 + .14

4.03 + .03

4.22 + .02

4.25 + .01

No statistically significant difference from comparison stateMean achievement significantly higher than comparison stateMean achievement significantly lower than comparison state

Page 32: A Policy Makers Guide to International Achievement Studies

A number of concerns have been raisedabout the quality of data on whichinternational analyses are based, and onthe ways in which data are reported. Themain concerns are discussed below.

Comparability of translatedassessments The problem: Translation may produceitems that differ in difficulty acrosslanguages and the validity ofinternational comparisons ofachievement depends on tasksmaintaining their relative difficultiesacross countries. For example, an easyquestion in English may become adifficult question when translated into

French. Differences in studentperformance on the tests then could bedue to differences in the difficulty of thelanguage tests rather than differences inthe achievements of the groups.

A solution: This concern is addressed atthe test development and data analysisstages of the studies. At the testdevelopment stage, in IEA studies forexample, translations are made from thesource language to the target language,then tests are back-translated to thesource language. The original and theback-translations are then compared.PISA has introduced even more rigorousprocedures. Two source versions, Englishand French, are used.

Page

27

Example 22 Within country analyses—South Africa: curriculum analysis

The TIMSS South African Project Team analysed variations in achievement acrossthe science content areas compared with the international average of achievement.The graph below illustrates the findings.38

WHAT CONCERNS HAVE BEEN RAISED?

60

50

40

30

20

10

0

Average per cent correct by the science content areasfor Standard 6

EarthScience

LifeScience

Physics ChemistryEnvironmentalScience Issues

International Average

SAAverage

26

55

27

59

27

55

26

51

26

53

Page 33: A Policy Makers Guide to International Achievement Studies

Translations into the third language fromboth source versions are compared andreconciled. At the analysis stage,statistical techniques are used to detectitems which, relative to other items, areunusually difficult or unusually easy inparticular countries. These items are notincluded in the study.

Match to curriculum (validity ofachievement measures)

The problem: Are the tests measures ofcurriculum knowledge? If they are, do thetests address each participating country’scurriculum? Have students in participatingcountries had the opportunity to learnwhat is being assessed?

Where tests aim to measure curriculumknowledge, the validity of comparisonsdepends on the degree to which the testsused in a particular study reflect thecurriculum of each country in the study.Because there are large numbers ofparticipating countries, there iscompromise over the content andcoverage of the tests, and tasks may matchthe curriculum of some countries betterthan others.

A solution: In order to address thisconcern, tests in IEA focus on a central orkey body of knowledge and skills agreedby participating countries. In the morerecent IEA studies, subject matter expertshave met on several occasions to establishthe content of the tests. This central bodymay not represent everything that istaught in a particular country butquestions can be asked about how wellthis body of skills is being taught andlearned in different countries.

It is interesting to note that the scoring ofstudents on either items appropriate totheir own curricula, or items appropriateto the curricula in different countries, didnot substantially change a country’s

position in international standings in theTIMSS study. 39

In the PISA study, the aim is to go beyondthe curriculum and to focus on moregeneralised skills in reading, mathematicaland scientific literacy. The content domainof the test is defined by internationalexpert opinion, not curriculum.

Comparability of targetpopulations Alignment of populations

The problem: The validity of comparisonsmay depend on students having thesame exposure to schooling. Countrieshave different policies with regard toschool age entry, grade repetition,promotion and graduation. Both lengthof time in school (grade level) and agecan be expected to influenceachievement, particularly at primarylevel. Countries also have differences inschool retention rates and enrolment inparticular courses.

A solution: Studies address age/gradeconcerns by clarifying the relationshipbetween age and grade. For example, inTIMSS three groups of students wereselected for the study: students midwaythrough elementary (and gradescontaining most nine year olds), studentsmidway through secondary (gradescontaining most 13 year olds), andstudents completing secondary(regardless of age).

The problem: If one aim of the study is toexamine the effect of curriculumexposure, a difficulty remains. The firstgroup studied, for example, includesboth third and fourth graders in somecountries and second and third in others,depending on which grades containedthe greatest percentage of nine yearolds. One indicator of curricular effectPage

28

Page 34: A Policy Makers Guide to International Achievement Studies

may be the performance differencesbetween students in the lower and uppergrades assessed.

A solution: In order to address thisproblem, it has been suggested thatstudies provide scores that allow theseparation of grade status from growth,rather than providing single mean scoresof achievement. It has been suggestedalso that comparative tables provideinformation indicating retention ratesand enrolment in particular courseswhere relevant, as was done whenreporting the TIMSS results for the finalyear of secondary schooling.

Exclusions

The problem: Sometimes countriesexclude sections of the defined targetpopulation, making comparisons lessvalid. For example, different definitionsof disability (physical, emotional andintellectual) may result in the exclusionof different groups of students.

A solution: In order to address thisproblem, it has been suggested thatthere should be clear standards toregulate the implementation ofpopulation exclusions.

Response rates

The problem: The validity of comparisonsalso will depend on the degree to whichthe selected samples are representativesamples. Although samples are definedto be representative, in practice, responserates vary. Comparisons betweencountries with widely differing responserates need to be treated cautiously.

A solution: Countries which do not meetsample requirements are excluded fromreports or their results are reported withcaveats as in TIMSS and PISA.

Comparisons over timeThe problem: Concern has been raisedabout the limitations of internationalsurveys for making causal inferencesabout patterns of student achievementover time. Concerns raised include:studies are not adequately linked,countries can change relative positionsas a result of chance, a country can riseor fall in its relative position as a result ofchanges another country has made.

A solution: Studies need to be carefullydesigned to ensure that comparisonsover time are justified. The extent andquality of links made in the achievementmeasures used are critical. PISA, forexample, has been designed from theoutset to give trend data for eachcountry—each country can use its ownprevious performance as a basis forcomparison.

Misleading reporting – league tables

The problem: There is concern thatreporting overall (mean) results,particularly in overall league tables,encourages readers to drawinappropriate conclusions about thedifferences between countries and thestrength of correlations betweenachievement and background variables.

A solution: Where league tables appear,they need to include qualifiers whichassist readers to interpret the relativepositions of countries.

Data need to be disaggregated in waysthat provide information otherwisemasked by overall means. For example,within country correlations betweenachievement and background variablesneed to be reported. Variables that arepositively correlated with achievementacross countries may be negativelycorrelated with achievement in any onecountry.

Page

29

Page 35: A Policy Makers Guide to International Achievement Studies

For a country, knowledge about studentperformance is of great importance, notin the form of ranking lists, but asstructured information that can be usedfor internal improvement.

I see great challenges nationally increating a working link from both theinternational and national level down tothe classroom level. How can we—nationally and internationally—manageto make the results from internationalsurveys to be useful for both teachersand schools in such a way that bothinstruction and practice are improved? 40

Data will be most useful to participatingcountries if they provide sufficientreliable information to inform debateand decision-making in a meaningfulway. The debate about the quality ofdifferent school structures in differentGerman Lander (example 17) wasconducted in a context where thenumber of schools drawn for each Landwas generally too small to permitinferences—a point overlooked in thedebate.41

The final section of this guide provides achecklist of considerations for ensuringthat international achievement dataprovide participating countries withevidence for informed decision-making.

Are the aims of the study clear?The aims of the study need to be clearand to address national as well asinternational policy concerns. (Countriesthat join studies at the planning stagehave an influence on what is addressedand how.)

1 Clear study aims and research questions

The aims of the study and the researchquestions to be addressed need to beclearly stated. Ideally, the researchquestions should address importantpolicy and theory-oriented issues for allparticipating countries.

2 Study design that addresses research questions

The study design needs to allow theresearch questions to be answered. Forexample, if data are to be used to trackchanges in achievement over time, then,where possible, each new internationalstudy needs to be linked to a previousinternational study.

3 National research extensions

Research extensions to the main study tomeet national objectives should beconsidered; for example, the collectionof data to facilitate linking the study toearlier studies allowing achievementchanges over time to be investigated, oroversampling so that internalcomparisons can be made. For example,Germany has a large sample for PISA toensure that comparisons betweenLander can be made.

Will the data be valid and reliable?

4 Valid instruments

If the instruments are to assesscurriculum, then they need to addressthe common intended curriculum ofparticipating countries. Ideally, thecommon elements will be important or‘core’ elements of each country’scurriculum. The tasks will be fair (that is,Page

30

INTERNATIONAL ACHIEVEMENT DATA AND INFORMED DECISION MAKING

Page 36: A Policy Makers Guide to International Achievement Studies

they will allow both male and femalestudents from different ethnic, cultural,social, and economic backgrounds todemonstrate what they know and cando), translated appropriately, andmeaningful to students. Ideally they alsowill be sensitive to instruction (that is,effective instruction will produceimprovements in performance).

Questions in the backgroundquestionnaires that address policy issuesoutlined in the study will be included.

5 Reliable data and sampling procedures

Procedures need to be in place forcollecting reliable (comparable)achievement data, including the trialtesting of all instruments, and thedevelopment of uniform collection andrecording procedures.

Sampling needs to be conducted so thatthe standard errors are acceptable inrelation to the policy decisions that willbe based on results. Recent standardsrequire sampling precision at the sameas, or better than, a simple randomsample of 400 students for educationaloutcome measures.42

Will like comparisons be facilitated?

6 Target population

When interpreting comparisons, likeneeds to be compared with like. Theextent of school and student-levelexclusions needs to be detailed and theimpact of these on comparisons ofmeans and distributions assessed.

Will the reporting be comprehensive?

7 Accessible and useful reporting at all levels

When results are reported, policy issuesneed to be addressed directly andanalyses described clearly. Arguments

about the interpretation of analysesshould be reported also. Reports shouldattempt to make the results useful forteachers and schools so that instructionand practice can be improved.

8 Comprehensive reporting

Analyses need to be comprehensive toreflect the varied nature and complexityof education systems.

League tables should be used withcaution and interpreted carefully.

Is the complexity of the dataconsidered before policyconclusions are drawn?

9 Responses to findings

When promoting policy change on thebasis of study findings, it is important toconsider the complexity of findings. Forexample, it could be misleading to argueon the basis of TIMSS results in favour offormal teaching, because it is acharacteristic of Japanese education,without arguing for mixed abilitygroupings, which are also a characteristicof Japanese education.

Are there procedures to monitorthe usefulness of the study findings?

10 Collect data on the impact of international studies

As part of the commitment toparticipating in international studies,research programs at a national level tomonitor the usefulness/impact offindings should be considered.

Page

31

Page 37: A Policy Makers Guide to International Achievement Studies

Achieve—a not for profit organisation created in1996 by American governors andcorporate leaders to provide advice andassistance to states on educationalreform. Emphasises strengtheningacademic expectations by benchmarkingstudents to the highest performingnations, promotes cross-statecollaboration on curriculum,accountability and assessment.

www.achieve.org

Eisenhower National Clearinghouse—a clearinghouse for mathematics andscience education located at the OhioState University and funded by the USDepartment of Education’s Office ofEducational Research and Improvement.The clearinghouse has an extensiveTIMSS site.

http://timss.enc.org/

First in the World Consortium—a consortium of 17 school districtslocated in the north suburbs of Chicagoand the Illinois Math and ScienceAcademy. They aim to become first in theworld in math and science achievement.

www.ncrel.org/fitw/homepage.htm

IEA International Association forthe Evaluation of EducationalAchievementwww.iea.nl/publications.htm

NCES National Center forEducational Statistics US Dept of Education TIMSS siteContains a comprehensive list of TIMSSpublications.

http://nces.ed.gov/timss/publist.html

PISA 2000—Programme for International StudentAssessment

www.pisa.oecd.org

Regional Alliance NetworkResources on the web related to TIMSS

http://ra.terc.edu/alliance/TEMPLATE/regional_networks/CIA/Assessment/timss.cfm

Thomas B. Fordham Foundation—a private foundation that supportsresearch, publications, and actionprojects in elementary/secondaryeducation reform at a national level andin the Dayton area.

www.edexcellence.net

TIMSS The Third InternationalMathematics and Science Study This site contains international results,technical reports, international databases, achievement items, TIMSSpublications, and links to related TIMSSsites.

http//timss.bc.edu

Page

32

USEFUL WEBSITES

Page 38: A Policy Makers Guide to International Achievement Studies

Kotte, D. (1992) Gender differences in scienceachievement in ten countries. Frankfurt: PeterLang.

Australian Council for Educational Research(1997) Raising Australian standards inmathematics and science: insights from TIMSS.Camberwell: ACER.

8 Peaker, G. (1975) An empirical study of educationin 21 countries: a technical report. Stockholm:Almqvist and Wicksell.

9 Beaton, A., Martin, M., Mullis, I., Gonzalez, E.,Smith, T. & Kelly, D. (1996). See note 6.

10 Beaton, A., Martin, M., Mullis, I., Gonzalez, E.,Smith, T. & Kelly, D. (1996). See note 6.

11 Schmidt, W., McKnight, C., Cogan, L., Jakwerth,P. & Houang, R. (1999) Facing the consequences:Using TIMSS for a closer look at US mathematicsand science education. Dordrecht: KluwerAcademic Publishers.

12 Harris, S., Keys, W. & Fernandes, C. (1997) Thirdinternational maths and science study Thirdnational report. Berkshire: NFER.

13 Lokan, J. (1997) Overview of the thirdinternational mathematics and science study(TIMSS) in Australia in Raising Australianstandards in mathematics and science: insightsfrom TIMSS. Melbourne: Australian Council forEducational Research.

14 Lokan, J., Ford, P. & Greenwood, L. (1997) Mathsand science on the line. Melbourne: AustralianCouncil for Educational Research.

15 Lapointe, A., Mead, N. & Phillips, G. (1989) A world of differences: an internationalassessment of mathematics and science.IAEP 1988: Educational Testing Service.Reprinted by permission of Educational TestingService, the copyright owner.

16 Lapointe, A., Mead, N., Phillips, G. (1989). See note 15.

17 Beaton, A., Martin, M., Mullis, I., Gonzalez, E.,Smith, T. & Kelly, D. (1996). See note 6.

18 Lokan, J., Ford, P. & Greenwood, L. (1997). See note 14.

Page

33

1 Plomp, T. (1999) The relevance of IEA typeinternational comparative assessments ofeducational achievement. Paper presented at the40th General Assembly of IEA Oslo, 30 August1999.

2 Organisation for Economic Cooperation andDevelopment (1989). Employment Outlook. Paris: OECD.Organisation for Economic Cooperation andDevelopment (1996). Lifelong Learning for All.Paris: OECD.

3 McKenzie, P. & Wurzburg, G. (1997) Lifelonglearning to maintain employability. Paperprepared as background for theme 3 of theOECD Analytical Report.

4 OECD (1989). See note 2.

5 Beaton, A., Postlethwaite, T. N., Ross, K., Spearitt,D. & Wolf, R. (1999) The benefits and limitationsof international educational achievementstudies. France: International Institute forEducational Planning/UNESCO.

6 Beaton, A., Martin, M., Mullis, I., Gonzalez, E.,Smith, T. & Kelly, D. (1996) Science achievementin the middle school years. IEA: TIMSSInternational Study Center, Boston.

Levin, H. (1988) Educational performancestandards and the economy. EducationalResearcher, 27 (4) 4-10.

Robinson, P. The tyranny of league tables inAlexander, R., Broadfoot, P. & Phillips, D. (1999)Learning from comparing – new directions incomparative educational research, Vol. 1Contexts, classrooms and outcomes.UK: Symposium Books.

7 Keeves, J. ‘Science education: the contribution ofIEA research to a world perspective’ inPostlethwaite, T. N. ed. (1986) Internationaleducational research: papers in honor of TorslenHusen. Oxford: Pergamon Press.

Keeves, J. (1995) The world of school learning:selected key findings from 35 years of IEAresearch. The Hague: IEA.

Purves, A. & Levine, D. eds (1975) Educationalpolicy and international assessment. Berkeley:McCutchan.

Endnotes

Page 39: A Policy Makers Guide to International Achievement Studies

19 Lokan, J., Ford, P. & Greenwood, L. (1997). See note 14.

20 Elley, W. (1992) How in the world do studentsread? IEA study of reading literacy. Netherlands:The International Association for the Evaluationof Educational Achievement.

21 Lokan, J. (1997). See note 13.

22 MacNab, D. (2000) Forces for change inmathematics education: the case of TIMSS inEducation Policy Analysis Archives Vol 8, No. 15.

23 Husen, T. (1987) Policy impact of IEA research inComparative Education Review Vol 31 (1)February 1987.

24 Robitaille, D., David F., Beaton, Albert E. &Plomp, Tjeerd. eds. (2000) The Impact of TIMSSon the Teaching and Learning of Mathematicsand Science. Vancouver, Canada: PacificEducational Press.

25 First in the World Consortiumwww.ncrel.org/fitw/homepage.htm

26 Lokan, J. & Ainley, J. (1998). The ThirdInternational Mathematics and Science Study:Implications for the development of numeracybenchmarks. Unicorn, 24 (2) 97-109.

Lindsey, J., Pearn, C., Lokan, J., Doig, B. &O’Connor, G. (2000). Comparison of Australia’sRevised Draft Year 7 Numeracy Benchmarks andInternational Standards. ACER andCommonwealth Department of Education,Training and Youth Affairs (mimeo).

O’Connor, G., Doig, B., Lindsey, J., Pearn, C. &Lokan, J. (1999). Comparisons: Australia’sRevised Draft Year 3 and Year 5 NumeracyBenchmarks and International Standards. ACERand Commonwealth Department of Education,Training and Youth Affairs (mimeo).

27 Lowe, L., Anderson, P., Farkota, R. (1999)Examining the Curriculum and StandardsFramework CSF II Paper commissioned by theBoard of Studies Victoria. Camberwell: ACER.

28 Robitaille, D. (2000). See endnote 24.

29 Greaney, V., Kellaghan, T. Monitoring thelearning outcomes of education systemsWashington, DC: The World Bank.

30 Husen, T. (1987). See note 23.

31 Jahr, H. (1999) Welcome address to the 40th IEAGeneral Assembly 30 August 1999 HolmenFjordhotell.

32 Peake, L. (1996) Pursuing excellence: A study ofUS 8th grade mathematics and science teaching,learning, curriculum, and achievement ininternational context. Washington, DC: USDepartment of Education.

33 Schmidt, W. H. ed. (1997) A splintered vision: aninvestigation of US science and mathematicseducation. Dordrecht Netherlands: KluwerAcademic Publishers.

34 Stigler, J. W. (1992) The learning gap: why ourschools are failing and what we can learn fromJapanese and Chinese education. New York:Summit Books.

35 Keeves, J. (1995). See note 7.

36 Lokan, J., Ford, P. & Greenwood, L. (1997). See note 14.

37 Chamberlain, M. Achievement in mathematics 1in Garden, R. ed. (1996) Mathematicsperformance of New Zealand form 2 and form 3students National results from New Zealand’sparticipation in the third internationalmathematics and science study. Wellington:Ministry of Education.

38 Howie, S. (1997) Mathematics and scienceperformance in the middle school years in SouthAfrica. Pretoria: Human Sciences ResearchCouncil.

39 Beaton, A., Postlethwaite, T. N., Ross, K., Spearitt,D. & Wolf, R. (1999) The benefits and limitationsof international educational achievementstudies. France: International Institute forEducational Planning/UNESCO.

40 Jahr, H. (1999). See note 31.

41 Husen, T. (1987). See note 23.

42 Beaton, A., Postlethwaite, T. N., Ross, K., Spearitt,D. & Wolf, R. (1999). See note 5.

Page

34

Page 40: A Policy Makers Guide to International Achievement Studies

This guide provides policy makers with research-basedinformation about international achievement studies.

Good decision-making at all levels of an education system is facilitatedby easily accessible, relevant, and reliable information.

Many indicators provide useful input to educational decision-making;but the most important indicators are those which address the centralconcern of education: the promotion of student learning.

Education systems monitor student learning–with the fundamentalintention of promoting learning—by collecting, analysing andreporting student achievement data. Given that state, national andinternational achievement studies are both time consuming andexpensive, it seems prudent to reflect on this effort:

What are the purposes of these studies?

How are data reported and used?

What concerns have been raised about these studies?

How can we ensure that data will provide evidence for informeddecision-making?