C H A P T E R 15 C H A P T E R 15 Standardized Tests and Teaching © 2006 The McGraw-Hill Companies,...

C H A P T E RC H A P T E R 1515

Standardized Tests and Teaching

Learning Goals

1.Discuss the nature of standardized tests.

2.Compare aptitude and achievement testing and describe current uses of achievement tests.

3. Identify the teacher’s role in standardized testing.

4.Evaluate some key issues in standardized testing.

Criteria for Evaluating

Standardized Tests

What Is a Standardized

The Nature of Standardized

The Purposes of

StandardizedTests

The Nature of Standardized Tests

Standardized Tests

• Have uniform procedures for administration and scoring.

• Allow comparison of student scores by age, grade level, local and national norms.

• Attempt to include material common across most classrooms.

Enter the DebateShould students have to pass a test to earn a high school diploma?

YES NO

Contribute to accountability

Provide information about student progress andprogram placement

Diagnose students’strengths and weaknesses

Provide information for planning

and instruction

Help in program evaluation

Purposes of Standardized Tests

The Nature of Standardized Tests

Standards-based tests assess skills that students are expected to have

mastered before they can be permitted to move to the next grade or be

permitted to graduate.

High-stakes testing is using tests in a way that will have important

consequences for the student, affecting major educational decisions.

Evaluating Standardized Tests

Norms – Does the normative group represent all students who may take the test?

Reliability – Are test scores stable, dependable and relatively free from error?

Validity – Does the test measure what it is purported to measure?

Correlation

Correlation coefficient

Indicates directionof relationship

(positive or negative)

Indicates strengthof relationship(0.00 to 1.00)

r = 0.37+

Correlation Coefficient is a statistical measure of

relationship between two variables.

Pearson correlation coefficient• r = the Pearson coefficient

• r measures the amount that the two variables (X and Y) vary together (i.e., covary) taking into account how much they vary apart

• Pearson’s r is the most common correlation coefficient; there are others.

Computing the Pearson correlation coefficient

• To put it another way:

• Or

separately vary Y and X which todegree

ther vary togeY and X which todegreer

separately Y and X ofy variabilit

Y and X ofity covariabilr

Sum of Products of Deviations• Measuring X and Y individually (the denominator):

– compute the sums of squares for each variable• Measuring X and Y together: Sum of Products

– Definitional formula

– Computational formula

• n is the number of (X, Y) pairs

))(( YYXXSP

YXXYSP

Correlation Coefficent:

• the equation for Pearson’s r:

• expanded form:

YX SSSS

Correlation Coefficient Interpretation

Coefficient

Strength of

Relationship

0.00 - 0.20 Practically None

0.20 - 0.40 Low

0.40 - 0.60 Moderate

0.60 - 0.80 High Moderate

0.80 - 1.00 Very High

ReliabilityTest-retest: The extent to which a test yields the

same score when given to a student on two different occasions

Alternate-forms: Two different forms of the same test on two different occasions to determine the consistency of the scores

Split-half: Divide the test items into two halves; scores are compared to determine test score consistency

Test-retest: The extent to which a test yields the same score when given to a student on two different occasions

Alternate-forms: Two different forms of the same test on two different occasions to determine the consistency of the scores

Split-half: Divide the test items into two halves; scores are compared to determine test score consistency

Methods of Studying Reliability

Interrater Reliability- The consistency of a test to measure a skill, trait, or domain across examiners.

This type of reliability is most important whenresponses are subjective or open-ended.

Terry OvertonAssessing Learners with Special Needs, 5e

Types of Validity…

Content: Test’s ability to sample the content that is being measured

Criterion-related:

1. Concurrent: The relation between a test’s score and other available criteria

2. Predictive: The relationship between test’s score and future performance

Construct: The extent to which there is evidence that a test measures a particular construct

Content: Test’s ability to sample the content that is being measured

Criterion-related:

1. Concurrent: The relation between a test’s score and other available criteria

2. Predictive: The relationship between test’s score and future performance

Construct: The extent to which there is evidence that a test measures a particular construct

statistical technique which uses the correlations between observed variables to estimate common factors and the

structural relationships linking factors to observed variables. The diagram below illustrates how two observed variables can correlate because of their

relationships with a common factor.

Factor Analysis

Aptitude and Achievement

Comparing Aptitude and Achievement

Types of StandardizedAchievement

High-StakesState-Mandated

District andNational

Aptitude vs. Achievement Tests

Aptitude TestsAptitude TestsPredict a student’s ability to

learn a skillor accomplish a task.

(Stanford Binet, Wechsler, SAT when

used to predict success)

Achievement TestsAchievement TestsMeasure what the

student has learnedor mastered.

(California Achievement,IOWA Basic Skills,SAT when used to

determine what has been learned)

High-Stakes State-Mandated Tests

PossibleAdvantages

Criticisms

- Improved student performance- More teaching time- Higher student expectations- Identification of poor-performing schools/teachers- Improved confidence in schools

- “Dumbing down” and more emphasis on rote memorization

- Less time for problem-solving and critical thinking skills

- Teachers “teaching to the test”- Discrimination against low-SES

and ethnic minority children

National Assessment of Educational ProgressA federal “census-like” exam of students’ knowledge,

skills, understanding, and attitudes

Reading 1992–2000 4th grade no improvement1992–1998 8th and 12th no improvement

Math 1990–2000 4th and 8th improvement1990–2000 12th decline

Science 1996–2000 4th and 8th no change1996–2000 12th decline

The Teacher’s Role

Preparing Studentsto Take

StandardizedTests

Administering Standardized

Using Standardized

Test Scores to Plan

and ImproveInstruction

Understanding and

InterpretingTest Results

The Don’ts of Standardized Testing

DON’TDON’T•Teach to the test

• Use the standardized test format for classroom tests

• Describe tests as a burden

• Tell students that important decisions will be made solely on the results of a single test

• Use previous forms of the test to prepare students

• Convey a negative attitude about the test

Counting the Data-Frequency

Look at the set of data that follows on the next slide.

A tally mark was made to count each time a score occurred

Which number most likely represents the average score?

Which number is the most frequently occurring score?

Descriptive statistics are the mathematical procedures that are used to describe and summarize data.

Frequency Distribution

Scores1009998949089888275746860

Tally11

11111111 11

1111 11111111 1

Frequency112257

1062111

AverageScore?

Most88

Most FrequentScore?

ly 1 1 11 11 1111

11 1 1 1

This frequency count represents data that closely represent a normal distribution.

Descriptive Statistics

Frequency Polygons

Data100 89

99 8998 8998 8994 88 94 8890 7590 7590 7490 6890 60

60 68 74 75 88 89 90 94 98 99 100

Scores

Measures of Central Tendency

Measures of central tendency provide information about the average or typical score in a data set

Mean: The numerical average of a group of scores

Median: The score that falls exactly in the middle of a data set

Mode: The score that occurs most often

Mean- To find the mean, simply add the scores and divide by the number of scores

in the set of data.

98 + 94 + 88 + 75 = 355Divide by the number of scores: 355/4 = 88.75

Central tendency = representative or typical value in a distribution

MeanSame thing as an average

Computed bySumming all the scores (sigma, )

Dividing by the number of scores (N)

Measures of Central Tendency• Steps to computing the median

1. Line up scores from highest to lowest

2. Count up to middle score• If there is 1 middle score, that’s the

median• If there are 2 middle scores, median

is their average

Median-The Middlemost point in a set of data

Data Set 110099999897969088858079

Data Set 2

100999897868278727068

Median96

The median is 84 for this set.84 represents

the middlemost point in

this set of data.

Mode-The most frequently occurring score in a set of data.

Find the modes for the following sets of data:

Data Set 3998989898975

Mode:89

Data set 499888887877270

88 and 87 are bothmodes for this

set of data. This iscalled a bimodal

distribution.

Measures of Variability (Dispersion)

Range- Distance between the highest and lowest scores in a set of data.

100 - 65 = 35

35 is the range in this set of scores.

Variance - Describes the total amount that a set of scores varies from the

1. Subtract the mean from each score.

When the mean for a set of data is87, subtract 87 from each score.

100 - 87 = 13 98- 87 = 11 95- 87 = 8 91- 87 = 4 85- 87 = -2 80- 87 = -7 60- 87 = -27

2. Next-Square each difference-multiply each difference by itself.

13 x 13 = 16911 x 11 = 1218 x 8 = 649 x 4 = 16

-2 x -2 = 4-7 x -7 = 49

-27x -27= + 729

3. Sum thesedifferences

1,152Sum of squares

4. Divide the sum of squares by the number of scores.

1,152 divided by 7 =164.5714

This number represents the variance for this set of data .

5. To find the standard deviation, find the square root of the variance. For this set of data, find the square root of

164.5714.

The standard deviation for this set of data is 12.82 or 13.

Standard Deviation-Represents the typical amount that a score is expected to vary

from the mean in a set of data.

Ceiling and Floor Effects• Ceiling effects

– Occur when scores can go no higher than an upper limit and “pile up” at the top

– e.g., scores on an easy exam, as shown on the right

– Causes negative skew• Floor effects

– Occur when scores can go no lower than a lower limit and pile up at the bottom

– e.g., household income– Causes positive skew

Skewed Frequency Distributions• Normal distribution (a)• Skewed right (b)

– Fewer scores right of the peak– Positively skewed– Can be caused by a floor effect

• Skewed left (c)– Fewer scores left of the peak– Negatively skewed– Can be caused by a ceiling effect

Understanding Descriptive Statistics

The Normal Distribution: A “bell-shaped” curve in which most of the scores are clustered around the mean; the farther from the mean, the less frequently the score occurs.

Bell Curve

Commonly Reported Test Scores Based on the Normal Curve

Z Scores• When values in a distribution are converted

to Z scores, the distribution will have – Mean of 0

– Standard deviation of 1

• Useful– Allows variables to be compared to one another

even when they are measured on different scales, have very different distributions, etc.

– Provides a generalized standard of comparison

Z Scores• To compute a Z

score, subtract the mean from a raw score and divide by the SD

• To convert a Z score back to a raw score, multiply the Z score by the SD and then add the mean

MSDZX ))((

Issues in Standardized

Testing

Standardized Tests,Alternative

Assessments,High-Stakes Testing

Diversity andStandardized

Testing

Issues in Standardized Testing

Alternative Assessments• Assessments of oral presentations• Real-world problems• Projects• Portfolios

Diversity and Standardized Tests• Gaps on standardized tests have been

attributed to environmental rather than hereditary factors

• Special concern in creating culturally unbiased tests

Crack the CaseStandardized Tests

1. What are the issues involved in this situation?

2. Examine Ms. Carter’s testing procedures. What does she do incorrectly? How might this reduce the validity of the students’ scores?

3. How would you answer each of the parents’ questions?

Reflection & ObservationReflection:

What standardized tests have you taken?

How have these tests affected your perceptions of competence?

Observation:

What are some of the mother’s concerns regarding her son’s standardized test scores?

What error does the teacher make in interpreting one of the test scores? How would you explain this score?

C H A P T E R 15 C H A P T E R 15 Standardized Tests and Teaching © 2006 The McGraw-Hill Companies,...

Documents

Santrock lsd14e ppt_ch9

Santrock essentials 3e_ppt_ch01

Santrock essentials4e ppt_ch07

Santrock essentials4e ppt_ch09

Santrock essentials 3e_ppt_ch12

Santrock essentials4e ppt_ch10

John W. Santrock

Santrock essentials 3e_ppt_ch15

Santrock essentials4e ppt_ch17

Santrock essentials 3e_ppt_ch02

C H A P T E R 6 C H A P T E R 6 Learners Who Are Exceptional © 2006 The McGraw-Hill Companies, Inc. All rights reserved. Santrock, Educational Psychology,

Santrock essentials4e ppt_ch06

Santrock tls 5_ppt_ch09

Santrock essentials 3e_ppt_ch04

Santrock lsd14e ppt_ch16

Santrock essentials 3e_ppt_ch05

Santrock essentials4e ppt_ch08

Santrock lsd14e ppt_ch2

Santrock lsd14e ppt_ch18

Santrock chapter 1