Assessing Teaching Quality Student Assessments

Charles A. Burdsal and Sandra RanneyWichita State University

Has evolved over a 30+ year period. 39 item (+ 2 validity items)

questionnaire. Not self administered. Scales derived using factor analysis. Biases dealt with.

Decide the intent of the instrument. Is it primarily for summative or formative purposes?

If the purpose is formative, almost anything a faculty member finds useful is fine.

If summative; i.e., used for such things as tenure, promotion and salary adjustments, then we have a very different situation. Note that a good summative instrument may also have formative value.

While anyone who teaches may offer valuable content for such a questionnaire, it is important to have someone involved with a significant background in questionnaire development.

Scales are more useful for evaluations than single items.

While there are other methods, factor analysis is a very useful tool to develop scales from items.

Course Difficulty (72%) Course Workload (60%) Class Size (60%) Instructor Popularity (63%) Student Interest in Subject Before Course (55%) Reason for Taking the Course Grading Leniency (68%) Student’s GPA

“Perhaps for each large, representative,well designed study, there is another

study,comment, or electronic bulletin-board message that relies on an atypical

anecdoteor an appeal to popular myth for itsimpact.” Marsh & Roche (2000)

We believe that three of Marsh’s biases can be combined in the concept of a priori motivation:

Instructor Popularity Student Interest in Subject Before Course Reason for Taking the Course

1. Prior to enrolling in this class, I expected it to be of (little or no value . . . . great value)

2. When I enrolled in this class, I wanted to take this course (very much . . . . not at all)

3. I took this course because I had to. (strongly agree . . . . strongly disagree)

4. I took this course because I was interested in the subject. (very much . . . . not at all)

Scale rAll Classes

rLocally Normed

Rapport +.343 +.292

Course Value +.643 +.569

Course Design +.322 +.289

Grading Quality +.358 +.315

Difficulty -.113 +.002

Workload -.236 -.142

Perceived Quality Index +.447 +.389

Course Demands -.152 -.035

A priori motivation is not in the control of the instructor.

Its effect should be removed from other evaluation scales.

S.P.T.E. does so using regression techniques.

Marsh & Roche (2000), Greenwald (1997) review concerns of the relation of Course Demands & Difficulty on Student Evaluations.

The harder the course, the worse the ratings.

Scale r (uncorrected - all classes)

r(corrected - locally normed)

Rapport -.209 -.137

Course Value -.078 +.038

Grading Quality -.126 -.064

Perceived Quality Index -.038 +.059

Frequently believed that the larger the class is the poorer the ratings.

Scale r (all classes) r (local norms)

Rapport -.200 -.180

Course Value -.234 -.212

Course Design -.075 -.059

Grading Quality -.151 -.125

Difficulty +.009 .-.039

Perceived Quality Index -.176 -.154

As most student evaluation instruments are administered before grades are assigned, we are really talking about expected grades.

A great deal of concern as to the relation of expected grades to student evaluations is seen in the literature – Greenwald & Gillmore (1997), Greenwald (1997), Marsh & Roche (2000), etc.

1. Teaching effectiveness influences both grades and ratings.

2. Students’ general academic motivation influences both grades and ratings.

3. Students’ course-specific motivations influences both grades and ratings.

4. Students infer course quality and own ability from received grades.

5. Students give high ratings in appreciation for lenient grades.

Scale r

(uncorrected - all classes)

r(corrected - locally normed)

Rapport +.415 +.216

Course Value +.495 +.157

Grading Quality +.427 +.232

Difficulty -.416 -.288

Perceived Quality Index +.401 +.156

The relationship of expected grades & teaching evaluations is probably not one-dimensional.

By removing a priori motivation from the NORMED scales, one noticeably reduces their correlation with expected grades.

This probably supports the third of Greenwalds’ model – course expectations of specific courses affects the ratings of that course.

The remaining correlations seem rather small and hopefully are related to learning.

We decided to examine open ended comments to see if the could help.

The questions asked on the comment sheet: What could the instructor do to improve the

course? What did you like about the course and/or

instructor? Please comment on the effectiveness of computer-

aided instruction, if it was used in the course. Any additional comments?

250 sections sampled randomly.

24 opted not to participate.

13 excluded because of various problems.

Two excluded because we forgot to enter them.

The 250 sections were sampled from the Fall, 2002 administration of SPTE from WSU.

The Social Science Lab staff copied the comments and returned the originals to the faculty member.

They then read all comments eliminating anything in a comment that might identify the faculty member.

Spring & Summer 2003, data unitized, entered in Excel and rated.

Fall 2003, data was analyzed.

Comments were unitized and entered in Excel.

14,313 comments in all after eliminating “not applicable”

After entering, comments reviewed for proper unitizing.

Valences of 1 to 5 were assigned to each comment.

1 the most negative; 5 the most positive.

Examples of each follow:

You suck!

I wish I never came to this University

She makes me feel foolish in front of class.

Slow down a little bit.

Try and set more deadlines and stick to them.

Let us know what our grades are at least by mid-semester.

Bring treats!

Blackboard was used in the course.

I type my papers on the computer.

His knowledge of the subject is very good.

I like the fact that the homework was relevant to the exams.

The instructor is enthusiastic.

You gave me so much personal attention – THANK YOU!

Best accounting class I’ve had at WSU so far!

He’s one of the best instructors I’ve had at WSU!

Average valence was computed for each section.

The mean was 3.4.

Mean valences for each section ranged from 2.2 to 4.2.

Standard deviation was 0.377

Uncorrected University PQI .803

Uncorrected Local PQI .758

Corrected University PQI .790

Corrected Local PQI .746

There is a strong positive relationship between the valence of comments and the overall SPTE score (PQI).

Comments do go with the scales.

Yet, it still doesn’t feel right?

Mean ValenceMean Valence Mean PQI Z-Score Mean PQI Z-Score with (Scale Score)with (Scale Score)

2.2 to 2.72.2 to 2.7 -2.348 (.80)-2.348 (.80)

2.7+ to 3.22.7+ to 3.2 -.679 (4.14)-.679 (4.14)

3.2+ to 3.73.2+ to 3.7 +.199 (5.90)+.199 (5.90)

3.7+3.7+ +.871 (7.24)+.871 (7.24)

People are probably being too negative in interpreting SPTE (or probably any SETE).

SPTE results need to have some interpretation help to have them reflect the comments.

Adjective based on Adjective based on CommentsComments

PQI Scale Score PQI Scale Score RangeRange

LowLow Below -1.52 (2.47)Below -1.52 (2.47)

GoodGood -1.51 (2.48) to -.255 (5.02)-1.51 (2.48) to -.255 (5.02)

Very GoodVery Good -.254 (5.03) + to .535 (6.57)-.254 (5.03) + to .535 (6.57)

HighHigh Above .535 (6.57)Above .535 (6.57)

Some of us are really trying and it seems like it's never ever good enough.

I am sorry you don't seem to enjoy what you teach. It must make life rough.

I felt I was robbed from this class and what I could gain if another had taught.

I believe that I would of gotten more out of it if my dog taught the class.

I fear that because of the poor teaching of the instructor, that I am going to struggle through many classes in the future..

I felt that the instructor was rude most of the time.

She/he's nice Make her lectures more interesting Very friendly When they accuse someone of academic dishonesty

have proof! She/he could have graded more effectively. It challenged me to be one step ahead, instead of one

step behind. Have more than one major paper. XXXX was very nice and understanding. She/he was fun Be clear about what is on the tests!

Well organized Instructor is very well educated with the subject. The instructor helped the students understand the

material through his classroom lectures. Wrote very good notes. I couldn't get detailed enough answers to some of my

questions. That she/he was cool The instructor covered the topics supposed to be

covered very well with what the topics were suppose to be.

The instructor made learning fun and understandable.

I truly appreciated your ideas and real-life experience. His mastery of the subject matter was superb. (What would you change?) Nothing, she/he did a great job!! I really enjoyed this class and was upset when I had to miss

it. She/he is exceptionally clear about what she/he expects

from her students. The instructor was wonderful. He/she is enthusiastic and passionate about this subject

which helps in focusing the students on the subject of XXX. I don't think there is anything he/she could do to improve

the course.

Get people skilled in questionnaire development (not just people good with surveys) involved in producing your instrument.

All results should be norm based with the emphasis on scales rather than single items.

Correct for known sources of bias beyond the instructor’s control.

Deal with the issue of negative skewness of quality scales.

Assessing Teaching Quality Student Assessments

Documents

ASSESSING INTERCULTURAL INSTRUCTION IN FL TEACHING PRACTICES

Adapting Your Assessments for Remote Teaching and Learning

Teaching and Assessing Production - BFI

Assessing Computational Thinking - Computer Sciencepeople.cs.vt.edu/~kafura/CS6604/Papers/Assessing... · Validated Assessments in CSValidated Assessments in CS Validated Exams CS

Testing, assessing, and teaching

CCSSO Criteria for Procuring and Evaluating High-Quality ... · and Evaluating High-Quality Assessments B.5 Assessing writing: Assessments emphasize writing ... Reading assessments

Getting Started: Teaching, Assessing, and Evaluating

Assessing and Supporting Outcome Assessments · 2017. 10. 5. · 1 Assessing and Supporting Outcome Assessments Categories, Context, and Evidence Mobile Technology Novel Endpoints

Self-Assessing Locally-Designed Assessments

Assessing and teaching reading2

ASSESSING THE EFFICACY OF PICTORIAL PREFERENCE ASSESSMENTS

CHAPTER SIX ASSESSING AND TEACHING ORAL LANGUAGE

Assessments in Online Learning Teaching and Learning

Teaching and assessing speaking

Teaching and Assessing Soft Skills

Assessing and teaching reading

SATAL Students Assessing Teaching and Learning

Assessing and Teaching for Dental Nurses

Teaching and Assessing Clinical Reasoning

Teaching and Learning Physical Science in Urban Secondary Schools: Assessing the Assessments Joan Whipp, Michael Politano, Michele Korb – Marquette University