I love student evaluations. I hate student evaluations.
Every semester, at the end, I’m required to go through this rigamarole where we give students an opportunity to evaluate our teaching, by handing out a standardized form with a Likert scale for telling us how wonderful or awful we are. It’s useless. They get to color in little dots that put us on a scale of quality, and most students don’t seem to enjoy it, and I’ve also noticed that the way they score the teacher is more reflective of how well they’re doing in the class than how well they were taught. I could easily boost my score by giving out more A grades.
And, unfortunately, they’re taken way too seriously by our review committees. I’ve seen committees split hairs over a hundredth of a point, or compare faculty on the basis of sample sizes of less than 10 students. Worst of all, I’ve been in meetings where faculty seriously insist that every instructor ought to be getting above average scores on student evaluations. And you can’t speak out against them, because then they’ll get revenge by carefully scrutinizing your scores.
In their defense, though, people have argued for years that student evaluation scores are positively correlated with academic effectiveness. Only that turns out to be not necessarily true.
A new study suggests that past analyses linking student achievement to high student teaching evaluation ratings are flawed, a mere “artifact of small sample sized studies and publication bias.”
“Whereas the small sample sized studies showed large and moderate correlation, the large sample sized studies showed no or only minimal correlation between [student evaluations of teaching, or SET] ratings and learning,” reads the study, in press with Studies in Educational Evaluation. “Our up-to-date meta-analysis of all multisection studies revealed no significant correlations between [evaluation] ratings and learning.”
These findings “suggest that institutions focused on student learning and career success may want to abandon SET ratings as a measure of faculty’s teaching effectiveness,” the study says.
Oh, please, yes, make it so. Kill these things. Not only would it stop wasting our time, but it would end pointlessly innumerate conversations in faculty meetings.
But wait, I also said I love student evaluations. I do! But not the numbers. Our forms also have an open space for free-form student comments, and those are often very useful. They’re also abused (one year a group of students colluded to write the same thing on every form: “This class taught me to love Jesus even more”, because of my reputation as an atheist. I hadn’t mentioned anything, pro or con, about Christianity in the course — it was a cell biology class, but I had brought up evolution quite a bit), but they also tell me what students found memorable or problematic. That’s good to know, and I try to reduce the problems and use the memorable strategies more in subsequent classes.
Also, believe it or not, grades aren’t just a way of punishing and rewarding students. I have goals for my courses, and they also tell me if I’m getting essential concepts across. So, for instance, the first exam in my cell bio course this term was intended to evaluate whether students had a good grasp of basic general chemistry; if they didn’t, I would have to go over redox reactions yet again before I plunged into oxidative phosphorylation. There’s no point in pushing on into more complex topics if they don’t have a good grip on the basics. (I’m relieved to say they did surprisingly well on the first exam, so our general chemistry course has clearly prepared them well.)
There are better ways of assessing whether a course is accomplishing its goals than handing students who don’t see the big picture a Likert scale and asking them to state whether the course and teacher are good or not. And do I need to even go into the superficial biases that color SETs? It matters whether you are good-looking or not, and students are nests of gender biases. I know that a benefit from being male — I’m not judged on appearance as much — but suffer a bit from being older and less attractive. But those are things that shouldn’t matter at all in judging teaching effectiveness.