I have a feeling I’ll just be ignored if I mention this to the administration

Student evaluations suck, mostly.

Imagine that you’re up for a promotion at your job, but before your superior decides whether you deserve it, you have to submit the comments section of an internet article that was written about you for assessment.

Sound a little absurd?

That’s in essence what we ask professors in higher education to do when they submit their teaching evaluations in their tenure and promotion portfolios. At the end of each semester, students are asked to fill out an evaluation of their professor. Typically, they are asked both to rate their professors on an ordinal scale (think 1­–5, 5 being highest) and provide written comments about their experience in the course.

We’ve repeatedly seen studies that show that student evaluations are skewed to favor popularity and attractiveness of the professor (damn, I lose), and this article points out that there is a gender bias as well: male professors tend to get higher ratings than female professors (so that’s how I’ve managed to get along), and that means these evaluations are discriminatory. And therefore illegal. Cool.

Now I said that student evals suck, mostly. They’re sometimes helpful — not the goofy numerical scores, and I ignore comments that whine about how hard the class is — but the productive, thoughtful comments can be very helpful. If a student says “X worked for me, Y didn’t”, I’ll seriously reconsider X and Y.

The last batch of evaluations I got back I just ignored the numerical scores and browsed through the comments for practical concerns. I got one: there’s a lot of grade anxiety out there, and they really wanted the gradebook available online, so they could see exactly where they stand, point by point. OK, I can do that. Not with our existing software, Moodle, in which the gradebook is a confusing nightmare, but we’re switching to new courseware next year, so I’ll look into it.

I guess it’s good that that was the biggest problem they had with the course. But it’s the stupid numbers that the administration will care about.


  1. MAJeff says

    Administrators are innumerate enough to think that the overall averages have mathematical meaning. Educators recognize that the comments, and patterns of answers, are FAR more useful than the nonsense of those averages.

    Managerial expedience and innumeracy make for useless faculty evaluations.

  2. weylguy says

    I retired from teaching before the social media craze took effect, so I never had to submit any formal evaluations from students. Myers sees the process as discriminatory and illegal. I completely agree, but I also feel it’s just an excuse to pretend that this rampant social media nonsense is a useful educational tool. In reality, most social media today is worthless garbage, another example of the dumbing-down of our society.

  3. Azkyroth, B*Cos[F(u)]==Y says

    Worse; everywhere I’ve been they make you fill them out by hand. They’re supposed to be anonymous but some students have very distinctive handwriting… :/

  4. says

    It’s a widespread problem. If I’m happy I’ll rate someone a 7 or 8, higher has to be really extraordinary, but from what some service people have said only a 10 would help them. it is crazy if that is the case.

  5. whywhywhy says

    This is just another example of lazy management. Similar is the model where the 10% lowest achieving employees are fired which leads to less cooperation within a group and more stress (of course the determination is determined in part by the supervisors opinion). Where is the understanding of whether an employee provides value in any of these approaches?

  6. Raucous Indignation says

    4&5 @robertbaden Evaluations are a joke. They are not on a scale although it would seem like they are. They are binary. A five point scale is seen as 0-0-0-0-1. A 10 point is 0-0-0-0-0-0-0-0-0-1. Every time you give a thoughtful measured evaluation that’s less that the maximum score, you’re actually giving someone or something a zero. I learned that more than 20 years ago during my medical training. Giving less than a 5 to any medical student would get me called in to the program director’s office to defend myself. “He is a lazy, indifferent and arrogant medical student not interested in doing the work and wholly lacking in compassion and self-awareness,” was NOT the correct answer. I only had to be taught that lesson once. All future evaluations were straight 5s with a single token 4 thrown in as an act of defiant raucous indignation. Never wrote another comment that was less than glowing.

  7. Ganner says

    I always tried to be helpful and give constructive feedback on my course evaluations. But I can definitely see how they are a seriously flawed instrument to use for tenure evaluations. The only place I can see that they may have any use is if the same legitimate, constructive complaints appear term after term. But students are going to give bad scores to teachers who are “too tough” or are foreign and maybe not as easy to understand. And I can also definitely see where gender bias can creep in.

  8. says

    I had a few truly appalling professors back in the day, so bad that they still stick in my head an embarrassing number of years later.

    Those evaluations were the only time we students ever had even the slightest chance to be listened to. Mostly we didn’t believe them anyway, because they were obviously going to be either handed to the bad professor or just thrown in the trash anyway.

    I know, because the one time I actually had a more urgent problem with a particular course being problematic and spoke to the faculty’s dean of students about it, he responded by accusing me of being a communist and sending me for a psychiatric evaluation.

    yeah, i WISH that was an exaggeration…

  9. Nentuaby says

    Weylguy: I don’t understand the association you’re making with “social media?” I was completing course evaluations in pencil on paper back when a facebook was a paper phonebook with pictures.

  10. williamhyde says

    The first course I taught was a disaster. And I knew it was.

    But the terrible ratings were useful in that they let me know that this was no lesser disaster. Very motivating (we academics don’t like to get low marks!). The comments didn’t help with specific criticisms – but I had a pretty good idea of the various things that hadn’t worked.

    The next time I taught that course I got exceptionally good reviews. These didn’t help me improve the course, but were useful in dealing with the administration.

    It was commonly believed that if class reviews were over 4.5, the class was too easy. I did later get a 4.6, but argued that due to the small sample size, this wasn’t significant.

  11. Callinectes says

    As a student I hated filling them out as well. I didn’t feel qualified to evaluate the performances of these professionals.

  12. says

    I found the written comments on student evaluations very useful. It is also useful to compare the numerical scores from year to year — you do see trends, some of them encouraging. On our evaluations there were about 18 areas on which you got evaluated, from “student confidence in instructors knowledge” to “answers to student questions”. Which of those I got higher scores on stayed remarkably the same from year to year, sending a very clear message on what needed more attention.

  13. jrkrideau says

    I have a feeling I’ll just be ignored if I mention this to the administration
    First human rights case will get their attention.

  14. says

    weylguy, I’m really confused about your comment, I just do not see what this issue has to do with social media. Evaluations like this pre-date social media, administrators have been overly relying on them for a long time. When I was a student we filled out evaluations that are just as this article describes and social media really wasn’t a thing back then.

  15. Raucous Indignation says

    @15 Travis, it’s okay; weylguy is just upset ’cause some kids are on his lawn.

  16. Tualha says

    Regarding the issue of an online gradebook: Taylor Mali’s book What
    Teachers Make
    mentions something similar he did. Not sure, but the
    system he developed may be available for others to use. It’s a great
    read for its own sake anyway.

  17. dbinmn says

    Careful what they [students] wish for. Here in high school land, we use grade software that keeps student informed practically down to the minute, and their anxiety is through the roof! Some have alerts set on their cell phones to tell them when I have added a new grade, and a few will rush to my room freaking out about the score within minutes of my hitting save. If I grade late at night, I will have emails before I log out to head to bed. They demand that online quizzes be set to send them results seconds after they hit submit, and they then expect that merely clicking buttons is all they need to learn (“can I take it again?”) . These smart phones with their instagrams, and snapchats and likes and streaks and alerts really are only anxiety enhancing devices.

  18. microraptor says

    Of course, the other issue with really bad teachers was that students tended to drop the class before the end so they didn’t get evaluated. I’ve had more than a few classes that I had to drop due to failing grades that were 100% due to the professor’s teaching style, since when I retook the class from a different teacher I got an A.

  19. gorobei says

    “Imagine that you’re up for a promotion at your job, but before your superior decides whether you deserve it, you have to submit the comments section of an internet article that was written about you for assessment.”

    That sounds like the promotion process for most mid-level jobs in industry, and perhaps more fair in some ways. The only way the industry promotions process might be better is that the promotion committee might be large and somewhat diverse, with an HR person sitting in as a non-voter trying to keep the process reasonably bias-free.
    Oh, and each candidate is assigned a due diligencer – an independent person who interviews each comment provider to ensure it’s somewhat objective and not propaganda. Maybe industry does do it a little better?

  20. colinday says

    I’ve seen means of such data, which would make them at least interval, if not ratio.

  21. says

    Damn, abbey @9, that’s for real. Back in the day my dad got beat up for suspected communism by a teacher. Wander home with a concussion style. Never had a fascist teacher or administrator that I knew of myself.

    I’ve seen student feedback possibly cost an adjunct professor their job before once. Teacher was being rude with the students and erratic to the extent people reasonably suspected she was on drugs. One little student essay on the subject, sent the right way, and the lady was gone next year. May have just been a coincidence tho. Adjuncts are like kleenex to schools.

  22. emergence says

    Hey PZ, I don’t know if you’re going to see this, but I found a news story that I think you should know about. In a nutshell, a shithead right wing law professor at UPenn wrote an op ed where she attacked affirmative action and claimed that barely any of her black students have passed in the top half of her classes.

    Here’s Vox’s coverage of this.

    I know this is a little off topic for this thread, but since you’re also a college professor I figured that you’d be able to provide some counterexamples to the claims that Wax is making. You might want to address this in a future post.

  23. emergence says

    abbeycadabra @9

    What the fuck?! What year was that? If this didn’t take place in the 50s or 60s, I’m going to be very depressed.

  24. methuseus says

    If a student says “X worked for me, Y didn’t”, I’ll seriously reconsider X and Y.

    I’m sure you already know this, PZ, but be careful implementing that, as Y might work for other students, so blending styles works best for the majority of students. As with anything, you need to take the mean and median of the answers to see how you can be most effective.

  25. Azkyroth, B*Cos[F(u)]==Y says

    I’m sure you already know this, PZ, but be careful implementing that, as Y might work for other students

    In particular, consider the Opera Problem.

  26. vole says

    Inappropriate metrics again, one of the great curses of present day society. The assumption that the things you can measure easily must be the important things. Hence Thatcherism, and administrative and political idiocies everywhere.

  27. davidw says

    Class evaluations are like having prisoners evaluate the guards. They gotta be taken with a HUGE grain of salt. It’s worse now that my institution has gone on-line – with written, in-class evals, you at least had a captive audience and most students filled them out (for better or worse). Now they have to actively access the evaluations, which means only the motivated students will do it – most of them motivated by the desire to trash the instructor, a lesser amount motivated to praise the instructor.

    That said, when we had paper evals, only the numerical averages were distributed to the administration. With on-line evals, they get to see the comments, too. WOO BOY, some of my (truly incompetent in the classroom, they are) colleagues are *finally* getting the attention they deserve! Our state recently went to a funding formula based on graduation rate instead of enrollment count, so now the administration wants students to be SUCCESSFUL, and if they’re not, guess whose fault it is??? Yes, some schadenfreud (sp?) on my part, and yes, I recognize it’s an improper response to an improper action by the admin to an imperfect process that too many constituencies take too seriously.

    I’m only glad that there’s light at the end of my tunnel and that I plan to be out soon.

  28. evodevo says

    “Now I said that student evals suck, mostly. They’re sometimes helpful — not the goofy numerical scores, and I ignore comments that whine about how hard the class is — but the productive, thoughtful comments can be very helpful.”
    Yes. This. I taught zoology non-major labs at UK for 20 years, and this is JUST what I found. “Why don’t you grade on the curve?!!11!!!” did NOT register with me, but a thoughtful critique of a certain element of instruction was appreciated. However, MBA pencil pushers demand METRICS, so you have the idiot scale of approval. However, I was so lowly a peon, I don’t think my approval ratings even pinged the sonar lol

  29. wcorvi says

    To show just how much the scores mean, here’s an example. When I taught at University of Northern Iowa. the evals were a word, and then five choices as to how much the student agreed with it. One word was ‘liverality’. I did fairly well on that one, but decided to look it up; it wasn’t in the dictionary. I asked the chair what it meant, and he didn’t know, but he found out that it was a typo, should have been ‘liberality’. For many years the faculty was being evaluated on something that didn’t even exist, and NO ONE asked about it – students, faculty, administration!

    That’s how meaningful the process is. The student responses are based on how much they like you. To get good evals, make them like you.

  30. wcorvi says

    Another couple of comments. At Northern Arizona University, the faculty would take the forms into lecture, and near the end, would pass them out, telling the students that they should put the filled forms into the envelope, and the last one should take it to the secretary. I know of cases where faculty realized some students wouldn’t be there that day, so they would fill out the extra forms ahead of time, and put them in the envelope before they left the room. Nothing like stacking the deck.

    But, as bad, the maths department’s statistician made the case that the means of different faculty when compared in light of the ‘standard deviation (SD)’ should use the ‘SD of the mean’, ie the value divided by sqrt(N), N being the number of students filling the form. That is a lot smaller than the simple SD. The problem is, it’s the wrong SD in the first place. To compare my score with other faculty, we need to know the SD of the FACULTY AVERAGES! The SD given by the computer tells how narrow the consensus is in the class, not how the score compares to other faculty! And this was the university’s statistician!

  31. rietpluim says


    make them like you

    Note the possible alternative meaning in this sentence…

  32. says

    emergence @25

    What the fuck?! What year was that? If this didn’t take place in the 50s or 60s, I’m going to be very depressed.

    Grab your SSRIs, this was 2004. Jeez, my PARENTS weren’t old enough for university in the 50s and 60s.

    It was the business faculty, though. So, not really your quality thinkers, with double the arrogance of anyone else.

  33. npb596 says

    I’m surprised that someone who wrote this quote (“Next step: make your parents and school officials intensely uncomfortable, throw off the chains, and fight for changes they dislike. Vote. March in the streets. Say rude words to old white men in power. Flip the bird at the president of the United States — he does not deserve respect. Question everything.”) about a week ago is now saying that he ignores his students comments when they “whine” about his class and that their own evaluations “suck”. Although I guess they only “mostly” suck so really there’s no issue, right?

  34. paxoll says

    In med school we had a nice excel form handed out each semester where we could input our grades and it would calculate our current grade based on our current scores as well as calculate what we needed for the remaining grades to receive a given grade in the subject. It was very nice knowing that, for example, you had a B going into the final exam, you had to score over a 35% to keep your B and needed a 98% to bump up to an A. It allowed us to focus our studying to subjects where we needed help or where we could make the best improvement.

  35. blf says

    Like others have mentioned, when I was a student, I always tried to fill them in honestly, critically, but with what I hoped was useful feedback.


    In my last job at BigDummieCo, they had a programme which every other(?) year sent someone to do a face-to-face interview with each person. Whilst there was one special topic each round, almost any topic could be raised during the interview. The results were supposedly anonymized in a written report.

    The interviews were in English. Oops! This is France. Whilst most people can speak English (at the site where I worked), the degree of competence varies, so from the start interviewees would be (unconsciously?) given different weight, simply due to the language. (I do not recall if any interpreters could be present (who in any case would be other people scheduled for an interview).) Plus the interviewer knew the job title / position of each person, albeit the interviewer insisted this didn’t affect the report.

    The thing which particularly irked me was people whose native language is French, even if they are fluent in English, will still, to varying degrees, use words and phrasing — and spelling, grammar, punctuation, etc mistakes — very different from native English speakers. That is, a native English speaker can often make an very good guess if some unknown / anonymous speaker or author is a native French speaker fluent in English, or a native English speaker. (This also applies visa-versa, ignoring accent and (mis-)pronunciation and poor spelling and typos, it’s still immediately obvious French is not my native language, and very probably that English is.) I suppose that if the interview results were truly anonymized, this might not matter.

    But they weren’t truly anonymized: First, The site was known. (For a very small site such as the one I worked at, this means there is no “hiding in the crowd”.) Second, The interviewer admitted to me when I asked that they did sometimes use (unattributed (anonymous)) quotes from the interviews in the report. As per above, that means the words and phrasing used could suggest the quoted person, perhaps especially in my case, as I was the only native English speaker. I not only had no protection “in the crowd”, but because any “anonymous” quotes from me probably wouldn’t sound like they were made by a native French speaker, I stood out.

    Third, the interviewer knew who said what. Even if they managed to avoid leaking identities via the “anonymous” quotes, other hints could easily sneak into the report. That is, the process was not “double-blinded” (albeit I admit I have no idea how to “double-blind” interviews).

    And fourth, the above-mentioned possible bias due to job title; that is, e.g., a manager might be weighed differently than a rookie, or whatever. Again, the lack of “double-blinding”…



    One corporate department in this same company would send out surveys asking for feedback on their services. This was not only not anonymized in any fashion whatsoever, they were usually just multiple-choice with three possible answers: Excellent, Very Good, and Good.

    Seriously! There was no Poor, or Terrible, or even a catch-all Other-fill-in-the-blank.

    I complained about this several times via different channels, including the above pseudo-anonymous interviews, but it never changed.