Generalizing Behavior Between Species – Part 1


Observe the dog; it enjoys bacon. We might be able to conclude that a wolf would enjoy bacon, too. But “might” is the key word, there – can we really? I’d say it’s highly likely, but that’s based on my knowing other things already about wolves and dogs – namely that they are omnivores.

Why, then, might I run a complex and abusive experiment to determine if dogs like bacon? That would be a waste of time, right?

Popular psychology is full of experiments that don’t make much sense, because they don’t teach us anything. If we could be sure that experiments run on a mouse would produce the same (or nearly similar) results were they run on a human, then we’ve discovered a valuable thing: mice are a reliable proxy for human behavior. That’s absurd, though.

Let’s imagine a scenario: our hypothesis is that chimpanzees are a proxy for human behavior. We put groups of chimpanzees in a mildly stressed situation – they watch a violent movie – and we measure whether they are more likely to demonstrate conflict afterward. Let’s say that our results are mostly positive, so we publish them somewhere. The next week some popular media website runs an article entitled “chimp study shows that gamers get more violent after playing violent games!” I’d say that obviously, the website got it wrong (a lot of science reporting is terrible like that) but what was our study really going to show? OK, disclosure, I cheated: I didn’t say if the chimps were Bonobos or Pan Troglodytes, or a mix. What if we re-ran the experiment with only Bonobos, and discovered that grooming behaviors and blowjobs increased. And that Pan Troglodytes actually do show more aggression? I’m just making this up, of course. I don’t know. And neither does anyone else.

That’s one of the big problems (to me) with psychology and how it is presented to the public. For the sake of argument, let me loosely define “pop psychology” as “the generally over-interpreted and inaccurate public understanding of psychology theories and research” as opposed to “psychology as a science” which is the part that is trying to do research based on evidence and the scientific method. Several times, when I’ve attacked psychology, people have attempted to draw a line between “pop psychology” and “science psychology” so let’s just assume that line exists and that we can mostly accurately distinguish pop psychology from science psychology. In other words, I am willing to imagine that there are Real Scientists who clutch their temples and cringe when they see “chimp study shows that gamers get more violent after playing violent games!” but they are too busy writing their next research grant to actually contact the writer at the website and tell them to correct the article.

a wolf is given bacon treats at wolf park. [indy]

The problem I am raising is basic epistemology – the branch of philosophy that deals with knowledge, and how we know when we know things. The scientific method is an epistemological tool that is used to establish knowledge by varying cause and effect (experiment) in order to demonstrate the relationships between causes and effects. Elsewhere I have referred to results in experiments as “generalizing” which is really a sloppy way to put it – what’s really going on is that we’re trying to establish cause and effect so we can predict that things are going to happen, using induction. If one wolf likes bacon, that’s interesting, but if every wolf we ever test likes bacon, we can say we have learned something about wolves. When we say a creature “is an omnivore” that is also a way of predicting by induction: if you tell me a cat is an “omnivore” that means I can fairly confidently say it will like bacon.

Where it gets interesting is when the results are not perfect (which is most of the time) – what if our experiments with bacon and wolves allowed us to determine that 65% of wolves like bacon? Then we start subjecting our results to typical epistemological challenges: is bacon-eating a learned behavior, or is it inherent in “omnivore”? Because “omnivore” does not work with “65% of wolves like bacon” – “omnivore” means “can eat anything” not “will always eat anything” – but why do we even have a word like “omnivore” unless we are trying to predict and understand behaviors? If we don’t try to establish knowledge by induction, we’re left in some sort of pyrhhonian skeptical hell in which we can only discuss the appearances of things right now – I.e.: “I can’t speculate about wolves in general but that wolf right there seems to like bacon at this particular moment. It may not at any other time. In fact I can’t speculate as to whether or not the wolf just choked that bacon down because it thought it was doing you a favor.” I’m being silly; the point is that we can’t speculate, we can only observe, which traps us in the here and now. The entire value of the scientific project is mapping cause and effect and generalizing knowledge by induction – if we have to repeat every wolf-and-bacon experiment, every time, we can’t claim to know anything about wolves except that we are serving them a lot of bacon. When we start talking about P-values what we are doing is acknowledging that we are dealing with a probability that one of our induction rules is true – we want to say that if our experiment showed 65% of the wolves like bacon, then for any given wolf, there’s a similar chance. We could use that rule to build a Monte Carlo Simulation of the disappearance-rate of a hypothetical bacon supply around a hypothetical pack of wolves.

I’ll go a step further and say that what I outlined above is, I believe, a simple version of the scientific method and empiricism. If I’m wrong about that, I’m sure someone will correct me. But if I’m substantially right, then I think my point about popular psychology ought to stand pretty well: if we start extrapolating too far from our results, we are committing scientific malpractice – we are poisoning civilization’s foundations of knowledge. So if the website ran an article saying “Study shows there is no ‘vegan’ wolf” I’d expect that to induce cringes in the skeptical community, and emails to the editor reading, “no, the article says 65% of wolves in a certain circumstance eat bacon. No less, and certainly no more.”

If you’re still with me, you can now see how a skeptic might immediately have a problem with IQ testing, or damn near any other psychometric test if it’s used to generalize results across individuals. Imagine if someone gives me a test intended to detect suicidal ideation, and I score 65% on it. What does that mean? First off, it does not mean that “all people named Marcus are 65% suicidal.” It does not even mean “this particular Marcus is 65% suicidal” – or, if it does, so what? It’s not as if I am going to roll 2 D20 and say “oops, ’00’ I’m outta here.” The test is, however, useful as a diagnostic tool for comparing how this Marcus scored today, versus last year. I’d say it is entirely legitimate that a medical practitioner might give me such a test for comparative purposes so they could learn something about what I report over time. It is entirely legitimate, for example, for a psychiatrist to suggest anti-depressants if I score in the low 20%s for 5 years and suddenly spike up to 98%. We understand that all these things do not guarantee a fixed outcome – that’s why the memeosphere is full of stories about “98 year-old mine-working grandma smokes a pack a day and washes it down with Jack Daniels, says she wants to run for president in 2030.” On the other hand, as a diagnostic tool, we need to be able to generalize some results in order to understand the world and make good recommendations: if your car’s motor stops extremely suddenly and oil starts pouring out of the oil-pan you have almost certainly thrown a rod and you should not expect that motor to run any more.

When scientists start doing experiments, it immediately raises an alarm for me if their experiment doesn’t have a clear data-point that they can confirm/disconfirm. I also get suspicious when I see a result that looks sort of obvious being presented as a big discovery. If someone actually did an experiment to see what percentage of wolves like bacon, I’d kind of think they just enjoyed making wolves happy, or were doing the experiment as an excuse to hang out and dance with wolves, or something. I also get suspicious of psychometric tools that are not diagnostic: most specifically IQ tests. [*] That’s a long-form answer to voyager’s question on my earlier posting, “what do you think of MMPI?” – as a diagnostic tool it’s OK but if it’s being used comparatively across individuals then I think it’s a party trick. Since it’s all self-reported data, why not just ask, “do you feel less suicidal than you did last time we talked?” Forcing a patient to answer a bunch of questions doesn’t make the test more accurate it makes it longer.

Here, I am specifically thinking about this sort of thing [psych] –

Two recent studies by Harvard psychologists deliver promising data from 2 tests that may help clinicians predict suicidal behavior. The markers in these new tests involve a patient’s attention to suicide-related stimuli and the measure of association with death or suicide. In the first study, lead investigator Matthew K. Nock, PhD and colleagues adapted the Stroop test and measured the speed at which subjects identified the color of words appearing on a computer screen. It was found that suicidal persons focused more on suicide-related words than neutral words. Suicide Stroop scores predicted 6-month follow-up suicide attempts well over traditionally accepted risk factors such as clinicians’ insight into the likelihood of a patient to attempt suicide, history of suicide attempts, or patient-reporting methods.

I would like to see an effectiveness comparison between those tests and just asking the patient how they feel that day. Maybe these tests are more effective – and if they are, I’d expect that to be easily shown. There are huge problems with any test where a subject is reporting their own assessment of their inner states. It’s one thing to ask, “do you feel more depressed than you did last week?” it’s another to ask, “do you feel more depressed than Fred, over there?” If you have a personality inventory, that’s the epistemological problem it is undertaking: it is trying to put everyone in the room on a common scale for depression, or IQ, or whatever. I understand why people would want to try to do that, but it’s an absurdly hard problem and when I see someone trying, I immediately suspect their motives or I wonder if they have studied the scientific method at all. [**]

The nature/nurture problem rears its head in all of this, because when we’re trying to measure something about a person, we have to deal with the question of whether we are measuring some attribute of the person, or some attribute of the society they grew up in. Someone might score radically differently on a test depending on whether they were raised with access to certain kinds of training, or not. Simply knowing a few test-taking techniques can shift a test-taker’s IQ score, which means that the test is measuring (to some degree) the subject’s education and (to some degree) something innate about the subject. That “to some degree” means, in my opinion, that the entire enterprise ought to be eyed with suspicion – especially when psychometrics are used to make decisions that will affect a subject’s life.

I tried to phrase that very carefully and I hoped I succeeded. There is a big difference, in my mind, between when society tries to use survey techniques to make decisions about people, and when a subject participates willingly with a medical professional to try to diagnose themselves so they can influence their own outcomes. If a psychiatrist uses a subjective depression test to help a patient assess whether they are depressed, that’s fine – it’s part of the tools of the field – but what happens if that same depression test is used to shunt students into a “special school for depressed kids” that reduces their opportunity in the world? Conversely, it may improve their opportunity – but all of that needs to be under the subject’s control because society has a terrible history of hiding racism, sexism, and xenophobia behind these things. In fact that is happening today, with IQ tests.

------ divider ------

I’ll do Part 2 if this posting survives the criticism and shredding I expect it to get.

* “I also get suspicious of psychometric tools that are not diagnostic: most specifically IQ tests.” Usually someone chimes in and says “but that’s not how they are supposed to be used! They are a good tool for measuring cognitive decline, for example!” – that’s true. But they are also being used to determine which high schools kids are allowed to attend, which has a very serious consequence on their lives, and social tropes embedded in an IQ test is a serious problem in that case. Look, I understand that a screwdriver is not a chisel, but you need to acknowledge that IQ tests are being used wrong all the time, in ways that continue to harm people.

[**] Richard Feynman’s portrayal of psychology as a cargo cult, and his take-down of a certain mouse-in-maze-running experiment, has had a great influence on me, although I didn’t read it until about 1991, and I had already gotten my psych degree and rejected the field by 1986. [feyn]

Regarding MMPI: I would say it’d be interesting if my psychiatrist told me, “remember that MMPI you took 15 years ago? Between then and now, you are scoring a lot higher on psychopathic deviate,” as I ate his brains with some chianti and fava beans.

Comments

  1. says

    Several times, when I’ve attacked psychology, people have attempted to draw a line between “pop psychology” and “science psychology” so let’s just assume that line exists and that we can mostly accurately distinguish pop psychology from science psychology. In other words, I am willing to imagine that there are Real Scientists who clutch their temples and cringe when they see “chimp study shows that gamers get more violent after playing violent games!” but they are too busy writing their next research grant to actually contact the writer at the website and tell them to correct the article.

    I noticed this pattern already some time ago while reading your blog posts where you criticize psychology and comments repeatedly say again and again that “this is just pop psychology, the read deal is a lot better.” This defense of “real psychology” bothers me. I understand that reporters don’t always understand what they are writing about, they also seek sensational news. Thus they misrepresent the findings of some study. I believe that in such situations it’s the researcher’s duty to contact the journalist and demand they fix or retract their news article. A failure to do so has very harmful consequences. Let’s assume some researcher does a study about the differences between men and women. Their findings get misrepresented by the press. Afterwards there are very real negative consequences—some men and women are denied jobs because potential employers assume them to be incompetent due to their gender. It can also be much subtle—due to the placebo effect a girl may get low scores in a school math test or some boy might be discouraged from pursuing a stereotypically feminine career. The harm is very real and prevalent, therefore “real researchers” cannot just ignore and dismiss the existence of pop psychology.

    The very fact a person like me who doesn’t have a degree in psychology believes that the crap I have read in media articles (the pop psychology) represents what real psychology is means that real psychologists have failed to communicate with the press. For a comparison, when some snake oil salesman comes up with a theory that water has memory, real chemists crack down on these claims and debunk them as pseudoscience and fraud. Thus no reputable newspaper publishes claims that water has memory. I believe that psychologists (if they want the public to see them as a legitimate science) have a duty to do the same and debunk pop psychology.

  2. dashdsrdash says

    Thesis: most people like learning.

    Antithesis: many people are not good at critically evaluating evidence, hypotheses, or falsifiability.

    Synthesis: science and pop science are treated as the same thing by a large segment of the population.

  3. kestrel says

    I so agree that first, not all animals are the same – even within the same species, there are a great many individual differences. And second, yes, it does not therefore follow that “mice do X, therefore humans will do X”. Nor does it work the other way. As an example of that, people don’t like chickens to be raised in confinement; they think, well, **I** would not like to be raised in confinement, therefore, chickens will not like being raised in confinement. But it turns out chickens have agoraphobia, and don’t like big open spaces. https://www.nature.com/news/2003/030807/full/news030804-7.html I like what the researcher says at the end: “It’s our fault for not thinking of it from the bird’s point of view.” (That doesn’t mean we should raise them in confinement; it means they have different needs than we do, and we need to take that into account when working with chickens.)

  4. lochaber says

    This kinda reminds me of the current mess regarding computer algorithms. People insisting that they are unbiased and objective, because computers can’t be racist/sexist, etc.
    One of the criticisms I’ve heard about psychology is that a majority of the experiments use college students, which is a small part of the population, and not necessarily representative of non-college students.
    My experience with psychology is pretty much limited to me dropping out of an intro course, so I don’t have much perspective on the field itself.

  5. Enkidum says

    Here to shred, I guess. This has been a pretty awful series of Gish Gallop posts. It’s very disappointing (and has been on previous occasions) when I read someone whose writing I mostly respect sounding exactly like some of the more coherent creationists who used to pop up on PZs comment section.

    There are a good twenty or thirty points that you’ve made in this and the previous two posts on the subject, some loosely-connected, some almost entirely unrelated to each other. It’s very hard to provide a single criticism because you’re jumping all over the damn place. I’ll pick one, super simple issue:

    I would like to see an effectiveness comparison between those tests and just asking the patient how they feel that day. Maybe these tests are more effective – and if they are, I’d expect that to be easily shown.

    The previous sentence you quoted reads:

    Suicide Stroop scores predicted 6-month follow-up suicide attempts well over traditionally accepted risk factors such as clinicians’ insight into the likelihood of a patient to attempt suicide, history of suicide attempts, or patient-reporting methods.

    I have no idea of the validity of this study, and neither do you, because as with seemingly ever study you comment on here, you don’t appear to have actually read it, which, seriously, jesus fucking christ that’s lazy. However, what do you think a “patient-reporting method” is? Hint: it includes things like asking the patient how they feel that day.

    Really, before you respond, please take the time to actually consider this. Can you think of a single “effectiveness comparison” of the type that you suggest that wouldn’t be a variant of what they’ve already done? Are you sure they haven’t already done it? If not, don’t you think you’re being more than a little irresponsible here? You have something of a presence on the internet, with quite a few people who take you very seriously indeed. You should do better than this.

    […]

    I understand why people would want to try to do that, but it’s an absurdly hard problem and when I see someone trying, I immediately suspect their motives or I wonder if they have studied the scientific method at all.

    Is it hard, or is it impossible? Most of what you’ve said suggests the latter, but you seem to be walking that back quite a bit in these posts. Science is hard. Particularly when, as with the cognitive sciences in general, the science is still figuring out its conceptual underpinnings. Sure, there’s plenty of handwavium in the field. But there was a lot of gibberish being written in biology a hundred years ago and in physics and chemistry two hundred years ago. (Still is, for that matter.)

    Human minds (and other minds) are absurdly complex, arguably the most complicated thing we have ever studied. So there’s going to be a lot of mistakes on the way. It’s a really, really hard problem, which has to be attacked from a lot of different angles. One of those angles is going to be the systematic study of behaviour in experimentally-controlled settings – you know, psychology.

    […]

    I’m definitely biased here. I’m a working research psychologist, who does some cross-species stuff (but mostly restricted to humans). So yeah, these posts are extra annoying. But I promise you it’s not because I’m terrified that you’ve figured out our scam, or that I’m having some uncomfortable realizations about my field.

    There are some interesting criticisms of psychometric testing and cross-species translational research out there. This was not one of them.

  6. says

    This is a good article that gave me a lot to think about, but I have to criticise one line.

    Since it’s all self-reported data, why not just ask, “do you feel less suicidal than you did last time we talked?”

    Depression/anxiety impede your recall of past emotional states. When I was depressed, I knew in theory that I hadn’t always felt that way, but I couldn’t call to mind any past emotional states. There are also proto-suicidal thoughts that the patient may not recognise as red flags, but a longer survey could tease out (e.g. my family are better off without me). In other words, some self-reported statements are more useful than others.

  7. jrkrideau says

    @ 1 Ieva Skrebele

    Thus they misrepresent the findings of some study. I believe that in such situations it’s the researcher’s duty to contact the journalist and demand they fix or retract their news article.

    Best of luck, lots of scientists have tried without much success. It is not easy to get an amendment or retraction. Plus, research seems to say that just printing or verbally retracting something, say in a TV or radio program seldom is very successful for any number of reasons.

    I believe that psychologists (if they want the public to see them as a legitimate science) have a duty to do the same and debunk pop psychology.

    See above. It often can be much, much harder than one would wish. People are still using that damned Myers–Briggs Type Indicator.

    Believe it or not, psychologists, among other disciplines have been studying the problem and how to attack it with very limited success. Those loud knocking sound are psychologists and other scientists—climate scientists in particular—pounding their heads against the wall.

    I am not sure but I think the “water has memory” dust-up was partly because the thesis was being advanced by Luc Montagnier a recipient of the Nobel Prize in Physiology or Medicine. This type of thing is often referred to as going Nobel or, more generally as going emeritus.

    I suspect that having a Nobel winner advancing such an idea was so attention getting and embarrassing enough that a lot of scientists in relevant fields reacted like scalded cats.

  8. jrkrideau says

    The information about the Stroop test and suicides comes from Harvard’s PR department. First rule of thumb in reading about scientific research is never trust press releases or newspapers to get it right. Their record for accuracy–excluding a very small number of reporters—is very spotty. Essentially one must read the original paper.

    I would like to see an effectiveness comparison between those tests and just asking the patient how they feel that day.

    I think you may have missed the outcome measurement here or I am misunderstanding you point

    Suicide Stroop scores predicted 6-month follow-up suicide attempts well over traditionally accepted risk factors

    Having done a quick google to remind myself what a Stroop test (and the underlying Stroop Effect) was, I have a problem believing the researchers’ results but I have not read the paper(s)?

    That “to some degree” means, in my opinion, that the entire enterprise ought to be eyed with suspicion – especially when psychometrics are used to make decisions that will affect a subject’s life.

    Ah, you don’t think that any properly trained psychologist is not aware of these problems despitesome of these problems and issues having been subjects of research for the last 50 or 75 years? As I have mentioned before, the real problem is almost always when some lay person who knows nothing about how any test works uses it for completely inappropriate purposes, jiggers around with the scoring, or completely misunderstands what the results mean in a specific situation.

    The Flynn Effect is a good example of this. Ever since Flynn identified the phenomenon, I think that psychologists grasped the issue very easily. It is the layperson user of I.Q tests who lacks the background knowledge that would let them understand the issue if they have even heard of the Flynn Effect. Understanding it would tell them that they may be badly misusing test results and lives can depend on these results.

    While not a psychometric test, the value-added analysis that is used to evaluate many teachers in the USA is a case in point. Whatever blithering idiots came up with the idea had no idea what they were doing. The theory is amazingly stupid and the measurement is crazy. Still they seem to have managed to sell it to any number of credulous school boards and state education authorities. Currently teachers are losing their jobs based on totally bogus test scores.

    BTW where did you get the idea that I.Q. tests are not diagnostic? They often are used as one component of a diagnostic package.

    Re Feynman Physicists

  9. says

    Re: Dentist Sock @6

    Depression/anxiety impede your recall of past emotional states.

    QFT. One big problem with diagnoses and tracking in psych* fields is that a lot of mental conditions – even before we consider the hardcore ‘can’t coherently communicate’ types – interfere with the process, and can make the self-reporting of even cooperative, motivated, honest patients unreliable.

  10. says

    … which is to say… there is a weird Orwellian “We have always been at war with Eastasia” quality to depression… it always feels like it’s been forever, and will always be forever…

  11. says

    jrkrideau @#9

    Once a story is out there it is amazingly hard to kill it.
    The Proctor & Gambles + Satan story is a good example in the extreme.

    I understand the issues with the press and uneducated journalists. My main problem is that I have seen university professors teaching pop psychology to their students. About 6 years ago, while studying for my bachelor’s degree in linguistics, I took a course that was introduction to psychology. My professor had a doctor’s degree in this field, yet she taught us borderline pop psychology. About 3 years ago, while studying for my master’s degree in linguistics, I took a course about literary analysis and my professor (her degree was in literature) taught us about Freud’s Oedipus complex with a straight face. Why is it so hard to pass around a memo among university professors that they ought to stop teaching pop psychology to their students? What I know about pop psychology I didn’t learn from newspapers, I learned it in universities.* And, in my opinion, that shouldn’t be happening. Alright, I understand that my introduction to psychology course was aimed at students who studied other subjects and this course was only a single semester with a single lecture per week, I understand that you cannot include anything advanced in such a course. But you shouldn’t include pop psychology bullshit either.


    * I studied in University of Latvia as well as in Johannes Gutenberg University Mainz (that’s in Germany). I cannot make any claims about what’s happening in other universities.

  12. voyager says

    A thoughtful reply, thanks. I agree with just about all of that, especially the discussion around test results being generalized across populations as tools of economic and social oppression.
    As well, I agree that personality testing can be useful for an individual if repeated over time, but I also think that a single test has merit.

  13. Enkidum says

    Marcus @10: No worries, I saw from your other posts you’ve got real life things going on. Heat is more important than arguing on the internet.

  14. jrkrideau says

    @ 13 Ieva Skrebele
    I took a course that was introduction to psychology. My professor had a doctor’s degree in this field, yet she taught us borderline pop psychology.

    I wish I could say I am surprised. There seem to be some places that are still heavily pyschoanalytic or otherwise nuts. One does not see a lot of them in the English-speaking world though our distinguished Jordan Peterson from the University of Toronto is an outstanding example of the bat-guano crazy.

    I have read reports suggesting that France and Argentina are heavily pyschoanalytic/Freudian in some of the clincial areas. It is unlikely to have any effect on, say, studies in perception and cognition or developmental language diagnostics.

    About 3 years ago, while studying for my master’s degree in linguistics, I took a course about literary analysis and my professor (her degree was in literature) taught us about Freud’s Oedipus complex with a straight face.

    Damn lay people. She, probably knows just about zero about psychology as a whole and is teaching what she was taught 20 or 30 years ago by a linguist who learned it 40 or fifty years ago.

    Why is it so hard to pass around a memo among university professors that they ought to stop teaching pop psychology to their students?

    Because the professors “obviously” know better, particularly the older ones? It is particularly hard to break the bad habit if the nonsense they are spouting is not in their area of expertise. And it would be a lot of work to revamp the course.

    As Max Planck apparently said ““A new scientific truth does not triumph by convincing its opponents and making them see the light, but rather because its opponents eventually die, and a new generation grows up that is familiar with it”, usually paraphrased as “Science advances one funeral at a time”.

    I understand that first year Economics courses in many universities still teach utility theory which was shown to be about as nonsensical as Freudian theory years ago.

    Many professors have bought into pop culture or the prevailing myths and it can be maddeningly difficult to get them to look at the current state of knowledge. When they do, they seem to ignore it.

    I remember a few years ago talking to a Ph.D candidate in history who was going to use a Freudian framework for her dissertation. I was a bit horrified and mentioned that Freudian theory was completely discredited. She said she knew that! Arrgh!

    I follow a couple of history of science blogs and the blog owners often seem to be tearing their hair out when they hear or read a scientist discussing the history of science. The Galileo Gambit story about the reactionary Church persecuting Galileo [1] and Christoper Columbus proving the world was round are two of the really bad ones.

    [1] By our standards the Church did persecute him a little bit but not for the reasons that are normally stated.

  15. says

    jrkrideau@#16

    My current opinion about psychology as a science is pretty low. I’m mostly blaming university professors for this, rather than journalists.

    That being said, I’m willing to believe when you and Enkidum say that there is a vast difference between pop psychology and real psychology and that the real deal is a lot better. I might as well try giving psychology another chance. Do you have any book recommendations for something that would cover the basics? Topics I’m particularly interested in would be:
    -current state of psychology as a science, what topics are being researched, what methods are used for this research;
    -what precautions researchers implement in order to ensure that their experiments and especially the conclusions drawn from those are sound;
    -where exactly the line between psychology as a science and pop psychology is drawn, what is considered the former and what is seen as the latter;
    -which old theories are now discredited.
    I assume there ought to be some psychology textbooks aimed at first course students that basically say “everything you heard about psychology from journalists is wrong, here’s what you will be learning instead.” Do you have any recommendations for something like this?

    About 3 years ago, while studying for my master’s degree in linguistics, I took a course about literary analysis and my professor (her degree was in literature) taught us about Freud’s Oedipus complex with a straight face.

    Damn lay people. She, probably knows just about zero about psychology as a whole and is teaching what she was taught 20 or 30 years ago by a linguist who learned it 40 or fifty years ago.

    Actually no, not this time. In some way, it was even worse. My professor was quite young, probably in her thirties. During that lecture, when she started talking about Freud’s theory, I immediately raised my hand and started asking questions that disrupted the lesson. “Are you aware that modern psychology distances itself from Freud’s theories that are discredited by now?” “Do you know that Freud simply invented theories out of thin air with no evidence whatsoever?” “Where are the proofs that a significant portion of children really experience Oedipus complex?” Since other students in the room said that Freud’s theory sounded plausible, I ended up asking, “Am I really the only person in this room who doesn’t want to have sex with one of my parents?” At this point my literature professor finally admitted that she herself doesn’t believe in Freud’s theories and she is also aware that these theories are subject to lots of criticism. She didn’t make the program for this literary analysis course. Another professor had created this course, and now she simply had to teach this course the way how it was made by this other professor. At this point I finally shut up and stopped disrupting the lesson. Honesty is one thing I can appreciate in professors, and there was no point demanding my professor to justify a theory she herself didn’t even believe in. As you can see, in this case even the retirement of an old professor wasn’t enough to get Freud out of university lectures.

  16. says

    Enkidum@#5:
    I’m going to respond to your comment on two levels – first, I need to make some meta-commentary about honesty, then I’ll dig into the details of your observations on my posting.

    Here to shred, I guess. This has been a pretty awful series of Gish Gallop posts. It’s very disappointing (and has been on previous occasions) when I read someone whose writing I mostly respect sounding exactly like some of the more coherent creationists who used to pop up on PZs comment section.
    and
    because as with seemingly ever study you comment on here, you don’t appear to have actually read it, which, seriously, jesus fucking christ that’s lazy
    or
    you seem to be walking that back quite a bit in these posts

    A “gish gallop” is a dishonest debating tactic, in which the person using the tactic knowingly attempts to manipulate the audience by over-extending the honest person’s ability to refute their points. Implying that I’m doing a “gish gallop” or a series of them is an accusation of intellectual dishonesty. “Walking back” a prior posting is also an accusation of intellectual dishonesty – honest discussion is when you make a mistake you acknowledge it, stop putting forward that view, and adjust your beliefs. So, that’s two accusations or implications that you feel I am not being honest with you and the commentariat here; that I am being manipulative – which would imply that I am serving some agenda. You further try to imply I am serving some agenda or ideology by likening me to “some of the more coherent creationists.” And, of course implying that I haven’t done my research (“you don’t appear to have actually read it,”) is also a form of dishonesty, that I am cherry-picking sections from studies.

    Why would I do that? For one thing, I hope you’ve noticed that I have never made any recommendation for any behavior at all – I don’t advocate anything except for (in the broad sense) skepticism about IQ tests and pop psychology. As far as “walking back” any claims, I’ve been more careful to delineate “pop psychology” from psychology (and of course psychiatry) in general. If I were trying to manipulate people’s opinions with a “gish gallop” or quote-mining, I’d be doing it in service of some kind of agenda.

    My interest in this stuff is a result of my experience while I was getting my BA in psychology. I was particularly interested in the sections on testing methodologies and I still am. It’s still a trained response in me that, when I see a study, I look for self-selected samples, self-reporting, and unrepresentative samples. Perhaps I see that sort of thing because that’s what I’m looking for – but I see it all over the place. It’s odd that you accuse me of being scattershot and not reading studies carefully enough when what I am doing is homing in on the parts of the study that make me suspect a methodological problem. By the way, if anyone has an intellectual honesty problem, it’s the people publishing the studies, not me. Please don’t shoot – I’m only the messenger.

    Where I have “walked back” my opinion is that I’m being more careful not to dismiss all of psychology as bullshit because, in fact, that’s not fair. We all know that; if you want to accuse me of hyperbole, go ahead. In the meantime you should probably see my postings as aligned somewhat with the great big cloud of articles in the last couple years about psychology’s WEIRD problem, the replication crisis, etc. I’m sure you’ve noticed them, they’re pretty hard to miss. I don’t consider my postings about psychology to be any more or less provocative (or inaccurate!) than those; perhaps my writing’s not as good. In fact, I’m certain it’s not. Oh, and, by the way, when you see those other articles about psychology and its WEIRD problem or the replication crisis, they are also jumping around among studies and pointing at the parts that are broken – not surveying the field as a whole. As you know it’s a huge field and there’s plenty of good work being done in it; I don’t know how many times I have said that. What I’m pointing out are the naughty bits, just like I think a proper skeptic ought to. It’s not my fault that there are plenty of them. I don’t have the time or ability to fully paint in what a great big shitshow popular psychology is (see, there I differentiate it from “real psychology”) or where the good work is being done. My interest is on the boundary-zone between the two and where the one bleeds into the other. I think a lot of my problem, here, is that I spent too much time studying philosophy (specifically epistemology) and the scientific method, and it’s easy for me to get lost and uncomfortable in that boundary-zone. That’s why I went to the trouble to try to sketch out the knowledge problem in experiments: how do we know what someone is experiencing?

    I think it’s quite legitimate to voice concern with psychology’s epistemology, when it relies so heavily on self-reported checkbox inventories. If you’re a practicing psychologist, I expect you to be, as well. A huge amount depends on that.

    Anyhow, I find it funny that I feel like I’m trying to point out where psychology maybe has some honesty problems, and I get accused of intellectual dishonesty. Perhaps someday I will graduate to being merely wrong, in your eyes.

    Now, let me move to the substance of your complaint:
    Particularly when, as with the cognitive sciences in general, the science is still figuring out its conceptual underpinnings

    That’s part of the issue I am exploring. I agree with you that the science is still figuring out its underpinnings. In many respect, it’s my view that when psychology looks for its underpinnings it discovers that they’re not there. That’s awkward – for psychology, not me. I assume you’re aware that many of psychology’s famous studies, upon which others are built, have failed to replicate. Some of those studies aren’t just well-publicized, they are foundational. For example, “ego depletion” appears to not be real. That’s a problem for every subsequent study that assumes “ego depletion” is a reality. That’s how science works – you don’t get to say “OK we discovered phlogiston does not exist but we’re not retracting any subsequent papers about phlogiston because they’re very popular with the public.” I agree with you that “figuring out its conceptual underpinnings” is a serious mission for psychology and, as I’ve said elsewhere, the field would have moved forward better if it had been able to ditch Freud, Jung, Maslow, Pavlov, and all those other hacks a lot sooner. What is necessary for those conceptual underpinnings is, you mean, turning psychology into a science. It would have been easier to do that had psychologists spent less time playing at being scientists, from the 1900s to the 1980s, and had started being more rigorous about teaching experimental methods rather than teaching students how to write surveys and give them to other college undergraduates. That’s not my problem, though. Nor is it any of ours. We don’t owe the field a fair hearing – it has done a lot of damage with its half-baked ideas getting popularized and never withdrawn. I don’t think it’s appropriate to attack people for pointing that sort of thing out; critiques of psychology’s scientific basis have to sting right now, but the way to deal with that is to move psychology to a sound scientific epistemology.

    That’s happening, I know. That’s why most of the problems I write about are from the heyday of pop psychology. I oughtn’t need to put disclaimers around that, but I do.

    I have no idea of the validity of this study, and neither do you, because as with seemingly ever study you comment on here, you don’t appear to have actually read it, which, seriously, jesus fucking christ that’s lazy. However, what do you think a “patient-reporting method” is? Hint: it includes things like asking the patient how they feel that day.

    So you did read it? I didn’t because I’m familiar with the Stroop test and that it has been replicated successfully many times. Jezus fucking christ, that was lazy of me; I was able to understand the results from just the reporting on the study. I was not challenging the study, anyway, the point I was trying to make is that I think it’d be an interesting question to see if it was more effective than just asking. You appear to have completely missed the point and started playing my ball, when you say “it includes things like asking the patient how they feel that day.” Then how do you know which works, the test, or just asking? Or is it a bit of both? If “just asking” works fine almost all of the time, then why do we need the test? Or, if it doesn’t, then don’t ask and make all patients take a survey/test. It’s a minor point but thank you for illustrating the problem I was raising so well.

    Can you think of a single “effectiveness comparison” of the type that you suggest that wouldn’t be a variant of what they’ve already done?

    Right! And if what they’ve already done is working, then definitely look for incremental improvements. First, establish that it’s working. How do you do that? In evidence-based medicine you can tell how effective an intervention is because there are effectiveness measures based on outcomes. Of course they are variants on what’s always been done. I’m not complaining about that. Being able to say “we used to just ask our patients ‘how do you feel?’ but we discovered that it’s not actually a very good indicator 90% of the time, so we recommend this test/survey instead” – that’d be great. But not “here’s a test some doctors use and others just ask ‘how do you feel?'”

    I think that all fits under the rubric of what you were talking about earlier, that it’s a field which is trying to establish its conceptual underpinnings.

    Human minds (and other minds) are absurdly complex, arguably the most complicated thing we have ever studied. So there’s going to be a lot of mistakes on the way. It’s a really, really hard problem, which has to be attacked from a lot of different angles. One of those angles is going to be the systematic study of behaviour in experimentally-controlled settings – you know, psychology.

    We violently agree! And doing that using good science to put psychology on a sound epistemological basis is the starting-point. It’s unfortunate that half-baked ideas have been promoted as “pop psychology” and a great deal of time and effort has been wasted. That’s what I am criticizing.

  17. says

    dashdsrdash@#2:
    Thesis: most people like learning.
    Antithesis: many people are not good at critically evaluating evidence, hypotheses, or falsifiability.
    Synthesis: science and pop science are treated as the same thing by a large segment of the population.

    I think you’re right, and that’s the problem. How do we distinguish pop science from science? It’s an epistemological problem, and the scientific method’s default response to epistemological challenges is: evidence.

    As a civilization we have a problem where someone dumps a hypothesis to the media as though it were a fact (“putting jade eggs in one’s vagina promotes health!”) and suddenly the idea gets promoted into the memeosphere and people begin acting on it. When scientists try to explain “no, that’s a bad idea!” they’ve got to uproot an established lie which some people are personally and financially vested in. (Can I say “sunk cost fallacy” and “confirmation bias” or is that pseudoscience too?)

    As much as everyone with a clue seems to agree “people’s life-outcomes should not be affected by a Myers-Briggs Type Index” there are people, right now, being given those surveys by Human Resources staff. When I was a kid I was given a similar test and it said I’d make a good auto mechanic. That was true, of course, but I made a better computer programmer; the test did not have that option or I’m sure I’d have scored terribly in that department because I can barely add and a lot of people mistake programming for algorithms.

    Everyone with a clue seems to agree that IQ tests should not be used to determine what school a child gets into except exactly that is happening and if you have any doubt that IQ testing is valid (you should!) then we are witnesses to a crime being perpetrated under cover of pop psychology.

  18. says

    jrkrideau@#9:
    Once a story is out there it is amazingly hard to kill it.

    Carl Sagan said something, when the whole Velikovsky phenomenon was big, about how it was much harder to stop a wrong idea once it had escaped.

    What do we do about this? I don’t know. It’s particularly egregious when someone deliberately profits from it – like the cigarette companies calling into doubt long-term outcome studies on smoking, or the AGW denialist program. Deliberate lies are the worst, but well-intentioned ignorance is damaging as well.

    And the Andrew Wakefield scandal where he claimed that vaccines caused autism not only cannot be killed but his vile claims have killed a lot of children (and probably more than a few adults)

    Also, it has cost a tremendous amount of money by requiring additional infrastructure and legislation. It’s a lot harder to manage “parents can opt out which children get shots” than “everyone line up and you get a shot” like they did for us in basic training. (and they didn’t tell us what was in the shot because presumably it contained experimental innoculations against classified bioweapons)

  19. says

    Dentist Sock@#6:
    Since it’s all self-reported data, why not just ask, “do you feel less suicidal than you did last time we talked?”

    Depression/anxiety impede your recall of past emotional states. When I was depressed, I knew in theory that I hadn’t always felt that way, but I couldn’t call to mind any past emotional states. There are also proto-suicidal thoughts that the patient may not recognise as red flags, but a longer survey could tease out (e.g. my family are better off without me). In other words, some self-reported statements are more useful than others.

    That’s a perfect answer to my question, then. In some cases a survey is better than just asking. Then, I would say, don’t ask – everyone gets the survey/test, every time, always.

    In other fields of medicine, when it’s determined that a certain intervention really has benefit, it gets pushed pretty ubiquitously. I believe that in psychiatry that is finally starting to happen, and that’s a very very good thing. For one thing, if you push a particular intervention ubiquitously, then you can gather long-term outcomes! Imagine if every patient who talks to a psychiatrist or psychologist were taking the same test, first. Then it would be “easy” to see if the test were effective by mapping outcomes to test answers. But if we just have outcomes and don’t know who took the test, or the answers, we just have to fly by the seat of our pants.

    When I presented at the ER at Johns Hopkins Hospital, with a broken jaw, the nurse yelled “head injury” and I was immediately in a small room having a basic neurological assay done (pupil response, stuff like that) – out here in Clearfield Pennsylvania, the night before, the hospital sent me home and never checked for anything except that my jaw was broken and they gave me some painkillers. When I woke and there was a yellow smear of dried CSF on the pillow, I went to a real hospital. The point is: interventions are most effective if – once they are determined to be valuable – they are applied ubiquitously. On the other hand, if they are not, then don’t give everyone who walks in the door a cranial CT scan.

  20. says

    jrkrideau@#16:
    One does not see a lot of them in the English-speaking world though our distinguished Jordan Peterson from the University of Toronto is an outstanding example of the bat-guano crazy.

    Perfect example.
    I was choking down some video of him talking about IQ tests and he’s doing exactly what you should not do and using IQ tests to justify the belief that there are inherent differences in “intelligence” (and that they are tied to race and gender, not education and experience)

    https://www.youtube.com/watch?v=IxQyroQLr1Q

    Is Psychology Today real psychology or pop psychology? I submit that most of the public might mistake it for being about the more scientific side of psychology, whether it is, or not. Furthermore, it directs readers toward therapists and diagnoses, so it’s trying to be a trusted source. Yet, it promotes an article about IQ testing and promotes within it the notion that there is a general intelligence that is measured with the test: [pt]

    Are you a logical thinker? A numerical whiz? A verbal genius? Or are you spatially inclined? Are you looking for intellectual stimulation? Find out how smart you are (and increase your IQ) with our classical IQ test. This classical IQ test measures several factors of intelligence, namely logical reasoning, math skills, language abilities, spatial relations skills, knowledge retained and the ability to solve novel problems. (Please note that it doesn’t take into consideration emotional intelligence).

    Oh, there’s a thing called “emotional intelligence” now too? [Insert epistemological challenge here]

    Is that pop psychology or real psychology? Peterson’s got the credentials, and ought to know better, but there he is nattering on about Trump’s IQ exactly as though there is a baseline normal “intelligence” (whatever that is). People are going to believe that. People are going to believe what they read in Psychology Today because it says “Psychology” right there on the top!

    You and I have discussed IQ before and if I recall correctly we agree that IQ tests are a good tool for measuring cognitive decline in an individual such as an Alzheimer’s patient. That seems to make sense to me. Last time I checked, I seem to recall you also did not think that IQ tests are a useful comparative tool between individuals.

    I wish to continue to attack what I consider scientific fraud from people like Peterson, because it does real damage, both by poisoning the well of public knowledge, and by serving racists and xenophobes’ agenda. I’m not trying to hack down all of psychology – just the 50% that most of the public believe is real.

  21. Enkidum says

    Well now I owe you a thoughtful (and less grumpy) response, as you’ve been more than fair to me. Unfortunately anything substantive is going to have to wait as I’ve got a grant deadline tomorrow and am then spending the weekend in the woods. But thanks for your patience, and for not responding in kind.

    I think we have a lot to disagree about. I also think that at the end of the day, we agree on more than we disagree, and that it’s on me to delineate the boundaries.

    I will say one thing: I wasn’t accusing you of being deliberately dishonest, but I was (and honestly still am) accusing you of being lazy/sloppy in your thinking and articulation of your thought, to the poijt of it resulting an ethical problem when you write about these issues, because the picture is so skewed. Admittedly, this is far less of an ethical issue than, say, lobotomizing unruly patients or using shitty tests to determine children’s futures, but it’s a real one nonetheless.

    I will also add: I am a notorious curmudgeon about these kinds of issues, and I certainly don’t think that the sins I’m accusing you of are mortal, or unique to you. I commit similar ones myself when talking about other things. And clearly, given my job, I’m biased as hell.

    Argh… I started writing about Pavlov, but I will resist the temptation for now. Here, last thing:

    You appear to have completely missed the point and started playing my ball, when you say “it includes things like asking the patient how they feel that day.” Then how do you know which works, the test, or just asking? Or is it a bit of both? If “just asking” works fine almost all of the time, then why do we need the test? Or, if it doesn’t, then don’t ask and make all patients take a survey/test. It’s a minor point but thank you for illustrating the problem I was raising so well.

    Again, this is essentially what the summary of the article claims the authors did.

    Now… in your defence I’m almost certain that you’d have legitimate objections to the way they did this, which would likely be a comparison with previously-reported effect sizes. There are issues with this. But this isn’t the point – the point is that what you asked for is something that is utterly commonplace in the field (indeed you’d never get a paper like this past peer review without addressing precisely this point), and something that is described in the summary of the article, in the sentence you quoted!

    I cannot see any way of interpreting what you wrote as anything other than a putative criticism of the field. But this is like criticizing the banking industry for not putting enough emphasis on taking other people’s money. It’s a central part of the whole enterprise.

    Does this not strike you as a problem, a real problem, with what you’ve written? If not I’m stumped – I don’t know how else to convey to you how very, very far from ok it is to do this, and how it very quickly causes those of us in the field to have serious issues with taking the rest of your writing seriously.

Leave a Reply