Sometimes, scientists abuse these terms, too


It’s good to see popular abuse of scientific concepts called out in this article, 10 Scientific Ideas That Scientists Wish You Would Stop Misusing, but sometimes I’m more concerned about scientists who perpetuate the abuse. The one on statistical significance, for instance…the whole idea of setting up “p<0.05” as some kind of Holy Grail for publication has led to a lot of bad science.

I also very much liked Marlene Zuk’s comment on learned vs. innate:

One of my favorite [misuses] is the idea of behavior being "learned vs. innate" or any of the other nature-nurture versions of this. The first question I often get when I talk about a behavior is whether it’s "genetic" or not, which is a misunderstanding because ALL traits, all the time, are the result of input from the genes and input from the environment. Only a difference between traits, and not the trait itself, can be genetic or learned — like if you have identical twins reared in different environments and they do something different (like speak different languages), then that difference is learned. But speaking French or Italian or whatever isn’t totally learned in and of itself, because obviously one has to have a certain genetic background to be able to speak at all.

Comments

  1. colnago80 says

    The one on statistical significance, for instance…the whole idea of setting up “p<0.05″ as some kind of Holy Grail for publication has led to a lot of bad science.

    I can’t speak as to biology but in the physics community, 5 standard deviations is now required for publication in reputable journals like the Physical Review (.05 is approximately 2 standard deviations).

  2. says

    My favorites are “natural” and “organic”. Mercury and radium are both completely natural, that doesn’t mean they’re any good for you. And botox, the most powerful toxin in the world is both natural and organic.

  3. julial says

    Perhaps I take it too far, but I don’t even like the phrase “artificial selection” or that there is something unnatural about “design” when looking at say, a pocket watch.
    I claim that all selection is natural selection and that yes, eons of evolution by natural selection can produce a 747 because, as products of humans who are products of natural selection, it has.

  4. twas brillig (stevem) says

    in the physics community, 5 standard deviations is now required for publication in reputable journals like the Physical Review (.05 is approximately 2 standard deviations).

    Maybe so, good to know, BUT. Why then did the MSM report that the searchers for the Higgs Boson detected something, but were waiting for 3 sigma (standard deviation, in statisticspeek) variation to declare it found? 5 sigma seems rather extreme, but definitely Significant.

    We all know, from reading PZ’s blog, how common it is (for THEM) to fumble the use of the word “theory”. [too easy to comment].

    “Natural”/Artificial :: Those two’s meanings have wandered all over. I take my way away, by asking what is the difference between Natural Vitamin C and artificial Vitamin C? Is ascorbic acid artificial? Got any natural Naugahyde? Who raises Naugas for their hyde?
    Organic :: Is this tomato organic or inorganic?
    so on and so on.
    I chuckle whenever I see (so frequently) misused science words, as attempts to reassure the ‘cautious’ or impress the gullible.

  5. says

    Deciding whether you’ve demonstrated the existence of the Higgs boson is one thing, which is not my area of expertise. However, most of the discussion of statistical significance in public discourse has to do in one way or another with human health — medical interventions, environmental or behavioral risks. Here, for the most part, the p value, at whatever level you want to set the threshold for significance, is really a poor guide to inference. This is a long and somewhat arcane story, but to try to put it in a nutshell prior plausibility should have a powerful place in our thinking, as should the incorporation of results of multiple trials into our inference engine, along with convergence of various lines of evidence, study quality and other factors. We don’t need to do a randomized controlled trial and compute a p value to know that you are better off jumping out of an airplane with a parachute than without one; whereas a trial showing the homeopathy is effective with p<.0001 is almost certainly a meaningless fluke.

  6. mikeyb says

    p<0.05 is intended to be a minimum threshold. Even the p<0.05 is dependent on the statistical test you use to some extent. Of course you have to look at a whole host of other factors to determine if you believe the results, like repeatability, sample size and if the sampling is random or representative, use of controls, etc. etc. and scientific judgment/eye test to see if there is something peculiar to the experiment or setup – e.g. non-specific binding etc etc which could account for the results. Since biological results are fuzzier than physics since there are a lot more factors that are beyond a researchers control (eg in rodents, even in inbred strains you don't really know if there are peculiar genetics or environmental or subtle behavioral effects going on, etc.), to go to the other extreme and demand say p<0.001 for publication would lead to a lot of good science thrown out with the bad.

  7. Rob Grigjanis says

    twas brillig @6:

    Why then did the MSM report that the searchers for the Higgs Boson detected something, but were waiting for 3 sigma (standard deviation, in statisticspeek) variation to declare it found?

    I don’t remember what the MSM reported, but CERN didn’t report ‘new particle observed’ until 5 sigma signals were seen in CMS and ATLAS, and even then, they only said it was a boson ‘consistent with’ the Higgs.

  8. Howard Bannister says

    Cannot tell if comment 5 is a pithy self-demonstration, done humorously, or just an unwitting demonstration.

  9. Anthony K says

    p<0.05 is intended to be a minimum threshold. Even the p<0.05 is dependent on the statistical test you use to some extent. Of course you have to look at a whole host of other factors to determine if you believe the results, like repeatability, sample size and if the sampling is random or representative, use of controls, etc. etc. and scientific judgment/eye test to see if there is something peculiar to the experiment or setup – e.g. non-specific binding etc etc which could account for the results. Since biological results are fuzzier than physics since there are a lot more factors that are beyond a researchers control (eg in rodents, even in inbred strains you don't really know if there are peculiar genetics or environmental or subtle behavioral effects going on, etc.), to go to the other extreme and demand say p<0.001 for publication would lead to a lot of good science thrown out with the bad.

    My understanding is that different alphas are appropriate for different fields. Of course, simply using any alpha as a hard and fast cutoff is problematic: a p-value of 0.049 isn’t much different than a p-value of 0.051. More troubling—I know it happens though I can’t say how widespread it is—is the practice of doing multiple comparisons without adjusting the alpha to compensate. I saw one published paper in which the researcher(s) performed 100 separate t-tests to compare 10 different treatments, using an alpha of 0.05 for each test. The joke was on them, though; only 4 of their tests produced p-values less than 0.05.

  10. Anthony K says

    Correction to my above: a p-value of 0.049 isn’t necessarily much different than a p-value of 0.051.

  11. dmgregory says

    I’m surprised you didn’t give them a hard time for this bit, though:

    …it wasn’t long ago that we thought that most of our DNA didn’t do anything at all. We called it “junk DNA”, but we’re discovering that much of that junk has purposes that weren’t immediately obvious.

    This seems to advance the myth that this DNA was labelled as “junk” by assumption, based solely on a lack of obvious function/coding, and that recent evidence is throwing question on the idea of junk itself. :(

  12. parasiteboy says

    Anthony K@13&14
    Your right about just reporting if a variable is above or below a cutoff threshold. I have often wondered why journals don’t require exact p-values for a statistical test. I think the small amount of extra space they would take up is well worth the extra information when evaluating the author’s interpretation of their data.

  13. Trebuchet says

    I can’t believe they didn’t include “Energy” to the list.

    And “vibration”.

  14. twas brillig (stevem) says

    from another thread:

    That is the nature of abstractions. They only describe specific parts of the system being studied.

    Exactly; that is what “abstract” means: “not the whole picture, just bits and pieces”.
    But even scientists, just a few, will hedge their words by starting with “Abstractly …”, “In the abstract …” Implying, or I inferred, that they are trying to present this abstract as the whole explanation. Even though their papers are often required to begin with “Abstract”, or rough synopsis of the whole paper. And then laymen will read an abstract and reach their final conclusion, based only on the abstract.

  15. Rob Grigjanis says

    michael @19:

    I wish ‘paradigm’ would die a horrible painful death

    I like ‘paradigm’. How many words have the ‘gm’ combo at the end? I’d retire ‘heuristic’, which seems to have become ‘I will now wave my hands furiously whilst signifying nothing’.

  16. mnb0 says

    “the whole idea of setting up “p<0.05″ as some kind of Holy Grail "
    Ah yes, I remember me bringing up that idea in the presence of my teacher physics (actually I said p<0,01). He countered with "on what is that number based?" It was a good lesson.

    In addition to #3: I'd like to replace "Uncertainty Principle" with "Probabilism Principle". If I got a dime every time I read "uncertainty (a la Heisenberg, so is the implication) doesn't rule out causality, it just means we are not certain about the causes" or something like that I wouldn't have to work the rest of my life.

  17. Anton Mates says

    in the physics community, 5 standard deviations is now required for publication in reputable journals like the Physical Review (.05 is approximately 2 standard deviations).

    Of course, for researchers who are dissatisfied with the concept of p-values, the sigma standard has all the same limitations. It’s basically a slightly sloppier p-value where you assume normality for the statistic in question. (If the statistic isn’t normally distributed, 2 standard deviations could translate to any p-value from 0 to 0.25 or so.)

    So far as I can tell, it’s just a quirk of history that particle physicists ended up using sigmas and life scientists ended up using p-values. I guess physicists are less worshipful of Fisher, or something.

  18. David Marjanović says

    And then laymen will read an abstract and reach their final conclusion, based only on the abstract.

    They kind of have to when the rest of the paper is behind a paywall… that’s not something scientists are to blame for, though. Quite the opposite.

    I’d retire ‘heuristic’, which seems to have become ‘I will now wave my hands furiously whilst signifying nothing’.

    “Heuristic search” is a well-defined technical term, for example in phylogenetics.

    I’d like to replace “Uncertainty Principle” with “Probabilism Principle”.

    Fun fact: in the original German, it’s called “blurriness” instead of “uncertainty”. “Blurriness” fits as well as anything could!

  19. madscientist says

    I’ve often wondered how many scientists out there actually understand even the basics of statistics. From my point of view, Myth#1 is that everything has a unimodal distribution and small but real populations are ‘noise’ because they don’t fit the Right Curve.

  20. WhiteHatLurker says

    Of course, for researchers who are dissatisfied with the concept of p-values, the sigma standard has all the same limitations. It’s basically a slightly sloppier p-value where you assume normality for the statistic in question.

    It is not necessarily a normal distribution that is used. I agree, they are both frequentist measures, but the sigma measure can have a stronger argument made for it.

    There is an anti-p-value movement out there. Please support it where possible.

  21. WhiteHatLurker says

    Then I went to the linked site – Orphan Black? Anybody that references Orphan Black has to be on the correct side B-)

  22. Anton Mates says

    I agree, they are both frequentist measures, but the sigma measure can have a stronger argument made for it.

    I’d be interested (no sarcasm) to hear what that argument is. I don’t have a lot of experience using sigmas myself, and I don’t know what statisticians say about their utility.