Better detection of false positive results

I wrote recently about an article by Ed Yong about a new technique that could be of use in catching research papers where the authors have massaged their data to get positive results. Yong has now identified and interviewed the hitherto anonymous developer of this technique. He is social psychologist Uri Simonsohn of the University of Pennsylvania.

In an interview with Yong, Simonsohn explains how he stumbled onto this problem and the basic idea behind his method, which he is planning to publish. He also illuminates a problem, if one can consider it that, in that science operates a lot on trust. Scientists believe the work of other scientists unless there is reason to suspect otherwise and this makes it easier for less scrupulous scientists to sail a little too close to the wind.

What’s systemic is the lack of defences. Social psychology — and science in general — doesn’t have sufficient mechanisms for preventing fraud.

If there’s a tool to detect fake data, I’d like people to know about it so we can take findings that aren’t true out of our journals. And if it becomes clear that fabrication is not an unusual event, it will be easier for journals to require authors to publish all their raw data. It’s extremely hard for fabrication to go undetected if people can look at your data.

A university’s reputation suffers a lot when people fake data, but they don’t have tools for preventing that — journals do. Journals should be embarrassed when they publish fake data, but there’s no stigma. They’re portrayed as the victims, but they’re more like the facilitators, like a government that looks the other way. I’d like journals to take ownership of the problem and start working towards stopping it.

I think that is a very good idea. If the self-correcting nature of science is to be maintained, journals must start policing themselves much more, either by giving closer scrutiny to the papers submitted to them before publication or being more willing to publish contradictory findings of previously published work or even retracting them.

A longer version of Yong’s interview with Simonsohn can be read here.


  1. says

    I found it odd the first time I heard that most journals do not require the publication of the raw data samples, and this is still largely the case today. Without the data, it’s even possible that researchers are committing elementary mathematical errors and no one would really know.

    Perhaps there are some researchers afraid that someone else will get the jump on them if they publish data in advance, but that’s easy to fix. Wait until the analyses are finished (or a little later) to publish everything at once.

  2. tbell says

    The best protection seems to me to be independent replication. However there is a premium placed on novel results, and a lot of barriers placed in front of failures to replicate.

  3. FedUp(OrJustFed) says

    what about all the vanity journals that will publish any crap you offer as long as you pay their exorbitant fees?

  4. beth says

    I found it odd too. I can understand not wanting to waste print space on large datasets, but I don’t understand why the raw data files can’t be made available on the web.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>