Science relies on honest observation


Elisabeth Bik is getting mad. She has spent the better part of a decade finding examples of scientific fraud, and it seems to be easy pickings.

Although this was eight years ago, I distinctly recall how angry it made me. This was cheating, pure and simple. By editing an image to produce a desired result, a scientist can manufacture proof for a favored hypothesis, or create a signal out of noise. Scientists must rely on and build on one another’s work. Cheating is a transgression against everything that science should be. If scientific papers contain errors or — much worse — fraudulent data and fabricated imagery, other researchers are likely to waste time and grant money chasing theories based on made-up results…..

But were those duplicated images just an isolated case? With little clue about how big this would get, I began searching for suspicious figures in biomedical journals…. By day I went to my job in a lab at Stanford University, but I was soon spending every evening and most weekends looking for suspicious images. In 2016, I published an analysis of 20,621 peer-reviewed papers, discovering problematic images in no fewer than one in 25. Half of these appeared to have been manipulated deliberately — rotated, flipped, stretched or otherwise photoshopped. With a sense of unease about how much bad science might be in journals, I quit my full-time job in 2019 so that I could devote myself to finding and reporting more cases of scientific fraud.

Using my pattern-matching eyes and lots of caffeine, I have analyzed more than 100,000 papers since 2014 and found apparent image duplication in 4,800 and similar evidence of error, cheating or other ethical problems in an additional 1,700. I’ve reported 2,500 of these to their journals’ editors and — after learning the hard way that journals often do not respond to these cases — posted many of those papers along with 3,500 more to PubPeer, a website where scientific literature is discussed in public….

Unfortunately, many scientific journals and academic institutions are slow to respond to evidence of image manipulation — if they take action at all. So far, my work has resulted in 956 corrections and 923 retractions, but a majority of the papers I have reported to the journals remain unaddressed.

I’ve seen some of the fraud reports, and it amazes me how stupid the scientists committing these fakes must be. It’s as if they think jpeg artifacts don’t exist, and can be an obvious fingerprint when chunks of an image are duplicated; they don’t realize that you can reveal cheating by just tweaking a LUT and seeing all the duplicated edges light up. The only reason it’s done is to adjust your data to make it look like you expected it to look, which is an obvious act against the most basic scientific principles: you’re supposed to use science to avoid fooling yourself, not to make it easy to fool others.

This behavior ought to be harshly punished. If image fakery became in issue when one of my peers came up for tenure or promotion, I’d reject them without hesitation. It’s not even a question: this behavior is a deep violation of scientific and ethical principles, and would make all of their work untrustworthy.

Also, this is a problem with the for-profit journal publication system. Those scientists paid money for those pages, how can we possibly enforce honesty? The bad actors wouldn’t pay us for journal articles anymore!

But guess what happens when Elisabeth Bik takes a principled stand?

Most of my fellow detectives remain anonymous, operating under pseudonyms such as Smut Clyde or Cheshire. Criticizing other scientists’ work is often not well received, and concerns about negative career consequences can prevent scientists from speaking out. Image problems I have reported under my full name have resulted in hateful messages, angry videos on social media sites and two lawsuit threats….

Things could be about to get even worse. Artificial intelligence might help detect duplicated data in research, but it can also be used to generate fake data. It is easy nowadays to produce fabricated photos or videos of events that never happened, and A.I.-generated images might have already started to poison the scientific literature. As A.I. technology develops, it will become significantly harder to distinguish fake from real.

Science needs to get serious about research fraud.

How about instantly firing people who do this? Our tenure contracts generally have a moral turpitude clause, you know. This counts.

Comments

  1. says

    One of the most common human mental characteristics (I don’t use the term ‘human nature’ it is self-contradictory) is dishonesty. Also, look at the world around us, there is almost none of the rampant political, social media and now AI fraud being held accountable. WTF.

  2. robro says

    As A.I. technology develops, it will become significantly harder to distinguish fake from real.

    Indeed, which is why the Biden admin is talking to OpenAI, Meta, Google, and other tech companies exploring LLMs about identifying…aka “watermarking”…fake images. However, I doubt their recent agreement will curtail the practice significantly. While AI can help identify doctored images at scale, it will also be used to circumvent detection systems.

  3. says

    A few years ago the ‘big tech companies’ with names starting with A and G stole millions of other people’s published works and after all the legal dust settled they weren’t punished, still make money off that stolen property and never had to get rid of it. My organization was a potential victim of that theft, so we took extreme measures to protect our intellectual and artistic creations. Now, we face AI ‘ingesting’ (stealing) everyone’s work. There is even an article stating that DeathSantis and/or affiliiates created a phony AI voice of tRUMP! I don’t even know if our hidden, encrypted copyright notices, watermarking and steganography will adequately protect our works from AI.

  4. Rob Grigjanis says

    What’s great about doing theoretical physics; it’s hard to sneak in ‘2+2=5’ without someone noticing.

  5. Rich Woods says

    @Rob #4:

    “2+2=5? Your universe will collapse twenty minutes before its Inflationary Era.”

  6. wzrd1 says

    robro, as I’ve long joked, “As soon as we invent a better mousetrap, some SOB comes along with a smarter mouse”.

    Rob Grigjanis, I seem to recall Einstein got gravitational lensing wrong at first, corrected it in 1919. Got it wrong on gravitational waves, initially claiming they couldn’t exist, due to the coordinate system breaking down and had to correct that as well. Then, there was that whole cosmological constant thing…
    Of course, the only way to avoid getting things wrong is to do nothing, which in and of itself is just a different way of being wrong.
    The trick is to catch the 2+2=5 and correct it before anyone else notices and issue a correction, making everyone else feel sheepish for not noticing themselves. ;)

  7. Allison says

    The thing is, this is the lazy fraud. People who don’t want to get caught do the faking long before it ever gets to an image.

    Graphs and other visual representations of numeric data are dead easy — just massage the numbers that go into whatever program generates the graph.

    Science — like civic life — depends upon a culture of honesty and trust, but the incentives to produce “good” results put pressure on to lie.

    I’m confident that this is just the tip of the ****berg.

  8. chrislawson says

    I suspect that fraudsters using AI will make it easier to identify dodgy data because the AI will leave unwitting artefacts. And you don’t need AI to fudge data. Anyone with a passing knowledge of coding can write a program to generate fake data that follows any desired function with enough noise to make it statistically indistinguishable from real data. Of course, this is not what is happening in most scientific fraud. Usually the researchers do actual experiments but fudge the data retrospectively to make it fit what they want to find. Sometimes it isn’t a specific hypothesis they care about, just generating that magic p<0.05 in some subanalysis.

    The worrying thing here is not just that Bik is finding lots of image manipulation and getting predictably horrible blowback for doing the scientific world a great service. To me the most worrying thing is that she is finding so many obvious manipulations. Half of her discoveries are blatant: duplicated compression artefacts, rotated images, and so on. This suggests that there is an even larger pool of smarter scientific fraud that Bik may not have the resources to uncover. (Not a criticism of her work, btw.) Basically the only way a clever fraud can be found out is through access to their research logs (not 100% perfect as these can also be faked), through whistleblowing colleagues (this is what killed William McBride’s career), or through failures of independent replication that raise serious questions about the original data.

  9. siwuloki says

    @6 wzrd1, or the corollary, “Foolproof solutions lead to the evolution of new fools.”

  10. Rob Grigjanis says

    wzrd1 @6: I was talking about deliberate fraud, not honest mistakes. Theoretical physics is chock full of the latter (made a couple myself).

    chrislawson @11: I specified theoretical physics for a reason. And again, failure is not fraud.

  11. robro says

    Incidentally, here’s an interesting interview with Sultan Moezali Mehbhji on the Yahoo! Finance page. Mehbhji is a Duke University professor and former Chief Innovation Officer for the FDIC. I think he does a decent job of summarizing the state and the risks of AI in a measured and realistic way.

  12. wzrd1 says

    siwuloki @ 10, one of my favorites, “Nothing is foolproof, for fools are far too ingenious”.

  13. Le Chifforobe says

    I have looked at fraud reports on PubPeer, too, and not all of the “detectives” are bringing integrity to their work. Some are just conspiracy theorists, seeing evidence of photo manipulation in the jaggedy lines produced by compression in article PDFs. I don’t think these hair-raising numbers will actually hold up. Science requires us to be skeptical, not cynical.

  14. says

    Wouldn’t making cheating an instant firing offense just raise the stakes for the cheaters? The death penalty doesn’t stop murders…

  15. flange says

    Elizabeth Bik can spot amateurish Photoshopped images, as could I. The tell is usually cloning an area repeatedly to extend an area, or cover some part of the image. It shows as repeated areas forming a pattern.
    If the “scientist” hired a Photoshop pro to help in the fraud, even an expert might not spot the image corruption.
    Cheating and lying well requires some skill and money. We can only hope that the bullshit that’s revealed isn’t the tip of the iceturd.

  16. chrislawson says

    Rob, that list I provided you included many examples of fraud and research misconduct in physics.

  17. Rob Grigjanis says

    chrislawson @18: Do you not understand the difference between ‘experimental’ and ‘theoretical’?

  18. chrislawson says

    Rob, all of the examples on that list were experimental. That is, they were published as experimental results in journals and later investigated and retracted. Every single one of them. Did you even look at the list?

  19. chrislawson says

    Oops, I am sorry Rob. I realise that I got your comment 100% the wrong way around in my head. I agree that theoretical physics doesn’t really lend itself to data fraud and any misconduct would be plagiarism or some such. I sincerely apologise.

  20. Kenneth Hodge says

    Someone found a duplicated image in supplemental material from my own paper. It was a silly oversight on the part of a student and myself. Hardly ill intentions.

  21. chrislawson says

    Thanks for your understanding, Rob. One of the benefits of commenting here is it reminds me I can be an idiot sometimes.

  22. lotharloo says

    @Rob Grigjanis:
    Do people actually check the complicated math of the lesser known papers? In theoretical computer science, often long pages of math and technical stuff won’t get carefully checked, at least not immediately. Although for us, conferences are what counts and few people actually bother with making journal versions of their results. In fact, we used to accept “extended abstract” stuff that never got full paper versions and many of them had claims that were not so easy to get. Eventually, this tradition was dropped and now, all papers must be full versions.

  23. wzrd1 says

    lotharloo @ 27, really? They’re still doing papers, rather than just by press release? ;)

  24. Rob Grigjanis says

    lotharloo @27: Not sure what you mean by “lesser known” papers. If there’s anyone else working in your field (and there are many fields or sub-fields), they’d definitely check your work*. And that was (I’ve been out of the biz for 30ish years) often done before publishing; informal communications after a lecture or conference, preprints, etc.

    *Finding a mistake in someone else’s work is a kick, and someone finding a mistake in yours is appreciated, especially if you haven’t been published yet.

  25. sparc says

    Not all duplications have been done wilfully with bad intentions. Some just happen by mistake. Unfortunately, most universities don‘t provide available tools which would help to identify duplications prior to publication. The following
    software detects problems with figures (rotation, scaling, flipping, cropping, full overlap, partial overlap, cloning) in ones own drafts for publications.
    https://www.proofig.com/
    https://imagetwin.ai/
    Unfortunately, it is not freely available.