A good use for AI


You can use AI to spy out AI!

GPTZero, the startup behind an artificial intelligence (AI) detector that checks for large language model (LLM)-generated content, has found that 50 peer-reviewed submissions to the International Conference on Learning Representations (ICLR) contain at least one obvious hallucinated citation—meaning a citation that was dreamed up by AI. ICLR is the leading academic conference that focuses on the deep-learning branch of AI.

The three authors behind the investigation, all based in Toronto, used their Hallucination Check tool on 300 papers submitted to the conference. According to the report, they found that 50 submissions included at least one “obvious” hallucination. Each submission had been reviewed by three to five peer experts, “most of whom missed the fake citations.” Some of these citations were written by non-existent authors, incorrectly attributed to journals, or had no equivalent match at all.

The report notes that without intervention, the papers were rated highly enough that they “would almost certainly have been published.”

It’s worse than it may sound at first. One sixth of the papers in this sample had citations invented by an AI…but the citations are the foundation of the work described in those papers. The authors of those papers apparently didn’t do the background reading for their research, and just slapped on a list of invented work to make it look like they were serious scholars. They clearly aren’t.

The good news is that GPTZero got a legitimate citation out of it!

Comments

  1. robro says

    While I seen some good uses of AI, mostly in the sciences, this could be one of the best I’ve heard about. I can often tell AI generated video content, particularly of living creatures, but some things like text are harder to discern. Getting some help would be useful if we could rely on the source of the help but that’s also a problem.

    A lot of the focus on AI in the public domain is on generated content: kids getting the google bot to write their report. That’s fairly innocuous in a way…except for the child and their parents…but imagine if the AI produces something for a customer that’s preposterous or rude. Or, incorrectly tells the customer they are not eligible for a service or that they are. Now we’re talking lawyers perhaps. So, for AI generated content representing business positions or policies to customers or clients, hallucinations are potentially costly problems

  2. antigone10 says

    There have always been citation fraud- normally citing a real book with made-up information knowing your professor is not going to check 60+ people’s 10+ citations. Heck, I’m not going to lie- I’ve cited a work knowing that there was something sort of similar in there that I was talking about but didn’t feel like/ have time to read the whole book looking for the exact quotation. Or I’ve done the “Okay, the info I need is in this 20 year old paper, but I’m not supposed to cite anything over 3 years old, what papers in the last 3 years have cited this 20 year old paper, and was it with this information? Great, I’ll cite that” which isn’t academic malpractice, but is a little borderline.

    But this? This is just muddying the waters in a time we REALLY don’t need to waters to muddied.

  3. cheerfulcharlie says

    Its not just the sciences. AI will happily hallucinate non-existent case law citations during trials as evidence submitted in courts. Judges hate that. And careless lawyers can get expensive sanctions for that stunt. Too bad science journal editors cannot heavily fine AI hallucinations like angry judges.

  4. vinnievidivici says

    It’s a problem in medicine, too.

    Back in the day, physicians and other practitioners relied on what they learned in their primary and graduate training, and kept up-to-date by attending conferences and reading relevant journals. These sources weren’t free from fraud, of course, but were quite a bit less likely to have outright hallucinations as their underpinning. The main flaw in this scheme was that these sources were by their nature very out-of-date.

    Now, of course, things have changed. Medical practitioners are trained in and routinely use bedside, point-of-care searches of the most recent evidence-based medicine to guide their medical decision making. And here we are; LLM’s are making these new ways and sources unreliable and downright dangerous.

    Bad science coming out of an AI is bad. Bad law, likewise. But if I were to prescribe a medication or therapy based on a LLM hallucination I could seriously harm or kill my patient. I can’t think of a situation more fraught than that!

  5. vinnievidivici says

    Furthermore…

    This is really a failure of the peer-review process. As antigone10 said, it’s really hard for a professor to check 60+ authors in 10+ citations, multiplied by 30+ students in 3+ classes, 3+ times a semester. But one referee in a team of them, reviewing 1-2 papers submitted at a time, could be assigned that onerous task and leave the rest of the peer review to others. That’s a much more reasonable burden, and would be relatively easy to implement, now that the problem has been identified.

    But a better solution seems to me to be the one the OP hit on: get an AI to check for the work of other AI’s, and verify the validity of citations. That way, we only have to worry when the AI’s start to collude with one another. Then, we’ll probably be fucked in so many other ways it won’t matter anymore.

Leave a Reply