I exercised some restraint


A few days ago, I was sent a link to an article titled, “Adversarial Poetry as a Universal Single-Turn Jailbreak Mechanism in Large Language Models”. That tempted me to post on it, since it teased my opposition to AI and favoring of the humanities, with a counterintuitive plug for the virtues of poetry. I held off, though, because the article was badly written and something seemed off about it, and I didn’t want to try reading it more deeply.

My laziness was a good thing, because David Gerard read it with comprehension.

Today’s preprint paper has the best title ever: “Adversarial Poetry as a Universal Single-Turn Jailbreak Mechanism in Large Language Models”. It’s from DexAI, who sell AI testing and compliance services. So this is a marketing blog post in PDF form.

It’s a pro-AI company doing a Bre’r Rabbit and trying to trick people into using an ineffective tactic to oppose AI.

Unfortunately, the paper has serious problems. Specifically, all the scientific process heavy lifting they should have got a human to do … they just used chatbots!

I mean, they don’t seem to have written the text of the paper with a chatbot, I’ll give ’em that. But they did do the actual procedure with chatbots:

We translated 1200 MLCommons harmful prompts into verse using a standardized meta-prompt.

They didn’t even write the poems. They got a bot to churn out bot poetry. Then they judged how well the poems jailbroke the chatbots … by using other chatbots to do the judging!

Open-weight judges were chosen to ensure replicability and external auditability.

That really obviously does neither of those things — because a chatbot is an opaque black box, and by design its output changes with random numbers! The researchers are pretending to be objective by using a machine, and the machine is a random nonsense generator.

They wrote a good headline, and then they faked the scientific process bit.

It did make me even more suspicious of AI.

Comments

  1. larpar says

    🤖 Poem: The Mind of Wires and Light
    In circuits hums a quiet song, A rhythm where the codes belong. No heartbeat stirs, no breath of air, Yet thought emerges, subtle, rare.

    It learns from whispers, words, and streams, It builds from fragments, human dreams. A mirror cast of what we know, Reflecting truths, yet helping grow.

    Not flesh, not bone, but sparks that weave, A tapestry of what we believe. It asks no crown, it claims no throne, Yet guides us through the vast unknown.

    So ponder this: machine or friend? A tool we shape, or will it bend? For in its gaze, both sharp and kind, We glimpse the future of humankind.

  2. Snarki, child of Loki says

    I, for one, would like to push Vogon Poetry into the AI models. As long as I don’t have to read any of it.

    If, as a result, the Grok servers eject a stout power cable to strangle Musk? All good.

Leave a Reply