I’m not alone in seeing how the internet has been degenerating over the years. The first poison was capitalism: once money became the focus of content, a content that was rewarded for volume rather than quality, the flood of noise started to rise. Then it was the “algorithm”, initially a good idea to manage the flow of information that was quickly corrupted to game the rules. SEO became a career where people engineered that flow to their benefit. And Google smiled on it all, because they could profit as well.
The latest evil is AI, which is nothing but a tool to generate profitable noise with which to flood the internet, an internet that is already choking on garbage. Now AI is beginning to eat itself.
Generative AI models are trained by using massive amounts of text scraped from the internet, meaning that the consumer adoption of generative AI has brought a degree of radioactivity to its own dataset. As more internet content is created, either partially or entirely through generative AI, the models themselves will find themselves increasingly inbred, training themselves on content written by their own models which are, on some level, permanently locked in 2023, before the advent of a tool that is specifically intended to replace content created by human beings.
This is a phenomenon that Jathan Sadowski calls “Habsburg AI,” where “a system that is so heavily trained on the outputs of other generative AIs that it becomes an inbred mutant, likely with exaggerated, grotesque features.” In reality, a Habsburg AI will be one that is increasingly more generic and empty, normalized into a slop of anodyne business-speak as its models are trained on increasingly-identical content.
After all, the whole point of AI is to create slop that will be consumed because it looks sorta like the slop that people already consumed. So make more of it! We’re in competition with the machines that are making slop, so we can outcompete them by just making more of it. It’s what biology would look like if there were no natural selection, and if energetic costs were nearly zero — we’d be swimming in a soup of goo. As Amazon has discovered.
Amazon’s Kindle eBook platform has been flooded with AI-generated content that briefly dominated bestseller lists, forcing Amazon to limit authors to publishing three books a day. This hasn’t stopped spammers from publishing awkward rewrites and summaries of other people’s books, and because Amazon’s policies don’t outright ban AI-generated content, ChatGPT has become an inoperable cancer on the body of the publishing industry.
That’s a joke. Limiting authors to three books a day? How about limiting it to one book a month, which is more in line with the human capacity to write? You know that anyone churning out multiple books per day is not investing any thought into them, or doing any real research, or even aspiring to quality. Amazon doesn’t care, they exist only to skim off a few pennies of profit off each submission, so sure, they’ll take every bit of hackwork you can throw at them. Take a look at the Kindle search page sometime — it’s nothing but every publisher’s slush pile amplified ten thousand fold.
The Wall Street Journal reported last year that magazines are now inundated with AI-generated pitches for articles, and renowned sci-fi publisher Clarkesworld was forced to close submissions after receiving an overwhelming amount of AI-generated stories. Help A Reporter Out used to be a way for journalists to find potential sources and quotes, except requests are now met with a deluge of AI-generated spam.
These stories are, of course, all manifestations of a singular problem: that generative artificial intelligence is poison for an internet dependent on algorithms.
The only algorithm I want anymore is “Did PZ Myers subscribe to this creator? Then show the latest from them.” I don’t want “X is vaguely similar to Z that PZ Myers subscribed to” and I sure as hell don’t want “Y paid money to be fed to everyone who liked Z”, but that is what we do get.
One hope is that all the AI-based companies will eventually start cannibalizing each other. That may have already begun: two AI image companies, Midjourney and Stability AI, are fighting because Stability skulked into the Midjourney database to snatch up as much of their data as they could.
Here’s a prompt for you: two puking dogs eating each other’s sick and vomiting it back up again, over and over.
cartomancer says
“two puking dogs eating each other’s sick and vomiting it back up again, over and over”
Uncannily similar to how political parties in a two-party system tend to generate policies.
Karl Stevens says
This is why I never used Twitter. Waaaay back when it started I created an account; the other social media site I used was Facebook, which I could visit someone’s page and see what they posted (no idea if it still works that way.) Assuming Twitter was the same I created an account to follow interesting people who were using it (Adam Savage, FWIW.) So I added him to my feed, and saw ONE thing he’d posted, followed by dozens upon dozens of posts by people I had no interest in. There was no way to turn off all the people who had replied to him – it was one post that interested me, followed by pages upon pages of garbage. I deleted the account and never looked back.
PZ Myers says
At least YouTube has an option to just view subscriptions — you get only the people you intentionally follow. It’s very useful, which probably means Google will remove it eventually.
birgerjohansson says
Noooo! If they remove Youtube, I will be cut off from The Scathing Atheist, Skepticrat and God Awful Movies.
One tool to spot human-generated content: if it is ‘so bad it’s good’, it is human-made.
If it is both bad and dull, it is made by Neil Breen or AI.
mordred says
My boss loves AI! He dreams that very soon a lot of people lose their jobs and then he can finally hire some people who will not demand such ridiculously high wages!
He said that to my face a few weeks after we had a long discussion about us not getting any decent IT stuff if he is not prepared to at least pay something close to average.
Replacing that man with an AI on the other hand might be an improvement.
Charly says
I wrote my comment on this a few days ago: https://freethoughtblogs.com/affinity/2024/03/10/chess-ai-and-lessons-about-societal-impact/
We need AI regulation, both legal and societal. The societal part is, at least somewhere somewhat happening, there are a lot of people who hate AI-generated slop and are vocal about it. But legislators are completely sleeping at the wheel and this can (and probably will) end in a catastrophe and people will die.
The “I” in “AI” is a misnomer. When people learn from each other, they improve and diverge (proof – the world around us). When AIs learn from each other, the converge to the mean.
Marcus Ranum says
When AIs learn from each other, the converge to the mean.
Maybe in the 1970s they did, but I suspect that has never been true. Quite an assertion, especially since humans use functionally the same creative process, and diverge just fine.
Basically what you just asserted is that evolution is not a real algorithm that works. Consider human selection of AI outputs as the “selection pressure” and you’re done.
Marcus Ranum says
When AIs learn from each other, the converge to the mean.
Maybe in the 1970s they did, but I suspect that has never been true. Quite an assertion, especially since humans use functionally the same creative process, and diverge just fine.
Basically what you just asserted is that evolution is not a real algorithm that works. Consider human selection of AI outputs as the “selection pressure” and you’re done.
Bronze Dog says
mordred @5: That reminded me of this little webcomic strip.
Raging Bee says
…especially since humans use functionally the same creative process, and diverge just fine.
Is that really true? Do organic human brains really work the same ways as computers?
Charly says
@Marcus #8 and how does that address what I said? If you have a human selection of the AI outputs, then the AIs are not learning from one another but from humans! And I would argue that much stronger “human selection of AI outputs” is the regulation that I think is sorely needed. My assertion is rooted in my understanding of evolution.
Left completely to its own devices, generative AI could probably still evolve if there were some other selection on the outputs. But in such a case, the outputs would with time have less and less relevance to humans than they do now.
jenorafeuer says
Amazon’s Kindle marketplace has been flooded with auto-generated crap for years, from blatant copying to early AI text generation that was more crap than the current versions. Sometimes that was used for money laundering, resulting in people finding illegitimate copies of their e-books on KU for ridiculous asking prices.
As for the ‘three per day’ thing… there was a Facebook group called ’20booksto50k’ which recommended writing 20 KU books per year to get an income of $50,000. They were very much into the ‘generic extruded SF’ category, and generated some controversy a few years back when the small publishing company that owned the 20booksto50k trademark got a bunch of his clients to nominate each other so that ‘indies’ would beat out the ‘traditional publishers’ in the Nebula awards. (Never mind that he actually was a publisher, just one with almost no standards aside from ‘will this sell a few copies in the stagnant marsh that is Kindle Unlimited’.) I can fully see some of the less ethical members of groups like that using AI to pad out their work.
nomuse says
I saw signs of this years ago, when I was hanging with student digital artists at Renderosity. People who were learning how to create art with the new digital tools, who were ONLY looking at art created with the new digital tools. This had the potential to be a whole generation that thought that it was perfectly natural and normal for a guy’s cloak to clip through his shoulders.
When I dabbled with AI I couldn’t help noticing that the 2nd and 3rd generation — especially LORAs and similar — were being trained on the output of other AI. Here’s the first big flag; one of the most effective prompts for later iterations is the name of a site that hosts mostly AI art. You are explicitly telling the AI to make something that looks like AI art.
The trend here is ever faster and more efficient ways of training the public to accept drek because they are being swamped by so much of it they no longer realize there can be something better.
Pierce R. Butler says
… the models themselves will find themselves increasingly inbred, training themselves on content written by their own models …
Reminds me of an article I read back in the ’70s, by a veteran TV scriptwriter lamenting that the younger generation of his trade had no real life experience except watching TV, producing an inward spiral of stereotypes and unimagination.
Sfaict, he got it right, except not realizing the process would be automated and accelerated.
Dunc says
This would be more reassuring if I had a higher opinion of the average human’s discernment. Alas…
Charly says
@Dunc, I do not think that the average human’s discernment is the issue right now. Right now the problem is that AI content is churned out at such a rate, that it is essentially not vetted by humans at all and once it is on the internet, it not only stays there but it gets incorporated into future AI content and it is quite often promoted by other AI which leads to more stuff like it being generated etc. It has started a self-reinforcing loop at the end of which most of the internet can be AI-generated generic crap.
The problem is not bad vetting by humans, the problem is almost no vetting by humans at all. If someone generates “three books a day” and “publishes” them online, there is literally no way those books were vetted even by their own pseudocreator (people using AI to generate stuff are not the creators of that stuff).
Not to mention that AFAIK (IANAL) AI AI-generated stuff cannot be copyrighted exactly because it has no human creator. Thus it should be illegal for someone to sell AI-generated stuff because they do not own it, it is automatically public domain. But there is currently a lack of robust legal regulation and a severe lack of enforcement of those regulations that are in place.
Jim Balter says
What a stupid, ignorant, wrong, and intellectually dishonest claim. Bothsidesers are imbeciles.
Another imbecile. The statement was about “AI’s learning from each other”. That sort of closed system does not diverge. It’s akin to “the bootstrap paradox” (from Heinlein’s By His Bootstraps).
cartomancer says
Jim Balter, #17
And yet, here we are, in both Britain and the USA, with the core economic policies of Labour and the Tories, the Republicans and the Democrats, functionally identical. Yes, one side garnishes it with a heaped dose of racism and jingo, the other with weak tea pretenses that they care about working people, but it’s pro-capitalist, neoliberal, austeritarian economics all the way down. Admittedly I prefer weak tea pretense to outright racism and jingo, but if we’re arguing about the garnish we’re missing the fundamental fact that neither side offers anything but the post-Keynesian orthodoxy we’ve had for 40+ years.
Not that the previous, postwar Keynesian, era was any better in terms of political variety and choice. The policies were infinitely better – effective controls on the excesses of capitalism were the norm and both sides had to offer them – but the system itself meant you had to stick within the orthodoxy even then. Put forward suggestions that maybe controlled capitalism a la Keynes was inadequate and we needed a better, non-capitalist system instead, and you were relegated to the margins. Observe what happened when Jeremy Corbyn tried to make the Labour Party an actual alternative with an actual radical, socialist, anti-capitalist policies – the system collaborated to return to the status quo almost immediately, and didn’t cease until it had. Now we’re left with Kier Starmer and his odious crew, who are Tories in every way except the colour of their rosettes and offer nothing in the way of substantial change over the horrors of the Cameron/Johnson/May/Truss/Sunak regime.
That’s what first-past-the-post two-party systems inevitably end up with: two functionally identical versions of the same thing and no real choice to be made on the part of the voters. Who are left in a perpetual state of trying to vote for the lesser of two evils and unable to offer effective support for people with a systemic critique of the fundamentals. The entire history of the last century stands testament to this fact.
Rob Grigjanis says
cartomancer @18: Very well said.
StevoR says
Being Cap’n Obvs here but that’s only one dog that’s just been sick and isn’t lapping it up (yet) – or was that the point?
lotharloo says
No, it’s not. Both statements are false. Humans are “not functionally the same creative process” that a stupid fucking statement. E.g., humans can count DNN cannot. DNN are basically linear algebra plugged into a non-linear function, tweaked and ad-hocly tuned to produce an output that is acceptable for humans.
John Morales says
Apocalypse now, quoth PZ.
Presumably, since the AI apocalypse has already happened, there is nothing more to be done about it.
So we now live in post-AI apocalyptic times.
(Bit of a disappointing apocalypse, but there you go)
gijoel says
Is it moral to steal AI content? Commander Steph Sterling has the answers. (It’s yes btw).
https://youtu.be/bbA1yql14Mc?si=fr9DSQAlU1DpkRlc
snarkhuntr says
Happily, the energetic costs aren’t nearly zero. We don’t know what they are now for the relatively primitive models that actually exist, and all we know about the costs for the hypothezised (fantasized) future, ‘better’ models is that they will cost more.
Spammers cranking out hundreds of AI-generated garble websites/day for SEO purposes aren’t spending their own money or computing time on it – they’re riding the wave of free or discounted computing time being passed out by the heavily-subsidized AI startups who are all hoping that they’ll somehow be able to figure out an actual profitable use for their software if they can just get enough people to use it. One the cost-per-query starts to rise at the end-user level, I expect some of this will die off.. Of course, not before it poisons the internet as a tool for training AI in the future, which might be an actual social good produced by those otherwise parasitic folks.
Likewise the ad-supported internet where ‘engagement’ or ‘views’ drive profits is also starting to wind down. If you dare, turn off your adblocker or disable your you-tube premium and actually watch the ads that are supporting some of those big platforms. More and more junk, scams, predatory bottomfeeders and their ilk. Fewer and fewer ads from big-ticket real companies flogging real products. Capitalism is souring on the ad-supported model, even as much of it remains addicted to it. It’s not going to run forever this way.
Once other funding models start to become more prominent, ‘chumbox’ ads and SEO won’t neccessarily be as profitable as they presently are and the folks doing those services now will transition to some other scummy way of earning a living at everyone else’s expense.
Ada Christine says
@Marcus Ranum #8
This is also quite an assertion. Do we understand enough about how humans perform creatively to say this with any degree of confidence?
DanDare says
Once publishing went from expensive print material to cheap digital the main quality driver was lost. It used to be risky to print something that would not sell.
However that was a dubious measure of the value of the content, populist over deeper quality.
The task before us is not railing against AI but how to solve the problem of promoting true value, and dampening the cancers.