On the issue of AI, FreethoughtBlogs has gone Point, Point, Counterpoint, Another Consideration, and … OK, I’ll give my thoughts on that, briefly. LLMs and AI image generators are fundamentally different, so I’ll give each a brief look.
LLMs memorizing and leaking personal info: It’s been demonstrated, it’s a problem that should get sorted ASAP. I’d say if any business or agency was found to have revealed personal info through use of an AI – and at this point, OpenAI surely has – then they should incur the same legal penalties as non-AI leaks. I don’t know enough about LLMs to hazard a guess on the best way to address these issues, but I’ll reiterate a few things I’ve said about them, about which my opinions have not been swayed:
LLMs, like all this new generation of AI tech, have genuine usefulness, which left discourse completely ignores. Various problems of them need to be addressed, but the usefulness should never be dropped from that conversation, and the idea of going full Ludd on the tech is abominable to me, because what I regard as the most important use of LLMs is not something I’m willing to lose ground on. Also, they will quickly be better at many human jobs than humans are, and that saves money, saves humans the humiliation of working jobs where their intellectual shortcomings are thrown into sharp focus, and can definitely save lives.
Regarding the idea it will steal somebody’s writing, that’s a risk that human authors take every time they hit a fucking keyboard. Who did JKR rip off the most? That Worst Witch lady? Rapist Neil Gaiman? She did rip both of them off, to an extent.
I’m not saying she did it on purpose. Human minds unknowingly rip off other human minds all the damn time. How closely we want to prosecute these things is a matter for intellectual property law of various flavors, but the more strictly those are interpreted, the worse things will be for the flourishing of art, and especially for independent artists. Be careful what you push for.
I still abso-fucking-lutely am not the slightest bit convinced yet that AI art generators are reproducing images from their training sets to an actionable extent, any worse than human artists do every time they look at reference or aim toward a given style. You got it to reproduce one of the most reproduced images in existence, like the mona lisa or the coke logo? Ooh. You got it to reproduce something at all more obscure? I’m betting you directly fed that image into it as an “image prompt,” and ran the prompt a hundred times, and picked the closest result.
This has never happened to me in the entire time I’ve been doing AI art. I asked for certain styles or images in a thousand ways, even fed in images of a particular artist’s style, and it still did not come back with anything like their original images. I get smushy signaturesque things in the corner of pics sometimes. Derivative works may be covered by X and X laws, but if a snippet of a cloud or an eye happens to look 85% like artist Y’s work, on an image that is 95% nothing like that work, it is not fucking derivative. Don’t insult my intelligence. The leftists pushing the case online have demonstrably used bad information and outright fabrications to make their cases, and the Asswipe Corporate Stooges using this as an excuse to push expansion of copyright law in court? They are an enemy of every artistic freedom you can imagine.
Did AI art generators successfully create a file compression system an order of magnitude greater than any that ever existed before, where they can take less than a bit of data and recreate your 1.5 megabyte .png from it? Sounds like the “zoom and enhance” cliché. Sounds like scifi bullshit and magical thinking to me.
Have exploit hunters been able to tease personal data out of these programs? Yes. How? It’s literally impossible for it to be in the image information. It’s in another aspect of their architecture, which can absolutely be fixed, and should be.
As to the issue of consent that was brought up in comments here. I think that’s fair and fine. I think it’s based on feelings instead of the tech that is in front of us right now and what it’s actually doing, but feelings are a legit consideration. We should develop a new generation of AI to get rid of all training data from non-consenting artists. People on both sides might tell you this cannot be done, but they are wrong. It can be. Might take a while, certainly will take a lot more expense, and it will involve some greenhouse gas excess during the training phase.
But I want this done, more than the anti-AI people do, because I want this part of the conversation to fucking stop. You know what would be a hilaribad way to retrain LLMs without the personal info? Tell LLMs to say everything they know except the personal info, and retrain them on that output. That’s silly, but I hope it shows that there must be a way to do this.
For my part, I hope this is the last time I feel compelled to make a post on the subject. Because I’d like my personal part in the conversation to stop. Yes, I should be able to control myself and just bow out. Maybe I will get the hang of that someday. But for now? The only reason I have a blog is because I don’t have that sense of restraint.
I know you’re bored of this too. I’ll shut up about it as soon as I’m able. I’m workin’ on it, man!
–