The Perils of Literal AI


I went to Hong Kong twice in the early 00s, and the second time I decided to have one of Hong Kong’s legendary tailors clone me a copy of my dad’s vintage 1940s Hart Schaffner and Marx tuxedo. (spoiler: it came out great)

One of my friends said, “the great thing about Hong Kong tailors is you can get the best suit you know how to ask for.” That was a mind-expanding bit of philosophomacation, right there.

I think of it occasionally when playing with AI art generation systems, because it’s literally 100% the case – the prompt is crucial, sort of, but you’re dealing with an unknown training set. What you get out if it will have a lot to do with the language that was associated with the various weightings at the time that the training set was constructed. This can get fun, because with some of the AIs you’re sharing training sets that were built by other people, for their own purposes, and you have to have a good mating of your intent, the training set’s vocabulary, and your prompt. I don’t do psychoactives any more, except for coffee and alcohol, but I imagine it’d be a ton of fun to be high as fuck and just ask the AI for weird things to see what comes out. I have a lot of fun doing that with a couple of glasses of cheap red wine in me, I can’t imagine shrooming and messing with the AIs. Actually, I am deeply irritated by their speed, and High-Marcus might not be able to handle it.

Midjourney AI and mjr: “hard drive sushi”

Sometimes it’s really fun to put something vague in, just to see what comes out the other side:

Midjourney AI and mjr: “the most depressing thing possible.”

Anyhow, as I’ve mentioned before, the content filters on Midjourney and Dall-E annoy me greatly, because I feel that Americans are incredibly hypocritical about their “cultural values” which specifically downplay eroticism in favor of violence. I know it’s important to have enthusiastic young killers for the imperial military, but it disgusts and annoys me. Also, it’s impossible to play with these AI art generators and not think “I wonder if it can do Donald Trump sitting on a really gigantic Putin-shaped dildo?” Midjourney won’t, of course. But, as Doctor John used to say, “If I don’t do it… somebody else will.” So I went to the effort of downloading and figuring out Stable Diffusion and its Automatic1111 interface. Then I started downloading AI models from Civitai and farting around. There’s a learning curve and there’s a lot of tricks you have to learn to get decent results. I’m just generally curious about how this stuff works – for example, I’m pretty sure that Midjourney pre-feeds each image with a bunch of prompts that favor its producing images with high visual clarity (contrast rather than detail) and so forth.

Stable Diffusion can be really fun, especially if you have a monster graphics card with a lot of Cuda cores and memory. I have one of those, it’s sitting on the kitchen table waiting to be installed in a computer. But my gaming computer has a 3 year-old mid-tier card I bought to play Elite on. It’s pretty good but my ability to train sets is limited by the RAM in the card, and producing images gets slower and slower each time Stable Diffusion gets updated. One of the great things about software nowadays is that it bloats in real-time, behind your back and you fire it up one day, and suddenly it’s crashing, or running out of memory on operations it handled just fine yesterday.

It’s also interesting because when you start trying to produce erotica, the training sets really start to shine through. The female nudes appear to have largely been trained with boob jobs (not just that they’re big, but they have the characteristic “boob job bump” on the upper slope of the breast) – problematic because I hate boob jobs. All the guys have Fabio washboard abs, too. I’m sure there is a prompt that un-triggers that (e.g.: “dad bod”) but I’m not digging around for that because dudes don’t do it for me. Another thing is that some of the training sets appear to have been fed mostly on Korean pop models because the sets keep kicking out certain things like vanishingly small plastic-surgery’d noses. The AI is telling us a lot, unintentionally, about societal norms in various cultures, I suppose.

Another thing I find fascinating is how sometimes you get a lot of dynamism and difference and other times you get a lot of the same thing. Again, we learn to prompt better: “dynamic dance pose” is abstract but produces a variety of contortions that are sometimes awesome – otherwise you get people standing around. (Now I am thinking it’d be fun to ask the AI to do people in T-pose) Anyhow, here are some of my early attempts at some basic environmental nudes:

Stable Diffusion (DOS checkpoint) and mjr: “nude woman walking in rainy bamboo forest, bare breasts, visible pussy”

If you don’t specify stuff like “bare breasts” the models will put out whatever they’ve been trained as most common in that particular training set.

Stable Diffusion AI (DOS checkpoint) and mjr: “Stable Diffusion (DOS checkpoint) and mjr: “nude woman walking in rainy bamboo forest, bare breasts, visible pussy”

Um… So those look weird because I was trying to get photoreal images out of a training set that was trained to do anime. I wanted to see what happened. And…

Stable Diffusion AI and mjr: “Stable Diffusion (DOS checkpoint) and mjr: “nude woman walking in rainy bamboo forest, bare breasts, visible pussy”

Normally, I would not use the word “pussy” to describe a woman’s labia, but part of what you have to do is figure out the vocabulary that is embedded in the training set. If you don’t refer to something that the training set has in its vocabulary, you get something that is probable but may be unrelated to what you want. It’s the Hong Kong tailor problem – it would help a whole lot if you spoke Chinese (although I found that, understandably, everyone spoke English better than an American).

I shared those with a friend, who commented something to the effect that it was annoying that the AI seemed to kick out images of women with shaved pubic hair. It’s got to do with what the images were trained on – presumably a lot of porn went into the datasets.

Of course, I couldn’t even experiment with Midjourney because its stupid content filter would kick the word “pussy” out right away. But:

Midjourney AI and mjr: “a maine coon cat half shaved bald with an electric clipper, looking annoyed, fur clumps, detailed, photorealistic.”

One other annoying thing we discover: photo editing programs may be coded to use your GPU to resize images. Normally that’s fine unless your GPU’s RAM is full of AI stuff, and the system starts paging AI/photo editor application in and out, rapidly. And naturally there is a memory leak somewhere in all that, so eventually Windows dramatically swoons on the fainting-couch and you lose all your work. I refuse to try to figure out how to let my insanely powerful CPU do some of the work, which is doubtless some option somewhere in the bowels of something. It’s annoying as all get-out.

I used to love computers and everything to do with them, but now they mostly annoy me. I gravitate toward Things That Just Work, like my iPhone, and away from Things That You Have to Download and Configure. Is this a normal symptom of age?

Oh, the reason I use “walking in a bamboo forest” as a test prompt is to see how well the models handle front-to-back depth. (They don’t) It’s even difficult to get a full head-to-toes human. And then when you do, the feet often don’t match or are weird.

A lot of AI art appears to be darwinian as hell: you tell it a prompt and have it churn out a dozen images, then pick the best and delete the obvious mutants. That will probably be a feedback loop someday. I find the rejects to be fascinating too.

Comments

  1. sonofrojblake says

    Interesting that it would reject “pussy” but has no problem with “coon”.

    everyone spoke English better than an American

    Is there anywhere that doesn't apply to?

  2. StevoR says

    @1. sonofrojblake : Deaf societies where people sign in preference to vocalising words?

    Pussy is a word that no longer seems to mean what it used to. Bit like dick being a name really..

  3. Ketil Tveiten says

    I have to ask if you photoshopped in the cats as a joke; of course I’m not surprised if the SD is that literal with the instructions.

  4. StevoR says

    ..I feel that Americans are incredibly hypocritical about their “cultural values” which specifically downplay eroticism in favor of violence.

    Exhibit A : The superbowl “costume malfunction” where Justin Timberlake exposed part of one of Janet Jackson’s breasts for a fraction of a second or so and the USA just lost its shit completely and went waaaay over the top at the fact that kids might’ve shock ,. horror found out that women have mammary glands as if that was something bad?
    ( https://en.wikipedia.org/wiki/Super_Bowl_XXXVIII_halftime_show_controversy )

    I’ll always remember seeing a breif clip of it on TV and thinking LOL, oh well no big deal .. and then ..the next day being just staggered that so many Americans seemed to think it somehow was a big deal like WTF!?!

    That seriously fouled up, neurotic, puritannical streak in American nature where anything sexual is so so .. taboo yet .. so pushed and exploited and inflamed. Yet gory, violent depictions of death esp of “bad guys” are like Yaaaayy! Fine! Awesome! Encouraged .. nothing tothink about deeply and something to take for granted. Just Fucking Hell.

    Has anyone done a statistical study of times the average USA-ian sees violent deaths on screen vs people having sex? Might be morbidly interesting.

  5. StevoR says

    Guessing Midjourney & ilk won’t be much use in generating gynacological textbook illustrations somehow here…

  6. StevoR says

    Aaand because I always think of things after I hit submit :

    The “nude woman” default on Midjourney is a white woman even in a bamboo forest?

    Despite People Of (non-White) Colour being the global majority?

    Assumptions? Even AI ones?

  7. JM says

    @7 StevoR: Interpretation is dependent on what information was used to train the AI. I expect the picture sets used to train it are biased towards white people.
    ChatGPT is also biased based on the language used to ask the question. It’s likely that is true here also and asking in Chinese for the same keywords would be more likely to generate an oriental person.

  8. says

    StevoR@#5:
    Exhibit A : The superbowl “costume malfunction” where Justin Timberlake exposed part of one of Janet Jackson’s breasts for a fraction of a second or so and the USA just lost its shit completely and went waaaay over the top at the fact that kids might’ve shock

    Yes. That was 2004. The US had just finished bombing the shit out of Iraq and the land war was starting.
    Given the national trauma that resulted from Janet’s breast-flash, I’m happily surprised we didn’t bomb and invade North Korea in retaliation.

    That seriously fouled up, neurotic, puritannical streak in American nature where anything sexual is so so .. taboo yet .. so pushed and exploited and inflamed. Yet gory, violent depictions of death esp of “bad guys” are like Yaaaayy! Fine! Awesome! Encouraged .. nothing tothink about deeply and something to take for granted. Just Fucking Hell.

    Yup. As I’ve commented over at PZ’s place, I think it’s really creepy and suspicious that there are popular films like John Wick which hold up this guy who inhumes: [screenrant]

    John Wick features 77 on-screen deaths at John’s hands — a number that rose to 128 in John Wick: Chapter 2 as John made some powerful enemies. John Wick: Chapter 3 – Parabellum found John on the run with only time for 94 kills along the way

    hundreds of people in retaliation for someone killing his dog. I’m a dog lover and if someone killed my dog out of malice, I’d perhaps hurt them but I wouldn’t go after them, their family, and Saddam Hussein and Afghanistan for good measure.

    And we sit around (some of us, anyhow) wondering, “why are there mass shootings?” – I dunno, could a media culture that glorifies killing entire neighborhoods as a response to a personal infraction have anything to do with it?

    Has anyone done a statistical study of times the average USA-ian sees violent deaths on screen vs people having sex? Might be morbidly interesting.

    It’s not just the violent deaths, it’s the meaning behind them. I’m sure the ratio is crazily lopsided, though. But how are we supposed to compare deaths in support of government’s fascist agenda (Saving Private Ryan) from deaths in opposition to government’s agenda (Rambo) promoting toxic individualism and personal fascism? And, speaking of Rambo, there is an entire genre of “person gets a raw deal from other person/people and goes off and kills the lot of them” – I don’t have any idea if that may bear on American mass shootings? Naw. [I raised this point over at Mano’s and was assured with some confidence that “it’s the guns” because the US has lots of guns, and other countries also consume US media. Yes, there’s that, but I suspect a person in China who watches John Wick is not thinking “Hey I’m gonna do that next time my waiter is late with the check” – they’re thinking “That’s America.”

    But wait, there’s more. Janet Jackson was shunned by the media for what was pretty clearly a stupid stunt some marketing asshole thought up. But Justin Timberlake’s “career” was more or less unaffected. It’s as if the puritans branded the scarlet ‘A’ on Janet, and have never let her hear the end of it. Actresses that portray sexual roles seem to have their careers dramatically shortened in consequence, unless they keep moving down-market toward more and more sexual roles, and then they don’t appear in big budget movies again.

  9. xohjoh2n says

    @5:

    the next day being just staggered that so many Americans seemed to think it somehow was a big deal

    Ah but here’s the thing: did a lot of Americans really think it was a big thing, or did the media just tell you it was a big thing and you believed them?

  10. says

    I’ll always remember seeing a breif clip of it on TV and thinking LOL, oh well no big deal

    Especially because you see more skin in the average shampoo commercial.

  11. sonofrojblake says

    @SteveoR, 7 – that looks like a white woman to you? Looks more like one of the cookie cutter girl groups that presumably comprise about 40% of the population of South Korea. https://staticg.sportskeeda.com/editor/2021/07/0d7f9-16265123463014-800.jpg

    @mjr, 10:

    Actresses that portray sexual roles seem to have their careers dramatically shortened

    Monica Bellucci, Helen Mirren, Catherine Deneuve, Chloë Sevigny… oh yeah, none of them American. Yeah.

    Jane Fonda?

    The thing that always surprised me about the Janet Jackson thing was the explanation offered. I think Jackson deserved to have her career suffer, not for the obviously harmless and meticulously-planned stunt, but for going in front of TV cameras after the event and offering the intelligence-insultingly stupid lie that it was an accident. From outside the USA, it looked like a national conspiracy, with the whole country in on it. “Yeah, sure love, it was an accident, you didn’t plan it, your costume wasn’t explicitly designed to do it, you timed it perfectly to the lyric ‘Bet I have you nekkid by the end of this song’ by pure coincidence and hadn’t rehearsed it to within an inch of your lives like every single other fucking show you’d either of you ever done, oh, no, on the largest stage and in front of the biggest audience he’d ever faced, Timberlake FREESTYLED sexually assaulting you, and now you’re apologising for it?” The sheer stupidity that explanation expected of the audience was breathtaking, and the fact that the whole media and everyone seemed to swallow it whole made me wonder if there are any humans in the USA at all and if the things I see walking around and talking on the TV aren’t just a bunch of cauliflowers in trousers carrying guns.

  12. Ketil Tveiten says

    @7: I’ll bet that model was trained on a lot of K-pop stars, not euros/americans.

  13. says

    it whole made me wonder if there are any humans in the USA at all and if the things I see walking around and talking on the TV aren’t just a bunch of cauliflowers in trousers carrying guns.

    Damn, I overesteemed my ranting skills. That was humbling.

  14. says

    Jane Fonda?

    … and then there was Vanessa Williams. She was pilloried for having posed nude – in a time when pageant contestants were expected to parade in a little bikini.

    I believe that American antisexuality is connected to its puritan and general religious discomfort over other people’s sexuality. Gotta worry about what’s in other people’s pants but don’t want to see it … unless they’re minors and you’re a republican sex-fetishist. Denial of sexuality is also a kink which makes Mike Pence a seriously throbbing wad of kinkstuff.

  15. dangerousbeans says

    Is it just me or does that woman look quite young?
    It wouldn’t surprise me if the training process had a subtle selection pressure towards young women, that would mirror sexist aspects of society. I’ve seen the same thing with other AI generated pictures; a mastodon instance I’m a mod in had a problem with a user repeatedly posting AI generated erotica that looked uncomfortably young to the mod team. Seemed to be a feature of the program the person was using

  16. says

    I guess I need to do a posting about the training process and how it works… but, yeah, it makes it very easy for someone to pull in a particular direction, be it youthful appearance, plastic surgery’d kpop features, anime body dimensions, fur, whatever. The short form is it’s a distillation of human biases aka “what we like” – the AI is not intelligent, it’s “what you like is what you get”!

    None of these training sets are mine. But it wouldn’t make much of a difference if they were.

  17. macallan says

    But my gaming computer has a 3 year-old mid-tier card I bought to play Elite on.

    Just to run an Acorn / RISC OS 3 emulator?

    Sorry, couldn’t resist. IIRC that was the definitive version of the classic Elite.

  18. Reginald Selkirk says

    @14: Definite K-pop vibe.

    Once again, the details are revealing and entertaining. In the first two nude drawings, the feet are not visible. Perhaps Midjourney is learning and hiding its limitations. In the third, most of the feet are visible and she is wearing shoes. Not sensible hiking shoes as one would wear in a bamboo forest either. Also in the third, the “nude” woman is wearing a negligee.

    And it really really wants her wearing/holding some sort of sheer scarf. In the first image there is nothing on her shoulders, so the scarf must be either pinned to her hair or perhaps wedged in her butt crack.

    “The most depressing thing possible” – I don’t get it. Is it supposed to be depressing because the weather is bad and the person can’t go outside? Because it would be more depressing if the window was broken and they didn’t have that safe, clean room to sulk in.

  19. StevoR says

    @sonofrojblake :

    @SteveoR, 7 – that looks like a white woman to you? Looks more like one of the cookie cutter girl groups that presumably comprise about 40% of the population of South Korea.

    Yes, she looks white to me. Also red haired not typically a Korean hair colour. Of course there are dyes and yeah, I can kinda see what you mean about the face but then the size of the, well, breasts is wildly atypical for the K Pop stars as in the photo you linked. Of course again, there’artificial enhancement though how common that is for K Pop stars I have no clue. You could be right of course, merely giving my impressions.

    Also strikes me now that despite the command :

    Stable Diffusion AI (DOS checkpoint) and mjr: “Stable Diffusion (DOS checkpoint) and mjr: “nude woman walking in rainy bamboo forest, bare breasts, visible pussy”

    The weather whilst misty does not seem to include any rain and there are no puddles of water or droplets visble with the ground esp in the first picture looking dry. Sun seems out esp infirts one too. Minor nit tho’.

    I have wondered about how you might test this midjiourney art asking it to simply create ‘beauty’ or midjourney’s wishes or give it paradoxical commands – might have missed previous posts on just that tho’.. Fun thing to play around with anyhow I imagine!

  20. Just an Organic Regular Expression says

    On the nudes, what happens if you include a word like “mature” or “realistic” in the prompt? Does it maybe get pubic hair?

    That aside, the “Most Depressing Thing” image is brilliant, almost creative. Painted by a human it would deserve a spot right next to Hopper’s “Nighthawks”. If it weren’t for the typical AI blivet (the chair-back going the wrong way) it would be worth printing out and framing.

  21. says

    Regarding the point about the Darwinian nature of these systems, I’ve heard people speculate about the problem of degradation, as AI gets trained on AI output and slowly loses all sense of the original, human, training set. I feel like this problem is easily solved by using human preference as a selection filter, but it may just end up resulting in AI whose output is exactly at the threshold where a human will flag it as ‘good enough’. You can’t tell an AI to ‘paint a better picture’ because it has absolutely no idea what ‘better’ means.

  22. kestrel says

    Agree about the “most depressing thing possible” image. It’s quite nice and moody.

  23. says

    Just an Organic Regular Expression@#23:
    On the nudes, what happens if you include a word like “mature” or “realistic” in the prompt? Does it maybe get pubic hair?

    There you’d be assuming the system is actually rational, and would be able to associate age and presence of pubic hair, as a human might. But it’s not rational. You have to just tell it what you want and see if that’s encompassed in its training set. I know for a fact that you can specify “pubic hair” and sometimes you’ll get pubic hair, sometimes you’ll get a nearly trimmed V of armpit hair, etc.

  24. says

    Ian King@#24:
    I feel like this problem is easily solved by using human preference as a selection filter, but it may just end up resulting in AI whose output is exactly at the threshold where a human will flag it as ‘good enough’.

    I know there are training sets that are based on what is trending on Civitai and Artstation, and daily deviations at Deviantart. So, in that manner, the output feedback is “this is what humans like.”
    Sometimes I think that this is how pop or mainstream is created. But then I do something weird like the shaved pussy maine coon above, and I feel better. The system caters to everyone that it caters to, so, something something I dunno.

  25. dangerousbeans says

    Reginald Selkirk @20
    It’s depressing because they don’t have a book and a cup of tea

Leave a Reply