So much effort spiraling down the drain of AI


Google has come up with a new tool for generating video called Veo — feed it some detailed prompts, and it will spit back realistic video and audio. David Gerard and Aron Peterson decided to test it and put it through its paces, and see whether it produces output that is useful commercially or artistically. It turns out to be disappointing.

The problems are inherent to the tools. You can’t build a coherent narrative and structured sequence with an algorithm that just uses predictive models based on fragments of disconnected images. As Gerard says,

Veo doesn’t work. You get something that looks like it came out of a good camera with good lighting — because it was trained on scenes with good lighting. But it can’t hold continuity for seven seconds. It can’t act. The details are all wrong. And they still have the nonsense text problem.

The whole history of “artificial intelligence” since 1955 is making impressive demos that you can’t use for real work. Then they cut your funding off and it’s AI Winter again.

AI video generators are the same. They’re toys. You can make cool little scenes. In a super limited way.

But the video generators have the same problems they had when OpenAI released Sora. And they’ll keep having these problems as long as they’re just training a transformer on video clips and not doing anything with the actual structure of telling a visual story. There is no reason to think it’ll be better next year either.

So all this generative AI is good for is making blipverts, stuff to catch consumers’ attention for the few seconds it’ll take to sell them something. That’s commercially viable, I suppose. But I’ll hate it.

Unfortunately, they’ve already lost all the nerds. Check out Council of Geeks’ video about how bad Lucasfilm and ILM are getting. You can’t tell an internally consistent, engaging story with a series of SIGGRAPH demos spliced together, without human artists to provide a relevant foundation.

Comments

  1. robro says

    I’ve seen quite a few shorts/reels…whatever…that are autogenerated in some way. You can tell that even the more realistic ones are autogenerated because the movements of the characters are sometimes implausible. They come with producers names like Club Cranium, Kelly Eldridge Boesch, and Crumb Hill. Some are steampunk, some Dali-esque, some clownish. Crumb Hill veers into dark horror. The characters sometimes kind of dance to the sound track as they morph from one shape to another. They are usually short…fortunately. A few can hold my interest for a minute or two, mostly because of the sound track, but it gets boring fairly quickly.

    I’ll refrain from predicting the future about this use of AI…who knows. As for Lucasfilm, ILM, Disney, etc, I haven’t seen a movie in years because they are so predictable. In fact, the last movie I saw in theater was Peter Jackson’s documentary They Shall Not Grow Old…an amazing work which owes something to the technology.

    Despite all the carping about AI…and there is plenty of hype and lot of drek in that arena…it does have it’s uses. As Science reported in 2020: ‘The game has changed.’ AI triumphs at solving protein structures. I don’t pretend to fully understand this but Derek Muller (aka Veritasium) did a piece about it that made it seem impressive.

  2. IX-103, the ■■■■ing idiot says

    I think many would quote Faraday and say “Madam, of what use is a baby?”

    Now, I’m not a die-hard AI fan (I’ll write my own code, thank you very much) and I agree that the current iteration is more hype than substance, but you have to admit that there is some substance there. What they currently call “AI” isn’t intelligent – it’s ignorant of reality and has no volition of it’s own, but it is good at repeating patterns, and that makes it a useful tool.

    An expert working with an LLM can be more productive than one without. The only issue is where LLMs compete against some entry level roles, and if those roles were eliminated there would be a shortage of experts in the future.

  3. JM says

    The real goal of the Hollywood studios is getting to the point they can digitize the actors and own them forever. Churn out a Mission Impossible movie every other year for the next few decades. Cycle another Star Wars trilogy every decade or so, with some other movies in between. Get rid of one of the expensive and erratic parts and replace it with something reliable and cheap.
    They know the technology isn’t there yet but they think AI will develop the say way special effects have. Where every decade has produced a new generation of technology allowing a ten fold advance.
    What Hollywood doesn’t realize is that AI video starts to work their entire industry will be eaten out by individual artists making their own short TV shows and movies. It’s already happening a bit with stuff on Youtube but it’s hard and slow. When an artist can sketch the major characters and shell out the plot and have AI fill in the background and generate video there will be storm of this stuff.

  4. robro says

    JM @ #3 — “What Hollywood doesn’t realize is that [when] AI video starts to work their entire industry will be eaten out by individual artists making their own short TV shows and movies. It’s already happening a bit…” which is, I think, what I’m seeing in these reels although the reels don’t have a story line just images and they don’t do much. I believe these reels are made by individual artists or small collectives.

    However, I would hesitate to predict these things eating out Hollywood. It’s a very resilient business, although it could alter the industry a lot…kind of like the demise of the studio system with the advent of TV. But note that the death of the studio system did not kill those business: MGM, Paramount, Warner Bros, 20th-Century Fox, even RKO, as well as some of the “little” guys…Paramount, Universal, United Artists…still exist and are familiar names. Their model evolved. In-theater movies are practically dead thanks to cable, and COVID of course, so owning the exhibition rights means getting them into the intertubes and on home screens where people can sit in their PJs, make their own popcorn in the nuker without the phony butter, have a beer or glass of wine, and relax not worrying about the person next to them coughing.

  5. says

    …but it is good at repeating patterns, and that makes it a useful tool. An expert working with an LLM can be more productive than one without…

    Which specific field(s) are you talking about?

  6. says

    The real goal of the Hollywood studios is getting to the point they can digitize the actors and own them forever…

    Does this mean all our favorite porn stars will be young forever (asking for a friend)?

    Another longer-term goal of Hollywood studios will be, not only to own the images of actors, but to “perfect” them: to keep them young (or age them up or down as desired), to erase the odd scar or other skin imperfection (or change their skin color altogether), to make their hair longer, shorter, bouncier or flowier, to make their hips wider or their waists narrower, to make their boobs bigger or smaller according to current tastes (subject to change without notice), whatever else (they think) viewers want to see (subject to change without notice). So producers get to show “reliably” appealing character faces and bodies every time, and men and women (but I suspect mostly women) get even more ridiculously unrealistic and unhealthy ideals of “beauty” and “perfection” to strive for (subject to change without notice).

  7. Walter Solomon says

    JM @3

    Churn out a Mission Impossible movie every other year for the next few decades.

    That would backfire. Audience fatigue is real as we’ve seen with both Marvel and Star Wars.

  8. robro says

    Raging Bee @ #6 — If the past is any indication, it will be the porn industry that develops and first deploys such technologies in their films. They have a reputation for being cutting edge.

    Walter Solomon @ #7 — Even though Hollywood has a strong tendency to go with the tried-and-true, it’s possible a digital film generator could figure out what elements of the story need to change to avoid audience fatigue.

  9. John Morales says

    “The whole history of “artificial intelligence” since 1955 is making impressive demos that you can’t use for real work.”

    Heh heh heh. Since 1955, eh?

    Automated facial recognition, it’s a thing. But not AI.
    Computational photography, also a thing. Also not AI.
    OCR, same thing. Hell, “the algorithm” used to be a form of AI.

    (That is, once an application becomes commonplace, it stops getting called AI)

    Oh yeah, and a poor worker blames their tools.

  10. robro says

    John Morales @ #9 — As I’ve pointed out every time you type something and you get type-ahead suggestions you’re looking at “AI” in the current sense, ie statistically based probabilities about what the next letter or whole word will be. This has proven invaluable to me personally when I’m doing text messages on my phone because it saves having to hit all those tiny little buttons for every letter. It ain’t perfect, of course, and sometimes it drives me nuts, but on the whole it’s useful. I’ve also used DuckDuckGo’s generative AI Assist to answer quick questions. It’s relatively faster than generic web searching followed by reading several pages of content, and at least for simplistic, straight forward things seems reasonably accurate.

  11. seachange says

    I have a nearly two decade investment in my education where I am very good at grammar, spelling, and how to do research at the library with the see alsos and card catalogs. I am much much better than any ai and deeply regret the loss of Alta Vista and Northern Light search engines. I find myself using “-ai” and -ai as search terms every single time, and my typing speed of 70 WPM has gone down to 55 just going back and getting rid of all the fucked up inserts.

    If there was a way to obliterate autocorrect and predictive searching everywhere in my computer I would.

  12. graham2 says

    Its easy, and fun, to rip into the AI generated stuff, but so it was also easy to mock the Commodore computer, etc. What I find amazing is how good (not perfect, but bloody good) the AI stuff is, and with such a short development time. If you project forward just a few years, the critics may be eating their words.

  13. John Morales says

    Apropos of nothing: https://www.rfcafe.com/references/radio-news/behind-giant-brains-january-1957-radio-television-news.htm

    Radio & Television
    News
    ran a two-part article on the state of the art of computers in the late
    1950s (this is part 1). It had only been since ENIAC’s (Electronic Numerical Integrator
    And Computer) debut in 1946 at Massachusetts Institute of Technology (MIT) that
    the public (or science community for that matter) was getting used to regularly
    hearing about computers in the news. By 1957 there were many companies popping up
    with electronic computer offerings. Originally the exclusive purview of university
    research labs and defense installations, the size and cost of computers was moving
    into the realm of affordability by corporations that used them for accounting and
    bookkeeping, and in some cases even rented idle time to outside users. Desktop PCs
    and notebook computers were still the realm of crazy dreamers.

    Quite a good read.
    Begins thus:

    By Frank Leary

    [Part 1. *Historical development and principles of electronic computers.*

    Here’s the story about the devices that are now beginning to shape our lives. To be concluded next month.]

    On a raw afternoon in February, 1946, officials of the Federal government and the University of Pennsylvania, several luminaries of the world of science, and representatives of the press met at the Moore School of Electrical Engineering, on the University of Pennsylvania campus in Philadelphia.

    Norbert Wiener, the MIT math professor who was to start a whole cross-section of America using the term cybernetics, arrived characteristically without an overcoat. Others parked their wraps and were shown into a large room at the back of the building. Racks of electronic apparatus surrounded them. They were told they were inside an electronic calculator which could solve complex differential equations – such as an equation in external ballistics – faster than most people could state the problem. Some were excited, others politely interested, a few were bored. They watched the electronic gadgetry being put through its paces: punched cards with problem data were fed in, cards with answers were punched seconds later. Someone checked the results; they were correct. The press asked some questions, got some answers, and then everybody went to dinner.

    These men had been summoned to witness the first public showing of the Moore School’s electronic numerical integrator and calculator (a mouthful of description shortened by Army Ordnance officers into the acronym ENIAC). It was not an occasion that seemed particularly world-shaking, but the outgrowths from this machine have been giving the world its share of shudders ever since.

    Yeah, yeah… you’d think. Old-timey shit.

    But it ends thus:

    It is an error to romanticize, humanize, or personify these devices. They are completely unimaginative servants; they can do exactly what they are told, provided a tube doesn’t burn out, and provided also that what they are told is consistent with what they can do; but they can do no more. They are controlled by the men who make them, the men who operate them, and the men who program them. They are especially at the mercy of the men who turn them off when the day is through.

    Any time a computer seems to show imagination, it is because someone used imagination in designing its program. If a “giant brain” solves a problem, it is because someone (a) knew exactly how to go about solving that problem, and (b) knew precisely how to instruct the equipment in the procedures for solving that problem. If anyone ever gets one of these computers to write a symphony, for example, it will be because that person knows the laws of melody and harmony, counterpoint, orchestral placement, musical structure, and scoring, and knows what limits to set, and knows further how to translate all these laws, maxims, and principles into an abecedarian lingo that the simpleminded “brain” can follow. Anyone who can do that could write the symphony himself, in less time than it would take to get the computer to do it. The only advantage would be that the computer could turn out an infinitude of remarkably similar symphonies at an extremely rapid rate.

    Progress, eh? :)

    Anyway, good article, that.

  14. KG says

    I should add that I agree with David Gerard about the attempt to use generative AI to tell coherent stories (either in text or visual form). ChatGPT and Veo type systems can’t do this, because they know nothing about the world, only about language or images – and then only in a statistical sense. They are, in fact, examples of how much can be done without intelligence – and how much can’t. Specialised systems such as those John Morales mentions, and fully autonomous killer drones, don’t need much knowledge of the world to do what they are designed to do. Real General AI will do – and in my view, is still decades away, because “deep learning” on its own will never get there.

  15. John Morales says

    BTW, wars are great for these sort of things. Motivated development.

    From nearly a year ago now:

    And let’s not forget ground vehicles

  16. says

    If you wanted to do the job properly you’d have to put some effort into understanding what a movie is.
    Making an AI that can do photography isn’t enough.
    You need an AI script writer, an AI director, an AI designer and an AI animator, you need an AI editor and probably a half dozen other things I’ve not thought of. Most of all you need an AI producer to coordinate and integrate all those parts so they can do their jobs.

    In almost all cases, when people look at the limitations of these systems the things they point out are a consequence of the particular system they’re looking at not being trained to do the specific thing they’re pointing to. There are systems that can do text, they’re just different to the ones that do images. You really can train one of these systems to do almost anything a person can do, at enormous expense and the use of massive amounts of power and clean water. The question we should always be asking is – why do we want a machine to do something people can already do perfectly well?

  17. John Morales says

    “The question we should always be asking is – why do we want a machine to do something people can already do perfectly well?”

    Productivity. That’s the very premise of mechanisation.

    Think John Henry.

    Ever seen modern workmen at work, say, building a house?

    All done with power tools, not plain hammers and saws.

    Same thing, but for things that take “intelligence” instead of “muscle”.

Leave a Reply