Google has come up with a new tool for generating video called Veo — feed it some detailed prompts, and it will spit back realistic video and audio. David Gerard and Aron Peterson decided to test it and put it through its paces, and see whether it produces output that is useful commercially or artistically. It turns out to be disappointing.
The problems are inherent to the tools. You can’t build a coherent narrative and structured sequence with an algorithm that just uses predictive models based on fragments of disconnected images. As Gerard says,
Veo doesn’t work. You get something that looks like it came out of a good camera with good lighting — because it was trained on scenes with good lighting. But it can’t hold continuity for seven seconds. It can’t act. The details are all wrong. And they still have the nonsense text problem.
The whole history of “artificial intelligence” since 1955 is making impressive demos that you can’t use for real work. Then they cut your funding off and it’s AI Winter again.
AI video generators are the same. They’re toys. You can make cool little scenes. In a super limited way.
But the video generators have the same problems they had when OpenAI released Sora. And they’ll keep having these problems as long as they’re just training a transformer on video clips and not doing anything with the actual structure of telling a visual story. There is no reason to think it’ll be better next year either.
So all this generative AI is good for is making blipverts, stuff to catch consumers’ attention for the few seconds it’ll take to sell them something. That’s commercially viable, I suppose. But I’ll hate it.
Unfortunately, they’ve already lost all the nerds. Check out Council of Geeks’ video about how bad Lucasfilm and ILM are getting. You can’t tell an internally consistent, engaging story with a series of SIGGRAPH demos spliced together, without human artists to provide a relevant foundation.
I’ve seen quite a few shorts/reels…whatever…that are autogenerated in some way. You can tell that even the more realistic ones are autogenerated because the movements of the characters are sometimes implausible. They come with producers names like Club Cranium, Kelly Eldridge Boesch, and Crumb Hill. Some are steampunk, some Dali-esque, some clownish. Crumb Hill veers into dark horror. The characters sometimes kind of dance to the sound track as they morph from one shape to another. They are usually short…fortunately. A few can hold my interest for a minute or two, mostly because of the sound track, but it gets boring fairly quickly.
I’ll refrain from predicting the future about this use of AI…who knows. As for Lucasfilm, ILM, Disney, etc, I haven’t seen a movie in years because they are so predictable. In fact, the last movie I saw in theater was Peter Jackson’s documentary They Shall Not Grow Old…an amazing work which owes something to the technology.
Despite all the carping about AI…and there is plenty of hype and lot of drek in that arena…it does have it’s uses. As Science reported in 2020: ‘The game has changed.’ AI triumphs at solving protein structures. I don’t pretend to fully understand this but Derek Muller (aka Veritasium) did a piece about it that made it seem impressive.
I think many would quote Faraday and say “Madam, of what use is a baby?”
Now, I’m not a die-hard AI fan (I’ll write my own code, thank you very much) and I agree that the current iteration is more hype than substance, but you have to admit that there is some substance there. What they currently call “AI” isn’t intelligent – it’s ignorant of reality and has no volition of it’s own, but it is good at repeating patterns, and that makes it a useful tool.
An expert working with an LLM can be more productive than one without. The only issue is where LLMs compete against some entry level roles, and if those roles were eliminated there would be a shortage of experts in the future.
The real goal of the Hollywood studios is getting to the point they can digitize the actors and own them forever. Churn out a Mission Impossible movie every other year for the next few decades. Cycle another Star Wars trilogy every decade or so, with some other movies in between. Get rid of one of the expensive and erratic parts and replace it with something reliable and cheap.
They know the technology isn’t there yet but they think AI will develop the say way special effects have. Where every decade has produced a new generation of technology allowing a ten fold advance.
What Hollywood doesn’t realize is that AI video starts to work their entire industry will be eaten out by individual artists making their own short TV shows and movies. It’s already happening a bit with stuff on Youtube but it’s hard and slow. When an artist can sketch the major characters and shell out the plot and have AI fill in the background and generate video there will be storm of this stuff.
JM @ #3 — “What Hollywood doesn’t realize is that [when] AI video starts to work their entire industry will be eaten out by individual artists making their own short TV shows and movies. It’s already happening a bit…” which is, I think, what I’m seeing in these reels although the reels don’t have a story line just images and they don’t do much. I believe these reels are made by individual artists or small collectives.
However, I would hesitate to predict these things eating out Hollywood. It’s a very resilient business, although it could alter the industry a lot…kind of like the demise of the studio system with the advent of TV. But note that the death of the studio system did not kill those business: MGM, Paramount, Warner Bros, 20th-Century Fox, even RKO, as well as some of the “little” guys…Paramount, Universal, United Artists…still exist and are familiar names. Their model evolved. In-theater movies are practically dead thanks to cable, and COVID of course, so owning the exhibition rights means getting them into the intertubes and on home screens where people can sit in their PJs, make their own popcorn in the nuker without the phony butter, have a beer or glass of wine, and relax not worrying about the person next to them coughing.
…but it is good at repeating patterns, and that makes it a useful tool. An expert working with an LLM can be more productive than one without…
Which specific field(s) are you talking about?
The real goal of the Hollywood studios is getting to the point they can digitize the actors and own them forever…
Does this mean all our favorite porn stars will be young forever (asking for a friend)?
Another longer-term goal of Hollywood studios will be, not only to own the images of actors, but to “perfect” them: to keep them young (or age them up or down as desired), to erase the odd scar or other skin imperfection (or change their skin color altogether), to make their hair longer, shorter, bouncier or flowier, to make their hips wider or their waists narrower, to make their boobs bigger or smaller according to current tastes (subject to change without notice), whatever else (they think) viewers want to see (subject to change without notice). So producers get to show “reliably” appealing character faces and bodies every time, and men and women (but I suspect mostly women) get even more ridiculously unrealistic and unhealthy ideals of “beauty” and “perfection” to strive for (subject to change without notice).
JM @3
That would backfire. Audience fatigue is real as we’ve seen with both Marvel and Star Wars.
Raging Bee @ #6 — If the past is any indication, it will be the porn industry that develops and first deploys such technologies in their films. They have a reputation for being cutting edge.
Walter Solomon @ #7 — Even though Hollywood has a strong tendency to go with the tried-and-true, it’s possible a digital film generator could figure out what elements of the story need to change to avoid audience fatigue.
“The whole history of “artificial intelligence” since 1955 is making impressive demos that you can’t use for real work.”
Heh heh heh. Since 1955, eh?
Automated facial recognition, it’s a thing. But not AI.
Computational photography, also a thing. Also not AI.
OCR, same thing. Hell, “the algorithm” used to be a form of AI.
(That is, once an application becomes commonplace, it stops getting called AI)
—
Oh yeah, and a poor worker blames their tools.
John Morales @ #9 — As I’ve pointed out every time you type something and you get type-ahead suggestions you’re looking at “AI” in the current sense, ie statistically based probabilities about what the next letter or whole word will be. This has proven invaluable to me personally when I’m doing text messages on my phone because it saves having to hit all those tiny little buttons for every letter. It ain’t perfect, of course, and sometimes it drives me nuts, but on the whole it’s useful. I’ve also used DuckDuckGo’s generative AI Assist to answer quick questions. It’s relatively faster than generic web searching followed by reading several pages of content, and at least for simplistic, straight forward things seems reasonably accurate.
I have a nearly two decade investment in my education where I am very good at grammar, spelling, and how to do research at the library with the see alsos and card catalogs. I am much much better than any ai and deeply regret the loss of Alta Vista and Northern Light search engines. I find myself using “-ai” and -ai as search terms every single time, and my typing speed of 70 WPM has gone down to 55 just going back and getting rid of all the fucked up inserts.
If there was a way to obliterate autocorrect and predictive searching everywhere in my computer I would.
Its easy, and fun, to rip into the AI generated stuff, but so it was also easy to mock the Commodore computer, etc. What I find amazing is how good (not perfect, but bloody good) the AI stuff is, and with such a short development time. If you project forward just a few years, the critics may be eating their words.
Apropos of nothing: https://www.rfcafe.com/references/radio-news/behind-giant-brains-january-1957-radio-television-news.htm
Quite a good read.
Begins thus:
Yeah, yeah… you’d think. Old-timey shit.
But it ends thus:
Progress, eh? :)
Anyway, good article, that.
John Morales@9,
Indeed so. And: Killing machines: how Russia and Ukraine’s race to perfect deadly pilotless drones could harm us all. once a fully autonomous drone blows the first political leader’s head off, it won’t be AI.
I should add that I agree with David Gerard about the attempt to use generative AI to tell coherent stories (either in text or visual form). ChatGPT and Veo type systems can’t do this, because they know nothing about the world, only about language or images – and then only in a statistical sense. They are, in fact, examples of how much can be done without intelligence – and how much can’t. Specialised systems such as those John Morales mentions, and fully autonomous killer drones, don’t need much knowledge of the world to do what they are designed to do. Real General AI will do – and in my view, is still decades away, because “deep learning” on its own will never get there.
O brave new world
That has such weapons in in’t!
BTW, wars are great for these sort of things. Motivated development.
From nearly a year ago now:
And let’s not forget ground vehicles
If you wanted to do the job properly you’d have to put some effort into understanding what a movie is.
Making an AI that can do photography isn’t enough.
You need an AI script writer, an AI director, an AI designer and an AI animator, you need an AI editor and probably a half dozen other things I’ve not thought of. Most of all you need an AI producer to coordinate and integrate all those parts so they can do their jobs.
In almost all cases, when people look at the limitations of these systems the things they point out are a consequence of the particular system they’re looking at not being trained to do the specific thing they’re pointing to. There are systems that can do text, they’re just different to the ones that do images. You really can train one of these systems to do almost anything a person can do, at enormous expense and the use of massive amounts of power and clean water. The question we should always be asking is – why do we want a machine to do something people can already do perfectly well?
“The question we should always be asking is – why do we want a machine to do something people can already do perfectly well?”
Productivity. That’s the very premise of mechanisation.
Think John Henry.
Ever seen modern workmen at work, say, building a house?
All done with power tools, not plain hammers and saws.
Same thing, but for things that take “intelligence” instead of “muscle”.