Ai Video Making

The Latest Ai for Film Making

by Jim Reed.

Last year my videos ‘An Editors Struggle’ and ‘Not Real’ used a lot of Ai, with images from Flux, lip synced Avatars from Heygen, and audio from Google Notebook LM, Eleven Labs and Suno. The result was very impressive at the time, but a great deal has changed since then.

Play Video
Jim Reed - 'An Editors Struggle''
Play Video
Jim Reed - 'Not Real...'
Note: Every image in the feature below is a screen grab taken from Veo3 Ai generated videos

At the turn of the year many professional commentators were talking about Ai running out of steam and ‘hitting the wall’, where nothing new could be produced. There was some logic to their reasoning, but they were wrong.

Suffice to say that Ai developments continue to confound those who thought that the rate of improvement couldn’t last, and we video makers continue to benefit from newer releases.

It’s now getting increasingly hard to keep up with all the changes in Ai video making. This year, new video generators from companies such Haiper, Tencent, Luma, Minimax and Kling were released, pushing forward a new era in video creation, and a vast range of Ai utilities are now helping video creators in all areas of post production:

Assisting in editing with auto cutting scene changes or key moments, and smart triming based on dialogue, motion, or silence (Descript or Wisecut); and auto framing, for example converting from 16:9 to 9:16 (or the reverse); or creating missing backgrounds to enable the subject to be better positioned in the frame.

Speech can easily be transcribed for automatic generation of subtitles (Whisper, Eleven Labs, Kapwing,VEED.io), or cloning and dubbing, or enhanced by removing background noises. Products such as HeyGen, DeepDub or Papercup enable language translation, auto dubbing and lip-sync generation. Ideal for correcting errors or enabling animated characters to speak.

Backgrounds, objects or features can be masked or removed, and slighlty out of focus recordings corrected. Videos can be upscaled to higher resolutions and frame rates correctly changed using frame interpolation (FlowFrame, Topaz). Motion tracking and Ai generated keyframe become straight forward.

Videos can mimic specific cinematic film effect looks using filters and automatic grading together with Ai assisted LUT generation. Original footage can be used to create style transfer (Cartoonise) or create new style footage (Runway Act One)

Audio generation for background scores (Suno, Unio, AIVA, Soundraw) can result in much better and more appropriate music tailored to the films mood, and auto generated music matched to video cuts or beats.

All these post production utilities make the work less arduous for the editor, and make the saying ‘we can fix it in post’ a realistic possibility.

Ai provides pragmatic and sensible time conserving tools, and those who work with it quickly realise the added potential it gives to help release their own creativity – not only by freeing up their time – but also, for example, enabling those who have never composed music before to find a new way to express their creativity.

All these Ai utilities are on a seemingly unending cycle of upgrade and improvement, with new releases almost weekly. And now we are very close to the point where Ai can not only assist video makers with the post production tools, but can be utilised in the actual making of a film.

Which neatly brings me to Google’s Veo 3, announced in the US during May, and released in the UK in June.

A year ago we were pretty happy if Ai could produce a video with the correct number of fingers on a hand and a person could walk properly. Now we are very close to the point where Ai can produce video clips that are stunningly accurate, and could possibly pass as genuine footage.

Google’s Veo 3 is significant in that it not only generates high quality video from text prompts, but it also generates the speech and background sounds.

This link to Googles I/O event features a 1:22 min introduction entirely generated with Veo3.

And this spoof advert for Puppramin was created by P J Ace Films (click the image to open on YouTube).

He explains the prompt used for the first 5 seconds, and how it was made below:

‘[Muted colors, somber muted lighting. A woman, SARAH (50s), sits on a couch in a cluttered living room. She speaks (melancholic, slightly trembling voice) “I tried everything for my depression, nothing worked.” ]

I then worked with Grok/ChatGPT on the rest of the script (I wrote most of it but it helps me come up with the ideas).
Once the script was done, I then had it create a shot list based on that prompt structure. 13 shots. 5-10 gens per shot to get right. About $500 in credits.’

But like most things, there is a catch, and for us club makers that’s it – the cost of $500 dollars in credits to make this 1:12 min video.

The simple fact is that someone has to start paying for the billions of dollars invested in Ai, and the days of free or inexpensive software are fast disappearing. Low cost product is still there, but the really good abilities are beginning to come with a price.

Not only do these better quality versions require a monthly subscription, but they also need tokens to pay for the generation. As an example, the only way to access Veo 3 is through Google’s AI Ultra monthly subscription at $250 per month, and although this includes 12,500 tokens, it only amounts to about 11 minutes of video. Each 8 second clip consumes 150 tokens, and extra tokens can be purchased at $1 per 100.

Knowing the variability of output, more tokens would be required for each scene. The advert above required 5 to 10 generations for each of the 13 shots. If we were professional users (for example creating adverts such as the spoof above) then compared to a spend of tens, or even hundreds of thousands of dollars, the cost would be negligible. However, as a club, the cost could easily become prohibitive for individual members.

It is increasing apparent that in the same way that tape replaced film and then digital replaced analog, Ai will play a major role in future film making. I believe that there is an exciting future ahead of us and Ai is is not only going to get better over the coming months, but it will become far more integrated, such that it can’t be ignored.

But, based on today’s evidence, it’s likely to come with a significant dollar price tag too.

All of which poses the question; how best should video clubs such as ours embrace the potential (and the cost) of Ai in their movie making activities?

Now that should be an interesting discussion!