Black Forest Labs – Flux1.0 / 1.1 image generation
How to generate your own high quality Ai images using Flux – Easily!
Background
Black Forest Labs, a German based company, released FLUX on the 1st August this year. It was released without the usual build-up and hype that has sometimes accompanied other Ai launches. Flux is however a real game changer.
Without doubt its launch put it head and shoulders above any of the competition for creating images, and already an improved version has just been released. Flux1.1 addressed limitations in user control, speed, and output refinement, meaning that the output more closely matches the text prompts, and is quicker. In Ai land, quicker usually also means less expensive.
Why isn’t Flux a standalone application?
Flux can’t be downloaded directly onto a standard computer and run with an easy to use interface. Like many other Ai developments, it requires a platform to run on. The reason for this is that Flux is the engine to make things work, and isn’t intended as a standalone system. For example Flux is integrated into xAi’s Grok-2, Elton Musk’s rival to Chat-GPT, giving Grok an image generator in much the same way that Dall-E is part of OpenAi’s Chat GPT.
The platform’s task is to act as an intermediary handling all the technical stuff; creating links to the servers; setting up the APIs, and so on, such that users only have to manage a simplified front end without the need to dig deep into the system or learn anything technical.
A simple analogy is this; imagine Flux to be like a car engine – it’s very high performance – but it’s still only the engine. It requires a vehicle body and some custom tuning to use it effectively.
And that car body takes the form of a variety of different platforms – of which there are many different options. Best known are HuggingFace, Replicate, AWS (Amazon SageMaker), and Google Cloud AI (Vertex AI), but there are many more.
To stick with the car analogy, Hugging Face is probably like a comfortable SUV, Replicate like a sports car, and both Amazon and Google very heavy duty trucks with giant wheels.
A heavy duty truck is not the kind of vehicle that’s of much use to us small video makers – there’s a steep learning curve to control them, they are expensive to run for small jobs, and parking at the supermarket can be a problem!
Part of the added value of these platforms is to tune the engine to some degree, so Flux outputs can be slightly different in the same way that engines might be tuned to perform differently in different vehicles.
Replicate’s interface is really easy to use, and it was my choice of platform for Flux.
It’s straightforward, has good guides, and produces the type of output that I want. Whilst there is some flexibility in ‘tuning’ the style and type images, I’m very happy with those that it produces without needing any extra tweaks. And I know that it still has the potential to allow me to create my own fine-tuning if I need to later.
How to easily produce Flux images using Replicate.com
This is a very easy (non technical) guide to get started with any of the Flux models, and at the same time get access to many other utilities and Ai tools that are available on Replicate.
Firstly open replicate.com,
click ‘Get started’, and then sign in to GitHub.
If you don’t have an account, simply create one:
You’ll need your email, create a password, create a username, choose email preference option, and finally verify your account. Then you are all set and will be returned to the Replicate page.
Whilst it’s possible to use Replicate without payment (some of the options are free), if you want to do anything useful you’ll need to add a credit card to the account. You’ll be prompted to add a payment method.
How much does it cost and how am I invoiced?
Each month you will only pay for what you use, so it’s not a standing subscription. If you don’t generate any output there is no charge. You are also encouraged to put a cap on your account to avoid any possible overspending. Mine is set at a miserly $5/month, and so far it’s been more than sufficient for the several hundred images I’ve produced!
In replicate, you’ll see many other programs and utilities on the platform, don’t be afraid to experiment. The estimated cost per image will appear at the top and bottom of the relevant page. Check it out before using the model you’ve selected.
Which Flux version is best to use to generate images?
Black Forest Labs launched 3 models on 1st August2024:
- Flux 1.0 Schnell (@ 0.3 cents/image) (basic, but 1/8 of the price of the Dev version)
- Flux 1.0-Dev (@ 2.5 cents/image)
- Flux1.0-Pro (@ 5.5 cents/image)
- Flux 1.1 Pro was added in October 2024 @ 4 cents/image.
My preference is for Flux 1.0 Dev. It suits my purposes and isn’t too pricy, but test out alternatives to find your own preferred output quality. Simply select the option from the range available on the ‘Explore’ Tab.
How to generate an image using Flux1.0 dev on Replicate
It’s really easy to generate an image.
Under ‘explore’ tab, look for ‘Flux1-Dev’ and click on it.
This takes you to this input page.
To start generating there are only 4 actions required:
- Enter your prompt – it can be something very simple – ‘spotted fluffy cat, riding a skateboard watched by lots of spectators. Colorful modern 3d animation style’.
- Select the image size – 16:9 is widescreen HD
- Select the image output – It defaults to webp, so change it to either jpg or png (either are fine)
- Press ‘RUN’.
When it’s generated, click ‘Download’ to save it locally.
All the other settings can be left at default, ready to play with another time!
(Btw this same prompt was used for another ‘cat on a skateboard’ creation shown below, except the last sentence was replaced with ‘Photo Realistic’)
If you want to go back to see any later, older images will be found under the‘Dashboard’-‘Predictions’ tab.
Why does it cost?
One thing to be clear about with Ai is that the really good stuff isn’t free! Someone has to pay for the programming, compute, and the power (energy) to drive them. They aren’t simply using the top of the range $1,500 Nvida 4090Ti GPU’s in a powerful PC, instead they might access banks of Nvidia T100 GPU’s at about $10,000 each, or even the much more powerful H100 at about $40,000 each.
How does it work?
Contrary to popular misunderstanding, Ai generated images are not simply copied from the internet and pasted into a composite image. Each image is generated by the Ai model using its understanding of the world and how it works. Because Ai image generation is in its infancy, Ai can sometimes (or even often) get details wrong. For example hands, and fingers in particular, have posed a problem for Ai, and even today they’re not always perfect. As an illustration these fingers produced by Flux1.0 (the image on the right) aren’t quite correct – which might also explain his sorrowful expression. The real challenge will come when they are perfect. Then how will we know what is real?
How can I produce exactly the images that I want?
The key to all text to image is the quality of the prompts. Indeed, it will clearly create a small industry for competent ‘Prompt Writers’.
However, to be honest, I use relatively simple prompts and am happy to accept the variability of the output. Some of my prompts are shown, but even simpler ones such as ‘man using laptop computer on his desk’ produces excellent results.
Dealing with Ai generated output has a lot to do with setting realistic personal expectations of the Ai. Hoping that Ai will produce exactly the image that you hold in your head is a guaranteed way to create massive frustration. Believe me, I’ve been there! The best way is not to hold any preconceived ideas of exactly how it might look, and utilize the output in whatever form it appears.
Ai is still improving, and it’s easy to forget that just two years ago if you said that life-like photo realistic images could be created from simple text prompts you’d be labeled as delusional. Ai will only get better, but for now my advice is to accept what it can do today, rather than fighting it hoping for perfection.
In some cases I’ve simply tweaked my storyline to suit the image output. The creative part for us is to utilize the images produced in ways that can be used to enhance a story in a video (or document such as this) without becoming frustrated that the man was wearing glasses…
Images were generated using Black Forrest Labs Flux1.x
What about Ai Videos?
Whilst I’m sure we are all longing to get Ai generated videos, and Black Forest Labs are working on video creation, but it might be a bit of longer wait for us amateurs. Even if it does come soon, it probably won’t suit our pockets.
Using the price of 2.5 cents per image for a video at 25 frames per second means that each second costs 62 cents, each minute is $37.5. So a 5 minute video at these prices would cost $187.50.
Even a fraction of that price could still be very steep when we realize that the output at first is likely to be somewhat unpredictable, and several regenerations might be required to get the results needed.
I can imagine Ai video generation being widely used in advertising studios, where short, sharp, and often slightly disjointed clips of only a few seconds might be ideal. And certainly the cost wouldn’t be an issue for them – indeed it would likely be massive cost saving.
Meanwhile, for us hobbyists, I suspect that other more creative ways to generate Ai video’s might be more accessible first. Perhaps using avatars would be a start?
More about that in later editions of Ai Corner…
So, while you wait for the video versions, get started with the images!
Enjoy!
Now how it works for the techies reading this.
Flux is powered by Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs), and Transformer-based models. The model has been trained on vast datasets containing a wide variety of images, styles, and concepts, allowing it to generate images with a high level of realism, creativity, and detail.- Natural Language Processing (NLP): The integration of NLP allows Flux to interpret complex prompts written in plain English (or other languages). This makes it easier for users to generate images simply by describing what they want, removing the need for technical expertise or specialized knowledge.
- Generative Networks: These are responsible for producing the images based on the user’s prompt. Flux leverages both traditional GANs as well as more modern architectures like StyleGAN2 to ensure the images are realistic yet artistically flexible.
- Advanced Neural Networks: The model utilizes a multi-layered neural network architecture that allows for both image generation and real-time style transfer. This network architecture helps to refine the quality of generated images, ensuring high levels of detail and complexity.
Ai Closing Comment for this Month
Ai at this level has only been around for a very short time, and during that period the developers have learned a great deal. Some 40 years ago it was thought that Neural Networks were a dead end for Ai, but thanks to the perseverance and persistence of people such as Geoffrey Hinton (Nobel Prize for Physics, October 2024) Neural Networks have become central to Ai evolution. And in the process they have taught, and continue to teach us, much more about the inner workings of the human brain. And In more recent times (only months ago) people were suggesting that Large Language Models had reached their limits. That the Ai wall was being hit and that an Ai ‘winter’, were nothing happens, was imminent. As is often the case with Ai, the unpredictable occurs, and today the rate of Ai growth is faster than ever.
The next development will be personal Ai Agents. Watch out for more about these in future editions.
Buckle up, it’s going to be one heck of a ride!
Jim Reed