A peek at the future of product photography
đź‘‹ Welcome to New Vintage, a weekly taste of tech for wine professionals to level up with no-code and AI tools.
PRODUCT PHOTOGRAPHY will be drastically more efficient in 2025.
I wanted to share some examples of work I made in minutes, not days, to give you a basic sense of what's possible.
Let's get a few things out of the way:
- Yes, this is an AI-focused piece.
- No, I'm not suggesting you shouldn't hire people.
- Yes, I'm a nerd and I think this stuff is so cool.
In this issue, we'll walk through some examples of what we can do on three "levels" to produce product photography assets today. There will be pictures.
For those that just want to know what tools I used, all assets shown were made using the Flux models by Black Forest Labs. I used Replicate for both generation and fine tuning.
Let's get started.
Level 1: Out of the Box
I think tasting notes are boring. There, I said it.
It's not that I don't want to get a sense of what a wine might be like, it's just that the typical format is stuffy and most sound the same. At a certain point, I don't know what forest floor means, nor do I want it in my wine.
So, as an experiment, I took the tasting notes of four different wines, and I turned them into prompts (text descriptions of what you want AI to produce).
Below are the results. They are "out of the box", meaning unedited and I think they're pretty cool.
These could be used in product listings, social posts, ads, etc. and I was personally surprised by how good they were right out of the gate. It had me wondering, could we do the same with actual products?
So, I tried a really basic prompt:
Here's the result:
That might be a bowl of chicken fingers? Fish sticks? Who knows.
The point is, that took 8.9 seconds to run and this was the FIRST result. Pretty incredible.
I've never run a product photoshoot, but I imagine that it's a bit more involved, albeit with much more control.
Level 2: Adding More Control
Speaking of which, the first way to add more control is by adjusting the prompt, i.e. making your instructions better.
There's a bit of a dance that is required here and it's too early for true "best practices", but suffice to say, your "better" prompt will be somewhere in between a fully scoped creative brief and a robotic series of descriptors that sound like a really weird deli order.
Building on the basic prompt we used earlier, I wanted to make it a bit more focused on one product.
Here's the result:
That's pretty good! I don't think I'd put random branches on my couch, but that's just me.
But what if we wanted to iterate on this scene? In each image generation, an image "seed" is used, which you can then feed into subsequent generations. You can think of this as operating from the same base image, instead of a random one.
Doing this, lets change the food.
How about grilled meats from an asado?
Or maybe one with generic grilled foods and different glasses?
Are these perfectly identical or flawless? Of course not. However, adjusting your prompts and your use of seeds let's you iterate through concepts incredibly fast.
You could absolutely pause here, take any decent images and just photoshop your product(s) into them. That's already a decent time save.
Note: various editing tools are progressing so fast that by the time you read this, you may already be able to skip photoshop and seamlessly drop in your products. Testing various in-painting tools produced limited results.
Level 3: Customize the Models
The frontier that I'm most excited about is known as "fine-tuning", where you are essentially training an existing model on your data or in this case, images.
This allows our generations to go beyond what we could do with photoshop because we are no longer bound by manual (or even programmatic) editing, we can now just have a specific model tuned for each one of our products, locations, brands, etc.
Let's build on the bruschetta example from earlier.
I trained a model on 20 publicly available images from the internet. It took 30 minutes, the images weren't very good quality and I didn't do a very good job. Additionally, the best models aren't widely available for fine-tuning yet, so what you see below is actually of lower image quality, but you'll get the idea.
Don't squint too hard, they're not that good.
However, this is still pretty incredible.
You could immediately make these better by simply:
- curating the right training sets (more and better images)
- running longer training cycles
- optimizing the generation settings (there are more knobs you can turn)
- editing the images
- adding overlays for social
This is where the human expertise can really shine. It's getting all the finer details dialed in, so that the images are "baked" with the right ingredients.
In Summary
There's so much opportunity to leverage AI today to truly level up your visual assets (and all the content they go into).
Starting with a simple prompt, you can quickly get “good enough” scenes and evolve them through improved instructions, consistent “seed” images, and eventually model fine-tuning.
It’s already possible to skip the heavy lifting of a traditional photoshoot, get a decent base image in seconds, and then refine it—via Photoshop or direct model training—into a final visual that’s tailored to your brand and products.
If you’re intrigued and want to discuss these concepts more deeply—exploring specific prompts, tools, or training approaches—drop a comment.
Let’s dive into the details together.
We'll skip next week's edition for the holidays.
Happy holidays,
Stephen
Comments ()