WEBVTT

00:00:00.000 --> 00:00:02.779
Have you ever tried to make an AI film where

00:00:02.779 --> 00:00:06.139
your main character, the one you spent ages designing

00:00:06.269 --> 00:00:09.830
Just changes faces in every other scene. Oh,

00:00:09.849 --> 00:00:13.810
that issue of character consistency. It is probably

00:00:13.810 --> 00:00:16.410
the biggest and most frustrating hurdle in the

00:00:16.410 --> 00:00:18.329
field right now. It just kills the immersion.

00:00:18.370 --> 00:00:20.190
It completely takes you out of it. But we have

00:00:20.190 --> 00:00:22.329
the golden key today, a trick called the three

00:00:22.329 --> 00:00:24.769
by three grid that locks the look right from

00:00:24.769 --> 00:00:27.170
the first picture. OK, let's unpack this. We're

00:00:27.170 --> 00:00:29.469
diving deep into a guide that's designed to fix

00:00:29.469 --> 00:00:32.530
that problem for good. That's right. Our listener,

00:00:32.810 --> 00:00:35.149
you, shared these really comprehensive notes

00:00:35.149 --> 00:00:39.020
on mastering character consistency in AI filmmaking.

00:00:39.539 --> 00:00:41.619
Our goal here is to move past those, you know,

00:00:41.899 --> 00:00:43.799
blurry changing versus and produce something

00:00:43.799 --> 00:00:45.880
that looks truly professional. We'll analyze

00:00:45.880 --> 00:00:48.280
the tools. We're going to break down this complex

00:00:48.280 --> 00:00:51.479
prompt formula into five clear parts and walk

00:00:51.479 --> 00:00:53.240
through the workflow. We're aiming for films

00:00:53.240 --> 00:00:55.240
that look polished, not like a collection of

00:00:55.240 --> 00:00:58.500
random photos. Exactly. You know, character consistency

00:00:58.500 --> 00:01:01.079
isn't just a technical thing. It's the key to

00:01:01.079 --> 00:01:03.380
storytelling. If your detective girl's yellow

00:01:03.380 --> 00:01:06.439
coat suddenly turns orange or her face changes,

00:01:06.519 --> 00:01:08.560
the whole film just looks unprofessional. So

00:01:08.560 --> 00:01:10.500
the three by three grid is basically a blueprint

00:01:10.500 --> 00:01:13.420
for that. Instead of one photo, we're creating

00:01:13.420 --> 00:01:16.450
a kind of reference sheet with nine different

00:01:16.450 --> 00:01:18.689
camera angles all at once. And that's how we

00:01:18.689 --> 00:01:21.930
lock the look. So what tools are powering this?

00:01:22.430 --> 00:01:25.170
The source suggests three crucial pieces of software.

00:01:25.310 --> 00:01:29.150
Right. First up is dzine .ai. Think of it as

00:01:29.150 --> 00:01:31.650
the creative hub built for filmmakers. It's what

00:01:31.650 --> 00:01:34.569
handles the character control. OK. Second is

00:01:34.569 --> 00:01:36.790
topaz gigapixel. This tool does something called

00:01:36.790 --> 00:01:39.409
upscaling. which just means making your images

00:01:39.409 --> 00:01:42.609
bigger, like 2x or 4x, but without losing quality.

00:01:42.810 --> 00:01:45.370
Exactly. It adds back details, like strands of

00:01:45.370 --> 00:01:48.150
hair. And then finally, for movement, you turn

00:01:48.150 --> 00:01:50.829
those high quality photos into smooth video using

00:01:50.829 --> 00:01:53.870
tools like Lumadri Machine or Runaway Gen 2.

00:01:53.989 --> 00:01:56.349
So why is generating a nine -shot reference sheet

00:01:56.349 --> 00:01:58.890
better than just, you know, generating nine separate

00:01:58.890 --> 00:02:01.790
images of the same character? Because the AI

00:02:01.790 --> 00:02:05.930
uses its internal memory, its short -term state,

00:02:06.469 --> 00:02:09.750
to maintain uniformity across that entire single

00:02:09.750 --> 00:02:13.349
page, it keeps it all consistent. Okay, this

00:02:13.349 --> 00:02:15.969
is the real heart of the process. A short, messy

00:02:15.969 --> 00:02:19.050
prompt means the AI is just going to fail. We

00:02:19.050 --> 00:02:21.389
need what the source calls an unbeatable formula,

00:02:21.629 --> 00:02:24.129
split into five clear parts. And before you even

00:02:24.129 --> 00:02:25.930
write anything, you have to set up the workspace

00:02:25.930 --> 00:02:29.699
in dzine .ai. Specifically, choosing the 1 .1,

00:02:29.719 --> 00:02:32.340
the square aspect ratio. Right. Why is that?

00:02:32.539 --> 00:02:34.560
A square shape just keeps those nine frames neat.

00:02:34.759 --> 00:02:36.759
It prevents the AI from cutting off the character's

00:02:36.759 --> 00:02:39.840
head or feet. Ah, that makes sense. Okay, so

00:02:39.840 --> 00:02:42.819
part one of the prompt. It's the command. A cinematic

00:02:42.819 --> 00:02:45.819
3x3 grid presenting multiple camera angles. That

00:02:45.819 --> 00:02:47.840
tells the AI to divide the image into netting

00:02:47.840 --> 00:02:50.199
sections. Simple. Part two is what you call the

00:02:50.199 --> 00:02:53.000
detail bomb. Subject and location. Yeah, don't

00:02:53.000 --> 00:02:55.419
just say a girl. Describe her hair, her eye color,

00:02:55.539 --> 00:02:58.139
the material of her clothes, any scars. The more

00:02:58.139 --> 00:02:59.979
detail you give it, the better the AI remembers

00:02:59.979 --> 00:03:02.080
her. And part three is where the magic happens.

00:03:02.360 --> 00:03:05.139
The nine specific camera angles. This makes sure

00:03:05.139 --> 00:03:07.379
your reference sheet is really versatile. You

00:03:07.379 --> 00:03:10.219
absolutely need these. You need an extreme close

00:03:10.219 --> 00:03:13.039
-up of the eyes, a medium shot from the waist

00:03:13.039 --> 00:03:15.699
up and over the shoulder, a wide shot, high angle

00:03:15.699 --> 00:03:17.800
top down, and a low angle, which makes the character

00:03:17.800 --> 00:03:20.000
look powerful. Right. Right. And then you need

00:03:20.000 --> 00:03:23.020
the side views, too. The profile, a 3 quarter

00:03:23.020 --> 00:03:25.819
rear view, and a full back view. If you skip

00:03:25.819 --> 00:03:28.620
those, the AI just improvises when your character

00:03:28.620 --> 00:03:31.379
turns around. And it's usually not good. OK,

00:03:31.379 --> 00:03:33.520
here's where it gets really interesting. Part

00:03:33.520 --> 00:03:37.300
four is cinema style and lighting. Things like

00:03:37.300 --> 00:03:40.539
volumetric fog, cinematic lighting, moody blue

00:03:40.539 --> 00:03:43.400
and orange. This is what Hollywood Eyes is the

00:03:43.400 --> 00:03:45.659
result. And it locks in the color palette. It's

00:03:45.659 --> 00:03:48.610
the glue. And then the final piece, part five,

00:03:48.669 --> 00:03:51.229
the cleanup command, images stacked together

00:03:51.229 --> 00:03:54.849
with no grid lines and no borders, no text. That

00:03:54.849 --> 00:03:57.849
no text part is so vital it saves you so much

00:03:57.849 --> 00:04:00.169
time later. The source gives this great example

00:04:00.169 --> 00:04:02.710
of a cyberpunk warrior prompt that uses all five

00:04:02.710 --> 00:04:05.590
parts. It just works. It works. So what happens

00:04:05.590 --> 00:04:07.870
if I skip the lighting keywords? What's the immediate

00:04:07.870 --> 00:04:10.270
consequence for my character? The lack of those

00:04:10.270 --> 00:04:12.349
lighting details will cause skin and clothes

00:04:12.349 --> 00:04:15.189
colors to change between scenes. Totally breaking

00:04:15.189 --> 00:04:20.240
the flow. OK, so we're moving from theory to

00:04:20.240 --> 00:04:22.779
application. We've made our 3x3 reference sheet.

00:04:23.360 --> 00:04:25.720
The source says you do a quality check, making

00:04:25.720 --> 00:04:28.620
sure details like eye color are about 80 % to

00:04:28.620 --> 00:04:31.540
90 % similar across the nine boxes. And once

00:04:31.540 --> 00:04:33.519
you've got that, the true secret is the character

00:04:33.519 --> 00:04:37.060
reference feature inside dzine .ai. OK. You save

00:04:37.060 --> 00:04:39.779
that successful 3x3 grid image, then you upload

00:04:39.779 --> 00:04:42.660
it using the little person icon button. And once

00:04:42.660 --> 00:04:46.319
it's uploaded, the AI now remembers the character.

00:04:46.439 --> 00:04:48.850
Yes. It has that file to check against. So now

00:04:48.850 --> 00:04:51.329
you can write short new prompts like, the girl

00:04:51.329 --> 00:04:53.949
is walking in the rain, and the AI automatically

00:04:53.949 --> 00:04:56.250
applies the face from your reference grid to

00:04:56.250 --> 00:04:58.290
that new scene. This whole process is so much

00:04:58.290 --> 00:05:00.370
faster than traditional filmmaking. Oh, it's

00:05:00.370 --> 00:05:02.569
not even close. After you master the workflow,

00:05:02.850 --> 00:05:05.370
the source suggests a full one -minute AI film

00:05:05.370 --> 00:05:07.569
takes only about two to three hours. Two to three

00:05:07.569 --> 00:05:10.290
hours. It's incredible. And after creating those

00:05:10.290 --> 00:05:12.370
new scenes, we need to upgrade them. That's where

00:05:12.370 --> 00:05:14.550
Topaz Gigapixel comes in. Right, for the upscaling.

00:05:14.750 --> 00:05:17.769
You drag the photos in, you choose 4x enlargement,

00:05:18.370 --> 00:05:21.069
and critically you select face recovery. That

00:05:21.069 --> 00:05:23.209
face recovery setting is what makes the eyes

00:05:23.209 --> 00:05:26.230
and skin look really, really real. It turns a

00:05:26.230 --> 00:05:29.149
good image into a high quality professional film

00:05:29.149 --> 00:05:32.850
asset. So what if I generate my grid? but one

00:05:32.850 --> 00:05:34.790
hand looks totally strange in the bottom right

00:05:34.790 --> 00:05:37.509
box. Do we have to restart everything? No, you

00:05:37.509 --> 00:05:39.730
don't. You can use the local edit tool to just

00:05:39.730 --> 00:05:42.069
brush over that area and fix that single box.

00:05:42.629 --> 00:05:44.370
All right. Let's move on to the pitfalls, the

00:05:44.370 --> 00:05:47.730
traps. The source warns against the prompt is

00:05:47.730 --> 00:05:50.129
too short trap. Right. This is for the scenes

00:05:50.129 --> 00:05:53.129
after you've made the grid. Exactly. If you don't

00:05:53.129 --> 00:05:56.269
detail the dirty green uniform or the old map

00:05:56.269 --> 00:05:58.769
in the character's hands. The AI will just randomly

00:05:58.769 --> 00:06:00.769
change the look. You have to keep giving it that

00:06:00.769 --> 00:06:03.230
context. Yeah, I still wrestle with prompt drift

00:06:03.230 --> 00:06:06.189
myself, especially when I'm shifting environments.

00:06:06.709 --> 00:06:09.290
You have to treat that initial prompt like a

00:06:09.290 --> 00:06:11.490
detailed map for the AI to follow every single

00:06:11.490 --> 00:06:13.750
time. That's a great way to put it. And another

00:06:13.750 --> 00:06:16.209
huge mistake is changing the lighting. Lighting

00:06:16.209 --> 00:06:19.050
is the glue that creates continuity. So if you

00:06:19.050 --> 00:06:21.980
start with soft fog, And then switch to harsh

00:06:21.980 --> 00:06:25.540
midday sun. The colors will break. The film will

00:06:25.540 --> 00:06:28.579
feel disjointed. Completely. And the simple technical

00:06:28.579 --> 00:06:32.339
error, forgetting no text or no borders. We mentioned

00:06:32.339 --> 00:06:35.160
this, but it's so important. The AI is trained

00:06:35.160 --> 00:06:37.480
on labeled data, so it tries to sneak in random

00:06:37.480 --> 00:06:39.839
words. And if your initial grid has any text

00:06:39.839 --> 00:06:42.300
on it, the source says to just scrap it. Generate

00:06:42.300 --> 00:06:44.790
a new, clean one. So what does this all mean,

00:06:45.029 --> 00:06:47.050
really? Beyond just the creative side, there

00:06:47.050 --> 00:06:50.329
are real monetary applications here. Absolutely.

00:06:50.910 --> 00:06:53.009
When you achieve character consistency, this

00:06:53.009 --> 00:06:55.670
becomes a viable skill. You can create ads for

00:06:55.670 --> 00:06:58.310
businesses. You could start YouTube story channels.

00:06:58.430 --> 00:07:00.889
Or sell the character concept art to game developers.

00:07:00.970 --> 00:07:03.110
Right. They need this stuff. Does this workflow

00:07:03.110 --> 00:07:06.290
require, like, a specialized gaming or rendering

00:07:06.290 --> 00:07:09.339
computer to even get started? Not usually. Most

00:07:09.339 --> 00:07:11.660
of the complex work happens online, though a

00:07:11.660 --> 00:07:14.899
good GPU definitely speeds up that Topaz upscaling

00:07:14.899 --> 00:07:17.660
part. So the big takeaway here is control. I

00:07:17.660 --> 00:07:20.079
mean, AI filmmaking can look like magic, but

00:07:20.079 --> 00:07:23.329
consistency is all about structure. The 3x3 grid

00:07:23.329 --> 00:07:26.410
shifts the whole process from just random generation

00:07:26.410 --> 00:07:29.230
to predictable, repeatable character design.

00:07:29.490 --> 00:07:32.730
We found the specific steps. Use that 1 .1 ratio,

00:07:33.269 --> 00:07:35.990
apply the five -part prompt formula, and then

00:07:35.990 --> 00:07:38.269
use the character reference feature to teach

00:07:38.269 --> 00:07:40.769
the AI what your character looks like. And that's

00:07:40.769 --> 00:07:42.829
the whole game. What's fascinating to me is just

00:07:42.829 --> 00:07:45.689
how quickly this field is evolving. We're talking

00:07:45.689 --> 00:07:47.610
about automating one of the hardest parts of

00:07:47.610 --> 00:07:51.699
visual art. Whoa. Imagine scaling this core workflow

00:07:51.699 --> 00:07:54.860
to a billion queries. It's staggering to think

00:07:54.860 --> 00:07:57.000
about. Yeah. And the source material ends by

00:07:57.000 --> 00:07:58.620
asking, you know, what's next? Should it be adding

00:07:58.620 --> 00:08:01.279
voices and music? Which raises a really important

00:08:01.279 --> 00:08:04.040
question. If the visual consistency is locked

00:08:04.040 --> 00:08:07.120
down, how do we ensure the audio consistency?

00:08:07.319 --> 00:08:09.259
That's a whole other deep dive. Yeah. For now,

00:08:09.339 --> 00:08:11.300
we hope this look at character consistency helps

00:08:11.300 --> 00:08:13.680
you realize your imagination can come to life

00:08:13.680 --> 00:08:16.879
faster than ever before. Go out there, try it,

00:08:17.000 --> 00:08:19.129
make mistakes. and try again. Thanks for sharing

00:08:19.129 --> 00:08:20.810
your sources with us for this deep dive.
