WEBVTT

00:00:00.000 --> 00:00:02.459
Imagine you walk into a car dealership. You're

00:00:02.459 --> 00:00:04.759
there to buy the most affordable, sensible car

00:00:04.759 --> 00:00:07.179
on the lot, a basic commuter model, you know,

00:00:07.200 --> 00:00:09.820
something practical. Then someone hands you the

00:00:09.820 --> 00:00:13.419
track results. And this everyday sedan just clocked

00:00:13.419 --> 00:00:15.980
faster times than the limited edition carbon

00:00:15.980 --> 00:00:18.500
fiber flagship sports car, the one that costs

00:00:18.500 --> 00:00:21.280
four times as much. That seemingly impossible

00:00:21.280 --> 00:00:24.239
scenario where budget just flat out beats flagship.

00:00:25.100 --> 00:00:27.140
That is exactly what just happened in the world

00:00:27.140 --> 00:00:30.320
of AI coding. A fast, incredibly cheap model

00:00:30.320 --> 00:00:32.700
just officially upset its own higher -priced

00:00:32.700 --> 00:00:35.240
sibling at solving real -world messy engineering

00:00:35.240 --> 00:00:37.700
problems. Welcome to the Deep Dive. And this

00:00:37.700 --> 00:00:39.899
isn't just an upgrade. It's more like a category

00:00:39.899 --> 00:00:41.939
collapse. We're diving deep into what this all

00:00:41.939 --> 00:00:45.159
means. The topic today is Gemini 3 Flash. It

00:00:45.159 --> 00:00:47.939
was marketed as the fast, cheap option, you know,

00:00:47.939 --> 00:00:50.140
for quick, low -cost tasks. Right, the simple

00:00:50.140 --> 00:00:52.600
stuff. Exactly. But for developers and builders,

00:00:52.799 --> 00:00:55.259
Flash has rapidly become... the new default.

00:00:55.420 --> 00:00:57.780
It's forcing everyone from solo founders to big

00:00:57.780 --> 00:01:00.119
tech teams to completely rethink what budget

00:01:00.119 --> 00:01:02.579
AI actually means and where to spend those precious

00:01:02.579 --> 00:01:05.459
token budgets. Our mission today is to cut through

00:01:05.459 --> 00:01:08.000
all that marketing noise. We're going to analyze

00:01:08.000 --> 00:01:10.599
the performance scores, which are pretty shocking.

00:01:11.069 --> 00:01:13.909
will reveal its secret weapon called dynamic

00:01:13.909 --> 00:01:17.170
thinking and run through four crucial real -world

00:01:17.170 --> 00:01:19.250
code tests. And then we'll get into the playbook,

00:01:19.349 --> 00:01:21.329
how to actually use it, and maybe more importantly,

00:01:21.530 --> 00:01:24.870
what to avoid. Okay, let's unpack this. So DeepMind

00:01:24.870 --> 00:01:27.930
releases Gemini 3 Flash, and they position it

00:01:27.930 --> 00:01:31.290
very clearly beneath their flagship Pro model.

00:01:31.430 --> 00:01:34.129
And historically, that kind of naming implies

00:01:34.129 --> 00:01:39.230
a predictable trade -off. Flash means fast, but

00:01:39.230 --> 00:01:42.920
maybe a little dumb. Pro means smart, but slow

00:01:42.920 --> 00:01:46.079
and expensive. That was the rule. And Flash just

00:01:46.079 --> 00:01:48.719
fundamentally broke it. And in doing so, it really

00:01:48.719 --> 00:01:51.159
disrupted the entire pricing structure of high

00:01:51.159 --> 00:01:53.140
-level AI. And we have the concrete evidence

00:01:53.140 --> 00:01:55.620
for this. To really quantify how big a deal this

00:01:55.620 --> 00:01:57.659
is, we have to look at the benchmark that coders

00:01:57.659 --> 00:02:00.620
actually care about. SW bench verified. This

00:02:00.620 --> 00:02:02.920
isn't just some academic test, right? Not at

00:02:02.920 --> 00:02:05.340
all. It's not multiple choice. It forces the

00:02:05.340 --> 00:02:07.980
AI to solve real world GitHub issues. We're talking

00:02:07.980 --> 00:02:11.319
actual open source bugs, feature requests, messy

00:02:11.319 --> 00:02:15.000
code, stuff pulled from real projects. It's basically

00:02:15.000 --> 00:02:17.229
an honest job interview for the model. And the

00:02:17.229 --> 00:02:19.289
scoreboard from that interview is what shocked

00:02:19.289 --> 00:02:22.250
everyone. It really is. So Gemini 3 Flash, using

00:02:22.250 --> 00:02:26.810
its new thinking configuration, scored a 78 .0%.

00:02:26.810 --> 00:02:29.930
Wow. Yeah, that's a huge figure for real -world

00:02:29.930 --> 00:02:32.990
problem solving. Now get this. Its older, more

00:02:32.990 --> 00:02:36.449
expensive sibling, Gemini 3 Pro, scored 76 .2%.

00:02:36.449 --> 00:02:39.189
So it's actually better. It is measurably better.

00:02:39.330 --> 00:02:43.289
And Claude Sonnet 4 .5 scored 77 .2%. Now, technically,

00:02:43.389 --> 00:02:46.330
GPT -5 .2 did nudge ahead at 80%, but there's

00:02:46.330 --> 00:02:48.370
some critical context here that makes Flash the,

00:02:48.389 --> 00:02:50.800
uh... Best practical model. Hold on a second,

00:02:50.819 --> 00:02:52.379
though. The difference between Flash and Pro,

00:02:52.539 --> 00:02:54.759
that's less than two points. Shouldn't we assume

00:02:54.759 --> 00:02:56.740
that small gap represents the really tricky edge

00:02:56.740 --> 00:02:58.800
cases making Pro still necessary for critical

00:02:58.800 --> 00:03:01.240
systems? That's a really fair point. And yes,

00:03:01.360 --> 00:03:03.240
that small percentage probably does represent

00:03:03.240 --> 00:03:06.280
problems only Pro's depth can handle. But that's

00:03:06.280 --> 00:03:08.419
where the cost comes in. That skepticism just

00:03:08.419 --> 00:03:10.280
evaporates when you look at the price tag. Right.

00:03:10.379 --> 00:03:13.520
We are seeing an economic collapse for high -level

00:03:13.520 --> 00:03:15.360
coding. Right. I mean, the cost difference is

00:03:15.360 --> 00:03:18.780
massive. Pro costs $2 per one. million input

00:03:18.780 --> 00:03:22.560
tokens. Flash costs just 50 cents. So the model

00:03:22.560 --> 00:03:24.960
that is measurably smarter on this key benchmark

00:03:24.960 --> 00:03:28.360
is four times cheaper to run. That's astonishing.

00:03:28.919 --> 00:03:31.340
Precisely. You can run four complete coding iterations,

00:03:31.520 --> 00:03:33.740
four full attempts at solving a problem with

00:03:33.740 --> 00:03:36.280
Flash for the exact same price as a single run

00:03:36.280 --> 00:03:39.180
with Pro. The cost per useful output just dropped

00:03:39.180 --> 00:03:42.110
off a cliff. How does the shift in cost versus

00:03:42.110 --> 00:03:44.289
capability fundamentally change the starting

00:03:44.289 --> 00:03:46.310
point for startups and solo builders who are

00:03:46.310 --> 00:03:48.849
constantly battling their cloud bill? High -level

00:03:48.849 --> 00:03:51.349
AI coding is now affordable for everyone, allowing

00:03:51.349 --> 00:03:53.889
faster, cheaper iteration and experimentation.

00:03:54.370 --> 00:03:57.210
So a nearly flagship model that's four times

00:03:57.210 --> 00:03:59.729
cheaper. How do they do that without killing

00:03:59.729 --> 00:04:02.610
the speed? The sources all point to this new

00:04:02.610 --> 00:04:06.039
feature, dynamic thinking. Which, let's be honest,

00:04:06.139 --> 00:04:08.560
sounds a bit like marketing jargon. It absolutely

00:04:08.560 --> 00:04:11.219
is marketing. But it also happens to describe

00:04:11.219 --> 00:04:14.340
a real architectural change. Oh. Dynamic thinking

00:04:14.340 --> 00:04:20.660
forces the model to pause, plan. and internally

00:04:20.660 --> 00:04:23.920
reason about the task before it generates a single

00:04:23.920 --> 00:04:26.639
line of code. An internal monologue. It's a mandatory

00:04:26.639 --> 00:04:28.959
internal monologue that happens in a single API

00:04:28.959 --> 00:04:31.699
call. And you, the developer, can actually influence

00:04:31.699 --> 00:04:33.759
how much thought it puts in. And that's critical.

00:04:33.899 --> 00:04:35.680
The old way was so frustrating. You'd ask for

00:04:35.680 --> 00:04:37.860
something complex, like a snake game, and the

00:04:37.860 --> 00:04:40.019
AI would just immediately spit out code. Right.

00:04:40.160 --> 00:04:41.819
And it would almost always have that one fundamental

00:04:41.819 --> 00:04:44.459
bug, like the menu doesn't disappear or the collision

00:04:44.459 --> 00:04:46.639
logic is just slightly off. The old model was

00:04:46.639 --> 00:04:49.279
just guessing the next word. The new way. with

00:04:49.279 --> 00:04:51.779
Flash is totally different. You ask for the snake

00:04:51.779 --> 00:04:53.720
game, and internally, before writing anything,

00:04:53.839 --> 00:04:56.279
it's planning. Okay, I need a game loop, a grid,

00:04:56.459 --> 00:04:59.860
handle input, detect wall collisions, use Pygame,

00:05:00.100 --> 00:05:02.540
and then it generates the structured and usually

00:05:02.540 --> 00:05:04.959
bug -free code. And we're seeing that planning

00:05:04.959 --> 00:05:08.040
ability shine in really serious real work scenarios,

00:05:08.360 --> 00:05:10.819
not just little functions. I mean, refactoring

00:05:10.819 --> 00:05:14.360
a messy 800 line legacy file into clean modules

00:05:14.360 --> 00:05:17.439
or debugging a failing off flow where the problem

00:05:17.439 --> 00:05:20.339
isn't obvious. You know, honestly, I still wrestle

00:05:20.339 --> 00:05:22.439
with prompt drift myself sometimes where you

00:05:22.439 --> 00:05:25.399
prompt the AI and by the third turn, it's completely

00:05:25.399 --> 00:05:27.639
forgotten the original constraints. Yeah, that's

00:05:27.639 --> 00:05:29.759
a common struggle. So seeing the model handle

00:05:29.759 --> 00:05:31.779
that internal planning, especially with constraints

00:05:31.779 --> 00:05:34.839
like build rate limiting for 10K requests per

00:05:34.839 --> 00:05:37.560
second and you have to use Redis, it's a massive

00:05:37.560 --> 00:05:39.720
relief. If the model is handling this internal

00:05:39.720 --> 00:05:42.459
structured planning, what is the single biggest

00:05:42.459 --> 00:05:45.120
benefit for a developer's daily workflow? The

00:05:45.120 --> 00:05:47.439
model stops being a simple snippet machine and

00:05:47.439 --> 00:05:49.379
starts acting like an efficient, guided technical

00:05:49.379 --> 00:05:52.000
assistant. Okay, benchmarks are great for headlines,

00:05:52.300 --> 00:05:55.560
but what about production? We need to know if

00:05:55.560 --> 00:05:57.779
Flash is actually stable and useful when the

00:05:57.779 --> 00:06:00.579
pressure is on. So let's review these four gauntlet

00:06:00.579 --> 00:06:03.399
tests. Test number one was all about speed latency

00:06:03.399 --> 00:06:06.040
under pressure. And the prompt was surprisingly

00:06:06.040 --> 00:06:10.399
complex. Create a single file HTML 3 .js scene

00:06:10.399 --> 00:06:14.660
of a cozy, softly lit living room with an animated

00:06:14.660 --> 00:06:18.689
Tom and Jerry SVG loop playing on a 3D TV. Wow.

00:06:18.829 --> 00:06:21.069
So that's testing graphics, animation, library

00:06:21.069 --> 00:06:23.550
knowledge all at once. Yeah. And Flash was faster

00:06:23.550 --> 00:06:26.350
than the last generation. Gemini 2 .5 Pro under

00:06:26.350 --> 00:06:29.930
30 seconds versus 47 seconds. And that low latency

00:06:29.930 --> 00:06:32.029
is so important. It's the difference between

00:06:32.029 --> 00:06:33.949
an assistant that keeps you in the flow and one

00:06:33.949 --> 00:06:35.370
that just, you know, makes you wait and gets

00:06:35.370 --> 00:06:36.790
annoying. Right. You lose your train of thought.

00:06:36.930 --> 00:06:39.810
And here's the surprise. Flash's result was actually

00:06:39.810 --> 00:06:43.529
better than 2 .5 Pro and even 3 Pro. Though,

00:06:43.629 --> 00:06:45.910
I have to point out, the testers did note one

00:06:45.910 --> 00:06:48.720
clear flaw. What was that? There was no TV stand.

00:06:48.879 --> 00:06:50.779
The television was just levitating in the middle

00:06:50.779 --> 00:06:53.699
of the room. Ah, the classic floating television

00:06:53.699 --> 00:06:56.519
problem. That tells you everything, doesn't it?

00:06:56.579 --> 00:06:59.220
It nailed the complex rendering and animation,

00:06:59.420 --> 00:07:02.300
but forgot basic physics. Still needs a human

00:07:02.300 --> 00:07:05.019
editor to remember gravity. Exactly. Then there

00:07:05.019 --> 00:07:08.259
was the stress test, combining complex math with

00:07:08.259 --> 00:07:11.000
animation. Right. This one was a 3D visualization

00:07:11.000 --> 00:07:14.819
of relative scale from a subatomic particle all

00:07:14.819 --> 00:07:18.050
the way up to a galaxy. It demands math, physics,

00:07:18.329 --> 00:07:21.889
animation, JavaScript, all working together perfectly.

00:07:22.110 --> 00:07:23.990
And Flash delivered the result in the shortest

00:07:23.990 --> 00:07:26.750
time. It was great. Though the highest -end model,

00:07:26.949 --> 00:07:29.449
3 Pro, did have a bit more polish on the final

00:07:29.449 --> 00:07:31.430
result. Flash just prioritized being correct

00:07:31.430 --> 00:07:33.709
and fast. And that brings us to test number four,

00:07:33.790 --> 00:07:35.889
which for me is the most impressive one, the

00:07:35.889 --> 00:07:38.889
one -shot voxel art test, the Eagle test. This

00:07:38.889 --> 00:07:41.110
prompt asks the model to write voxel art code

00:07:41.110 --> 00:07:43.290
for an eagle sitting on a branch in a single

00:07:43.290 --> 00:07:45.879
HTML file. And voxel art is really hard because

00:07:45.879 --> 00:07:48.319
you have to manually define the 3D space, coordinate

00:07:48.319 --> 00:07:51.379
by coordinate. Whoa, just imagine scaling a model

00:07:51.379 --> 00:07:54.199
that can handle that level of creativity, abstract

00:07:54.199 --> 00:07:56.779
spatial reasoning, and obscure library knowledge

00:07:56.779 --> 00:08:00.060
in one shot. Correctly defining the relationship

00:08:00.060 --> 00:08:02.360
between the eagle and the branch in 3D space.

00:08:02.660 --> 00:08:06.120
That's a fundamentally new capability for a so

00:08:06.120 --> 00:08:08.480
-called budget model. So the summary from all

00:08:08.480 --> 00:08:10.879
these tests is pretty clear. Flash prioritizes

00:08:10.879 --> 00:08:13.949
speed. correctness and just pure practicality

00:08:13.949 --> 00:08:16.730
pro is still there for depth creative polish

00:08:16.730 --> 00:08:19.550
and complex framing but flash hits that sweet

00:08:19.550 --> 00:08:22.290
spot for just getting things built beyond the

00:08:22.290 --> 00:08:25.009
raw scores and the cool animations what single

00:08:25.009 --> 00:08:27.850
practical outcome makes flash feel truly production

00:08:27.850 --> 00:08:30.050
ready right now It's fast enough to integrate

00:08:30.050 --> 00:08:32.250
directly into your flow while being smart enough

00:08:32.250 --> 00:08:34.990
to confidently handle surprisingly complex multi

00:08:34.990 --> 00:08:37.970
-step tasks. Okay, knowing the models is great

00:08:37.970 --> 00:08:40.129
is one thing. Deploying it without setting your

00:08:40.129 --> 00:08:42.149
credit card on fire requires a real strategy.

00:08:42.409 --> 00:08:44.690
So let's talk playbook. First thing, the API

00:08:44.690 --> 00:08:47.509
parameter. It's vital for cost control. Gemini

00:08:47.509 --> 00:08:50.029
3 models use thinking levels instead of the old

00:08:50.029 --> 00:08:51.889
difficult thinking budget where you had to guess

00:08:51.889 --> 00:08:54.049
how many tokens it needed. Right. Thinking levels

00:08:54.049 --> 00:08:56.529
makes it way simpler. For simple chat, you'd

00:08:56.529 --> 00:08:59.750
use minimal. But for coding, the key is to always

00:08:59.750 --> 00:09:02.730
use thinking level high. This forces it to do

00:09:02.730 --> 00:09:04.889
that internal planning. And this is where we

00:09:04.889 --> 00:09:07.289
have to issue a strong warning. This is the minimal

00:09:07.289 --> 00:09:10.149
trap. You can't actually completely disable the

00:09:10.149 --> 00:09:13.149
thinking. Even on minimal, if the model thinks

00:09:13.149 --> 00:09:16.179
a prompt is tricky. it might still generate those

00:09:16.179 --> 00:09:18.620
reasoning tokens, the thinking trace. So your

00:09:18.620 --> 00:09:20.440
application logic has to be ready for that. If

00:09:20.440 --> 00:09:23.100
your code expects just raw text and it gets the

00:09:23.100 --> 00:09:25.519
model's internal monologue mixed in, your app

00:09:25.519 --> 00:09:27.440
could crash. You have to be ready to parse that.

00:09:27.679 --> 00:09:30.580
The real secret, though, for both cost and efficiency

00:09:30.580 --> 00:09:33.639
is the golden architecture, the manager -worker

00:09:33.639 --> 00:09:36.299
pattern. Right. So you use Pro as the manager,

00:09:36.460 --> 00:09:38.740
the architect, for maybe 10 % of your tasks,

00:09:38.960 --> 00:09:42.120
high -level planning, complex reasoning, the

00:09:42.120 --> 00:09:44.279
big -picture stuff. Then use Flash as the worker.

00:09:44.700 --> 00:09:47.940
the executor for the other 90%. It executes the

00:09:47.940 --> 00:09:50.679
plan, runs the code, processes the data. You

00:09:50.679 --> 00:09:52.340
just don't use the expensive model to change

00:09:52.340 --> 00:09:54.159
a button color when Flash can do it perfectly

00:09:54.159 --> 00:09:56.679
in a fraction of the time. What is the ratio

00:09:56.679 --> 00:09:59.059
we should remember when architecting a new application

00:09:59.059 --> 00:10:02.539
using this pattern? Follow the rule of 90 % Flash

00:10:02.539 --> 00:10:06.519
for execution and 10 % Pro for high -level, complex

00:10:06.519 --> 00:10:09.629
planning and architectural strategy. You know,

00:10:09.649 --> 00:10:11.710
Flash isn't just another update. It's a really

00:10:11.710 --> 00:10:14.350
strong signal about where AI development is headed.

00:10:14.590 --> 00:10:16.529
If you connect the dots, there are like four

00:10:16.529 --> 00:10:18.730
major shifts happening right now because of this.

00:10:18.870 --> 00:10:21.570
Okay, so signal number one. Pro -level intelligence

00:10:21.570 --> 00:10:25.090
is becoming incredibly cheap. That old line between

00:10:25.090 --> 00:10:28.610
fast and basic and expensive and powerful, it's

00:10:28.610 --> 00:10:31.149
just collapsed. And that means low -cost models

00:10:31.149 --> 00:10:33.190
will soon handle things that were just too expensive

00:10:33.190 --> 00:10:36.090
before. Imagine an AI reviewing every single

00:10:36.090 --> 00:10:39.289
pull request or live continuous refactoring happening

00:10:39.289 --> 00:10:41.549
right inside your editor without you even thinking

00:10:41.549 --> 00:10:43.950
about the bill. Signal 2, models are actually

00:10:43.950 --> 00:10:46.039
learning how to think. This dynamic thinking

00:10:46.039 --> 00:10:48.240
is part of a trend where models move beyond just

00:10:48.240 --> 00:10:50.659
guessing the next word. We should expect models

00:10:50.659 --> 00:10:53.039
that debug their own code or agents that can

00:10:53.039 --> 00:10:56.080
plan multi -day development tasks. And that completely

00:10:56.080 --> 00:10:58.740
changes your role as a builder. You spend less

00:10:58.740 --> 00:11:01.200
time on boilerplate and more time deciding what

00:11:01.200 --> 00:11:03.639
should be built and why. You move up the stack

00:11:03.639 --> 00:11:07.580
to strategy. Signal three. Speed is the new battleground.

00:11:08.200 --> 00:11:10.799
Accuracy used to be everything, but now the focus

00:11:10.799 --> 00:11:14.220
is on latency. A slow suggestion breaks your

00:11:14.220 --> 00:11:17.450
flow. The future is near instant responses that

00:11:17.450 --> 00:11:19.730
feel like a native part of your editor. And signal

00:11:19.730 --> 00:11:21.929
four is that development is opening up to everyone.

00:11:22.129 --> 00:11:25.309
When strong coding AI costs almost nothing, the

00:11:25.309 --> 00:11:27.230
friction to build something is basically zero.

00:11:27.809 --> 00:11:30.750
Solo builders can ship complex products. Non

00:11:30.750 --> 00:11:32.649
-technical founders can prototype real ideas.

00:11:32.970 --> 00:11:36.399
Junior devs get senior level support. If the

00:11:36.399 --> 00:11:38.419
quality and cost of the tools are no longer the

00:11:38.419 --> 00:11:40.440
problem, what is the core question that remains

00:11:40.440 --> 00:11:42.679
for developers? The only remaining question is

00:11:42.679 --> 00:11:45.240
what you, the builder, choose to build with this

00:11:45.240 --> 00:11:48.320
new democratized power. Okay, so Flash is incredibly

00:11:48.320 --> 00:11:50.539
powerful, but we have to be really clear. It

00:11:50.539 --> 00:11:52.960
is not a magic bullet. And running into these

00:11:52.960 --> 00:11:54.960
limits will cause some painful, expensive mistakes

00:11:54.960 --> 00:11:57.519
if you treat it as flawless. Limitation number

00:11:57.519 --> 00:12:02.000
one. Very long contexts. Flash still struggles

00:12:02.000 --> 00:12:04.419
when you push it into millions of tokens. It

00:12:04.419 --> 00:12:07.539
works best in that 2 ,000 to 8 ,000 token range.

00:12:07.720 --> 00:12:10.320
Beyond that, it starts missing details. The fix

00:12:10.320 --> 00:12:12.820
is to break up your big card bases into chunks

00:12:12.820 --> 00:12:15.500
and always include a short architecture summary

00:12:15.500 --> 00:12:17.600
in every prompt to keep it on track. Limitation

00:12:17.600 --> 00:12:21.230
two, new or bleeding edge frameworks. Flash only

00:12:21.230 --> 00:12:23.789
knows stuff up to its training cutoff. If you're

00:12:23.789 --> 00:12:25.389
using a brand new framework, it might suggest

00:12:25.389 --> 00:12:27.990
outdated or just wrong patterns. So you have

00:12:27.990 --> 00:12:30.549
to bring the framework to Flash, paste the relevant

00:12:30.549 --> 00:12:32.929
docs right into the context window and be explicit.

00:12:33.110 --> 00:12:36.490
Tell it, use this exact Astro 5 .0 API, not the

00:12:36.490 --> 00:12:39.330
old ones. Limitation 4 is a classic LLM problem.

00:12:39.789 --> 00:12:41.730
Confident answers to vague prompts. If you're

00:12:41.730 --> 00:12:43.690
unclear, it'll just make an assumption and give

00:12:43.690 --> 00:12:45.909
you a confident but potentially incorrect answer.

00:12:46.070 --> 00:12:47.789
The fix is to treat your prompt like a strict

00:12:47.789 --> 00:12:49.909
contract. Define your preconditions and post

00:12:49.909 --> 00:12:51.970
conditions. Don't just say, handle the date.

00:12:52.090 --> 00:12:55.330
Say, if date parsing fails, log a warning and

00:12:55.330 --> 00:12:58.610
return. None. Clear rules, safer output. And

00:12:58.610 --> 00:13:02.230
finally, number five, security and privacy. Flash

00:13:02.230 --> 00:13:05.120
runs on external infrastructure. Never, ever

00:13:05.120 --> 00:13:08.440
send secrets, API keys, credentials, sensitive

00:13:08.440 --> 00:13:11.419
business logic in clear text. The safe ways are

00:13:11.419 --> 00:13:13.759
to anonymize your code, use Google's enterprise

00:13:13.759 --> 00:13:16.299
tier, or run it through something like Vertex

00:13:16.299 --> 00:13:18.740
AI for more control. This is not your private

00:13:18.740 --> 00:13:22.100
notebook. Which limitation, if ignored, poses

00:13:22.100 --> 00:13:24.340
the greatest immediate risk to a real -world

00:13:24.340 --> 00:13:26.879
deployed application? Ignoring security and privacy

00:13:26.879 --> 00:13:29.340
rules or treating external infrastructure like

00:13:29.340 --> 00:13:31.779
a private notebook is without question the biggest

00:13:31.779 --> 00:13:34.779
risk. So Gemini 3 Flash is a rare, genuinely

00:13:34.779 --> 00:13:37.539
paradigm shifting upgrade. It's faster. It's

00:13:37.539 --> 00:13:39.720
demonstrably smarter than its sibling. And critically,

00:13:39.919 --> 00:13:42.179
it's four times cheaper. It breaks that budget

00:13:42.179 --> 00:13:43.960
model tradeoff we've all been living with. You

00:13:43.960 --> 00:13:46.059
get high level intelligence with high speed and

00:13:46.059 --> 00:13:48.159
low cost. And this changes developer behavior

00:13:48.159 --> 00:13:51.019
instantly. Complex tasks like refactoring or

00:13:51.019 --> 00:13:53.320
debugging now feel safe to hand off to an AI

00:13:53.320 --> 00:13:56.110
because the cost is so low you just. The budget

00:13:56.110 --> 00:13:58.889
model really has become the king of code. And

00:13:58.889 --> 00:14:00.750
this leads us to our final provocative thought

00:14:00.750 --> 00:14:03.970
for you. The sources suggest trying to flack.

00:14:04.389 --> 00:14:07.509
Flash out in Google AI Studio for the best control.

00:14:08.090 --> 00:14:11.929
So here's a thought. If AI can now handle complex

00:14:11.929 --> 00:14:14.490
spatial reasoning and code architecture for pennies,

00:14:14.490 --> 00:14:17.129
how long until the cost of computational creativity

00:14:17.129 --> 00:14:20.679
becomes virtually zero? And what is the next

00:14:20.679 --> 00:14:23.200
high -value, uniquely human skill that developers

00:14:23.200 --> 00:14:25.440
will need to cultivate to stay relevant? That

00:14:25.440 --> 00:14:27.539
is the essential question to build on. And if

00:14:27.539 --> 00:14:29.259
you want to experience this dynamic thinking

00:14:29.259 --> 00:14:31.340
firsthand, go try running one of those gauntlet

00:14:31.340 --> 00:14:33.539
tests we talked about, the Voxelar Eagle or the

00:14:33.539 --> 00:14:36.259
complex scale animation. That'll really show

00:14:36.259 --> 00:14:38.399
you what this new architecture can do. Thank

00:14:38.399 --> 00:14:40.179
you for sharing your sources and letting us take

00:14:40.179 --> 00:14:41.899
this deep dive with you. We'll catch you next

00:14:41.899 --> 00:14:42.200
time.