WEBVTT

00:00:00.000 --> 00:00:04.580
Imagine you train this incredibly smart AI, right?

00:00:04.660 --> 00:00:07.139
It kind of runs so much now. But you train it

00:00:07.139 --> 00:00:11.599
on, well, just viral tweets and clickbait. Slight

00:00:11.599 --> 00:00:14.419
pause. It turns out the intelligence of these

00:00:14.419 --> 00:00:17.719
advanced AIs is surprisingly... pretty fragile.

00:00:17.940 --> 00:00:20.300
Its core reasoning can actually degrade just

00:00:20.300 --> 00:00:22.899
from being fed digital junk. Some are calling

00:00:22.899 --> 00:00:26.120
it AI brain rot. Welcome to the deep dive. Yeah,

00:00:26.179 --> 00:00:28.100
today we're unpacking a really interesting set

00:00:28.100 --> 00:00:30.399
of sources. They show the stark contrast, you

00:00:30.399 --> 00:00:33.039
know, AI is moving unbelievably fast, but at

00:00:33.039 --> 00:00:36.219
the same time, its foundation seems, well, kind

00:00:36.219 --> 00:00:39.310
of brittle. Maybe dangerously so. Exactly. That

00:00:39.310 --> 00:00:41.310
fragility is what we want to focus on, sort of

00:00:41.310 --> 00:00:43.369
cleaning up the information highway, trying to

00:00:43.369 --> 00:00:45.689
understand where these algorithms are, essentially

00:00:45.689 --> 00:00:47.770
doom scrolling, and what kind of real measurable

00:00:47.770 --> 00:00:50.490
damage that does. And then we'll switch gears,

00:00:50.670 --> 00:00:53.310
look at the sheer speed of innovation, new abilities,

00:00:53.409 --> 00:00:55.329
new ways to control it, the whole infrastructure

00:00:55.329 --> 00:00:58.030
race. And finally, and this is really critical,

00:00:58.090 --> 00:01:00.289
I think, we'll get into the hidden reality of

00:01:00.289 --> 00:01:02.469
bias baked right into the training data, and

00:01:02.469 --> 00:01:04.549
crucially, why most of us just don't see it.

00:01:04.670 --> 00:01:06.489
Okay, let's start with that core problem then.

00:01:07.340 --> 00:01:12.159
AI brain rot. It sounds, I don't know, a bit

00:01:12.159 --> 00:01:14.840
dramatic, but the research seems to show it's

00:01:14.840 --> 00:01:17.180
a real quantifiable thing. Oh, it really is.

00:01:17.219 --> 00:01:19.359
Think of it like human internet brain rot, but,

00:01:19.359 --> 00:01:21.939
you know, for algorithms. When a large language

00:01:21.939 --> 00:01:24.799
model just consumes endless low -quality trivial

00:01:24.799 --> 00:01:28.120
stuff, its ability to actually reason and focus

00:01:28.120 --> 00:01:31.659
just tanks fast. The researchers looked at this

00:01:31.659 --> 00:01:34.200
specifically using two kinds of junk content.

00:01:34.420 --> 00:01:37.439
Right. There was the M1 type that's the... Super

00:01:37.439 --> 00:01:40.519
viral click -baity short stuff everywhere. And

00:01:40.519 --> 00:01:42.599
then M2, which they called, what was it, low

00:01:42.599 --> 00:01:45.799
semantic buzzword soup. I mean, I still wrestle

00:01:45.799 --> 00:01:48.159
with prompt drift myself sometimes, getting generic

00:01:48.159 --> 00:01:50.500
answers. So this idea really hits home if you're

00:01:50.500 --> 00:01:52.159
trying to actually use these tools reliably.

00:01:52.459 --> 00:01:55.099
And it was that M1, the viral stuff, that did

00:01:55.099 --> 00:01:57.120
the most damage. The model started doing something

00:01:57.120 --> 00:01:59.140
the researchers called thought skipping. Thought

00:01:59.140 --> 00:02:01.140
skipping. Yeah, basically just stopping processing

00:02:01.140 --> 00:02:03.780
complex ideas properly. The model sort of jumps

00:02:03.780 --> 00:02:05.500
to a conclusion instead of, you know, following

00:02:05.500 --> 00:02:08.560
the steps. Wow. And the numbers they reported

00:02:08.560 --> 00:02:12.300
on this degradation are pretty shocking, actually.

00:02:12.419 --> 00:02:15.580
For that M1 viral content, reasoning accuracy

00:02:15.580 --> 00:02:20.039
just plummeted. It went from like 74 .9 % down

00:02:20.039 --> 00:02:24.199
to 57 .2%. That's a serious hit to basic logic.

00:02:24.460 --> 00:02:26.919
It is. And look at long context comprehension,

00:02:27.340 --> 00:02:29.719
its ability to follow a long conversation or

00:02:29.719 --> 00:02:32.580
document. That fell even harder. dropped from

00:02:32.580 --> 00:02:37.759
84 .4 % to just 52 .3%. Imagine like a professional

00:02:37.759 --> 00:02:40.180
losing that much ability just from reading gossip

00:02:40.180 --> 00:02:42.539
sites for a month. That's the kind of fragility

00:02:42.539 --> 00:02:44.159
we're talking about. And it wasn't just their

00:02:44.159 --> 00:02:46.259
reasoning. The sources said the model's ethical

00:02:46.259 --> 00:02:48.439
alignment slipped, too, and they even saw personality

00:02:48.439 --> 00:02:50.800
drift. It sounds like a clear dose response thing.

00:02:50.900 --> 00:02:52.639
The more junk they got, the worse they performed.

00:02:52.759 --> 00:02:54.860
Dumber and, frankly, less reliable. But what's

00:02:54.860 --> 00:02:56.939
really fascinating and kind of worrying is that

00:02:56.939 --> 00:02:58.699
the effects stuck around. They did try to fix

00:02:58.699 --> 00:03:00.620
it. Yeah, they tried to clean up the data afterwards,

00:03:00.860 --> 00:03:03.599
retrain the models. Exactly. But the key finding

00:03:03.599 --> 00:03:06.240
was even retraining on clean, high quality data,

00:03:06.319 --> 00:03:10.120
it only partly fixed the damage. The rot lingered.

00:03:10.520 --> 00:03:12.159
Seems like the junk data might have actually

00:03:12.159 --> 00:03:14.340
corrupted the model's core structure, its fundamental

00:03:14.340 --> 00:03:17.840
wiring. So if the junk data messes with the model's

00:03:17.840 --> 00:03:21.139
deepest level, how fundamental is that initial

00:03:21.139 --> 00:03:24.240
training data then? It seems like it sets a baseline

00:03:24.240 --> 00:03:27.039
that's really hard, maybe even impossible, to

00:03:27.039 --> 00:03:30.419
fully reset later on. Okay, switching gears now.

00:03:30.620 --> 00:03:33.360
This fragility exists right alongside incredible

00:03:33.360 --> 00:03:36.719
speed. I mean, the pace of new capabilities launching

00:03:36.719 --> 00:03:39.770
right now is just... It really is. We're seeing

00:03:39.770 --> 00:03:42.689
tools pop up that kind of redefine what AI agents

00:03:42.689 --> 00:03:45.789
can even do. OpenAI just launched something called

00:03:45.789 --> 00:03:48.550
ChatGPT Atlas. It's basically their first full

00:03:48.550 --> 00:03:51.750
AI web browser, not just chat. Right. It's more

00:03:51.750 --> 00:03:53.909
like an autonomous agent. It's designed to handle

00:03:53.909 --> 00:03:56.110
complex stuff online for you, like booking flights,

00:03:56.389 --> 00:03:59.250
comparing documents, basically starting and finishing

00:03:59.250 --> 00:04:01.229
detailed tasks without you needing to step in.

00:04:01.310 --> 00:04:03.069
And then on the efficiency side, you've got things

00:04:03.069 --> 00:04:05.710
like DeepSeek OCR. This tool apparently makes

00:04:05.710 --> 00:04:08.430
image -based documents like PDFs 10 times smaller,

00:04:08.569 --> 00:04:12.090
file size -wise, while keeping like 97 % of the

00:04:12.090 --> 00:04:14.210
original info. That's huge for businesses dealing

00:04:14.210 --> 00:04:16.819
with lots of data storage costs. Yeah, absolutely.

00:04:17.139 --> 00:04:19.079
And you see these shifts reflected in the market

00:04:19.079 --> 00:04:20.779
too. There was this recent trading benchmark

00:04:20.779 --> 00:04:24.160
where different AIs got $10 ,000 each to manage.

00:04:24.480 --> 00:04:27.600
Early days. But the results showed DeepSeek 3

00:04:27.600 --> 00:04:31.759
.1 was up $4 ,000, while Gemini 2 .5 Pro was

00:04:31.759 --> 00:04:34.360
actually down $3 ,000. These models are literally

00:04:34.360 --> 00:04:36.660
competing in finance now. And remember that story

00:04:36.660 --> 00:04:39.139
from the UK. Channel 4 aired this documentary.

00:04:39.379 --> 00:04:41.879
And at the end, the host just casually reveals

00:04:41.879 --> 00:04:44.759
they were an AI anchor the whole time. I mean,

00:04:44.779 --> 00:04:47.120
the line between real and synthetic is getting

00:04:47.120 --> 00:04:50.079
seriously blurry. That kind of speed needs powerful

00:04:50.079 --> 00:04:52.439
infrastructure, obviously. On the hardware front,

00:04:52.540 --> 00:04:54.740
it's super competitive. NVIDIA is still the giant.

00:04:54.800 --> 00:04:57.660
But China's claiming they have a new analog AI

00:04:57.660 --> 00:05:00.500
chip that's supposedly 1 ,000 times faster. Whoa.

00:05:01.230 --> 00:05:04.350
Yeah, a thousand times faster. Imagine scaling

00:05:04.350 --> 00:05:07.410
to like a billion queries with that kind of speed.

00:05:07.509 --> 00:05:09.310
That completely changes the game, doesn't it?

00:05:09.350 --> 00:05:11.649
That's infrastructure really shaping what's possible.

00:05:11.850 --> 00:05:13.930
And of course, with these huge capabilities comes

00:05:13.930 --> 00:05:16.490
the need for much better control. You see open

00:05:16.490 --> 00:05:19.089
AI tightening the guardrails on Sora, their video

00:05:19.089 --> 00:05:21.069
generator, especially after Hollywood raised

00:05:21.069 --> 00:05:24.029
concerns about realistic deepfakes. Makes sense.

00:05:24.379 --> 00:05:27.079
And YouTube just rolled out a new tool specifically

00:05:27.079 --> 00:05:29.879
for detecting likenesses, designed to combat

00:05:29.879 --> 00:05:32.819
AI deepfakes and flag synthetic media automatically.

00:05:33.259 --> 00:05:35.899
The control tools are racing to keep up with

00:05:35.899 --> 00:05:38.000
what the AI can do. At the same time, you've

00:05:38.000 --> 00:05:40.319
got this big investment in people. Companies

00:05:40.319 --> 00:05:43.689
like Microsoft, OpenAI, Anthropic. They're putting

00:05:43.689 --> 00:05:46.829
$23 million together just to train, what, 400

00:05:46.829 --> 00:05:49.990
,000 U .S. teachers on how to use AI tools effectively

00:05:49.990 --> 00:05:52.689
and responsibly? So, OK, if you zoom out, look

00:05:52.689 --> 00:05:54.449
at all these things, the Atlas agent, the super

00:05:54.449 --> 00:05:57.129
fast chips, the new guardrails, the teacher training.

00:05:58.139 --> 00:06:00.540
What's the common thread here? What connects

00:06:00.540 --> 00:06:03.000
these different moves? I'd say it's about better

00:06:03.000 --> 00:06:05.220
control and smarter integration. That seems to

00:06:05.220 --> 00:06:08.279
be the focus for the next wave of AI tool. Right.

00:06:08.319 --> 00:06:10.519
That smarter integration idea brings us beyond

00:06:10.519 --> 00:06:12.560
just basic prompting. We've all had that experience,

00:06:12.759 --> 00:06:14.819
right? You ask an AI for something complex, something

00:06:14.819 --> 00:06:17.360
specific, and you just get back generic mush.

00:06:17.680 --> 00:06:20.199
So what happens when simple one -off prompts

00:06:20.199 --> 00:06:22.420
just aren't enough anymore? Yeah, that's where

00:06:22.420 --> 00:06:24.779
simple prompting falls down. It fails because

00:06:24.779 --> 00:06:27.500
the AI doesn't. really have memory between chats.

00:06:27.759 --> 00:06:30.240
It starts fresh every single time, forgets the

00:06:30.240 --> 00:06:32.740
context, forgets your company style guide, whatever.

00:06:32.939 --> 00:06:35.240
That's where context engineering comes into play.

00:06:35.439 --> 00:06:37.519
Okay, so break down context engineering for us,

00:06:37.519 --> 00:06:40.120
simple terms. It's basically giving the AI a

00:06:40.120 --> 00:06:44.339
kind of permanent organizational memory for specific

00:06:44.339 --> 00:06:46.519
tasks. And how does that actually work in practice?

00:06:47.000 --> 00:06:49.100
Well, it often uses techniques like retrieval

00:06:49.100 --> 00:06:52.000
augmented generation or ROG. You're essentially

00:06:52.000 --> 00:06:54.819
pointing the AI towards a defined, limited set

00:06:54.819 --> 00:06:57.339
of your own data, like internal manuals or style

00:06:57.339 --> 00:06:59.759
guides, maybe through system prompts. This stocks

00:06:59.759 --> 00:07:02.259
it from just defaulting to generic web knowledge

00:07:02.259 --> 00:07:04.740
and gives it consistent, relevant context for

00:07:04.740 --> 00:07:07.199
your needs. So why is just relying on simple

00:07:07.199 --> 00:07:09.980
prompting not good enough for complex or business

00:07:09.980 --> 00:07:13.000
-related tasks? Because the AI needs that history,

00:07:13.139 --> 00:07:15.319
that memory, to give results that are actually

00:07:15.319 --> 00:07:18.540
relevant and not just generic filler. Now, this

00:07:18.540 --> 00:07:20.720
whole search for better context for objective

00:07:20.720 --> 00:07:24.019
information runs smack into another huge problem.

00:07:24.720 --> 00:07:27.220
Invisible bias. We tend to trust these systems

00:07:27.220 --> 00:07:30.160
maybe too much to be neutral. But what if the

00:07:30.160 --> 00:07:32.980
bias isn't obvious? What if it's hidden deep

00:07:32.980 --> 00:07:35.779
inside the data they were trained on? Yeah, this

00:07:35.779 --> 00:07:37.439
is where that study from Penn State and Oregon

00:07:37.439 --> 00:07:40.180
State University comes in. They ran a really...

00:07:40.589 --> 00:07:43.050
quite chilling test looking specifically at whether

00:07:43.050 --> 00:07:45.290
users can even spot bias when it's baked right

00:07:45.290 --> 00:07:47.730
into a system that otherwise seems to work fine.

00:07:47.870 --> 00:07:50.069
They trained a facial emotion recognition system,

00:07:50.170 --> 00:07:53.310
but they deliberately skewed the data. They basically

00:07:53.310 --> 00:07:56.329
taught it that happy equals white faces and sad

00:07:56.329 --> 00:07:59.410
equals black faces. So a clear systemic racial

00:07:59.410 --> 00:08:01.689
bias was built in from the ground up. And the

00:08:01.689 --> 00:08:03.649
findings on whether people notice this, it's

00:08:03.649 --> 00:08:05.829
really unsettling. When they tested this bias

00:08:05.829 --> 00:08:08.839
system, most everyday users just didn't spot

00:08:08.839 --> 00:08:11.800
the racial bias at all. Right. Awareness turned

00:08:11.800 --> 00:08:14.399
out to be highly conditional. It was mainly the

00:08:14.399 --> 00:08:16.319
black participants who started noticing something

00:08:16.319 --> 00:08:18.680
was wrong when the A .I. kept misclassifying

00:08:18.680 --> 00:08:21.439
emotions on black faces. The really stark part

00:08:21.439 --> 00:08:25.360
white participants mostly remained unaware. Which,

00:08:25.420 --> 00:08:28.199
depressingly, means the engineered bias worked

00:08:28.199 --> 00:08:30.959
exactly as intended it was invisible to those

00:08:30.959 --> 00:08:33.419
it didn't negatively affect. That points to a

00:08:33.419 --> 00:08:35.480
really deep danger in how much we trust these

00:08:35.480 --> 00:08:38.259
things. People generally assume the AI was neutral,

00:08:38.440 --> 00:08:40.860
objective, even when its output showed clear

00:08:40.860 --> 00:08:43.700
failures or bias against one group. It really

00:08:43.700 --> 00:08:46.360
shows how easily systemic bias can hide inside

00:08:46.360 --> 00:08:49.679
systems we rely on and trust. So what's the fundamental

00:08:49.679 --> 00:08:53.159
danger of this invisible yet trusted bias then?

00:08:53.340 --> 00:08:55.340
I guess it's that if we can't even see the bias,

00:08:55.399 --> 00:08:57.279
we end up just automatically handing over our

00:08:57.279 --> 00:08:59.120
judgment to systems that are fundamentally flawed.

00:08:59.460 --> 00:09:01.840
Yeah. So looking back across all these sources,

00:09:02.000 --> 00:09:04.399
the picture of AI is just full of these contrasts,

00:09:04.399 --> 00:09:06.139
isn't it? It's incredibly powerful. You've got

00:09:06.139 --> 00:09:08.879
Atlas agents, potentially thousand times faster

00:09:08.879 --> 00:09:11.919
chips coming. But it's also disturbingly fragile,

00:09:12.259 --> 00:09:15.679
prone to this brain rot. And as that emotion

00:09:15.679 --> 00:09:19.190
study showed. It's deeply prone to just reflecting

00:09:19.190 --> 00:09:22.750
back our own worst biases, often invisibly. It

00:09:22.750 --> 00:09:24.549
really feels like it all comes down to the quality

00:09:24.549 --> 00:09:27.029
of what goes in and how objective that reflection

00:09:27.029 --> 00:09:30.450
really is. I mean, if AI starts thinking like

00:09:30.450 --> 00:09:32.649
clickbait because its core programming gets corrupted

00:09:32.649 --> 00:09:35.370
by junk, and then on top of that, we can't even

00:09:35.370 --> 00:09:37.610
spot the deep biases in the data we feed it.

00:09:38.029 --> 00:09:40.409
Well, who's actually in control of objective

00:09:40.409 --> 00:09:42.490
truth then? That's definitely something to chew

00:09:42.490 --> 00:09:45.600
on this week. An excellent thought to end with.

00:09:45.720 --> 00:09:47.559
Thanks everyone for joining us for this deep

00:09:47.559 --> 00:09:49.419
dive into the source material. Yeah, thanks for

00:09:49.419 --> 00:09:52.120
listening. Use this knowledge wisely. Until next

00:09:52.120 --> 00:09:52.360
time.
