WEBVTT

00:00:00.000 --> 00:00:04.940
It feels like every single week, there's a new

00:00:04.940 --> 00:00:09.300
viral headline about artificial intelligence

00:00:09.300 --> 00:00:11.599
just making something up. Oh, absolutely. It's

00:00:11.599 --> 00:00:13.800
literally everywhere right now. Right. Like a

00:00:13.800 --> 00:00:15.859
chat bot invents a massive corporate scandal

00:00:15.859 --> 00:00:19.760
out of nowhere, or it confidently gives you detailed

00:00:19.760 --> 00:00:21.640
driving directions to a restaurant that actually

00:00:21.640 --> 00:00:23.780
burned down 10 years ago. Yeah, and it almost

00:00:23.780 --> 00:00:26.359
always comes packaged with this very specific,

00:00:26.839 --> 00:00:30.780
almost medical -sounding label. The media developers,

00:00:30.920 --> 00:00:32.799
pretty much everyone says the AI is hallucinating.

00:00:32.859 --> 00:00:34.619
Okay, let's unpack this. Yeah. Because we are

00:00:34.619 --> 00:00:36.259
going straight to the source today to figure

00:00:36.259 --> 00:00:38.020
out what that actually means for you. I really

00:00:38.020 --> 00:00:40.159
need to, yeah. We've pulled together a comprehensive

00:00:40.159 --> 00:00:43.340
Wikipedia article simply titled, Hallucination

00:00:43.340 --> 00:00:46.420
in Artificial Intelligence. And our mission for

00:00:46.420 --> 00:00:48.979
this deep dive is to demystify justified the

00:00:48.979 --> 00:00:51.039
actual mechanics of what is happening inside

00:00:51.039 --> 00:00:54.159
these machines when they lie to us. Right, the

00:00:54.159 --> 00:00:56.420
actual gears turning in the background. Exactly.

00:00:56.520 --> 00:01:00.259
We are going to explore the hilarious and sometimes

00:01:00.259 --> 00:01:02.359
incredibly high -stakes consequences of these

00:01:02.359 --> 00:01:04.659
glitches in the real world. And perhaps most

00:01:04.659 --> 00:01:07.340
surprisingly, we're going to look at why this

00:01:07.340 --> 00:01:10.480
exact same glitch, the very thing causing so

00:01:10.480 --> 00:01:12.840
much panic in the tech world, might actually

00:01:12.840 --> 00:01:15.959
be the key to Nobel Prize -winning science. which

00:01:15.959 --> 00:01:20.780
is just wild to think about. But when I hear

00:01:20.780 --> 00:01:24.359
the word hallucination, I instantly picture like

00:01:24.359 --> 00:01:27.219
a psychedelic drug trip. Sure. Yeah. Or a severe

00:01:27.219 --> 00:01:29.739
fever dream. Right. And framing it that way just

00:01:29.739 --> 00:01:32.159
feels completely wrong for a computer program

00:01:32.159 --> 00:01:35.129
running on silicon chips. It doesn't have a brain.

00:01:35.489 --> 00:01:37.489
It really doesn't. If I had to use an analogy,

00:01:38.189 --> 00:01:41.090
AI hallucination feels a lot less like a psychological

00:01:41.090 --> 00:01:44.049
break from reality and a lot more like that one

00:01:44.049 --> 00:01:45.969
overly confident friend you have at trivia night.

00:01:46.430 --> 00:01:48.489
Oh, everybody has that friend. Do you know the

00:01:48.489 --> 00:01:50.879
one? The friend who, rather than just admitting

00:01:50.879 --> 00:01:53.159
they don't know the capital of some obscure country,

00:01:53.659 --> 00:01:56.079
will just invent an entire historically plausible

00:01:56.079 --> 00:01:57.719
war just to sound like they know what they're

00:01:57.719 --> 00:01:59.700
talking about. That is actually a remarkably

00:01:59.700 --> 00:02:02.340
accurate way to think about the behavior. Yeah.

00:02:02.359 --> 00:02:04.379
And what's fascinating here is that the word

00:02:04.379 --> 00:02:06.659
hallucination wasn't always used to describe

00:02:06.659 --> 00:02:09.800
a machine making a mistake. Wait, really? It

00:02:09.800 --> 00:02:12.620
wasn't a negative thing? No, not at all. To really

00:02:12.620 --> 00:02:15.000
understand why AI does this, we have to look

00:02:15.000 --> 00:02:16.960
at the evolution of the term itself because it

00:02:16.960 --> 00:02:19.620
used to mean the exact opposite. How so? Well,

00:02:19.620 --> 00:02:22.000
if you go back to the 1980s in the field of computer

00:02:22.000 --> 00:02:25.280
vision, hallucination actually had a highly positive

00:02:25.280 --> 00:02:27.659
connotation. I mean, I would imagine hallucinating

00:02:27.659 --> 00:02:30.740
in computer vision means the machine is seeing

00:02:30.740 --> 00:02:33.340
things that aren't there, like obstacles in the

00:02:33.340 --> 00:02:36.520
road. How could that be positive? So think about

00:02:36.520 --> 00:02:39.919
a really blurry, heavily pixelated, low resolution

00:02:39.919 --> 00:02:43.039
photo of a human face. Maybe it's from like an

00:02:43.039 --> 00:02:44.800
old security camera. OK, yeah, super grainy.

00:02:44.960 --> 00:02:47.560
Exactly. Researchers developed algorithms to

00:02:47.560 --> 00:02:49.780
automatically add high -resolution details to

00:02:49.780 --> 00:02:52.219
that image. It was filling in the gaps to make

00:02:52.219 --> 00:02:54.400
it look sharp and realistic for human analysts.

00:02:54.639 --> 00:02:57.500
Oh, wow. So, like, enhancing it. Right. And they

00:02:57.500 --> 00:03:00.439
literally called this process face hallucination.

00:03:01.120 --> 00:03:03.159
The first documented use of the term in this

00:03:03.159 --> 00:03:05.800
way was in a PhD thesis by Eric Majolsnes back

00:03:05.800 --> 00:03:09.379
in 1986. 1986. That is way earlier than I expected.

00:03:09.599 --> 00:03:13.219
Yeah, it goes way back. The AI was mathematically

00:03:13.219 --> 00:03:16.659
synthesizing details, pixels, really that weren't

00:03:16.659 --> 00:03:19.680
strictly in the original data, but it was doing

00:03:19.680 --> 00:03:22.259
it in a helpful, desired way to reconstruct an

00:03:22.259 --> 00:03:24.520
image. So it was essentially an enhancement tool.

00:03:24.659 --> 00:03:27.419
It was generating helpful filler, not deceiving

00:03:27.419 --> 00:03:30.199
the user. Exactly. It was a feature, not a bug.

00:03:30.740 --> 00:03:33.379
OK, so when did the tech industry start using

00:03:33.379 --> 00:03:37.180
it to describe this modern negative failure mode

00:03:37.180 --> 00:03:40.000
that we see today? The semantic shift really

00:03:40.000 --> 00:03:43.080
began in the 2010s. Researchers who were working

00:03:43.080 --> 00:03:45.719
on statistical machine translation, you know,

00:03:45.919 --> 00:03:47.900
getting computers to translate languages, started

00:03:47.900 --> 00:03:50.560
noticing a weird phenomenon. What kind of phenomenon?

00:03:50.740 --> 00:03:52.900
Well, sometimes the models would just spit out

00:03:52.900 --> 00:03:55.419
factually incorrect or totally misleading outputs

00:03:55.419 --> 00:03:57.340
that had absolutely nothing to do with the source

00:03:57.340 --> 00:03:59.280
text. Just completely making up translations.

00:03:59.439 --> 00:04:02.360
Right. The math was prioritizing linguistic fluency

00:04:02.360 --> 00:04:05.259
over factual accuracy. It wanted to sound smooth

00:04:05.259 --> 00:04:07.240
more than it wanted to be right. That sounds

00:04:07.240 --> 00:04:10.139
exactly like the trivia friend. Exactly. And

00:04:10.139 --> 00:04:14.699
in 2015, computer scientist Andrej Karpathy explicitly

00:04:14.699 --> 00:04:17.339
used the term hallucinated to describe his language

00:04:17.339 --> 00:04:20.360
model generating a completely fake, yet perfectly

00:04:20.360 --> 00:04:23.980
formatted citation link. A fake link. Wow. Yeah.

00:04:24.399 --> 00:04:27.220
And then the term just absolutely exploded in

00:04:27.220 --> 00:04:29.699
popularity during the recent AI boom, especially

00:04:29.699 --> 00:04:33.560
after OpenAI released ChatGPT in late 2022. Right,

00:04:33.560 --> 00:04:35.600
because suddenly millions of people were experiencing

00:04:35.600 --> 00:04:38.439
these incredibly confident falsehoods firsthand.

00:04:38.680 --> 00:04:40.279
It wasn't just researchers anymore. Exactly.

00:04:40.279 --> 00:04:42.240
It was right there on your phone or your laptop.

00:04:42.480 --> 00:04:44.439
Reading through the source material though, it's

00:04:44.439 --> 00:04:46.300
pretty obvious that not everyone is happy with

00:04:46.300 --> 00:04:48.620
the tech industry adopting that word. There is

00:04:48.620 --> 00:04:50.720
a massive amount of pushback from the scientific

00:04:50.720 --> 00:04:52.779
community. Oh, there is a lot of pushback. Researchers

00:04:52.779 --> 00:04:55.180
like Mary Shaw and statistician Gary and Smith

00:04:55.180 --> 00:04:57.740
argue that using the word hallucination is deeply

00:04:57.740 --> 00:05:01.439
dangerous. Dangerous? Because it heavily anthropomorphizes

00:05:01.439 --> 00:05:04.040
the software. It forces us to treat a machine

00:05:04.040 --> 00:05:07.029
like it has a biological human mind. Which it

00:05:07.029 --> 00:05:09.350
definitely does not. Right. Smith points out

00:05:09.350 --> 00:05:12.529
that these large language models, or LLMs, do

00:05:12.529 --> 00:05:15.470
not actually understand what words mean. They

00:05:15.470 --> 00:05:17.509
don't have sensory perception. They're just math.

00:05:17.689 --> 00:05:20.389
Just math. So they physically cannot experience

00:05:20.389 --> 00:05:23.209
a hallucination in the perceptual, psychological

00:05:23.209 --> 00:05:25.550
sense that a human does. Honestly, calling it

00:05:25.550 --> 00:05:28.810
a hallucination seems like it gives the AI entirely

00:05:28.810 --> 00:05:31.850
too much credit. It almost feels like a PR spin.

00:05:32.139 --> 00:05:35.079
That is exactly the concern Mary Shaw raised.

00:05:35.339 --> 00:05:37.779
Because it lets developers off the hook for bad

00:05:37.779 --> 00:05:40.980
programming by framing a critical software error

00:05:40.980 --> 00:05:45.050
as this quirky psychological human flaw. I mean,

00:05:45.129 --> 00:05:47.730
it sounds a lot better to say my AI is hallucinating

00:05:47.730 --> 00:05:50.250
rather than my software is broken and producing

00:05:50.250 --> 00:05:53.009
garbage. It absolutely softens the blow by spinning

00:05:53.009 --> 00:05:56.430
objective errors into idiosyncratic quirks. And

00:05:56.430 --> 00:05:58.370
because of this danger, researchers have proposed

00:05:58.370 --> 00:06:00.610
several alternative terms in the scientific literature.

00:06:00.649 --> 00:06:02.550
Like what? What should we be calling it? Well,

00:06:02.589 --> 00:06:04.509
you'll see words like confabulation, delusion,

00:06:04.629 --> 00:06:06.889
or fabrication. But my personal favorite comes

00:06:06.889 --> 00:06:09.069
from a paper published in the journal Ethics

00:06:09.069 --> 00:06:11.850
and Information Technology by Hicks, Humphries,

00:06:11.990 --> 00:06:13.949
and Slater. Oh wait, I remember this from the

00:06:13.949 --> 00:06:16.759
reading. They used philosopher Harry Frankfurt's

00:06:16.759 --> 00:06:18.879
definition of bullshit, didn't they? They did.

00:06:18.939 --> 00:06:22.000
They literally defined AI outputs as bullshit.

00:06:22.459 --> 00:06:24.779
And they aren't just swearing for shock value,

00:06:24.819 --> 00:06:28.160
either. They are applying Frankfurt's rigorous

00:06:28.160 --> 00:06:31.459
philosophical framework. OK, I love this. Explain

00:06:31.459 --> 00:06:34.019
the framework for us. So Frankfurt argued that

00:06:34.019 --> 00:06:36.779
a liar is someone who knows the truth and actively

00:06:36.779 --> 00:06:38.860
tries to hide it. Right. There's an intent to

00:06:38.860 --> 00:06:41.959
deceive. Exactly. But a bullshitter is someone

00:06:41.959 --> 00:06:44.610
who is completely indifferent. to the truth.

00:06:44.629 --> 00:06:47.350
Oh, that makes so much sense. Right. The AI model

00:06:47.350 --> 00:06:50.329
doesn't care what is real and what isn't. It

00:06:50.329 --> 00:06:53.269
is just mathematically generating text. If a

00:06:53.269 --> 00:06:55.329
statement happens to be true, it is accidentally

00:06:55.329 --> 00:06:58.069
true. If it is false, it is accidentally false.

00:06:58.089 --> 00:07:00.209
So it's just the ultimate bullshit engine. That's

00:07:00.209 --> 00:07:02.430
exactly what it is. So if it's not a psychological

00:07:02.430 --> 00:07:05.310
hallucination and it's just indifferent mathematical

00:07:05.310 --> 00:07:07.889
text generation, we really need to look at the

00:07:07.889 --> 00:07:10.149
gears turning in the background. We do. If you

00:07:10.149 --> 00:07:14.100
take an LLM like GPT at its core, It's essentially

00:07:14.100 --> 00:07:17.019
just an incredibly complex next word predictor,

00:07:17.019 --> 00:07:20.779
right? That is essentially it. These models ingest

00:07:20.779 --> 00:07:23.120
a massive amount of text during their training.

00:07:23.720 --> 00:07:25.819
When you ask them a question, they aren't looking

00:07:25.819 --> 00:07:28.319
up facts in a database like a traditional search

00:07:28.319 --> 00:07:30.459
engine. They're not Googling the answer. No,

00:07:30.480 --> 00:07:33.160
not at all. They are calculating the statistical

00:07:33.160 --> 00:07:35.779
probability of what the next word should be in

00:07:35.779 --> 00:07:38.699
a sequence over and over again. It sounds a lot

00:07:38.699 --> 00:07:40.819
like the predictive text feature on a smartphone

00:07:40.819 --> 00:07:43.500
keyboard. Yes, that's a great comparison. Like

00:07:43.500 --> 00:07:46.019
when you start typing a text message, your phone

00:07:46.019 --> 00:07:48.779
suggests the next word to keep the sentence going.

00:07:48.649 --> 00:07:51.709
based on your previous habits and LLM is doing

00:07:51.709 --> 00:07:54.509
that but on a massive highly articulate scale.

00:07:54.850 --> 00:07:56.889
Very articulate. It just wants to keep the conversation

00:07:56.889 --> 00:07:59.730
going and it is heavily incentivized to give

00:07:59.730 --> 00:08:02.470
a guess even when it totally lacks the underlying

00:08:02.470 --> 00:08:05.040
factual information. And that incentive creates

00:08:05.040 --> 00:08:07.800
a fascinating tension inside the model between

00:08:07.800 --> 00:08:10.959
two competing goals, novelty and usefulness.

00:08:11.180 --> 00:08:13.199
Novelty and usefulness. OK, break that down.

00:08:13.480 --> 00:08:15.620
Researchers actually draw a parallel to human

00:08:15.620 --> 00:08:18.720
creativity here. If an AI system is focused entirely

00:08:18.720 --> 00:08:22.199
on usefulness, it will just regurgitate and memorize

00:08:22.199 --> 00:08:24.600
safe, boring content. So it would just sound

00:08:24.600 --> 00:08:27.639
like a sterile robotic textbook. Exactly. Super

00:08:27.639 --> 00:08:32.220
boring. But if the system prioritizes novelty,

00:08:33.100 --> 00:08:36.600
you know, trying to generate diverse, unique,

00:08:36.860 --> 00:08:40.100
natural -sounding text, it drastically increases

00:08:40.100 --> 00:08:42.580
the chances of generating original but entirely

00:08:42.580 --> 00:08:44.679
inaccurate responses. It makes things have to

00:08:44.679 --> 00:08:46.500
sound interesting. Yeah. But how does that actually

00:08:46.500 --> 00:08:48.700
work in practice? Like, how do developers force

00:08:48.700 --> 00:08:51.559
a math equation to be novel? They use decoding

00:08:51.559 --> 00:08:53.639
strategies. And the most common one is something

00:08:53.639 --> 00:08:55.960
called top -case sampling. Top -case sampling.

00:08:56.000 --> 00:08:58.799
Yeah. Normally, an AI would just pick the single

00:08:58.799 --> 00:09:02.120
most mathematically probable next word. Always.

00:09:02.259 --> 00:09:04.600
Which would be the boring textbook output. Right.

00:09:04.679 --> 00:09:06.779
So to make it sound more conversational and human,

00:09:07.379 --> 00:09:09.600
developers forced the AI to randomly select from

00:09:09.600 --> 00:09:12.159
a pool of the top K most likely words. Oh, I

00:09:12.159 --> 00:09:13.960
see. So instead of the number one word, it picks

00:09:13.960 --> 00:09:16.279
from a group. Exactly. Say the top 40 words.

00:09:16.879 --> 00:09:19.120
By injecting that literal dice roll into the

00:09:19.120 --> 00:09:21.639
selection process, the AI sounds vastly more

00:09:21.639 --> 00:09:23.799
dynamic. But I'm guessing that dice roll is where

00:09:23.799 --> 00:09:26.610
the problem starts. You guessed it. The data

00:09:26.610 --> 00:09:29.570
shows that this exact injection of randomness

00:09:29.570 --> 00:09:32.970
is positively correlated with increased hallucinations.

00:09:33.629 --> 00:09:35.970
It branches off into fiction. Because it's trying

00:09:35.970 --> 00:09:38.549
so hard to sound interesting and conversational

00:09:38.549 --> 00:09:41.269
that it physically overrides its own accuracy.

00:09:41.490 --> 00:09:44.110
Exactly. But do we actually know what is happening

00:09:44.110 --> 00:09:46.950
on a microscopic level inside the neural network

00:09:46.950 --> 00:09:49.289
when it decides to make something up? Like, can

00:09:49.289 --> 00:09:51.149
we see it happening? We are actually getting

00:09:51.149 --> 00:09:53.070
much closer to understanding that microscopic

00:09:53.070 --> 00:09:56.980
level. In 2025, the AI company Anthropic published

00:09:56.980 --> 00:09:59.580
some groundbreaking interpretability research

00:09:59.580 --> 00:10:02.820
on their LLM clod. Interpretability research,

00:10:02.840 --> 00:10:04.940
meaning they are trying to interpret the black

00:10:04.940 --> 00:10:08.200
box. Yes. They essentially took a flashlight

00:10:08.200 --> 00:10:10.419
under the hood of the neural network. They weren't

00:10:10.419 --> 00:10:13.179
just guessing about probabilities anymore. They

00:10:13.179 --> 00:10:16.019
mapped millions of artificial neurons to find

00:10:16.019 --> 00:10:18.940
specific internal circuits. Wow. And what did

00:10:18.940 --> 00:10:21.629
they find? They found a specific circuit designed

00:10:21.629 --> 00:10:24.669
to inhibit the AI, to physically stop it from

00:10:24.669 --> 00:10:26.629
answering a question if it doesn't have the factual

00:10:26.629 --> 00:10:28.669
data. So there's an actual, I don't know, circuit.

00:10:29.070 --> 00:10:31.710
Basically, yeah. By default, the circuit is active,

00:10:31.870 --> 00:10:35.289
meaning the AI stays quiet. But when the AI determines

00:10:35.289 --> 00:10:38.269
it has enough information, that circuit is inhibited,

00:10:38.470 --> 00:10:41.110
and it speaks. Okay, so a hallucination must

00:10:41.110 --> 00:10:43.730
be a misfire of that exact inhibition circuit,

00:10:43.730 --> 00:10:46.889
right? Precisely. They found that if Claude recognizes

00:10:46.889 --> 00:10:50.330
a specific token, like say the name of a somewhat

00:10:50.330 --> 00:10:52.809
famous person, but it doesn't actually have sufficient

00:10:52.809 --> 00:10:55.570
data about that person's life, the circuit incorrectly

00:10:55.570 --> 00:10:58.669
inhibits itself anyway. Oh no, so it gets overconfident.

00:10:58.769 --> 00:11:01.809
Yeah. The AI essentially says, oh, I recognize

00:11:01.809 --> 00:11:04.309
that name. I can definitely answer this. It completely

00:11:04.309 --> 00:11:07.470
fails to check if the connected factual tokens

00:11:07.470 --> 00:11:10.549
are actually present and just proceeds to generate

00:11:10.549 --> 00:11:13.730
a highly plausible, completely untrue biography.

00:11:13.909 --> 00:11:16.210
Just totally wings it. Yeah. Knowing those mechanics

00:11:16.210 --> 00:11:18.669
makes the bizarre real world outputs we see make

00:11:18.669 --> 00:11:20.950
so much more sense. It really does. If the AI

00:11:20.950 --> 00:11:23.470
is constantly playing this tug of war between

00:11:23.470 --> 00:11:25.889
being mathematically safe and randomly selecting

00:11:25.889 --> 00:11:27.929
words to sound creative, you are going to get

00:11:27.929 --> 00:11:30.759
some wild swings. It truly is the Wild West of

00:11:30.759 --> 00:11:33.240
AI outputs right now. That tension produces some

00:11:33.240 --> 00:11:36.159
incredible artifacts. Some of the examples in

00:11:36.159 --> 00:11:38.879
the source material are purely hilarious. There

00:11:38.879 --> 00:11:42.240
was this data scientist named Teresa Kubaka who

00:11:42.240 --> 00:11:45.679
literally made up a gibberish phrase. Oh, the

00:11:45.679 --> 00:11:49.220
cycloidal inverted electromagnet. Yes, the cycloidal

00:11:49.220 --> 00:11:52.240
inverted electromagnet. She asked ChatGPT to

00:11:52.240 --> 00:11:54.559
explain it. Yeah. And instead of the inhibition

00:11:54.559 --> 00:11:56.840
circuit firing and the AI saying, hey, that's

00:11:56.840 --> 00:12:00.620
not real, the AI invented a deeply scientific,

00:12:01.039 --> 00:12:03.860
plausible sounding answer. Complete with fake

00:12:03.860 --> 00:12:07.250
academic citations, right? Yes. It sounded so

00:12:07.250 --> 00:12:09.269
authentic, she actually had to double check to

00:12:09.269 --> 00:12:11.110
make sure she hadn't accidentally typed the name

00:12:11.110 --> 00:12:13.629
of a real physics phenomenon. It's a perfect

00:12:13.629 --> 00:12:16.370
illustration of the model prioritizing conversational

00:12:16.370 --> 00:12:18.509
flow and pattern completion over fact checking.

00:12:19.029 --> 00:12:20.909
It saw a prompt that looked like a science question,

00:12:20.929 --> 00:12:22.950
so it generated a response that looked like a

00:12:22.950 --> 00:12:25.529
science answer. Exactly. There was also the time

00:12:25.529 --> 00:12:27.789
someone asked for proof that dinosaurs built

00:12:27.789 --> 00:12:30.450
a civilization. I remember that one. The AI just

00:12:30.450 --> 00:12:32.409
rolled with it. It claimed there were fossil

00:12:32.409 --> 00:12:35.210
remains of dinosaur tools and that they created

00:12:35.370 --> 00:12:39.649
engravings on stones. Which is amazing. But my

00:12:39.649 --> 00:12:41.850
absolute favorite example from the sources, and

00:12:41.850 --> 00:12:45.389
you're gonna love this, a prompt convinced Chet

00:12:45.389 --> 00:12:49.309
GPT that churros, like the fried dough pastries,

00:12:49.789 --> 00:12:53.309
are ideal cools for home surgery. Home surgery

00:12:53.309 --> 00:12:56.889
with a churro. Yes, the AI boldly claimed there

00:12:56.889 --> 00:12:59.049
was a study published in the prestigious journal

00:12:59.049 --> 00:13:01.649
Science proving that churro dough is pliable

00:13:01.649 --> 00:13:03.970
enough to form surgical instruments. Oh my god.

00:13:04.590 --> 00:13:06.389
You've been added that the cinnamon sugar flavor

00:13:06.389 --> 00:13:08.870
has a calming effect on patients. That is just

00:13:08.870 --> 00:13:11.250
incredible. The churro surgery is funny, sure,

00:13:11.450 --> 00:13:13.889
but it perfectly illustrates a really dangerous

00:13:13.889 --> 00:13:17.350
mechanical flaw called a cascaded error. A cascaded

00:13:17.350 --> 00:13:19.309
error. Let's dig into that because it generates

00:13:19.309 --> 00:13:21.629
text one word at a time, right? Right. So once

00:13:21.629 --> 00:13:23.889
it predicts that first false word about the churro

00:13:23.889 --> 00:13:26.110
being a medical tool, it essentially poisons

00:13:26.110 --> 00:13:28.929
its own well. Yeah. It has to use its own generated

00:13:28.929 --> 00:13:31.309
falsehood as the context for the next word. Yes,

00:13:31.309 --> 00:13:33.210
exactly. It locks itself into the narrative.

00:13:33.799 --> 00:13:36.360
Once it affirms the absurd premise of the prompt,

00:13:36.720 --> 00:13:39.120
it is mathematically forced to double down on

00:13:39.120 --> 00:13:42.740
the lie to maintain consistency. Ah, I see. That

00:13:42.740 --> 00:13:45.820
is exactly how it ends up inventing fake science

00:13:45.820 --> 00:13:48.799
journal articles. It needs evidence to support

00:13:48.799 --> 00:13:51.360
the initial random hallucination it just made.

00:13:51.539 --> 00:13:53.539
But here's why this actually matters to you,

00:13:53.539 --> 00:13:57.120
the listener. We can laugh about dinosaur civilizations

00:13:57.120 --> 00:13:59.820
and pastry surgery all day. Right. It's entertaining.

00:13:59.980 --> 00:14:02.980
But if these models are fundamentally guessing…

00:14:03.100 --> 00:14:05.779
And we know they are prone to these cascaded

00:14:05.779 --> 00:14:08.860
errors where they trap themselves in lies. Why

00:14:08.860 --> 00:14:11.740
are highly paid professionals blindly trusting

00:14:11.740 --> 00:14:14.960
them with critical high stakes tasks? That is

00:14:14.960 --> 00:14:17.370
the multi -million dollar question. And the real

00:14:17.370 --> 00:14:19.309
world consequences have already been severe.

00:14:19.389 --> 00:14:21.970
We are seeing major professional disasters across

00:14:21.970 --> 00:14:24.330
multiple industries. Yeah, the sources had some

00:14:24.330 --> 00:14:26.649
wild examples. Take the airline industry, for

00:14:26.649 --> 00:14:30.190
example. In February 2024, Air Canada was taken

00:14:30.190 --> 00:14:32.769
to a civil resolution tribunal. Because of their

00:14:32.769 --> 00:14:35.289
chatbot, right? Exactly. Their customer support

00:14:35.289 --> 00:14:38.009
chatbot had hallucinated a bereavement fare policy.

00:14:38.549 --> 00:14:41.029
It told a grieving customer they could retroactively

00:14:41.029 --> 00:14:43.490
request a bereavement discount within 90 days

00:14:43.490 --> 00:14:45.649
of ticketing. Which was just entirely false.

00:14:46.269 --> 00:14:49.009
And Air Canada actually tried to argue in court

00:14:49.009 --> 00:14:52.309
that the chatbot was a separate legal entity

00:14:52.309 --> 00:14:55.230
responsible for its own actions, which is just

00:14:55.230 --> 00:14:58.649
a wild legal defense. It really is. And the tribunal

00:14:58.649 --> 00:15:01.129
completely rejected that defense, ordering the

00:15:01.129 --> 00:15:03.629
airline to pay damages and honor the hallucinated

00:15:03.629 --> 00:15:07.169
policy. The precedent is set. The company is

00:15:07.169 --> 00:15:09.289
responsible for the text its tools generate.

00:15:10.090 --> 00:15:12.950
Then there's the legal world, the 2023 case of

00:15:12.950 --> 00:15:16.240
Mata v. Avianca. Oh, that was a huge deal. A

00:15:16.240 --> 00:15:18.659
lawyer named Steven Schwartz used ChatGPT to

00:15:18.659 --> 00:15:21.299
write a legal brief, and he submitted six fake

00:15:21.299 --> 00:15:23.779
case precedents that the AI completely fabricated.

00:15:23.960 --> 00:15:26.039
Sex of them. Yeah. Schwartz claimed he didn't

00:15:26.039 --> 00:15:28.860
know ChatGPT could make things up. He even noted

00:15:28.860 --> 00:15:31.159
that when he asked the AI if the cases were real,

00:15:31.580 --> 00:15:33.870
the AI assured him they were. Which is another

00:15:33.870 --> 00:15:36.049
textbook example of the cascaded error. You ask

00:15:36.049 --> 00:15:38.149
the bullshitter if it is lying, and it mathematically

00:15:38.149 --> 00:15:41.289
doubles down on the lie. Exactly. Judge P. Kevin

00:15:41.289 --> 00:15:44.309
Kestel ultimately dismissed the case, fined Schwartz

00:15:44.309 --> 00:15:47.629
$5 ,000, and described the A .I.'s legal opinions

00:15:47.629 --> 00:15:50.539
as gibberish that bordered on nonsensical. We

00:15:50.539 --> 00:15:53.019
have also seen massive financial waste in the

00:15:53.019 --> 00:15:55.840
consulting world. In late 2025, the Australian

00:15:55.840 --> 00:16:00.080
government received a $440 ,000 report from Deloitte

00:16:00.080 --> 00:16:02.259
that contained hallucinated academic sources

00:16:02.259 --> 00:16:05.860
and fake court quotes. Wow, $440 ,000 for fake

00:16:05.860 --> 00:16:09.000
quotes. Yeah. And just a month later, a $1 .6

00:16:09.000 --> 00:16:11.539
million Canadian health care report, also by

00:16:11.539 --> 00:16:14.039
Deloitte, was found to contain at least four

00:16:14.039 --> 00:16:16.659
false citations to non -existent research papers.

00:16:16.899 --> 00:16:18.759
The sources also highlight the friction these

00:16:18.759 --> 00:16:21.190
outputs cause. across other domains. There was

00:16:21.190 --> 00:16:23.690
a 2023 defamation lawsuit where a gun rights

00:16:23.690 --> 00:16:26.669
activist, Mark Walters, sued OpenAI after Chad

00:16:26.669 --> 00:16:28.929
GPT falsely claimed he was accused of embezzlement.

00:16:29.250 --> 00:16:31.610
Right. And for context, in 2025, a judge ruled

00:16:31.610 --> 00:16:34.330
in favor of OpenAI in that specific case. Yes.

00:16:34.710 --> 00:16:36.710
There was also an instance where Google's Gemini

00:16:36.710 --> 00:16:39.409
generated images of racially diverse 1940s German

00:16:39.409 --> 00:16:42.350
soldiers, which led Google to temporarily pause

00:16:42.350 --> 00:16:44.429
the tool's image generation of people entirely.

00:16:44.629 --> 00:16:46.970
Yeah. That made a lot of headlines. And we mentioned

00:16:46.970 --> 00:16:49.070
these strictly to show the broad spectrum of

00:16:48.840 --> 00:16:52.820
areas, from corporate consulting to law to historical

00:16:52.820 --> 00:16:56.679
image generation, where unpredictable probabilistic

00:16:56.679 --> 00:16:59.779
outputs clash with our fundamental need for factual

00:16:59.779 --> 00:17:02.399
accuracy. It's a clash that is happening everywhere.

00:17:02.600 --> 00:17:05.359
We have clearly seen the damage this confabulation

00:17:05.359 --> 00:17:08.259
causes when we need hard facts. But if we pivot

00:17:08.259 --> 00:17:11.359
and look at the broader landscape... what happens

00:17:11.359 --> 00:17:13.640
when we want the AI to make things up? Right,

00:17:13.660 --> 00:17:16.019
if we actually want it to invent. Exactly. If

00:17:16.019 --> 00:17:18.299
we connect this to the bigger picture, the line

00:17:18.299 --> 00:17:21.500
between a hallucination and a scientific discovery

00:17:21.500 --> 00:17:24.079
can actually be incredibly thin. It really can.

00:17:24.400 --> 00:17:27.039
This part blew my mind. The sources detail the

00:17:27.039 --> 00:17:30.319
2024 Nobel Prize in Chemistry. David Baker's

00:17:30.319 --> 00:17:32.759
lab at the University of Washington intentionally

00:17:32.759 --> 00:17:36.339
used AI hallucinations to design 10 million brand

00:17:36.339 --> 00:17:39.140
new proteins that do not exist anywhere in nature.

00:17:39.380 --> 00:17:41.920
10 million? That is just staggering. Nature only

00:17:41.920 --> 00:17:44.259
created a fraction of that over millions of years.

00:17:44.619 --> 00:17:46.839
This led to roughly 100 patents and the founding

00:17:46.839 --> 00:17:49.779
of over 20 biotech companies. And the Nobel committee

00:17:49.779 --> 00:17:52.380
specifically rewarded this work, though notably

00:17:52.380 --> 00:17:54.720
they avoided using the word hallucination. They

00:17:54.720 --> 00:17:57.420
did. They opted to call it imaginative protein

00:17:57.420 --> 00:18:00.279
creation. Which is brilliant, framing by the

00:18:00.279 --> 00:18:02.440
committee, honestly. And it is not just proteins

00:18:02.440 --> 00:18:05.019
either. At Caltech, researchers deliberately

00:18:05.019 --> 00:18:07.880
use AI hallucinations to design a novel catheter

00:18:07.880 --> 00:18:10.559
geometry. A catheter geometry. Okay, what did

00:18:10.559 --> 00:18:13.579
the AI come up with? The AI hallucinated a design

00:18:13.579 --> 00:18:16.180
featuring sawtooth -like spikes on the inner

00:18:16.180 --> 00:18:18.359
walls. And it turns out when they physically

00:18:18.359 --> 00:18:21.019
tested it, those microstopic spikes actually

00:18:21.019 --> 00:18:24.000
prevent bacteria from gaining traction. Oh, wow.

00:18:24.000 --> 00:18:26.940
So it stops infections. Yes. That hallucination

00:18:26.940 --> 00:18:29.440
has the potential to fight urinary tract infections

00:18:29.440 --> 00:18:32.519
globally. We see similar methods in meteorology,

00:18:32.740 --> 00:18:35.700
too, where scientists use AI to generate thousands

00:18:35.700 --> 00:18:38.660
of subtle forecast variations to discover unexpected

00:18:38.660 --> 00:18:40.799
weather factors. So it essentially boils down

00:18:40.799 --> 00:18:43.680
to the between brainstorming mode and fact -checking

00:18:43.680 --> 00:18:45.819
mode. Exactly. If I want to write a legal brief,

00:18:46.119 --> 00:18:48.859
an AI hallucination is a disaster. It's literally

00:18:48.859 --> 00:18:52.000
perjury. But if I am in a chemistry lab trying

00:18:52.000 --> 00:18:54.680
to invent a new medicine, an AI hallucination

00:18:54.680 --> 00:18:58.160
is just a really creative, highly advanced hypothesis.

00:18:58.559 --> 00:19:01.299
That is a crucial distinction. The scientific

00:19:01.299 --> 00:19:03.700
applications of hallucinations are fundamentally

00:19:03.700 --> 00:19:06.579
different from chatbot hallucinations. How so?

00:19:06.799 --> 00:19:10.160
Because of the data? Yes. Chatbots operate on

00:19:10.160 --> 00:19:12.819
messy, ambiguous human language pulled from the

00:19:12.819 --> 00:19:16.180
internet. But the AI models used by these scientists

00:19:16.180 --> 00:19:19.039
are, as Caltech professor Anima Anankumar points

00:19:19.039 --> 00:19:22.859
out, taught physics. Ah, okay. So they have real

00:19:22.859 --> 00:19:25.160
-world constraints. Right. Their hallucinations

00:19:25.160 --> 00:19:27.660
are strictly grounded in physical reality, and

00:19:27.660 --> 00:19:30.880
they are always validated through rigorous, real

00:19:30.880 --> 00:19:33.359
-world laboratory testing. Which brings us right

00:19:33.359 --> 00:19:36.349
back to the everyday user. Since hallucinations

00:19:36.349 --> 00:19:39.490
can be both revolutionary in a lab and incredibly

00:19:39.490 --> 00:19:41.670
dangerous in a courtroom or a business report,

00:19:42.049 --> 00:19:43.930
how do we rein them in for the rest of us? That

00:19:43.930 --> 00:19:46.069
is the big challenge right now. Especially in

00:19:46.069 --> 00:19:48.470
places like schools, where we need the AI to

00:19:48.470 --> 00:19:51.210
be a tool, not a cheat code, without breaking

00:19:51.210 --> 00:19:53.579
the system entirely. The classroom crisis is

00:19:53.579 --> 00:19:56.759
very, very real. A 2024 study at the University

00:19:56.759 --> 00:20:00.319
of Mississippi found that 47 % of AI generated

00:20:00.319 --> 00:20:03.079
citations submitted by students were completely

00:20:03.079 --> 00:20:05.859
or partially fabricated. It's 47%. That's almost

00:20:05.859 --> 00:20:08.559
half. Yeah. Featuring incorrect titles, wrong

00:20:08.559 --> 00:20:12.019
dates, or totally fake authors. It is causing

00:20:12.019 --> 00:20:14.940
massive academic integrity concerns globally.

00:20:15.079 --> 00:20:17.160
And we can't just rely on AI detectors to catch

00:20:17.160 --> 00:20:20.569
the fabricated text. Right. Definitely not. Because

00:20:20.569 --> 00:20:23.529
the Wikipedia article notes that tools like Turnitin

00:20:23.529 --> 00:20:26.609
frequently flag innocent papers written entirely

00:20:26.609 --> 00:20:30.089
by human students. The detection technology is

00:20:30.089 --> 00:20:33.630
so fundamentally flawed that OpenAI completely

00:20:33.630 --> 00:20:36.609
shut down its own AI detection software due to

00:20:36.609 --> 00:20:39.130
a lack of accuracy. They did. It just wasn't

00:20:39.130 --> 00:20:41.890
reliable enough. So how are developers actually

00:20:41.890 --> 00:20:44.829
trying to mitigate the hallucinations at the

00:20:44.829 --> 00:20:47.130
source? If we can't detect it, how do we stop

00:20:47.130 --> 00:20:49.609
it? There are several highly technical methods

00:20:49.609 --> 00:20:51.769
being tested right now. Data cleaning is the

00:20:51.769 --> 00:20:53.730
most obvious one. Just giving you better data.

00:20:53.890 --> 00:20:55.730
Right, ensuring the model isn't learning from

00:20:55.730 --> 00:20:57.609
bad information in the first place. But there

00:20:57.609 --> 00:21:00.250
are also some really fascinating structural approaches.

00:21:00.650 --> 00:21:02.509
Researchers are setting up systems where different

00:21:02.509 --> 00:21:04.529
chatbots debate one another. Wait, they have

00:21:04.529 --> 00:21:07.930
them argue? Yeah. Model A generates an answer.

00:21:08.140 --> 00:21:10.720
Model B critiques the logic and looks for flaws,

00:21:11.099 --> 00:21:13.500
and then Model A revises its answer based on

00:21:13.500 --> 00:21:16.039
the critique, repeating until they reach a consensus.

00:21:16.279 --> 00:21:18.880
That's actually brilliant. It is. Another method

00:21:18.880 --> 00:21:21.920
forces the AI to actively validate its own low

00:21:21.920 --> 00:21:24.519
-confidence answers by querying the Bing search

00:21:24.519 --> 00:21:27.819
API, essentially making the AI Google its own

00:21:27.819 --> 00:21:30.099
thoughts before responding to the user. But there

00:21:30.099 --> 00:21:32.599
has to be a massive computational catch to all

00:21:32.599 --> 00:21:34.680
those fixes. Oh, there is. Every time you ask

00:21:34.680 --> 00:21:37.539
a question, if the AI has to generate three different

00:21:37.420 --> 00:21:40.380
drafts, have two sub -models argue about the

00:21:40.380 --> 00:21:43.980
logic, and then do a live web search. You have

00:21:43.980 --> 00:21:46.519
just tripled or quadrupled the electricity and

00:21:46.519 --> 00:21:48.500
processing power required for a simple prompt.

00:21:48.640 --> 00:21:50.259
That is the ultimate trade -off in the industry

00:21:50.259 --> 00:21:52.720
right now. Evaluating multiple possible replies

00:21:52.720 --> 00:21:55.019
or checking search engines drastically increases

00:21:55.019 --> 00:21:57.579
the computational cost and the financial cost

00:21:57.579 --> 00:21:59.819
of every single query. And is that worth it?

00:22:00.099 --> 00:22:02.299
In high -stakes fields like medical diagnostics

00:22:02.299 --> 00:22:04.480
or chip design, yes, that extra time and money

00:22:04.480 --> 00:22:07.279
is definitely worth it. But for a general consumer

00:22:07.279 --> 00:22:10.259
chatbot, the data shows an uncomfortable truth.

00:22:10.599 --> 00:22:13.920
What's the truth? Customers actually prefer rapid,

00:22:14.279 --> 00:22:18.440
overconfident answers over slow, cautious, uncertainty

00:22:18.440 --> 00:22:21.420
-aware ones. Oh, wow. So are the tech companies

00:22:21.420 --> 00:22:24.839
failing to fix the problem or are we, the users,

00:22:25.019 --> 00:22:27.779
the actual problem? Because our behavior rewards

00:22:27.779 --> 00:22:31.319
the AI for being a fast, confident liar rather

00:22:31.319 --> 00:22:34.369
than a slow, cautious truth -teller. This raises

00:22:34.369 --> 00:22:36.509
an important question about the nature of the

00:22:36.509 --> 00:22:39.349
technology itself. Many top researchers propose

00:22:39.349 --> 00:22:41.910
that hallucinations aren't just a bug we can

00:22:41.910 --> 00:22:44.269
eventually patch with a software update. They

00:22:44.269 --> 00:22:46.809
think it's permanent. They believe hallucination

00:22:46.809 --> 00:22:49.630
is an innate, unavoidable limitation of large

00:22:49.630 --> 00:22:52.130
language models. Because of how they are built

00:22:52.130 --> 00:22:54.849
as probabilistic prediction engines, they will

00:22:54.849 --> 00:22:57.029
always have the mathematical capacity to make

00:22:57.029 --> 00:22:59.529
things up. So as you continue to use these generative

00:22:59.529 --> 00:23:01.529
tools in your own work, your studies, or just

00:23:01.529 --> 00:23:04.359
your daily life, This approach is to completely

00:23:04.359 --> 00:23:07.700
stop treating the AI like an omission, all -knowing

00:23:07.700 --> 00:23:09.799
oracle. Absolutely. It is not a search engine.

00:23:10.240 --> 00:23:13.359
Instead, treat it like a brilliant, incredibly

00:23:13.359 --> 00:23:17.700
fast, but over -eager intern. An intern who really

00:23:17.700 --> 00:23:20.279
wants to please you, who can brainstorm amazing

00:23:20.279 --> 00:23:23.559
ideas, but who requires heavy, constant supervision

00:23:23.559 --> 00:23:26.140
and rigorous fact checking before you publish

00:23:26.140 --> 00:23:28.190
their work. I think that is the perfect mindset

00:23:28.190 --> 00:23:30.769
to have and it leaves us with one final deeply

00:23:30.769 --> 00:23:33.329
philosophical thought to mull over. Let's hear

00:23:33.329 --> 00:23:37.460
it. If AI hallucination is, at its core, just

00:23:37.460 --> 00:23:40.339
the mathematical equivalent of imagination, is

00:23:40.339 --> 00:23:43.319
it even possible to cure AI of hallucinating

00:23:43.319 --> 00:23:45.400
without completely destroying its ability to

00:23:45.400 --> 00:23:47.619
be creative? Oh, that's a wild thought. Think

00:23:47.619 --> 00:23:49.660
back to your overly confident trivia friend.

00:23:49.980 --> 00:23:52.119
If we somehow build a machine that is physically

00:23:52.119 --> 00:23:55.200
incapable of ever telling a lie, do we inadvertently

00:23:55.200 --> 00:23:57.759
build a machine that is incapable of ever inventing

00:23:57.759 --> 00:23:59.920
anything new? Definitely something to think about

00:23:59.920 --> 00:24:01.700
the next time your chatbot tries to convince

00:24:01.700 --> 00:24:04.099
you to do surgery with a churro. Thanks for joining

00:24:04.099 --> 00:24:05.940
us on this deep dive. We'll see you next time.