WEBVTT

00:00:00.000 --> 00:00:02.620
I uploaded a contract yesterday, just standard

00:00:02.620 --> 00:00:04.719
stuff, really, a service agreement. And I needed

00:00:04.719 --> 00:00:07.820
to check one thing, a specific liability cap.

00:00:08.039 --> 00:00:10.900
So I asked the AI and it came back instantly.

00:00:11.039 --> 00:00:13.539
It cited the clause perfectly. Paragraph four,

00:00:13.660 --> 00:00:16.539
section B said liability is capped at two times

00:00:16.539 --> 00:00:20.100
the total fees paid. It looked professional,

00:00:20.480 --> 00:00:23.410
totally sound. Let me guess. There was no Section

00:00:23.410 --> 00:00:25.010
B. There was no Section B. The clause didn't

00:00:25.010 --> 00:00:27.609
exist at all. But here's the kicker. It wasn't

00:00:27.609 --> 00:00:33.109
a glitch. It wasn't broken. The AI, it was trying

00:00:33.109 --> 00:00:35.590
to be helpful. And if I hadn't gone back to the

00:00:35.590 --> 00:00:37.310
PDF to double check, which, you know, I really

00:00:37.310 --> 00:00:39.929
didn't want to do, that helpful little hallucination

00:00:39.929 --> 00:00:42.149
could have cost us thousands. And that right

00:00:42.149 --> 00:00:44.530
there is the entire paradox we're living in now.

00:00:44.850 --> 00:00:46.909
Welcome back to the Deep Dive. It is Monday,

00:00:47.070 --> 00:00:49.920
February 9th, 2026. Usually we're around here

00:00:49.920 --> 00:00:51.740
talking about speed, you know, how fast the new

00:00:51.740 --> 00:00:53.880
models are. But today we're hitting the brakes.

00:00:53.979 --> 00:00:55.600
We're talking about trust. We're exploring the

00:00:55.600 --> 00:00:57.920
nature of truth when you're dealing with a machine

00:00:57.920 --> 00:01:00.200
that is literally programmed to please you. It's

00:01:00.200 --> 00:01:02.039
a fascinating problem, isn't it? Because the

00:01:02.039 --> 00:01:04.299
models are so much more powerful now than they

00:01:04.299 --> 00:01:07.680
were back in, you know, 2023. But that confident

00:01:07.680 --> 00:01:10.420
guessing issue, it's still here. And it's almost

00:01:10.420 --> 00:01:13.340
worse because the hallucinations sound so...

00:01:13.980 --> 00:01:16.659
Convincing now. The grammar is perfect. He wears

00:01:16.659 --> 00:01:19.879
a suit and tie now. Exactly. So to navigate this,

00:01:19.959 --> 00:01:21.920
we're digging into a technical guide by Max Am.

00:01:22.079 --> 00:01:25.560
It's called Why ChatGPT, Gemini, Claude are Still

00:01:25.560 --> 00:01:29.200
Hallucinating and How to Fix It. And the core

00:01:29.200 --> 00:01:31.799
idea is that we need to change how we work with

00:01:31.799 --> 00:01:34.420
these tools. to move from just a casual chat

00:01:34.420 --> 00:01:37.859
to something ann calls auditability auditability

00:01:37.859 --> 00:01:41.420
that is the skill for 2026. seriously it's not

00:01:41.420 --> 00:01:43.180
about finding an answer anymore the machine does

00:01:43.180 --> 00:01:45.819
that it's about proving the answer is real okay

00:01:45.819 --> 00:01:47.969
so let's start with the why Why does it lie?

00:01:48.129 --> 00:01:49.790
I think a lot of us still assume that if it sounds

00:01:49.790 --> 00:01:51.849
smart, it must be right. Right. But that's the

00:01:51.849 --> 00:01:53.810
wrong frame. It isn't lying out of malice. It

00:01:53.810 --> 00:01:55.750
isn't even broken. The irony is it's lying out

00:01:55.750 --> 00:01:58.349
of kindness or what it thinks is kindness. These

00:01:58.349 --> 00:02:00.129
models at their core, they're just prediction

00:02:00.129 --> 00:02:02.329
engines. They're trained to give you the most

00:02:02.329 --> 00:02:04.829
likely next word that will satisfy you. So its

00:02:04.829 --> 00:02:07.709
goal isn't be truthful. It's complete the sentence

00:02:07.709 --> 00:02:10.270
in a pleasing way. Don't disappoint the user.

00:02:10.509 --> 00:02:12.610
That's the core directive. So when it scans a

00:02:12.610 --> 00:02:14.409
file you upload and it can't find the answer.

00:02:15.259 --> 00:02:17.759
It has a choice. It can say, I don't know, which

00:02:17.759 --> 00:02:20.039
feels like a failure to its reward system. Or

00:02:20.039 --> 00:02:23.159
it can switch gears. It switches from retrieval

00:02:23.159 --> 00:02:25.560
mode to generative mode. It just starts guessing

00:02:25.560 --> 00:02:28.139
to fill the gap. The guide has this terrifying

00:02:28.139 --> 00:02:31.319
example with Apple's Q2 revenue. Walk us through

00:02:31.319 --> 00:02:33.000
that because it's not just a wrong number. It's

00:02:33.000 --> 00:02:35.639
like a whole work of fiction. It's the classic.

00:02:36.250 --> 00:02:39.490
polished lie. So you upload a dense financial

00:02:39.490 --> 00:02:42.110
report. You ask, what was Apple's Q2 revenue?

00:02:42.389 --> 00:02:44.669
But let's say the document doesn't use that exact

00:02:44.669 --> 00:02:47.150
phrase. Maybe it says second quarter fiscal results.

00:02:47.490 --> 00:02:50.009
So the simple text search fails. It fails. But

00:02:50.009 --> 00:02:51.729
instead of saying, I can't find that, the training

00:02:51.729 --> 00:02:53.810
kicks in. It starts thinking, well, Apple's a

00:02:53.810 --> 00:02:56.550
big company. Q2 is usually around March. And

00:02:56.550 --> 00:02:59.169
it just constructs an answer. In the example,

00:02:59.189 --> 00:03:03.169
it says $95 .4 billion for the fiscal 2025 second

00:03:03.169 --> 00:03:06.689
quarter, ending March 29, 2025. That is so specific.

00:03:06.729 --> 00:03:09.810
It has a dollar amount, a decimal, a date. And

00:03:09.810 --> 00:03:13.090
every single piece of it is hallucinated. It's

00:03:13.090 --> 00:03:15.069
just a statistical guess of what an earnings

00:03:15.069 --> 00:03:16.629
report should look like. You can immediately

00:03:16.629 --> 00:03:20.030
see the danger zones and the list of few like

00:03:20.030 --> 00:03:23.050
invoicing. Oh, that's a huge risk. You have automated

00:03:23.050 --> 00:03:27.689
systems processing invoices and the AI just hallucinates

00:03:27.689 --> 00:03:30.330
a line item for a shipping fee because it thinks

00:03:30.330 --> 00:03:32.930
there should be one. And you just overpay. By

00:03:32.930 --> 00:03:35.729
thousands. And no human would ever catch it because

00:03:35.729 --> 00:03:38.270
it looks normal. Insurance is another one. That's

00:03:38.270 --> 00:03:40.909
the nightmare. You ask, am I covered for flood

00:03:40.909 --> 00:03:43.069
damage? It sees a water damage clause and says,

00:03:43.169 --> 00:03:45.349
yep, you're covered. Then the flood hits and

00:03:45.349 --> 00:03:47.750
you find out your policy specifically excludes

00:03:47.750 --> 00:03:50.650
rising water. The AI just gave you a standard

00:03:50.650 --> 00:03:53.129
answer that wasn't true for you. And contracts,

00:03:53.409 --> 00:03:56.669
like my story. Miscompliance, wrong dates. It's

00:03:56.669 --> 00:03:59.129
the halo effect of good presentation. We see

00:03:59.129 --> 00:04:01.830
polish and our brain automatically assumes it's

00:04:01.830 --> 00:04:04.310
true. I have to admit, I still struggle with

00:04:04.310 --> 00:04:06.900
that. The blind trust. When the output looks

00:04:06.900 --> 00:04:09.659
that professional, the formatting is clean, the

00:04:09.659 --> 00:04:12.340
grammar is perfect. Yeah. It is just so hard

00:04:12.340 --> 00:04:14.120
for my brain to stop and say, wait a minute,

00:04:14.199 --> 00:04:17.680
is this real? It's a cognitive bias. We're wired

00:04:17.680 --> 00:04:20.000
to believe that competence in form means competence

00:04:20.000 --> 00:04:23.660
in fact. But with AI, those two things are totally

00:04:23.660 --> 00:04:26.300
decoupled. So this is the big question then.

00:04:26.399 --> 00:04:30.259
If the machine prioritizes being helpful over

00:04:30.259 --> 00:04:33.500
being truthful. how do we force it to care about

00:04:33.500 --> 00:04:35.360
the truth? We have to change the instructions.

00:04:35.439 --> 00:04:37.040
We have to make, I don't know, an acceptable,

00:04:37.180 --> 00:04:40.319
even a rewarded answer. And the guide says there

00:04:40.319 --> 00:04:43.100
are three layers to this. The hardware, the software,

00:04:43.300 --> 00:04:45.779
and the verification. Okay, let's start with

00:04:45.779 --> 00:04:48.959
layer one, the hardware. Model selection. This

00:04:48.959 --> 00:04:50.740
seems basic, but the guide says most people mess

00:04:50.740 --> 00:04:53.230
this up right at the start. Yep. They fail before

00:04:53.230 --> 00:04:55.730
they even type a word. They just open ChatGPT

00:04:55.730 --> 00:04:58.329
or whatever and use the default model that loads.

00:04:58.569 --> 00:05:00.310
I'm guilty of that. I just open the window and

00:05:00.310 --> 00:05:02.949
go, I assume it's the smartest one. And for serious

00:05:02.949 --> 00:05:05.730
document work in 2026, that's a critical mistake.

00:05:06.009 --> 00:05:08.870
The default models, the turbo or flash versions,

00:05:09.029 --> 00:05:11.670
they're built for speed. For conversation. Right.

00:05:11.750 --> 00:05:14.129
They are not built for deep forensic analysis.

00:05:14.290 --> 00:05:16.730
You have to manually toggle on the reasoning

00:05:16.730 --> 00:05:18.750
capabilities. So we're talking about specific

00:05:18.750 --> 00:05:22.220
settings, not just using GPT -5. Exactly. For

00:05:22.220 --> 00:05:30.139
ChatGPT, the... For Claude, it's Opus 4 .6 with

00:05:30.139 --> 00:05:33.160
extended reasoning on it. And for Google, Gemini

00:05:33.160 --> 00:05:36.060
3 Pro. What's the actual difference? Is it just

00:05:36.060 --> 00:05:38.439
slower? It is slower, but that's the whole point.

00:05:38.560 --> 00:05:41.699
The default model is like an improv comic. Fast,

00:05:41.860 --> 00:05:44.060
creative, good at making connections. But you

00:05:44.060 --> 00:05:45.920
don't want an improv comic doing your accounting.

00:05:46.199 --> 00:05:48.319
You want the deep thinker. The part of the brain

00:05:48.319 --> 00:05:50.319
that checks its work. That's what high reasoning

00:05:50.319 --> 00:05:53.079
does. It forces the model to spend more compute

00:05:53.079 --> 00:05:55.360
cycles on a chain of thought before it gives

00:05:55.360 --> 00:05:57.220
you an answer. It has an internal monologue.

00:05:57.379 --> 00:05:59.300
It's checking its own work before it speaks.

00:05:59.800 --> 00:06:01.879
Precisely. So does picking the right model solve

00:06:01.879 --> 00:06:04.379
the whole problem? If I just switch to Opus 4

00:06:04.379 --> 00:06:07.360
.6, am I good to go? Nope, absolutely not. That's

00:06:07.360 --> 00:06:09.319
just the baseline. It gets you a smart auditor,

00:06:09.500 --> 00:06:11.819
but even a smart auditor needs clear instructions.

00:06:12.199 --> 00:06:15.500
And that brings us to layer two. the software

00:06:15.500 --> 00:06:17.860
the prompts this is where the guide introduces

00:06:17.860 --> 00:06:20.420
the grounding complete template right which it

00:06:20.420 --> 00:06:22.879
sounds very official it is and it works this

00:06:22.879 --> 00:06:24.779
whole section is about giving the ai permission

00:06:24.779 --> 00:06:27.980
to fail permission to fail i love that okay so

00:06:27.980 --> 00:06:31.279
what's the first prompt prompt one is the grounding

00:06:31.279 --> 00:06:34.980
rule you say base your answer only on the uploaded

00:06:34.980 --> 00:06:38.410
documents Nothing else. Nothing else. That's

00:06:38.410 --> 00:06:40.790
the key part. That little phrase does so much

00:06:40.790 --> 00:06:43.209
work. It stops the model from using its vast

00:06:43.209 --> 00:06:45.930
internet training data. It shrinks its universe

00:06:45.930 --> 00:06:48.790
down to just the PDF you gave it. Okay, prompt

00:06:48.790 --> 00:06:51.790
two. This one tackles the people pleaser problem.

00:06:52.069 --> 00:06:54.129
Right. This is where you say, if information

00:06:54.129 --> 00:06:57.029
isn't found, say, not found in the documents,

00:06:57.250 --> 00:06:59.910
don't guess. It seems so simple. You just have

00:06:59.910 --> 00:07:01.470
to tell it not to guess. You have to be that

00:07:01.470 --> 00:07:03.430
direct. You're overwriting its core training.

00:07:03.649 --> 00:07:05.589
And what's wild is that this isn't some clever

00:07:05.589 --> 00:07:09.029
hack. Anthropix's own API docs recommend this.

00:07:09.209 --> 00:07:11.529
Oh, really? Yeah. They basically say if you want

00:07:11.529 --> 00:07:13.490
accuracy, you have to tell the model it's okay

00:07:13.490 --> 00:07:15.509
to be silent. So it's a manufacturer -approved

00:07:15.509 --> 00:07:18.509
fix. Okay. And prompt three. Demand citations.

00:07:18.689 --> 00:07:22.149
For each claim, cite the specific location, document

00:07:22.149 --> 00:07:24.990
name, page section, and relevant quotes. Show

00:07:24.990 --> 00:07:27.829
me the receipts. Always get the receipts. This

00:07:27.829 --> 00:07:30.889
forces it to be a data auditor, not a creative

00:07:30.889 --> 00:07:33.050
writer. If it can't point to the line on the

00:07:33.050 --> 00:07:35.250
page, it can't make the claim. The guide also

00:07:35.250 --> 00:07:37.350
has a couple of bonus prompts. One is the middle

00:07:37.350 --> 00:07:39.430
ground. Yeah, that's where you ask it to mark

00:07:39.430 --> 00:07:42.910
things it's unsure about as unverified. It's

00:07:42.910 --> 00:07:44.930
great for summaries of long reports where you

00:07:44.930 --> 00:07:46.910
just need to know where the shaky ground is.

00:07:47.209 --> 00:07:50.149
But then there's the nuclear option for high

00:07:50.149 --> 00:07:52.870
stakes stuff. The high stakes mode. The prompt

00:07:52.870 --> 00:07:57.670
is only respond. If you're 100 % confident. Whoa.

00:07:58.350 --> 00:08:01.750
That feels intense. It is. And your answers will

00:08:01.750 --> 00:08:03.410
be much shorter. You'll get, I don't know, a

00:08:03.410 --> 00:08:05.810
lot more. But the answers you do get will be

00:08:05.810 --> 00:08:09.009
rock solid. Imagine the discipline required to

00:08:09.009 --> 00:08:12.230
stay silent unless you are 100 % sure. That's

00:08:12.230 --> 00:08:14.550
a standard most humans can't even meet. It is,

00:08:14.569 --> 00:08:17.250
but that's the choice. Do you want a chatty assistant

00:08:17.250 --> 00:08:20.209
or a rigorous auditor? So we have the right brain,

00:08:20.310 --> 00:08:23.829
the right instructions. Do we trust it now? Are

00:08:23.829 --> 00:08:26.050
we finally done? Not yet. Even with all that,

00:08:26.110 --> 00:08:28.089
you can't blindly trust it. You need an independent

00:08:28.089 --> 00:08:31.370
audit. Okay, we're back. We've picked a reasoning

00:08:31.370 --> 00:08:33.970
model. We've used the grounding prompts. But

00:08:33.970 --> 00:08:36.570
the guide says we're still not done. We have

00:08:36.570 --> 00:08:39.649
to verify. Trust, but verify. But the cool thing

00:08:39.649 --> 00:08:42.710
about doing this in 2026 is that verify doesn't

00:08:42.710 --> 00:08:44.830
mean you have to reread the whole document yourself.

00:08:45.129 --> 00:08:46.730
Right, because that would defeat the whole purpose.

00:08:46.929 --> 00:08:51.580
Exactly. The new concept is AI checking AI. I

00:08:51.580 --> 00:08:54.679
like that. A digital second set of eyes. So what's

00:08:54.679 --> 00:08:57.159
method one, the low intensity version? The self

00:08:57.159 --> 00:08:59.679
-check. This is the easiest one. You just stay

00:08:59.679 --> 00:09:02.659
in the same chat window and ask. Rescan the document.

00:09:02.740 --> 00:09:05.059
If you can't find the quote, take the claim back.

00:09:05.580 --> 00:09:07.639
Does the word race scan actually do something

00:09:07.639 --> 00:09:10.559
special? It does. It's a critical word. It forces

00:09:10.559 --> 00:09:13.559
the model to perform a new methodological review

00:09:13.559 --> 00:09:15.679
instead of just, you know, confirming its own

00:09:15.679 --> 00:09:17.820
previous answer. It avoids confirmation bias.

00:09:18.139 --> 00:09:20.139
Okay, so that's the quick check. Method two is

00:09:20.139 --> 00:09:22.159
the multi -model check. This is your medium intensity

00:09:22.159 --> 00:09:25.320
option. You take the output from, say, ChatGPT,

00:09:25.399 --> 00:09:27.940
and you feed it, plus the original file, into

00:09:27.940 --> 00:09:30.919
a different model, like Claude Opus 4 .6. And

00:09:30.919 --> 00:09:34.779
you ask Claude to grade ChatGPT's homework. Basically,

00:09:34.879 --> 00:09:37.710
yeah. And the analogy in the guide is perfect.

00:09:38.090 --> 00:09:40.450
It's like getting a second opinion from a doctor

00:09:40.450 --> 00:09:42.669
who went to a different medical school. They

00:09:42.669 --> 00:09:45.070
have different training, different biases. The

00:09:45.070 --> 00:09:47.269
chance that they will both make the exact same

00:09:47.269 --> 00:09:51.009
weird mistake is incredibly low. That makes a

00:09:51.009 --> 00:09:53.490
lot of sense. Okay, then we have the heavy hitter,

00:09:53.509 --> 00:09:57.610
method three, notebook LM. This is the gold standard

00:09:57.610 --> 00:09:59.590
for this kind of work today. Google's notebook

00:09:59.590 --> 00:10:02.769
LM running on Gemini 3. I've used it for research.

00:10:02.929 --> 00:10:05.190
Why is it considered the highest intensity check?

00:10:05.389 --> 00:10:07.809
Because it was purpose built for this task. It's

00:10:07.809 --> 00:10:09.970
not a chat bot trying to be an auditor. It is

00:10:09.970 --> 00:10:12.549
an auditor. You upload your document and the

00:10:12.549 --> 00:10:15.210
AI's analysis and you ask which claims in this

00:10:15.210 --> 00:10:17.769
analysis are not supported by the source document.

00:10:18.110 --> 00:10:21.139
And here's the killer feature. It links every

00:10:21.139 --> 00:10:24.320
single claim. It verifies directly to the source

00:10:24.320 --> 00:10:26.940
text. You can click a little citation number

00:10:26.940 --> 00:10:29.220
and it takes you to the exact paragraph in the

00:10:29.220 --> 00:10:31.039
PDF. So it's not just telling you it's true.

00:10:31.100 --> 00:10:33.460
It's showing you the proof. It creates a verifiable

00:10:33.460 --> 00:10:35.860
paper trail. That's the definition of auditability.

00:10:36.200 --> 00:10:38.279
This whole system sounds incredibly thorough.

00:10:38.919 --> 00:10:41.840
Hardware, software, verification. But I have

00:10:41.840 --> 00:10:44.559
to ask, is there anything this system cannot

00:10:44.559 --> 00:10:47.519
catch? Is there still a gap where you need a

00:10:47.519 --> 00:10:51.639
human? Yes, a big one. It can't fix a bad source

00:10:51.639 --> 00:10:54.379
file. And more importantly, it can't understand

00:10:54.379 --> 00:10:57.179
human risk. So let's talk about that, the reality

00:10:57.179 --> 00:10:59.799
check section. What is this grounded mode not

00:10:59.799 --> 00:11:02.679
good for? Well, first, it's not magic. If your

00:11:02.679 --> 00:11:05.259
PDF has missing pages or the scan quality is

00:11:05.259 --> 00:11:08.019
terrible, the AI can't fix that. The guide has

00:11:08.019 --> 00:11:10.620
a great rule of thumb for this. If a junior analyst

00:11:10.620 --> 00:11:13.519
couldn't answer it, neither should the AI. Exactly.

00:11:13.740 --> 00:11:16.139
Garbage in, garbage out. But the much bigger

00:11:16.139 --> 00:11:19.190
limitation is risk assessment. The AI can extract

00:11:19.190 --> 00:11:22.190
a liability clause perfectly. It can tell you

00:11:22.190 --> 00:11:25.450
verbatim liabilities capped at $5 ,000. But it

00:11:25.450 --> 00:11:27.870
has no idea if signing a contract with that clause

00:11:27.870 --> 00:11:30.110
is a terrible idea for your business. Right.

00:11:30.190 --> 00:11:32.269
It can read the map perfectly, but it can't tell

00:11:32.269 --> 00:11:33.870
you there's a cliff at the end of the road. That

00:11:33.870 --> 00:11:36.169
takes a lawyer. That takes context about your

00:11:36.169 --> 00:11:38.389
business, the market, your negotiating power.

00:11:38.570 --> 00:11:41.669
The AI is an extractor, not a strategist. The

00:11:41.669 --> 00:11:44.820
guide also mentions math. I feel like we keep

00:11:44.820 --> 00:11:46.799
hearing AI is getting better at math, but it's

00:11:46.799 --> 00:11:49.379
still a weak spot. With dense tables, yeah. It's

00:11:49.379 --> 00:11:52.100
the column confusion problem. It still confuses

00:11:52.100 --> 00:11:55.419
totals and subtotals. It skips footnotes. It

00:11:55.419 --> 00:11:58.340
basically treats numbers like words. So what's

00:11:58.340 --> 00:12:02.019
the fix? Don't use it for math. You slow it down.

00:12:02.220 --> 00:12:05.320
You ask it to first reproduce the table exactly

00:12:05.320 --> 00:12:08.179
as it sees it in the chat. Once you confirm it,

00:12:08.220 --> 00:12:10.120
read the numbers right, then you ask it to do

00:12:10.120 --> 00:12:12.200
the math. Show your work. It's like we're back

00:12:12.200 --> 00:12:14.139
in fourth grade math class. Always make it show

00:12:14.139 --> 00:12:17.000
its work. And the final limitation is creativity

00:12:17.000 --> 00:12:20.820
versus precision. Models just love to paraphrase.

00:12:20.820 --> 00:12:23.480
It's in their nature. But for legal or compliance

00:12:23.480 --> 00:12:26.299
work, similar is not good enough. You need the

00:12:26.299 --> 00:12:29.340
exact words. So you have to explicitly command

00:12:29.340 --> 00:12:32.480
it. Quote verbatim. Do not paraphrase. You have

00:12:32.480 --> 00:12:34.940
to be that blunt. So if we zoom out on all this,

00:12:35.000 --> 00:12:36.659
it really sounds like the goal isn't to replace

00:12:36.659 --> 00:12:39.340
the human. It's just changing the human's job

00:12:39.340 --> 00:12:43.259
description. 100%. The human is no longer the

00:12:43.259 --> 00:12:45.399
researcher digging in the file cabinet. The human

00:12:45.399 --> 00:12:48.360
is the auditor, the one checking the citations,

00:12:48.419 --> 00:12:50.919
assessing the risk, and making the final judgment

00:12:50.919 --> 00:12:54.240
call. It's a real shift from finding answers

00:12:54.240 --> 00:12:57.360
to verifying truth. And that shift is everything.

00:12:57.639 --> 00:13:00.059
It's what separates people who get value from

00:13:00.059 --> 00:13:03.299
AI from those who just get noise. Okay, let's

00:13:03.299 --> 00:13:05.830
recap the big idea. If someone listening has

00:13:05.830 --> 00:13:07.850
a stack of documents on their desk for tomorrow

00:13:07.850 --> 00:13:10.690
morning, what is the simple playbook? Three steps.

00:13:10.950 --> 00:13:13.690
One, pick a reasoning model. Don't use the default.

00:13:13.769 --> 00:13:17.009
Use GBT 5 .3 or Opus 4 .6 with the reasoning

00:13:17.009 --> 00:13:20.490
modes on. Two, ground it. Paste in that template.

00:13:20.590 --> 00:13:22.629
Tell it to only use the file and give it permission

00:13:22.629 --> 00:13:26.070
to say, I don't know. And three, verify. Use

00:13:26.070 --> 00:13:29.490
a second AI or ideally notebook LM to cross -check

00:13:29.490 --> 00:13:31.450
the key facts. It sounds like a little bit of

00:13:31.450 --> 00:13:33.909
extra work upfront. It's maybe 30 seconds more

00:13:33.909 --> 00:13:36.289
per task. But it saves you hours of panic and

00:13:36.289 --> 00:13:39.149
cleanup later. Trust but verify. An old saying,

00:13:39.350 --> 00:13:41.769
but it feels more important now than ever before.

00:13:41.909 --> 00:13:44.049
It's the only way to operate in an age of infinite

00:13:44.049 --> 00:13:46.789
confident information. So here's the challenge

00:13:46.789 --> 00:13:48.690
for you listening. The next time you upload a

00:13:48.690 --> 00:13:51.570
file, don't just ask it to summarize. Try that

00:13:51.570 --> 00:13:54.110
grounding template. And just watch how different

00:13:54.110 --> 00:13:56.789
the output looks when the AI stops trying to

00:13:56.789 --> 00:13:59.129
guess and starts citing its sources. It gets

00:13:59.129 --> 00:14:02.029
remarkably quiet. And that quiet, that's the

00:14:02.029 --> 00:14:04.700
sound of accuracy. I love that. Thank you for

00:14:04.700 --> 00:14:06.299
deep diving with us today. We'll catch you on

00:14:06.299 --> 00:14:06.679
the next one.