WEBVTT

00:00:00.000 --> 00:00:02.140
You've likely experienced this. You spend hours,

00:00:02.240 --> 00:00:06.559
maybe days, carefully prepping documents, chunking

00:00:06.559 --> 00:00:08.240
them just right, loading them into your vector

00:00:08.240 --> 00:00:11.259
database. Then you ask your shiny new AI agent

00:00:11.259 --> 00:00:14.880
a pretty simple, direct question. And you watch,

00:00:14.980 --> 00:00:17.559
maybe with that sinking feeling, as it confidently

00:00:17.559 --> 00:00:21.120
spits back an answer that's, well, completely

00:00:21.120 --> 00:00:24.129
irrelevant. Or maybe just plain wrong. That specific

00:00:24.129 --> 00:00:27.370
moment of dismay when you're our agent just completely

00:00:27.370 --> 00:00:30.510
misses the point. Yes, that feeling. We've absolutely

00:00:30.510 --> 00:00:32.689
all been there. It's so frustrating. Welcome

00:00:32.689 --> 00:00:34.969
to the Deep Dive. Today, we're really going to

00:00:34.969 --> 00:00:37.030
dig into that exact challenge. It's pretty widespread

00:00:37.030 --> 00:00:39.390
in AI development, actually. How do we turn those

00:00:39.390 --> 00:00:42.689
frustrating, sometimes scattershot information

00:00:42.689 --> 00:00:46.030
retrievers into real precision knowledge machines?

00:00:46.549 --> 00:00:48.429
Precisely. Yeah, it's all about fixing arguably

00:00:48.429 --> 00:00:50.549
the number one flaw in retrieval augmented generation

00:00:50.549 --> 00:00:52.909
agents, our edge agents. We're going to unpack.

00:00:53.039 --> 00:00:55.880
two really powerful techniques, re -ranking and

00:00:55.880 --> 00:00:58.500
metadata enrichment. Think of them like superpowers

00:00:58.500 --> 00:01:00.899
for your AI. We'll explain why your ag agents

00:01:00.899 --> 00:01:03.020
might be failing you, and then, importantly,

00:01:03.280 --> 00:01:06.060
give you a practical blueprint to fix them. We'll

00:01:06.060 --> 00:01:08.980
show how tools like, say, NAN and Supabase make

00:01:08.980 --> 00:01:11.319
this doable. We'll even walk through a specific

00:01:11.319 --> 00:01:15.640
example using, of all things, golf rules. So

00:01:15.640 --> 00:01:17.620
what does this all really mean for building AI

00:01:17.620 --> 00:01:20.870
you can genuinely rely on? OK, let's get into

00:01:20.870 --> 00:01:22.890
it. Let's start maybe with the basic flow, the

00:01:22.890 --> 00:01:24.989
fundamental steps in most standard RG setups.

00:01:25.189 --> 00:01:27.849
You have your source document, right? A PDF,

00:01:28.010 --> 00:01:29.969
maybe something else. It gets broken down into

00:01:29.969 --> 00:01:32.909
smaller bits, chunks, as they're called. Right.

00:01:32.989 --> 00:01:34.829
And each of those little chunks gets turned into

00:01:34.829 --> 00:01:37.950
a, well, a numerical vector by an AI model. It's

00:01:37.950 --> 00:01:39.549
kind of like a mathematical fingerprint of what

00:01:39.549 --> 00:01:42.090
that chunk means. These fingerprints, these vectors,

00:01:42.189 --> 00:01:43.909
they get stored in a special kind of database,

00:01:44.090 --> 00:01:46.950
a vector database. Then when a user asks a question.

00:01:47.370 --> 00:01:49.430
The question itself gets turned into a vector

00:01:49.430 --> 00:01:52.569
too. The system hunts through the database for

00:01:52.569 --> 00:01:54.109
the vectors that are mathematically closest.

00:01:54.349 --> 00:01:57.230
That's the nearest neighbor search part. The

00:01:57.230 --> 00:01:59.689
text chunks linked to those closest vectors get

00:01:59.689 --> 00:02:02.150
pulled out and fed to a large language model,

00:02:02.310 --> 00:02:05.969
an LLM, which then writes the final answer. It

00:02:05.969 --> 00:02:07.870
sounds pretty logical, doesn't it? It does sound

00:02:07.870 --> 00:02:10.090
logical on paper, but here's the real kicker,

00:02:10.090 --> 00:02:12.129
the critical point where it often breaks down.

00:02:12.370 --> 00:02:14.669
It's the assumption that mathematically similar

00:02:14.669 --> 00:02:17.349
always means contextually relevant. That's the

00:02:17.349 --> 00:02:20.590
flaw. It's like, imagine asking about... payment

00:02:20.590 --> 00:02:23.270
terms for a service, and the system brings back

00:02:23.270 --> 00:02:25.849
stuff about terminating a contract. The words

00:02:25.849 --> 00:02:28.370
might overlap a bit, terms, terminating, but

00:02:28.370 --> 00:02:31.349
the actual meaning, the context, completely off.

00:02:31.569 --> 00:02:33.389
And that's why a lot of promising AI projects

00:02:33.389 --> 00:02:35.650
just kind of fizzle out. They don't deliver reliable

00:02:35.650 --> 00:02:37.750
answers. Okay, so what's that core assumption

00:02:37.750 --> 00:02:40.430
basic argue systems make that often leads straight

00:02:40.430 --> 00:02:43.349
to irrelevant answers? Basically, it assumes

00:02:43.349 --> 00:02:45.990
mathematical similarity equals true relevance.

00:02:46.310 --> 00:02:49.020
And that's just not always true. Right. And this

00:02:49.020 --> 00:02:51.159
is precisely where re -ranking comes into play.

00:02:51.280 --> 00:02:53.879
It feels like a game changer. How exactly does

00:02:53.879 --> 00:02:56.159
it tackle that core problem you just laid out?

00:02:56.460 --> 00:02:59.120
Okay, think of re -ranking like adding a super

00:02:59.120 --> 00:03:02.939
smart quality control manager to your information

00:03:02.939 --> 00:03:05.159
assembly line instead of just grabbing the first

00:03:05.159 --> 00:03:07.400
few things off the belt that look roughly right.

00:03:07.560 --> 00:03:09.759
It takes a much bigger batch, like a whole pile

00:03:09.759 --> 00:03:12.740
of potential candidates, and then it carefully

00:03:12.740 --> 00:03:15.439
inspects each one for how well it actually fits

00:03:15.439 --> 00:03:18.259
the original request. Okay, so you cast a wider

00:03:18.259 --> 00:03:20.240
net initially, but then you apply a much more

00:03:20.240 --> 00:03:22.860
discerning filter. Can you maybe contrast the

00:03:22.860 --> 00:03:25.539
old way versus this new re -ranked flow? Absolutely.

00:03:25.800 --> 00:03:29.319
So the traditional RU flow, user asks a question.

00:03:29.479 --> 00:03:31.800
The system grabs maybe the top three or four

00:03:31.800 --> 00:03:34.240
nearest vectors based purely on math, sends those

00:03:34.240 --> 00:03:36.699
straight to the AI. It's often a bit of a hope

00:03:36.699 --> 00:03:38.599
and pray method, honestly. You just hope those

00:03:38.599 --> 00:03:40.759
top few are good enough. But with the re -ranked

00:03:40.759 --> 00:03:42.580
flow, the vector search still happens first,

00:03:42.659 --> 00:03:44.479
but you tell it, hey, bring me back more options,

00:03:44.620 --> 00:03:47.280
maybe 10, 20, even more candidate vectors, a

00:03:47.280 --> 00:03:50.099
wider net. Then this bigger pool of candidate

00:03:50.099 --> 00:03:52.780
chunks gets passed to a specialized re -ranker

00:03:52.780 --> 00:03:54.500
model. And what's really interesting here is

00:03:54.500 --> 00:03:56.759
that the re -ranker's only job is to look at

00:03:56.759 --> 00:03:59.319
that bigger set of chunks and compare each one

00:03:59.319 --> 00:04:02.039
directly against the original user query. It's

00:04:02.039 --> 00:04:04.360
not just doing math similarity again. It's looking

00:04:04.360 --> 00:04:07.789
for genuine contextual relevance. It assigns

00:04:07.789 --> 00:04:11.289
a score like 0 .0 to 1 .0 saying, how relevant

00:04:11.289 --> 00:04:14.110
is this really? And then crucially, it throws

00:04:14.110 --> 00:04:16.970
away everything except the very top scores, maybe

00:04:16.970 --> 00:04:19.069
the best three or four. Those are what finally

00:04:19.069 --> 00:04:21.470
go to the LLM. It massively boosts the quality

00:04:21.470 --> 00:04:23.870
of the input for the AI. Yeah, it ensures the

00:04:23.870 --> 00:04:25.949
AI gets only the most relevant stuff, makes the

00:04:25.949 --> 00:04:28.949
final answer way, way more accurate. And, you

00:04:28.949 --> 00:04:31.009
know, this isn't just theory. It's actually surprisingly

00:04:31.009 --> 00:04:32.689
straightforward to put into practice, especially

00:04:32.689 --> 00:04:35.529
using tools like NAN that handle a lot of the

00:04:35.529 --> 00:04:37.949
complexity. Right. The main thing you need first

00:04:37.949 --> 00:04:40.470
is just a basic arg workflow already set up.

00:04:40.730 --> 00:04:43.110
Maybe you're already using something like Supabase

00:04:43.110 --> 00:04:45.310
as your vector database. So let's walk through

00:04:45.310 --> 00:04:48.410
the actual steps to add Cohere re -ranking. First

00:04:48.410 --> 00:04:51.370
up, you need an API key from Cohere. You go to

00:04:51.370 --> 00:04:53.329
Cohere .com, sign up. They have a pretty generous

00:04:53.329 --> 00:04:55.529
free tier, which is great for getting started.

00:04:55.730 --> 00:04:58.709
Grab an API key. Maybe name it something clear

00:04:58.709 --> 00:05:01.649
like an 8 -ranker key. Okay. Step one, get the

00:05:01.649 --> 00:05:04.579
key. Then what? Then you need your vector database

00:05:04.579 --> 00:05:06.879
set up. Let's stick with Supabase. You'll create

00:05:06.879 --> 00:05:09.220
a table, let's call it documents, to hold your

00:05:09.220 --> 00:05:12.199
chunks. It needs a few columns, like an id, maybe

00:05:12.199 --> 00:05:15.759
an integer, the actual content as text, a metadata

00:05:15.759 --> 00:05:18.959
column, probably JSON B format for flexibility,

00:05:19.339 --> 00:05:21.420
and of course the embedding column itself, which

00:05:21.420 --> 00:05:23.500
holds the vector data type. Got it, table structure.

00:05:23.740 --> 00:05:26.550
Then connect it in ANAN. Exactly. You go to your

00:05:26.550 --> 00:05:28.750
Supabase vector store node in your NAD workflow.

00:05:29.170 --> 00:05:31.509
Now, this bit's important. You need to find the

00:05:31.509 --> 00:05:33.930
limit setting. By default, it might be set low,

00:05:34.050 --> 00:05:35.730
like four or five. You have to crank that up,

00:05:35.769 --> 00:05:38.170
set it to maybe 20. That's casting the wider

00:05:38.170 --> 00:05:40.750
net we talked about. Ah, okay. Increase the limit

00:05:40.750 --> 00:05:43.069
first. Yes. And then you just flick the switch.

00:05:43.110 --> 00:05:45.189
There's usually a toggle labeled re -rank results.

00:05:45.990 --> 00:05:48.129
Enabling that makes new fields appear for the

00:05:48.129 --> 00:05:50.170
re -ranker setup. Makes sense. And the final

00:05:50.170 --> 00:05:52.800
step. just configure those new fields you'd select

00:05:52.800 --> 00:05:55.399
cohere as the provider paste in that api key

00:05:55.399 --> 00:05:58.720
you got earlier and pick a re -ranker model re

00:05:58.720 --> 00:06:01.939
-rank v3 .5 is a good one save it and you've

00:06:01.939 --> 00:06:04.180
basically integrated re -ranking okay that does

00:06:04.180 --> 00:06:06.560
sound pretty doable but it brings up an important

00:06:06.560 --> 00:06:10.220
question Why is it so critical to increase that

00:06:10.220 --> 00:06:12.759
initial limit in the vector store node before

00:06:12.759 --> 00:06:15.740
the re -ranking happens? Why not just re -rank

00:06:15.740 --> 00:06:18.459
the default four or five? Because the re -ranker

00:06:18.459 --> 00:06:20.660
needs a decent pool of candidates to show its

00:06:20.660 --> 00:06:23.230
value. If you only give it four options, maybe

00:06:23.230 --> 00:06:25.470
none of them are actually that relevant. By giving

00:06:25.470 --> 00:06:27.750
it 20, you significantly increase the odds that

00:06:27.750 --> 00:06:30.069
some truly relevant chunks are in that initial

00:06:30.069 --> 00:06:32.910
pool for it to find and promote. It needs options

00:06:32.910 --> 00:06:35.810
to work its magic. Okay, so before a RAG agent

00:06:35.810 --> 00:06:37.930
can even start answering questions with re -ranking,

00:06:37.930 --> 00:06:40.220
we need to actually feed it knowledge. build

00:06:40.220 --> 00:06:42.500
its brain so to speak that happens to a data

00:06:42.500 --> 00:06:45.259
preparation workflow right this is the back end

00:06:45.259 --> 00:06:47.819
pipeline it's what takes your source document

00:06:47.819 --> 00:06:51.699
like that pdf of golf rules processes it smartly

00:06:51.699 --> 00:06:54.420
and loads it into super base ready for the agent

00:06:54.420 --> 00:06:57.139
to use this is how you build the actual knowledge

00:06:57.139 --> 00:06:59.360
base so the process might look something like

00:06:59.360 --> 00:07:02.379
this first you get your source pdf maybe download

00:07:02.379 --> 00:07:05.759
it then you extract the raw text out of it and

00:07:05.759 --> 00:07:08.519
here's a crucial part structuring that raw text.

00:07:08.939 --> 00:07:11.279
You could use, for instance, a JavaScript code

00:07:11.279 --> 00:07:13.879
node in 8AN. You might write a little script

00:07:13.879 --> 00:07:16.339
or even get AI to help write it. Oh, interesting.

00:07:16.579 --> 00:07:18.879
A script that looks for patterns, like rule number

00:07:18.879 --> 00:07:22.459
X in the text, and uses those patterns to split

00:07:22.459 --> 00:07:25.509
the document into meaningful JSON objects. Each

00:07:25.509 --> 00:07:27.949
object might have, say, a view number, a rule

00:07:27.949 --> 00:07:30.850
title, and the full text for that specific rule.

00:07:30.990 --> 00:07:33.089
Ah, so you're breaking it down logically, not

00:07:33.089 --> 00:07:36.029
just by random character counts, and adding structure

00:07:36.029 --> 00:07:39.170
right away. Then, presumably, you load that structured

00:07:39.170 --> 00:07:41.350
data. Exactly. You load that structured data,

00:07:41.470 --> 00:07:43.610
and importantly, you add metadata during this

00:07:43.610 --> 00:07:45.829
step, like that rule number you just extracted,

00:07:45.930 --> 00:07:47.910
maybe a document type like official rules, maybe

00:07:47.910 --> 00:07:50.029
a date created. Then you generate the vector

00:07:50.029 --> 00:07:53.009
embeddings for these nice, structured, metadata

00:07:53.009 --> 00:07:56.079
-tagged chunks. You could use an open AI model

00:07:56.079 --> 00:07:58.800
like text embedding 3 small with your open AI

00:07:58.800 --> 00:08:01.540
key. And finally, you upload the whole package,

00:08:01.819 --> 00:08:04.360
the structured text, the metadata, the embeddings

00:08:04.360 --> 00:08:07.660
into your Supabase table. This careful prep work

00:08:07.660 --> 00:08:10.399
is really foundational. Absolutely. If you think

00:08:10.399 --> 00:08:12.839
about the big picture, the quality and thoughtfulness

00:08:12.839 --> 00:08:15.720
you put into this data preparation stage, that

00:08:15.720 --> 00:08:18.660
directly dictates how reliable and accurate your

00:08:18.660 --> 00:08:21.100
entire RAG system will be later on. It's the

00:08:21.100 --> 00:08:23.610
bedrock. Okay, so why is strategic data preparation

00:08:23.610 --> 00:08:26.949
considered the foundational step for a good RAG

00:08:26.949 --> 00:08:28.829
agent? Because good data prep ensures you have

00:08:28.829 --> 00:08:31.329
a structured, accurate knowledge base. That's

00:08:31.329 --> 00:08:33.190
absolutely critical for the AI's performance

00:08:33.190 --> 00:08:35.450
down the line. Let's make this concrete. Let's

00:08:35.450 --> 00:08:36.929
talk about that golf rules agent again. It was

00:08:36.929 --> 00:08:39.769
built from a 22 -page PDF covering 28 distinct

00:08:39.769 --> 00:08:42.690
golf rules. Right. So the test query we used

00:08:42.690 --> 00:08:45.830
was, how is the order of play determined in golf?

00:08:46.450 --> 00:08:49.360
Simple enough question. Now, without re -ranking,

00:08:49.360 --> 00:08:52.440
a basic argue system, just looking at word similarity,

00:08:52.919 --> 00:08:55.120
it might easily grab chunks talking about penalties

00:08:55.120 --> 00:08:57.879
or maybe equipment rules. Why? Because the word

00:08:57.879 --> 00:09:00.120
play shows up in those contexts, too. That leads

00:09:00.120 --> 00:09:02.019
to irrelevant bits getting mixed in, and you

00:09:02.019 --> 00:09:04.679
end up with a confusing, maybe incomplete answer.

00:09:05.039 --> 00:09:07.740
Okay. But then we switched on re -ranking. The

00:09:07.740 --> 00:09:10.419
Supabase node first did its job, fetching 20

00:09:10.419 --> 00:09:12.419
chunks that were mathematically similar to the

00:09:12.419 --> 00:09:15.259
query casting that wide net. Then the Cohere

00:09:15.259 --> 00:09:17.480
re -ranker stepped in and evaluated each of those

00:09:17.480 --> 00:09:19.759
20 chunks for true relevance to the question

00:09:19.759 --> 00:09:22.240
about order of play. And the results were really

00:09:22.240 --> 00:09:24.820
clear. Chunk 1, which talked about match play

00:09:24.820 --> 00:09:28.610
order, got a high score. 0 .877 super relevant

00:09:28.610 --> 00:09:32.149
chunk two explaining stroke play order also scored

00:09:32.149 --> 00:09:35.230
well 0 .642 still very relevant chunk three was

00:09:35.230 --> 00:09:37.429
about provisional balls less directly relevant

00:09:37.429 --> 00:09:40.610
but somewhat related scored 0 .57 but crucially

00:09:40.610 --> 00:09:43.690
the other 17 chunks they had much lower relevant

00:09:43.690 --> 00:09:46.429
scores the re -ranker basically said nope these

00:09:46.429 --> 00:09:48.429
aren't really about the order of play and discarded

00:09:48.429 --> 00:09:50.720
them Only the top ones went forward. And the

00:09:50.720 --> 00:09:53.120
AI's final answer, built only from those top

00:09:53.120 --> 00:09:55.840
-scoring, truly relevant chunks. It was perfect.

00:09:56.000 --> 00:09:58.639
It was comprehensive, accurate, detailing both

00:09:58.639 --> 00:10:01.440
match play and stroke play order. It even offered

00:10:01.440 --> 00:10:03.960
more details if needed. The re -ranker acted

00:10:03.960 --> 00:10:06.320
like this incredibly precise filter, cutting

00:10:06.320 --> 00:10:08.139
out all the noise. Yeah, and what's really cool,

00:10:08.200 --> 00:10:10.179
especially in a tool like NEM, is the transparency.

00:10:10.480 --> 00:10:12.039
You can actually look at the logs and see it

00:10:12.039 --> 00:10:14.480
happen. You see the query. You see the initial

00:10:14.480 --> 00:10:16.639
20 results pulled. You see the re -ranker scores

00:10:16.639 --> 00:10:18.799
for each one. And you see exactly... which top

00:10:18.799 --> 00:10:21.220
chunks got selected to build the answer makes

00:10:21.220 --> 00:10:23.340
troubleshooting and optimizing so much clearer.

00:10:23.539 --> 00:10:25.639
It's really powerful to see. Makes you think,

00:10:25.779 --> 00:10:28.700
whoa, imagine applying that same level of precision,

00:10:28.879 --> 00:10:30.980
that same transparent filtering to something

00:10:30.980 --> 00:10:34.200
massive like a billion complex scientific papers.

00:10:34.259 --> 00:10:37.240
The potential is just huge. So what was the key

00:10:37.240 --> 00:10:39.620
difference the re -ranker made in the golf agent's

00:10:39.620 --> 00:10:42.419
answer? It precisely filtered for true relevance.

00:10:42.960 --> 00:10:45.340
which directly resulted in a comprehensive and

00:10:45.340 --> 00:10:47.500
accurate answer. Okay, so re -ranking boosts

00:10:47.500 --> 00:10:49.820
relevance significantly. But you mentioned metadata

00:10:49.820 --> 00:10:51.940
earlier, saying it brings surgical precision.

00:10:52.279 --> 00:10:54.379
What exactly does that mean in this context?

00:10:54.679 --> 00:10:57.620
Right, surgical precision. It tackles a different

00:10:57.620 --> 00:11:00.559
but related problem, what we call the chunk -based

00:11:00.559 --> 00:11:04.629
retrieval problem. Think about it. A single logical

00:11:04.629 --> 00:11:07.429
piece of information, like golf rule three on

00:11:07.429 --> 00:11:09.730
stroke play, might actually get split across

00:11:09.730 --> 00:11:11.710
several different text chunks when you do the

00:11:11.710 --> 00:11:14.110
initial document processing. It just happens

00:11:14.110 --> 00:11:16.830
sometimes based on length or structure. Now without

00:11:16.830 --> 00:11:19.149
metadata, if you ask the agent, tell me everything

00:11:19.149 --> 00:11:21.830
about rule three, the system just does its usual

00:11:21.830 --> 00:11:23.850
vector search based on the words in your question.

00:11:24.360 --> 00:11:26.759
It might find some chunks related to rule three,

00:11:26.820 --> 00:11:28.820
but it could easily miss other crucial parts

00:11:28.820 --> 00:11:30.480
of that same rule. If they happen to land in

00:11:30.480 --> 00:11:32.840
chunks that don't seem mathematically similar

00:11:32.840 --> 00:11:34.820
enough to your question, you get fragmented information.

00:11:35.360 --> 00:11:37.820
Ah, I see. The information is in the database,

00:11:38.000 --> 00:11:40.279
just scattered across chunks that the basic similarity

00:11:40.279 --> 00:11:43.100
search might miss. So how does tagging every

00:11:43.100 --> 00:11:45.440
chunk with structured info, like a simple rule

00:11:45.440 --> 00:11:48.659
number, three tag, fix that fragmentation? It's

00:11:48.659 --> 00:11:51.820
simple, but powerful. By adding that rule number,

00:11:52.269 --> 00:11:55.350
three, tag to every chunk that contains any part

00:11:55.350 --> 00:11:57.549
of rule three, you give the system another way

00:11:57.549 --> 00:12:00.490
to find information. Now, instead of just relying

00:12:00.490 --> 00:12:03.149
on semantic similarity to the question, the system

00:12:03.149 --> 00:12:05.529
can be told, hey, retrieve all chunks that have

00:12:05.529 --> 00:12:08.549
the metadata tag rule number three. Boom. It

00:12:08.549 --> 00:12:10.669
instantly pulls together the complete context

00:12:10.669 --> 00:12:12.950
for rule three, no matter how the text was chunked

00:12:12.950 --> 00:12:15.149
or how similar the individual chunk content is

00:12:15.149 --> 00:12:17.389
to the query itself. It ensures you get the whole

00:12:17.389 --> 00:12:19.450
picture. It's like giving every piece of info

00:12:19.450 --> 00:12:22.100
a precise address. Okay. That makes a lot of

00:12:22.100 --> 00:12:24.019
sense. So how do we actually add and then use

00:12:24.019 --> 00:12:26.639
this metadata effectively, say, with an N8N you

00:12:26.639 --> 00:12:28.860
said sort of two -step thing? Yeah, two main

00:12:28.860 --> 00:12:31.159
parts. Step one is adding the metadata during

00:12:31.159 --> 00:12:33.200
that data preparation phase we talked about earlier.

00:12:33.620 --> 00:12:35.720
When you're processing your documents and getting

00:12:35.720 --> 00:12:37.779
them ready for the Vexor database in your data

00:12:37.779 --> 00:12:40.659
loader node, you explicitly add these key value

00:12:40.659 --> 00:12:43.120
pairs. So along with the text content, you'd

00:12:43.120 --> 00:12:45.820
add fields like rule number, one, document type,

00:12:45.919 --> 00:12:50.379
official rules, date created, 2024 -01 -15, whatever

00:12:50.379 --> 00:12:52.799
makes sense. for your data. That metadata gets

00:12:52.799 --> 00:12:54.919
stored right alongside the text chunk and it's

00:12:54.919 --> 00:12:57.500
embedding. Got it. Added during prep. And step

00:12:57.500 --> 00:13:00.039
two, you mentioned a smart way to extract it.

00:13:00.269 --> 00:13:03.029
Right. The smart way avoids manual labor. Instead

00:13:03.029 --> 00:13:05.049
of manually figuring out which rule is which,

00:13:05.289 --> 00:13:08.110
you can leverage AI itself. Take the raw text

00:13:08.110 --> 00:13:10.889
from your golf PDF, for example. You could go

00:13:10.889 --> 00:13:13.409
to an AI chat tool like ChatJPT or Claude and

00:13:13.409 --> 00:13:15.789
give it a prompt like, Hey, help me write some

00:13:15.789 --> 00:13:18.730
JavaScript code for an N8n code node. This code

00:13:18.730 --> 00:13:21.009
needs to take this big block of text as input

00:13:21.009 --> 00:13:23.529
and split it into separate items every time it

00:13:23.529 --> 00:13:25.549
finds the pattern rule, followed by a number

00:13:25.549 --> 00:13:28.169
and a colon, like rule X. For each item, extract

00:13:28.169 --> 00:13:31.200
the rule number and the rule title. The AI can

00:13:31.200 --> 00:13:33.159
often generate perfectly usable code for you.

00:13:33.200 --> 00:13:36.220
You paste that code into an N8N code node, and

00:13:36.220 --> 00:13:38.440
boom, it automatically parses your document into

00:13:38.440 --> 00:13:40.500
structured items, each already tagged with its

00:13:40.500 --> 00:13:42.960
rule number and title. Super efficient. Wow,

00:13:43.039 --> 00:13:45.379
using AI to help structure the data for the AI.

00:13:45.759 --> 00:13:48.559
Clever. Exactly. And then you can get even more

00:13:48.559 --> 00:13:51.360
advanced with dynamic metadata filtering. Imagine

00:13:51.360 --> 00:13:54.139
this. A user asks, tell me about rule eight.

00:13:54.539 --> 00:13:57.340
You could have a first AI agent whose only job

00:13:57.340 --> 00:14:00.059
is to analyze that query. it recognizes the user

00:14:00.059 --> 00:14:02.980
wants a specific rule and outputs just the metadata

00:14:02.980 --> 00:14:07.059
filter needed rule number 80 then your main og

00:14:07.059 --> 00:14:09.940
agent uses that specific filter in its super

00:14:09.940 --> 00:14:12.500
base query maybe alongside the vector search

00:14:12.500 --> 00:14:14.899
or maybe even instead of it for such a direct

00:14:14.899 --> 00:14:17.840
query that's surgical precision targeting exactly

00:14:17.840 --> 00:14:20.120
the data you need based on the query's intent

00:14:20.120 --> 00:14:23.580
okay so the smart way to add metadata involves

00:14:23.580 --> 00:14:25.899
using AI to generate code that automatically

00:14:25.899 --> 00:14:28.740
extracts and tags that structure data from your

00:14:28.740 --> 00:14:31.159
documents. And the potential uses for this kind

00:14:31.159 --> 00:14:33.779
of metadata filtering, they go way, way beyond

00:14:33.779 --> 00:14:35.559
just golf rules. Honestly, it's like a superpower

00:14:35.559 --> 00:14:37.539
for almost any business data you can think of.

00:14:37.659 --> 00:14:39.159
Yeah, I could see that. Give us some examples.

00:14:39.240 --> 00:14:41.200
How would this apply in, say, a business context?

00:14:41.639 --> 00:14:44.299
Okay, imagine you record and transcribe all your

00:14:44.299 --> 00:14:47.639
team meetings. You could add metadata like .date,

00:14:47.759 --> 00:14:51.360
2024, Z620, participants, John, Sarah, project

00:14:51.360 --> 00:14:54.070
.website, or design. Then you can ask your AI

00:14:54.070 --> 00:14:56.409
agent, what did we decide about the homepage

00:14:56.409 --> 00:14:58.750
navigation during the website redesign meeting

00:14:58.750 --> 00:15:01.529
Sarah attended in June? The metadata makes that

00:15:01.529 --> 00:15:03.830
possible. Or for legal documents. Yep. Client

00:15:03.830 --> 00:15:06.470
contracts could have metadata like client name,

00:15:06.669 --> 00:15:09.830
Acme Corp, document type, MSA, status, active,

00:15:10.029 --> 00:15:15.059
renewal date, 2025 -03 -01. Imagine querying.

00:15:15.159 --> 00:15:17.480
Show me all active master service agreements

00:15:17.480 --> 00:15:19.500
for Acme Court that are up for renewal in the

00:15:19.500 --> 00:15:21.720
next six months. Super targeted. And maybe one

00:15:21.720 --> 00:15:24.259
more like for technical docs. Sure. A technical

00:15:24.259 --> 00:15:26.659
knowledge base could use tags like topic, API

00:15:26.659 --> 00:15:28.840
authentication, language, Python, library version

00:15:28.840 --> 00:15:31.419
2 .1, difficulty advanced. Then a developer could

00:15:31.419 --> 00:15:34.340
ask, find advanced Python examples for API authentication

00:15:34.340 --> 00:15:38.000
using library version 2 .1 or later. Wow. Okay.

00:15:38.350 --> 00:15:40.129
When you step back and look at it like that,

00:15:40.250 --> 00:15:42.450
metadata really does fundamentally change how

00:15:42.450 --> 00:15:44.730
you interact with huge piles of unstructured

00:15:44.730 --> 00:15:46.970
text. It lets you query it almost like a structured

00:15:46.970 --> 00:15:49.909
database. It's a total game changer. It unlocks

00:15:49.909 --> 00:15:54.549
that database -like precision for messy, unstructured

00:15:54.549 --> 00:15:57.889
data. So what kind of really new search capabilities

00:15:57.889 --> 00:16:00.610
does metadata filtering unlock for businesses?

00:16:01.259 --> 00:16:04.379
it enables these highly specific almost multi

00:16:04.379 --> 00:16:07.480
-dimensional queries across large diverse data

00:16:07.480 --> 00:16:10.080
sets that were previously just impossible okay

00:16:10.080 --> 00:16:11.960
so if we think about building one of these really

00:16:11.960 --> 00:16:15.179
robust rag systems not just a basic one but one

00:16:15.179 --> 00:16:18.120
with re -ranking and metadata thinking about

00:16:18.120 --> 00:16:20.419
it as a whole project we probably break it down

00:16:20.419 --> 00:16:23.470
into maybe three distinct phases. Yeah, I think

00:16:23.470 --> 00:16:25.809
that makes sense. Phase one has got to be data

00:16:25.809 --> 00:16:28.230
preparation. And we can't emphasize this enough.

00:16:28.330 --> 00:16:30.509
This phase is critical. It covers everything

00:16:30.509 --> 00:16:33.789
from getting the documents in ingestion to chunking

00:16:33.789 --> 00:16:36.250
them strategically, ideally along logical breaks

00:16:36.250 --> 00:16:38.269
in the content, not just random character limits.

00:16:38.529 --> 00:16:41.029
Then the AI -assisted metadata extraction we

00:16:41.029 --> 00:16:42.990
talked about, storing it all neatly structured

00:16:42.990 --> 00:16:46.389
in your vector DB. And vitally, quality validation.

00:16:46.590 --> 00:16:48.190
You have to check that the chunks and metadata

00:16:48.190 --> 00:16:50.230
are accurate before you build on top of them.

00:16:50.289 --> 00:16:52.389
Garbage in. garbage out, right? Garbage in, garbage

00:16:52.389 --> 00:16:55.710
out. Okay. Phase one, solid data prep. What's

00:16:55.710 --> 00:16:58.309
phase two? Phase two is what I call intelligent

00:16:58.309 --> 00:17:01.149
retrieval. This is the main workflow the user

00:17:01.149 --> 00:17:03.649
actually interacts with. It starts with analyzing

00:17:03.649 --> 00:17:06.769
the user's query, then potentially applying those

00:17:06.769 --> 00:17:09.210
dynamic metadata filters if the query calls for

00:17:09.210 --> 00:17:12.490
it. Next, doing the vector search, but retrieving

00:17:12.490 --> 00:17:14.549
that larger pool of candidates, remember, like

00:17:14.549 --> 00:17:17.680
20. Followed immediately by re -ranking to score

00:17:17.680 --> 00:17:20.220
those candidates for true relevance. And finally,

00:17:20.319 --> 00:17:22.480
feeding only the best, highest scoring chunks

00:17:22.480 --> 00:17:25.279
to the LLM to generate the actual response. Okay,

00:17:25.319 --> 00:17:27.420
prep, then intelligent retrieval. What's the

00:17:27.420 --> 00:17:30.180
third phase? Is it ever really done? Huh, good

00:17:30.180 --> 00:17:32.740
question. No, it's never truly done. So phase

00:17:32.740 --> 00:17:36.299
three is continuous improvement. A good RG system

00:17:36.299 --> 00:17:39.259
needs ongoing care and feeding. This means monitoring

00:17:39.259 --> 00:17:41.440
things like the relevant scores the re -ranker

00:17:41.440 --> 00:17:44.480
is producing. Are they consistently high? Or

00:17:44.480 --> 00:17:47.599
are there queries where it struggles? Analyzing

00:17:47.599 --> 00:17:49.460
the kinds of questions users are asking, maybe

00:17:49.460 --> 00:17:51.680
that gives you ideas for new, useful metadata

00:17:51.680 --> 00:17:54.579
tags to add. And definitely integrating user

00:17:54.579 --> 00:17:56.920
feedback. Even a simple thumbs up, thumbs down,

00:17:57.039 --> 00:17:59.819
was this answer helpful? On the responses can

00:17:59.819 --> 00:18:01.779
help you pinpoint areas that need refinement.

00:18:02.329 --> 00:18:04.829
I still wrestle with prompt drift myself sometimes,

00:18:04.930 --> 00:18:07.690
getting the LLM prompts just right. So this continuous

00:18:07.690 --> 00:18:09.609
improvement loop, it really is vital for making

00:18:09.609 --> 00:18:12.029
sure the system stays effective long term. That

00:18:12.029 --> 00:18:14.049
makes perfect sense. So thinking back to phase

00:18:14.049 --> 00:18:16.109
one, data preparation, what would you say is

00:18:16.109 --> 00:18:18.190
the single most critical aspect there for the

00:18:18.190 --> 00:18:20.789
whole system's success? Quality validation, hands

00:18:20.789 --> 00:18:23.230
down. Ensuring the accuracy of both the text

00:18:23.230 --> 00:18:25.549
chunks and the metadata you extract is absolutely

00:18:25.549 --> 00:18:28.069
paramount. Everything else relies on that foundation

00:18:28.069 --> 00:18:31.009
being solid. As people start building these more

00:18:31.009 --> 00:18:33.869
advanced R -rank systems using re -ranking and

00:18:33.869 --> 00:18:36.750
metadata, are there common mistakes or pitfalls

00:18:36.750 --> 00:18:39.130
they should watch out for? Oh, definitely. There

00:18:39.130 --> 00:18:41.990
are a few classic traps people fall into. Knowing

00:18:41.990 --> 00:18:44.769
them up front can save a lot of pain later. Okay,

00:18:44.829 --> 00:18:47.630
let's hear them. First big one, over -engineering

00:18:47.630 --> 00:18:50.150
metadata. You get excited about tags and create

00:18:50.150 --> 00:18:52.650
like... 50 different metadata fields right at

00:18:52.650 --> 00:18:54.490
the start, but most of them never actually get

00:18:54.490 --> 00:18:58.829
used in queries. The fix. Start simple. Identify

00:18:58.829 --> 00:19:01.069
maybe just two or three really key fields based

00:19:01.069 --> 00:19:03.730
on how you think people will query. Add more

00:19:03.730 --> 00:19:06.309
only when you see a clear need based on actual

00:19:06.309 --> 00:19:08.730
usage patterns. Don't boil the ocean initially.

00:19:09.069 --> 00:19:11.450
Okay, start minimal with metadata. What else?

00:19:11.940 --> 00:19:14.140
Insufficient re -ranking candidates. This is

00:19:14.140 --> 00:19:15.960
running the re -ranker, but only giving it the

00:19:15.960 --> 00:19:18.240
default four or five results from the initial

00:19:18.240 --> 00:19:20.700
vector search. Remember, the re -ranker needs

00:19:20.700 --> 00:19:23.019
options. You've got to increase that initial

00:19:23.019 --> 00:19:25.500
retrieval limit in your vector store node. Get

00:19:25.500 --> 00:19:27.700
it up to at least 15 or 20. Give it a good pool

00:19:27.700 --> 00:19:29.619
to choose from. Right. Feed the re -ranker properly.

00:19:30.039 --> 00:19:33.279
Any others? Yeah, related to that. Ignoring relevant

00:19:33.279 --> 00:19:36.000
scores. Just because the re -ranker put something

00:19:36.000 --> 00:19:38.099
at the top doesn't mean it's actually good enough

00:19:38.099 --> 00:19:41.559
if its score is really low. Don't just blindly

00:19:41.559 --> 00:19:44.700
pass the top three results to the LLM. Implement

00:19:44.700 --> 00:19:47.440
a score threshold. Say, only use chunks with

00:19:47.440 --> 00:19:50.700
a re -ranker score of 0 .5 or higher. Filter

00:19:50.700 --> 00:19:52.980
out the low confidence results, even if they're

00:19:52.980 --> 00:19:55.779
ranked highest. That's a smart refinement. Setting

00:19:55.779 --> 00:19:58.920
a quality bar. And the last one. Static metadata

00:19:58.920 --> 00:20:01.839
schemas. This is defining your metadata structure

00:20:01.839 --> 00:20:04.339
once at the beginning and then never revisiting

00:20:04.339 --> 00:20:06.740
it. Your data sources might change. The kinds

00:20:06.740 --> 00:20:08.960
of questions users ask will definitely evolve.

00:20:09.319 --> 00:20:11.380
Your metadata schema needs to be treated like

00:20:11.380 --> 00:20:14.039
a living document. Review it periodically, adapt

00:20:14.039 --> 00:20:16.920
it, add new fields, maybe retire old ones. Keep

00:20:16.920 --> 00:20:18.740
it relevant to how the system is actually being

00:20:18.740 --> 00:20:21.039
used. Okay, those are great points. Why is it

00:20:21.039 --> 00:20:23.940
so crucial, do you think, to actively avoid these

00:20:23.940 --> 00:20:26.259
specific pitfalls when you're trying to build

00:20:26.259 --> 00:20:28.259
a ROG system that people can actually trust?

00:20:28.759 --> 00:20:31.019
It really comes down to engineering user trust.

00:20:31.259 --> 00:20:34.480
Every time the AI gives an irrelevant or nonsensical

00:20:34.480 --> 00:20:36.839
answer because of one of these issues, that trust

00:20:36.839 --> 00:20:39.640
erodes. Avoiding these pitfalls is about building

00:20:39.640 --> 00:20:42.140
a system that is consistently reliable and useful,

00:20:42.339 --> 00:20:44.539
which is the only way users will keep coming

00:20:44.539 --> 00:20:47.019
back to it. Makes sense. So looking at those

00:20:47.019 --> 00:20:48.920
pitfalls, which one do you think is maybe the

00:20:48.920 --> 00:20:52.079
most damaging to our ag systems' ability to adapt

00:20:52.079 --> 00:20:54.599
and stay useful over the long term? Ooh, good

00:20:54.599 --> 00:20:58.000
question. I'd probably say static metadata schemas,

00:20:58.019 --> 00:21:00.920
because your data and user needs will change.

00:21:01.099 --> 00:21:03.500
If your metadata structure can adapt, the system's

00:21:03.500 --> 00:21:05.920
ability to precisely retrieve relevant information

00:21:05.920 --> 00:21:08.559
will degrade over time, no matter how good the

00:21:08.559 --> 00:21:11.140
other components are. Adaptability is key. So

00:21:11.140 --> 00:21:12.960
let's bring it all together. What does this really

00:21:12.960 --> 00:21:15.619
mean for you, the learner, trying to build better

00:21:15.619 --> 00:21:18.859
AI? We know basic rag agents, while exciting,

00:21:19.099 --> 00:21:21.609
can often feel... a bit clumsy, a bit hit or

00:21:21.609 --> 00:21:23.650
miss. But what we've seen today is that with

00:21:23.650 --> 00:21:26.029
intelligent design choices, they can transform

00:21:26.029 --> 00:21:28.509
into something much more powerful. Exactly. It's

00:21:28.509 --> 00:21:30.750
that combination, casting a wider net initially,

00:21:30.910 --> 00:21:33.690
retrieving maybe 10 or 20 candidates, then applying

00:21:33.690 --> 00:21:36.289
that smart, intelligent re -ranking to score

00:21:36.289 --> 00:21:39.049
them for true contextual relevance, and then

00:21:39.049 --> 00:21:42.130
layering on strategic metadata, filtering for

00:21:42.130 --> 00:21:44.490
that surgical precision when needed, that synergy.

00:21:44.750 --> 00:21:47.450
That's what creates a RRG system users can actually

00:21:47.450 --> 00:21:50.269
trust, rely on, and get consistently accurate

00:21:50.269 --> 00:21:52.819
answers from. It feels like a shift from just

00:21:52.819 --> 00:21:55.299
sort of hoping the AI gets it right to actively

00:21:55.299 --> 00:21:57.460
engineering this system so that it's much more

00:21:57.460 --> 00:22:00.240
likely to get it right. These techniques re -ranking

00:22:00.240 --> 00:22:02.440
smart metadata, they offer immediate, tangible

00:22:02.440 --> 00:22:05.039
ways to improve your AI agent's accuracy and

00:22:05.039 --> 00:22:07.859
overall usefulness. Yeah. Your RG agents basically

00:22:07.859 --> 00:22:10.900
just got superpowers. Seriously. It's time to

00:22:10.900 --> 00:22:13.019
move past the frustration phase and start empowering

00:22:13.019 --> 00:22:16.039
your AI to really perform at its best. And for

00:22:16.039 --> 00:22:17.940
you, the listener, this provides a clearer path

00:22:17.940 --> 00:22:20.450
forward. a path towards ai interactions that

00:22:20.450 --> 00:22:23.130
are deeply informed highly accurate and capable

00:22:23.130 --> 00:22:25.150
of cutting through the sheer volume of information

00:22:25.150 --> 00:22:28.190
we all face so here's a thought to leave you

00:22:28.190 --> 00:22:30.869
with think about the data you interact with every

00:22:30.869 --> 00:22:33.650
single day your emails your documents transcripts

00:22:33.650 --> 00:22:36.309
articles what structured insights are currently

00:22:36.309 --> 00:22:39.349
just hidden inside all that unstructured text

00:22:39.349 --> 00:22:42.960
waiting Waiting for intelligent metadata to unlock

00:22:42.960 --> 00:22:45.500
their real power, their true utility for you.

00:22:45.579 --> 00:22:48.220
We really hope this deep dive has given you not

00:22:48.220 --> 00:22:49.859
just a clearer understanding of these concepts,

00:22:49.920 --> 00:22:52.380
but also a practical blueprint you can use to

00:22:52.380 --> 00:22:54.400
start building more intelligent, more reliable

00:22:54.400 --> 00:22:57.880
AI systems yourself. Until next time, go to your

00:22:57.880 --> 00:22:58.240
own music.
