WEBVTT

00:00:00.000 --> 00:00:03.040
Imagine an AI chatbot, you know, brilliant at

00:00:03.040 --> 00:00:05.419
general conversation. But then you ask it about

00:00:05.419 --> 00:00:07.599
your company's specific policies or maybe the

00:00:07.599 --> 00:00:10.060
nuances of your industry. And he either guesses

00:00:10.060 --> 00:00:12.519
or maybe worse, just kind of makes things up.

00:00:12.960 --> 00:00:15.919
What if you could give that AI a custom brain,

00:00:16.179 --> 00:00:20.420
one filled with your verified knowledge? Welcome

00:00:20.420 --> 00:00:23.140
to the deep dive. Today, we're diving deep into

00:00:23.140 --> 00:00:25.699
exactly that idea. Retrieval Augmented Generation,

00:00:26.019 --> 00:00:28.899
or RA for short. And honestly, it's simpler than

00:00:28.899 --> 00:00:30.780
it sounds, but it fundamentally changes how we

00:00:30.780 --> 00:00:33.439
can trust and actually use AI. We're going to

00:00:33.439 --> 00:00:36.320
discover what RA is, how these powerful Mind

00:00:36.320 --> 00:00:38.700
Palace databases make it possible. We'll break

00:00:38.700 --> 00:00:40.859
down the two crucial parts of any RA system.

00:00:41.100 --> 00:00:42.700
Then yeah, we'll walk you through building one,

00:00:42.759 --> 00:00:45.200
step by step, no code needed. You'll even learn

00:00:45.200 --> 00:00:46.979
how to give your AI a memory, make it remember

00:00:46.979 --> 00:00:49.619
stuff, and power its new brain efficiently. It's

00:00:49.619 --> 00:00:51.530
really about turning a smart... chatbot into

00:00:51.530 --> 00:00:54.030
a true expert on your stuff. Okay, let's unpack

00:00:54.030 --> 00:00:56.130
this. So let's start right at the heart of it.

00:00:56.229 --> 00:00:59.850
What is ARAG? Retrieval Augmented Generation.

00:01:00.829 --> 00:01:03.770
In plain English, it just means your AI does

00:01:03.770 --> 00:01:05.549
its own research before it speaks. Think of it

00:01:05.549 --> 00:01:07.230
like an open book test. When you ask a question,

00:01:07.290 --> 00:01:11.250
maybe how many feet are in a mile? ARAG system

00:01:11.250 --> 00:01:13.269
doesn't just pull from its huge general training

00:01:13.269 --> 00:01:16.349
data. No, it first retrieves specific accurate

00:01:16.349 --> 00:01:20.109
info like 5 ,280 feet from a source you've given

00:01:20.109 --> 00:01:22.900
it. Only then does it use that verified fact

00:01:22.900 --> 00:01:26.219
to augment or, well, significantly improve its

00:01:26.219 --> 00:01:28.879
answer. This is what transforms a generic AI

00:01:28.879 --> 00:01:31.340
into one that genuinely knows your company docs,

00:01:31.560 --> 00:01:33.819
your support tickets, maybe your internal policies.

00:01:34.060 --> 00:01:36.459
Yeah, it changes the AI from just a generalist

00:01:36.459 --> 00:01:39.359
into a trusted specialist. OK, so the core idea

00:01:39.359 --> 00:01:41.739
is that the AI isn't just generating from like

00:01:41.739 --> 00:01:44.200
its initial programming. It's actively seeking

00:01:44.200 --> 00:01:46.640
out and pulling in external facts first. Exactly.

00:01:46.739 --> 00:01:48.859
It finds the facts, then generates with confidence.

00:01:49.390 --> 00:01:52.609
But where does this AI store all the specific

00:01:52.609 --> 00:01:55.430
knowledge so it can find it so fast? That's where

00:01:55.430 --> 00:01:57.250
the vector database comes in. It's truly the

00:01:57.250 --> 00:02:01.010
brain of a rag agent. Imagine this vast... multi

00:02:01.010 --> 00:02:03.269
-dimensional space kind of like a galaxy right

00:02:03.269 --> 00:02:05.829
where every star is a piece of information from

00:02:05.829 --> 00:02:08.650
your documents each chunk of text becomes a single

00:02:08.650 --> 00:02:10.310
point of light in this galaxy that's what we

00:02:10.310 --> 00:02:12.750
call a vector and the really profound insight

00:02:12.750 --> 00:02:16.310
here is semantic similarity this database doesn't

00:02:16.310 --> 00:02:18.349
organize things alphabetically not at all it

00:02:18.349 --> 00:02:20.830
places data points based on their meaning so

00:02:20.830 --> 00:02:23.830
all the points about say fruits cluster together

00:02:23.830 --> 00:02:26.289
over here and animals form a totally different

00:02:26.289 --> 00:02:28.810
constellation over there when you ask what is

00:02:28.810 --> 00:02:31.550
a kitten your question itself gets turned into

00:02:31.550 --> 00:02:33.810
a new point of light. And it appears naturally

00:02:33.810 --> 00:02:36.710
right within that animal constellation. The database

00:02:36.710 --> 00:02:39.110
then super quickly finds the closest points,

00:02:39.189 --> 00:02:41.750
maybe cat, puppy, even wolf, and pulls the text

00:02:41.750 --> 00:02:44.189
link to them. That gives you a perfect, hyper

00:02:44.189 --> 00:02:45.909
-relevant answer. It's all about asking, what

00:02:45.909 --> 00:02:48.449
does this mean? And getting an intelligent, context

00:02:48.449 --> 00:02:50.969
-aware match. For a deep dive today, we'll be

00:02:50.969 --> 00:02:53.669
using Supabase as our vector database example,

00:02:53.889 --> 00:02:56.669
which is built right on top of Postgresful. So

00:02:56.669 --> 00:02:59.169
this database, it really understands concepts.

00:02:59.280 --> 00:03:02.219
then, not just keywords. Yes, precisely. It's

00:03:02.219 --> 00:03:04.680
all about meaning and context. Okay. And it's

00:03:04.680 --> 00:03:07.020
this clever organization that lets the two crucial

00:03:07.020 --> 00:03:10.159
halves of a RIAG system work so well together.

00:03:10.400 --> 00:03:13.139
If we kind of connect this to the bigger picture,

00:03:13.319 --> 00:03:16.050
building a RIAG system, it's like... assembling

00:03:16.050 --> 00:03:18.969
this, I don't know, magical, intelligent library,

00:03:19.110 --> 00:03:21.370
it's really a two -phase project, requires two

00:03:21.370 --> 00:03:23.889
distinct but totally complementary halves of

00:03:23.889 --> 00:03:25.650
a single brain, you could say. First, you've

00:03:25.650 --> 00:03:27.430
got what we call the AI librarian. That's your

00:03:27.430 --> 00:03:29.169
RA pipeline. This is the behind -the -scenes

00:03:29.169 --> 00:03:32.090
worker. Its job is to meticulously read every

00:03:32.090 --> 00:03:34.150
single document you give it, understand the content,

00:03:34.370 --> 00:03:36.409
and then place each piece of information on the

00:03:36.409 --> 00:03:39.449
correct shelf within your, well, infinitely large

00:03:39.800 --> 00:03:42.639
perfectly organized vector database. This isn't

00:03:42.639 --> 00:03:44.819
an ongoing job, mind you. It's a one -time process

00:03:44.819 --> 00:03:47.240
you run to initially stock the library. Right.

00:03:47.539 --> 00:03:50.319
And then you have the AI scholar. That's your

00:03:50.319 --> 00:03:52.860
R agent. This is the public -facing expert. The

00:03:52.860 --> 00:03:54.780
scholar kind of sits at the front desk, ready

00:03:54.780 --> 00:03:57.439
to help visitors. When you ask a difficult question,

00:03:57.620 --> 00:04:00.180
the scholar instantly zips through the library,

00:04:00.479 --> 00:04:02.599
pulls the exact pages from the correct books,

00:04:02.719 --> 00:04:05.080
synthesizes the information, and gives you this

00:04:05.080 --> 00:04:08.199
brilliant custom -written answer. But these two

00:04:08.199 --> 00:04:10.879
parts, they're completely interdependent. A scholar

00:04:10.879 --> 00:04:13.699
without a library is, well, just a generic chatbot.

00:04:13.800 --> 00:04:16.259
No specialized knowledge. And a library without

00:04:16.259 --> 00:04:18.680
a scholar, that's just a silent... inaccessible

00:04:18.680 --> 00:04:21.560
database. So one half is organizing all the knowledge

00:04:21.560 --> 00:04:23.439
and the other half leverages that organization

00:04:23.439 --> 00:04:26.420
to actually provide the answers. Precisely. A

00:04:26.420 --> 00:04:28.759
powerful two -phase system, yeah. So let's dig

00:04:28.759 --> 00:04:31.019
into the librarian's job then, that RAG pipeline.

00:04:31.240 --> 00:04:33.139
It's really all about transforming raw documents

00:04:33.139 --> 00:04:35.540
into a searchable, structured knowledge base.

00:04:35.759 --> 00:04:38.279
And this process has four key steps. First up,

00:04:38.379 --> 00:04:40.720
document input. This is simple. It's just acquiring

00:04:40.720 --> 00:04:42.819
the book. It's your source document. It would

00:04:42.819 --> 00:04:45.839
be a PDF, a text file, or even data you pull

00:04:45.839 --> 00:04:47.649
from other apps, maybe like a CRM. or something.

00:04:48.329 --> 00:04:50.490
Then we chunk the document. Think about it. You

00:04:50.490 --> 00:04:52.589
wouldn't file an entire 500 -page book under

00:04:52.589 --> 00:04:55.009
just one topic, right? You'd separate it into

00:04:55.009 --> 00:04:57.930
chapters or sections. Similarly, we split large

00:04:57.930 --> 00:05:01.149
documents into smaller, more focused pieces of

00:05:01.149 --> 00:05:03.050
text. This is super critical because it creates

00:05:03.050 --> 00:05:05.810
these context -rich pages for more accurate searching

00:05:05.810 --> 00:05:09.170
later on. A pro tip here, aim for chunks of maybe

00:05:09.170 --> 00:05:11.970
about 1 ,000 characters and use a 200 -character

00:05:11.970 --> 00:05:13.730
overlap between chunks. That just helps ensure

00:05:13.730 --> 00:05:15.629
concepts aren't awkwardly cut off right in the

00:05:15.629 --> 00:05:17.509
middle. Okay, and here's where it gets, well...

00:05:17.610 --> 00:05:21.110
really interesting. After chunking, you embed.

00:05:21.250 --> 00:05:23.750
This is I think the most magical part. An embeddings

00:05:23.750 --> 00:05:27.470
model. It's a sophisticated AI, acts like a universal

00:05:27.470 --> 00:05:30.430
translator. It converts those text chunks into

00:05:30.430 --> 00:05:32.550
vectors, which are basically just long strings

00:05:32.550 --> 00:05:35.269
of numbers. Think of it like giving each chunk

00:05:35.269 --> 00:05:38.230
a precise coordinate in that huge galaxy of meaning

00:05:38.230 --> 00:05:40.629
we talked about, that 1536 number you sometimes

00:05:40.629 --> 00:05:43.189
see. That just refers to the number of dimensions

00:05:43.189 --> 00:05:45.850
or kind of like traits an AI uses to describe

00:05:45.850 --> 00:05:47.990
each piece of information. It's like a super

00:05:47.990 --> 00:05:50.029
detailed profile for every single text chunk.

00:05:50.189 --> 00:05:52.910
And for your AI to understand your data consistently,

00:05:53.089 --> 00:05:56.180
this profile format, it has to match across all

00:05:56.180 --> 00:05:59.139
parts of your system. Finally, you vectorize

00:05:59.139 --> 00:06:01.839
those chunks. This just means storing these numerical

00:06:01.839 --> 00:06:04.639
vectors in your Supabase vector database. They

00:06:04.639 --> 00:06:07.060
get intelligently placed by meaning, right? So

00:06:07.060 --> 00:06:08.620
all the golf rules about putting are grouped

00:06:08.620 --> 00:06:10.980
together, far away from rules about driving.

00:06:11.439 --> 00:06:14.199
In a tool like N8AN, this whole process is actually

00:06:14.199 --> 00:06:16.660
simplified into like a five -node assembly line.

00:06:16.800 --> 00:06:18.699
You've got a trigger to start it, maybe a Google

00:06:18.699 --> 00:06:20.939
Drive node to get the document, a data loader

00:06:20.939 --> 00:06:23.500
for the chunking part, an embeddings model, like

00:06:23.500 --> 00:06:26.300
OpenAI's text embedding 3 small to do the translating,

00:06:26.439 --> 00:06:29.620
and the Supabase vector store node to file it

00:06:29.620 --> 00:06:31.519
all the way neatly. You just run this workflow

00:06:31.519 --> 00:06:33.939
once for each document you want to add. Gotcha.

00:06:34.000 --> 00:06:36.139
So it's about breaking down the info, translating

00:06:36.139 --> 00:06:39.300
it into this conceptual number language, and

00:06:39.300 --> 00:06:41.459
then filing it away smartly. Precisely. Creating

00:06:41.459 --> 00:06:43.860
that structured, searchable knowledge base. Yeah.

00:06:43.939 --> 00:06:46.019
Okay, to make this all super concrete, let's

00:06:46.019 --> 00:06:48.600
talk about our mission for today. We're building

00:06:48.600 --> 00:06:52.699
an AI golf caddy using a 22 -page PDF. It's called

00:06:52.699 --> 00:06:55.480
The Rules of Golf Simplified. That's our single

00:06:55.480 --> 00:06:58.060
source of truth. Why golf rules? Well, because

00:06:58.060 --> 00:07:00.879
they are incredibly dense, super specific, and

00:07:00.879 --> 00:07:04.399
totally self -contained. A generic AI. You might

00:07:04.399 --> 00:07:06.540
just guess or even hallucinate, you know, make

00:07:06.540 --> 00:07:09.139
stuff up if you ask a tricky golf question. But

00:07:09.139 --> 00:07:11.560
our engagement trained exclusively on this official

00:07:11.560 --> 00:07:14.860
rulebook. It'll be a true, reliable expert. Our

00:07:14.860 --> 00:07:17.339
goal is an AI caddy that can instantly and accurately

00:07:17.339 --> 00:07:19.720
answer very specific questions like, what am

00:07:19.720 --> 00:07:21.639
I allowed to do for practice? Or maybe, can I

00:07:21.639 --> 00:07:23.360
hit a practice shot between playing two holes?

00:07:23.600 --> 00:07:25.560
Yeah, and this isn't really about golf, is it?

00:07:25.740 --> 00:07:28.100
What's fascinating here is that this golf PDF,

00:07:28.420 --> 00:07:31.180
it's just a placeholder. This whole process is

00:07:31.180 --> 00:07:33.839
actually a universal blueprint for any specialized

00:07:33.839 --> 00:07:36.339
knowledge you want to give your AI. Your data

00:07:36.339 --> 00:07:39.839
source could be anything. Your last 5 ,000 HubSpot

00:07:39.839 --> 00:07:42.240
support tickets may be an air table base full

00:07:42.240 --> 00:07:44.519
of project data or even recent client emails.

00:07:45.279 --> 00:07:48.420
Imagine asking your AI, what are the top three

00:07:48.420 --> 00:07:50.660
common product issues we're seeing in the UK?

00:07:50.800 --> 00:07:53.660
Or, which projects are running over budget this

00:07:53.660 --> 00:07:56.259
year? and your agent's trigger, how you start

00:07:56.259 --> 00:07:58.379
the conversation. That can also be anything beyond

00:07:58.379 --> 00:08:00.680
just a chat window. It could be an email sent

00:08:00.680 --> 00:08:03.480
to askatyourcompany .com or a submission on your

00:08:03.480 --> 00:08:06.000
website form, or even a scheduled task that,

00:08:06.019 --> 00:08:08.120
say, summarizes the week's support tickets automatically.

00:08:08.600 --> 00:08:11.180
The skills you learn building this simple AI

00:08:11.180 --> 00:08:14.079
golf caddy, they're absolutely foundational for

00:08:14.079 --> 00:08:16.240
building really powerful business systems. So

00:08:16.240 --> 00:08:18.800
this simple golf example, it really unlocks vast

00:08:18.800 --> 00:08:21.230
possibilities for building custom AI then. Absolutely.

00:08:21.350 --> 00:08:23.329
It's a template, a template for pretty much any

00:08:23.329 --> 00:08:25.470
custom AI you can think of. All right, let's

00:08:25.470 --> 00:08:27.589
get our hands dirty then. Step one, building

00:08:27.589 --> 00:08:30.529
the library itself using Supabase. This is going

00:08:30.529 --> 00:08:32.870
to be the home for our AI agent. It's built on

00:08:32.870 --> 00:08:35.309
Postgresql, which is incredibly solid. Plus,

00:08:35.389 --> 00:08:37.909
it's pretty user -friendly and has a very generous

00:08:37.909 --> 00:08:40.070
free tier, which is great for getting started.

00:08:40.269 --> 00:08:43.529
Think of this step as laying the foundation,

00:08:43.830 --> 00:08:46.730
putting up the walls, and installing that magical

00:08:46.730 --> 00:08:48.889
shelving system for all your AI's knowledge.

00:08:49.419 --> 00:08:51.279
Okay, first up, you'll head over to supebase

00:08:51.279 --> 00:08:54.059
.com and create a new project. You give it a

00:08:54.059 --> 00:08:56.419
name, obviously, and critically, create a strong

00:08:56.419 --> 00:08:59.580
database password. Critical tip. Please, save

00:08:59.580 --> 00:09:01.659
this password securely right away, put it in

00:09:01.659 --> 00:09:03.419
a password manager, you will definitely need

00:09:03.419 --> 00:09:06.279
it later. Once your project is spinning up, might

00:09:06.279 --> 00:09:08.799
take a minute or two, you'll configure the vector

00:09:08.799 --> 00:09:11.590
database part. Now, this sounds a bit intimidating

00:09:11.590 --> 00:09:13.769
because it involves a snippet of code, but trust

00:09:13.769 --> 00:09:16.669
me, consider it a one -time magic spell. Exactly.

00:09:16.789 --> 00:09:18.710
A magic spell is a great way to put it. You just

00:09:18.710 --> 00:09:21.509
navigate to the SQL editor inside Superbase.

00:09:22.009 --> 00:09:24.470
You copy and paste a provided code block. We'll

00:09:24.470 --> 00:09:26.470
make sure you have access to that. And you click

00:09:26.470 --> 00:09:29.309
run that code. It's simply doing three things.

00:09:29.389 --> 00:09:31.950
First, create extension vector that's telling

00:09:31.950 --> 00:09:34.509
your database, hey, install the add -on to work

00:09:34.509 --> 00:09:36.370
with these AI vectors, these numbers representing

00:09:36.370 --> 00:09:40.409
meaning. Second, create table documents. This

00:09:40.409 --> 00:09:42.429
sets up the main storage shelf for your document

00:09:42.429 --> 00:09:44.909
chunks, including that special embedding vector

00:09:44.909 --> 00:09:47.950
1536 column we talked about earlier, the profile.

00:09:48.279 --> 00:09:51.360
Finally, create function match documents. This

00:09:51.360 --> 00:09:53.559
creates the magic search tool that finds similar

00:09:53.559 --> 00:09:56.200
documents by comparing their vectors, their meaning.

00:09:56.320 --> 00:09:59.000
That little bit that says one, documents .embedding,

00:09:59.019 --> 00:10:01.539
query embedding. That's just telling the database

00:10:01.539 --> 00:10:03.299
how to measure the conceptual distance between

00:10:03.299 --> 00:10:05.440
your question and every piece of info it has.

00:10:05.840 --> 00:10:08.120
Think of it like a reverse distance meter. The

00:10:08.120 --> 00:10:10.240
closer the concepts match, the higher the similarity

00:10:10.240 --> 00:10:12.580
score it gives. And then it brings back the highest

00:10:12.580 --> 00:10:14.779
scoring, most relevant information. So by running

00:10:14.779 --> 00:10:17.500
this one script, you instantly upgrade your plan.

00:10:17.580 --> 00:10:20.879
database into an ai powered knowledge base pretty

00:10:20.879 --> 00:10:23.480
cool right and the final part of this setup step

00:10:23.480 --> 00:10:26.159
is just getting the keys to the library so to

00:10:26.159 --> 00:10:28.740
speak in your super base project settings under

00:10:28.740 --> 00:10:31.320
the api section you'll find your project url

00:10:31.320 --> 00:10:33.980
and the service role secret key security warning

00:10:33.980 --> 00:10:37.059
this service role key it is the master key to

00:10:37.059 --> 00:10:39.519
your entire database treat it like a password

00:10:39.519 --> 00:10:43.190
like gold Never, ever share it publicly. You'll

00:10:43.190 --> 00:10:45.370
need both the URL and that key for connecting

00:10:45.370 --> 00:10:48.789
it in 8n later on. OK, so we're essentially building

00:10:48.789 --> 00:10:51.509
the AI's personal research library here, getting

00:10:51.509 --> 00:10:53.330
it all ready for use. That's it. Building the

00:10:53.330 --> 00:10:55.769
database and its intelligence shelves ready to

00:10:55.769 --> 00:10:59.149
be filled. Right. So with our library built and

00:10:59.149 --> 00:11:01.950
ready to be stocked or maybe already stocked,

00:11:01.950 --> 00:11:03.509
if we ran the pipeline, it's time to hire the

00:11:03.509 --> 00:11:06.690
scholar. Our ARIG agent. This is a new separate.

00:11:07.159 --> 00:11:09.879
and add in workflow. This one's for the interactive

00:11:09.879 --> 00:11:11.899
conversations. Think about the anatomy of this

00:11:11.899 --> 00:11:14.700
AI scholar. It's got the ears. That's a chat

00:11:14.700 --> 00:11:16.799
trigger node. It just listens for user questions.

00:11:17.139 --> 00:11:19.419
Then there's the central nervous system, an AI

00:11:19.419 --> 00:11:21.279
agent node that coordinates everything, the thinking

00:11:21.279 --> 00:11:23.759
and responding. The brain itself is an open AI

00:11:23.759 --> 00:11:26.940
chat model, maybe like GPT -40 mini, which does

00:11:26.940 --> 00:11:29.320
the actual thinking and generates the responses.

00:11:30.120 --> 00:11:32.120
Crucially, we give it the library card. That's

00:11:32.120 --> 00:11:34.879
a Supabase vector store tool. You add this as

00:11:34.879 --> 00:11:37.200
a tool specifically for the AI agent. You set

00:11:37.200 --> 00:11:39.740
its operation to retrieve documents. And you

00:11:39.740 --> 00:11:41.100
give it a really clear description, something

00:11:41.100 --> 00:11:43.059
like, use this tool to look up the official rules

00:11:43.059 --> 00:11:45.899
of golf. The AI needs that clarity. And finally,

00:11:45.980 --> 00:11:48.389
the universal translator. That's the embeddings

00:11:48.389 --> 00:11:50.509
model again. This must be the exact same model,

00:11:50.570 --> 00:11:53.309
say, text embedding three small that you use

00:11:53.309 --> 00:11:55.350
when you build a pipeline. That ensures consistent

00:11:55.350 --> 00:11:57.070
understanding for accurate searching. It's got

00:11:57.070 --> 00:11:59.129
to speak the same language. Now, a brilliant

00:11:59.129 --> 00:12:01.029
scholar who can't remember what you just said

00:12:01.029 --> 00:12:03.470
five seconds ago, that's incredibly frustrating,

00:12:03.690 --> 00:12:05.649
right? It's like the Dory from Finding Nemo problem.

00:12:06.110 --> 00:12:09.029
By default, your agent has zero short -term memory,

00:12:09.169 --> 00:12:11.389
so we need to give it a notepad. This is step

00:12:11.389 --> 00:12:14.750
four, adding conversational memory. And you can

00:12:14.750 --> 00:12:17.080
actually do this surprisingly easily. Right inside

00:12:17.080 --> 00:12:19.559
your AI agent node, you add the PostgreSQL chat

00:12:19.559 --> 00:12:22.379
memory feature. You'll create a new PostgreSQL

00:12:22.379 --> 00:12:24.779
credential using your Supabase database settings.

00:12:25.120 --> 00:12:27.960
Quick tip here, use the transaction pooler settings

00:12:27.960 --> 00:12:30.299
that Supabase provides. It helps with efficiency,

00:12:30.580 --> 00:12:32.620
especially if you plan to scale this up later.

00:12:33.529 --> 00:12:35.350
The password you need here is the one you set

00:12:35.350 --> 00:12:37.429
up for your Supabase project way back at the

00:12:37.429 --> 00:12:40.389
start. See? Told you you'd need it. You can also

00:12:40.389 --> 00:12:42.710
set the context window how many past messages

00:12:42.710 --> 00:12:45.570
it remembers. The default is five recent interactions,

00:12:45.710 --> 00:12:47.830
which is usually a good starting point. And this

00:12:47.830 --> 00:12:50.110
whole memory system works using session IDs.

00:12:50.389 --> 00:12:52.529
Think of it like a unique library card for each

00:12:52.529 --> 00:12:54.769
user. That ensures everyone gets their own separate

00:12:54.769 --> 00:12:57.009
continuous conversation. Your chat doesn't get

00:12:57.009 --> 00:13:00.370
mixed up with someone else's. I still wrestle

00:13:00.370 --> 00:13:03.389
with remembering to configure these. memory pieces

00:13:03.389 --> 00:13:05.269
myself sometimes it's just it's easy to miss

00:13:05.269 --> 00:13:07.870
yeah i can see that and testing this memory is

00:13:07.870 --> 00:13:10.090
pretty simple right you just start a new chat

00:13:10.090 --> 00:13:13.070
say hello my name is whatever your name is and

00:13:13.070 --> 00:13:16.409
then in the next message ask what's my name if

00:13:16.409 --> 00:13:20.029
it answers correctly bingo its notepad is working

00:13:20.029 --> 00:13:22.149
you can even double check by looking at the engine

00:13:22.149 --> 00:13:24.769
histories table directly in superbase if you

00:13:24.769 --> 00:13:27.389
want proof So this creates that conversational

00:13:27.389 --> 00:13:30.049
expert, one that can actually remember the previous

00:13:30.049 --> 00:13:33.289
turns in the conversation. Yes, exactly. It enables

00:13:33.289 --> 00:13:35.629
those interactive, context -aware conversations

00:13:35.629 --> 00:13:39.009
that feel much more natural. Okay, makes sense.

00:13:39.190 --> 00:13:41.350
Mid -roll sponsor read. Now let's talk about

00:13:41.350 --> 00:13:43.549
the fuel, the stuff that makes our AI engine

00:13:43.549 --> 00:13:47.919
actually run, the OpenAI API key. This is a step

00:13:47.919 --> 00:13:50.460
that can sometimes trip up beginners. It's really

00:13:50.460 --> 00:13:51.879
crucial to understand the difference between

00:13:51.879 --> 00:13:54.019
your ChatGPT Plus subscription, if you have one,

00:13:54.139 --> 00:13:57.480
and the OpenAI API. Think of ChatGP Plus like

00:13:57.480 --> 00:13:59.039
an all -you -can -eat buffet. You pay a flat

00:13:59.039 --> 00:14:02.179
fee, like $20 a month, for direct human -to -AI

00:14:02.179 --> 00:14:05.080
chatting through their website. The OpenAI API,

00:14:05.299 --> 00:14:07.679
though, that's different. It's a la carte. You

00:14:07.679 --> 00:14:10.080
pay for exactly what you use, usually charged

00:14:10.080 --> 00:14:12.000
per token, which is kind of like parts of words.

00:14:12.360 --> 00:14:15.120
This API is specifically designed for programmatic

00:14:15.120 --> 00:14:17.500
use, you know, machine to AI interaction. And

00:14:17.500 --> 00:14:19.620
that's exactly what you need for tools like NAI

00:14:19.620 --> 00:14:22.399
to work. Right. And getting your API key is pretty

00:14:22.399 --> 00:14:25.100
straightforward. You navigate to platform .opni

00:14:25.100 --> 00:14:27.779
.com. Notice the platform part. That's the developer

00:14:27.779 --> 00:14:30.799
side, not the regular chat GPT site. You'll create

00:14:30.799 --> 00:14:32.460
an account if you don't have one at a payment

00:14:32.460 --> 00:14:34.700
method. Think of it like opening a tab at a restaurant.

00:14:34.759 --> 00:14:36.740
You only pay for what you order. When you go

00:14:36.740 --> 00:14:40.019
to the API keys section, click create new secret

00:14:40.019 --> 00:14:42.120
key. It'll generate a key that starts with SCET.

00:14:42.259 --> 00:14:45.340
Security 101. Again, this API key is basically

00:14:45.340 --> 00:14:47.320
like a credit card number for your AI usage.

00:14:47.759 --> 00:14:50.179
Guard it fiercely. Copy it immediately. Save

00:14:50.179 --> 00:14:52.480
it in your secure password manager. Never, ever

00:14:52.480 --> 00:14:54.679
share it publicly. And once you close that little

00:14:54.679 --> 00:14:56.480
window where it shows you the key, you'll never

00:14:56.480 --> 00:14:58.580
see the full key again. If you lose it, you just

00:14:58.580 --> 00:15:01.000
have to generate a new one. Okay. And the golden

00:15:01.000 --> 00:15:03.919
rule of cost control here is really start small

00:15:03.919 --> 00:15:08.200
and set limits. Tip number one, use cheaper models

00:15:08.200 --> 00:15:10.860
when you're testing and building. GPT -4 -0 Mini

00:15:10.860 --> 00:15:13.220
is incredibly capable, especially now, and it's

00:15:13.220 --> 00:15:14.940
significantly cheaper than the bigger models

00:15:14.940 --> 00:15:17.500
like GPT -4 Turbo. You can always upgrade later

00:15:17.500 --> 00:15:19.360
once you know it works and you need more power.

00:15:19.940 --> 00:15:21.539
Tip number two, and this is probably the most

00:15:21.539 --> 00:15:24.820
important one, set hard usage limits. In your

00:15:24.820 --> 00:15:27.039
OpenAI billing settings, you can actually set

00:15:27.039 --> 00:15:29.639
a hard spending limit, maybe just $10 a month

00:15:29.639 --> 00:15:31.879
to start. If your usage ever hits that limit,

00:15:32.039 --> 00:15:34.139
the API will simply stop working until the next

00:15:34.139 --> 00:15:36.100
billing cycle. This is your ultimate safety net.

00:15:36.200 --> 00:15:38.000
It lets you experiment with total confidence,

00:15:38.179 --> 00:15:40.720
knowing you won't get a surprise bill. Got it.

00:15:40.840 --> 00:15:44.279
So the API key is the gateway to the AI's brainpower,

00:15:44.500 --> 00:15:47.000
and cost control is absolutely essential. Right.

00:15:47.179 --> 00:15:49.519
Access the brain, but keep that spending firmly

00:15:49.519 --> 00:15:52.220
in check. All right. The moment of truth, then.

00:15:52.360 --> 00:15:55.100
Let's test our AI golf caddy. You execute the

00:15:55.100 --> 00:15:57.379
agent workflow in NA8N. You open the chat interface

00:15:57.379 --> 00:15:59.419
it provides, and you ask a question. Let's try

00:15:59.419 --> 00:16:01.710
that one. What am I allowed to do for practice

00:16:01.710 --> 00:16:04.529
before a round? Now, behind the scenes, here's

00:16:04.529 --> 00:16:07.049
what happens. The agent gets your query. It identifies

00:16:07.049 --> 00:16:09.789
its look up the rules of golf tool as the right

00:16:09.789 --> 00:16:12.350
one to use. It sends your question to the embeddings

00:16:12.350 --> 00:16:14.669
model to get vectorized, turned into numbers.

00:16:14.850 --> 00:16:17.350
It searches this super -based database using

00:16:17.350 --> 00:16:19.590
those vector numbers. It finds the most relevant

00:16:19.590 --> 00:16:21.909
chunks of text about practice from that PDF we

00:16:21.909 --> 00:16:24.230
loaded. It feeds those relevant chunks to its

00:16:24.230 --> 00:16:26.929
GPD 4 .0 mini brain along with your original

00:16:26.929 --> 00:16:30.120
question. And then, it generates a perfect, context

00:16:30.120 --> 00:16:32.460
-aware answer based only on the provided rules.

00:16:32.759 --> 00:16:34.580
Yeah, and the response you'd likely get, pulled

00:16:34.580 --> 00:16:36.320
straight from the source document, would be pretty

00:16:36.320 --> 00:16:38.960
detailed. Something like, before a round, you

00:16:38.960 --> 00:16:40.820
can practice on the course on the day of a match

00:16:40.820 --> 00:16:43.000
play event, but not before a stroke play tournament

00:16:43.000 --> 00:16:46.080
or playoff, or between rounds, unless the committee

00:16:46.080 --> 00:16:48.980
allows it. During the round itself, no practice

00:16:48.980 --> 00:16:51.360
shots are allowed when playing a hole or between

00:16:51.360 --> 00:16:54.100
holes. Except for some chipping or putting on

00:16:54.100 --> 00:16:56.559
or near the last green you finished, a practice

00:16:56.559 --> 00:16:58.970
green, Or the next D -Box, provided it doesn't

00:16:58.970 --> 00:17:02.360
hold up play. Let's see. Super specific, right

00:17:02.360 --> 00:17:04.680
from the rules. And you can even check the agent's

00:17:04.680 --> 00:17:06.980
execution logs in ADN to see this whole thought

00:17:06.980 --> 00:17:09.460
process play out, which tool it used, what info

00:17:09.460 --> 00:17:12.079
it retrieved. It's great for debugging, but really

00:17:12.079 --> 00:17:14.380
this is just the hello world of RAG, you know.

00:17:14.539 --> 00:17:16.980
Beyond these basics, you can set up dynamic document

00:17:16.980 --> 00:17:19.740
updates. Imagine a workflow where any new file

00:17:19.740 --> 00:17:21.680
dropped in a Google Drive folder automatically

00:17:21.680 --> 00:17:23.960
triggers an update to the AI's knowledge base.

00:17:24.240 --> 00:17:26.700
For multi -user systems, you could use unique

00:17:26.700 --> 00:17:29.079
session IDs, like maybe a user's email address

00:17:29.079 --> 00:17:31.730
or phone number. to maintain potentially thousands

00:17:31.730 --> 00:17:33.750
of separate private conversations simultaneously.

00:17:34.849 --> 00:17:38.019
Beep. Whip. I mean, imagine scaling that. Giving

00:17:38.019 --> 00:17:41.160
a custom private AI brain to potentially millions

00:17:41.160 --> 00:17:43.400
of individual users, all running off one core

00:17:43.400 --> 00:17:46.799
system. That's truly remarkable when you think

00:17:46.799 --> 00:17:48.839
about it. And of course, there's always performance

00:17:48.839 --> 00:17:50.940
optimization you can do later. Things like fine

00:17:50.940 --> 00:17:53.220
-tuning the vector search parameters, maybe caching

00:17:53.220 --> 00:17:55.539
common queries to reduce API calls and costs.

00:17:55.799 --> 00:17:58.059
Lots you can do. So this example, it's really

00:17:58.059 --> 00:18:00.000
just scratching the surface of what's possible

00:18:00.000 --> 00:18:02.819
with this RAG approach. Absolutely. It's a powerful

00:18:02.819 --> 00:18:05.880
foundation. You can build almost infinite applications

00:18:05.880 --> 00:18:08.710
on top of this basic. So let's recap. You've

00:18:08.710 --> 00:18:11.369
just gone from a general chat bot, one that knows

00:18:11.369 --> 00:18:13.410
a lot about everything but nothing specific about

00:18:13.410 --> 00:18:16.670
your stuff, to an AI with a custom specialized

00:18:16.670 --> 00:18:20.910
brain. RDAG systems, powered by these clever

00:18:20.910 --> 00:18:24.289
vector databases, allow AI to do its own context

00:18:24.289 --> 00:18:26.829
-specific research within your data. You got

00:18:26.829 --> 00:18:29.779
the librarian. building the knowledge base, meticulously

00:18:29.779 --> 00:18:32.119
organizing the information, and the scholar using

00:18:32.119 --> 00:18:34.380
that knowledge base, retrieving precise facts

00:18:34.380 --> 00:18:36.779
to answer your questions accurately. And maybe

00:18:36.779 --> 00:18:38.960
the best part, you can actually build this all

00:18:38.960 --> 00:18:40.980
without writing a single line of traditional

00:18:40.980 --> 00:18:43.720
code, creating powerful, intelligent systems

00:18:43.720 --> 00:18:47.319
for anything from, well, golf rules to complex

00:18:47.319 --> 00:18:49.670
business data. Yeah, this isn't just some cool

00:18:49.670 --> 00:18:52.009
tech demo. It's really about building the future

00:18:52.009 --> 00:18:54.390
of how AI interacts with your unique world, your

00:18:54.390 --> 00:18:56.849
specific information. You've now got the foundational

00:18:56.849 --> 00:18:59.450
skills, the understanding to customize AI and

00:18:59.450 --> 00:19:01.549
build truly intelligent agents tailored to your

00:19:01.549 --> 00:19:04.109
needs. This deep dive has hopefully equipped

00:19:04.109 --> 00:19:06.329
you with a pretty profound understanding of how

00:19:06.329 --> 00:19:09.230
to give AI both a memory and a personal library.

00:19:09.849 --> 00:19:12.970
So the question for you is, what specific dense

00:19:12.970 --> 00:19:15.930
body of knowledge in your life or your work could

00:19:15.930 --> 00:19:18.529
you give your AI next? Imagine the expertise

00:19:18.529 --> 00:19:20.809
you could unlock. Thank you for diving deep with

00:19:20.809 --> 00:19:22.910
us today. Keep exploring, keep building, and

00:19:22.910 --> 00:19:23.869
yes, stay curious.
