WEBVTT

00:00:00.000 --> 00:00:01.120
You know, you build these things, right? You

00:00:01.120 --> 00:00:04.540
spend time putting together an AI agent in N8N,

00:00:04.719 --> 00:00:06.599
hoping it's going to, you know, automate something

00:00:06.599 --> 00:00:10.480
cool for you. Yeah. And then it just goes completely

00:00:10.480 --> 00:00:13.800
off the rails. It like makes stuff up or maybe

00:00:13.800 --> 00:00:16.000
it totally ignores the tools you gave it and

00:00:16.000 --> 00:00:20.359
your whole workflow just breaks. Yes. That feeling.

00:00:20.420 --> 00:00:22.359
It's incredibly frustrating. Oh, absolutely.

00:00:22.660 --> 00:00:25.399
Anyone who's tried to move AI from a cool demo

00:00:25.399 --> 00:00:30.000
to actual automation has hit that wall. And I

00:00:30.000 --> 00:00:33.560
think a big part of it is just how a lot of resources

00:00:33.560 --> 00:00:35.759
still teach you to build these things. Right.

00:00:35.799 --> 00:00:38.219
It feels like the standard amateur way you see

00:00:38.219 --> 00:00:41.359
online is, what, grab a basic model, write a

00:00:41.359 --> 00:00:44.259
super simple prompt, cross your fingers, and

00:00:44.259 --> 00:00:46.520
just hope for the best. Pretty much. And then

00:00:46.520 --> 00:00:48.719
you're surprised when it's not reliable. Exactly.

00:00:48.719 --> 00:00:50.280
Exactly. And the source material we've got for

00:00:50.280 --> 00:00:52.039
this deep dive, it really cuts through all that.

00:00:52.100 --> 00:00:53.939
It's pretty blunt about it, saying if that's

00:00:53.939 --> 00:00:55.759
what you're doing, you're kind of building a

00:00:55.759 --> 00:01:00.820
toy, not a production system. Ooh. Okay. So this

00:01:00.820 --> 00:01:03.520
deep dive then, this is like the antidote. It's

00:01:03.520 --> 00:01:07.760
about moving past the toy phase. actually building

00:01:07.760 --> 00:01:09.519
something that's robust, something you can depend

00:01:09.519 --> 00:01:11.680
on. That's the mission. It's about pulling back

00:01:11.680 --> 00:01:14.299
the curtain on what the professionals really

00:01:14.299 --> 00:01:17.000
focus on. It's less about just prompt engineering,

00:01:17.260 --> 00:01:18.920
although that's still important, and way more

00:01:18.920 --> 00:01:21.400
about the configuration. The configuration. Yeah.

00:01:21.420 --> 00:01:24.140
The source highlights seven key settings that,

00:01:24.180 --> 00:01:27.219
honestly, most people just seem to miss or maybe

00:01:27.219 --> 00:01:30.780
ignore. Seven. Wow. Okay. So we're talking like...

00:01:31.420 --> 00:01:33.579
architectural design here, right? Building predictable,

00:01:33.799 --> 00:01:36.659
reliable systems, not just, you know, getting

00:01:36.659 --> 00:01:39.340
lucky with some clever wording in a prompt. Precisely.

00:01:39.340 --> 00:01:42.219
This deep dive is all about understanding the

00:01:42.219 --> 00:01:44.980
engine control unit for your AI automation, not

00:01:44.980 --> 00:01:47.319
just the driver trying to steer. Okay. I love

00:01:47.319 --> 00:01:49.480
that. The engine control unit. So, all right,

00:01:49.480 --> 00:01:52.120
let's unpack this. Where do we even start with

00:01:52.120 --> 00:01:54.719
this configuration mindset? Like, why does it

00:01:54.719 --> 00:01:57.159
matter so much? Well, you know, the source makes

00:01:57.159 --> 00:01:59.459
it really clear from the jump. Configuration

00:01:59.459 --> 00:02:01.780
is the foundation. It's the difference between

00:02:01.780 --> 00:02:04.120
something that's maybe a cheap internal tool

00:02:04.120 --> 00:02:07.680
that saves you an hour here and there and a mission

00:02:07.680 --> 00:02:09.460
-critical business system that you actually depend

00:02:09.460 --> 00:02:12.560
on. Missing these settings means, as they put

00:02:12.560 --> 00:02:14.340
it, your brilliant prompt is like a brilliant

00:02:14.340 --> 00:02:16.860
driver in a car that's just fundamentally not

00:02:16.860 --> 00:02:18.560
configured right. It's never going to perform

00:02:18.560 --> 00:02:21.840
reliably. Okay, so it's not like a minor tweak.

00:02:22.099 --> 00:02:24.319
It's the core of building something that's not

00:02:24.319 --> 00:02:26.599
just going to... I don't know, randomly decided

00:02:26.599 --> 00:02:29.340
to potato halfway through your process. Right.

00:02:29.400 --> 00:02:32.740
It's about predictability and scale. And the

00:02:32.740 --> 00:02:35.360
source dives into these settings one by one,

00:02:35.379 --> 00:02:38.080
which is super helpful. OK, let's do it. First

00:02:38.080 --> 00:02:40.979
up, they talk about choosing the right AI brain.

00:02:41.060 --> 00:02:44.159
Yeah. And I got to confess, defaulting to like

00:02:44.159 --> 00:02:47.419
the GPT -4 or the latest famous model for everything

00:02:47.419 --> 00:02:49.419
that feels relatable. I think I've done that.

00:02:49.520 --> 00:02:51.080
Yeah. And the source uses this great analogy.

00:02:51.560 --> 00:02:54.879
Using GPT -4 for every single task is like using

00:02:54.879 --> 00:02:57.639
a Formula One race car to deliver a pizza. I

00:02:57.639 --> 00:02:59.759
mean, it's incredibly powerful, sure, but it's

00:02:59.759 --> 00:03:03.120
also wildly expensive and frankly, totally overkill

00:03:03.120 --> 00:03:06.080
for, you know, a simple delivery job. It's the

00:03:06.080 --> 00:03:08.400
wrong tool. OK, yeah, that makes so much sense.

00:03:08.539 --> 00:03:10.960
So what's the what's the professional framework

00:03:10.960 --> 00:03:14.560
for picking the right brain for the right job?

00:03:14.800 --> 00:03:16.840
The source lays it out really clearly based on

00:03:16.840 --> 00:03:19.219
the task you're trying to accomplish. If you

00:03:19.219 --> 00:03:21.400
need what they call Einstein level reasoning,

00:03:21.520 --> 00:03:23.780
complex problem solving, strategic thinking,

00:03:23.979 --> 00:03:26.520
deep synthesis, you're looking at models like

00:03:26.520 --> 00:03:32.080
maybe DeepSeek V2 or some of OpenAI's newer O1

00:03:32.080 --> 00:03:35.340
series models. A good use case there would be

00:03:35.340 --> 00:03:37.960
like analyzing complex market trends to develop

00:03:37.960 --> 00:03:41.060
a strategic plan. But heads up, these really

00:03:41.060 --> 00:03:42.960
heavy thinkers, some of them might not support

00:03:42.960 --> 00:03:45.259
tools well. Check the docs on that. Okay. So

00:03:45.259 --> 00:03:47.599
for the really big, chewy problems, what about

00:03:47.599 --> 00:03:51.379
speed, like real -time interaction? Ah, speed

00:03:51.379 --> 00:03:53.319
is totally different. For light and fast responses,

00:03:53.680 --> 00:03:55.979
real -time chat, interactive apps, you need models

00:03:55.979 --> 00:03:58.379
optimized for pure speed. Yeah. Think Grok or

00:03:58.379 --> 00:04:00.639
maybe Gemini 2 .5 Flats. Gemini Flats, yeah.

00:04:00.719 --> 00:04:02.879
That's your customer support chatbot on a website.

00:04:03.120 --> 00:04:05.199
It needs to feel instantaneous, right? Got it.

00:04:05.580 --> 00:04:08.439
Speed demons for speed tasks. And what about

00:04:08.439 --> 00:04:11.340
bigger companies with, you know, strict security

00:04:11.340 --> 00:04:14.240
needs, compliance and all that? That's where

00:04:14.240 --> 00:04:16.839
you lean into enterprise -grade security and

00:04:16.839 --> 00:04:19.379
governance, models available via services like

00:04:19.379 --> 00:04:22.500
Azure OpenAI or AWS Bedrock. Right, the cloud

00:04:22.500 --> 00:04:25.180
provider wrappers. Exactly. If you're dealing

00:04:25.180 --> 00:04:29.339
with, say, IPA or GDPR compliance, or you just

00:04:29.339 --> 00:04:31.319
need to integrate deeply into a secure cloud

00:04:31.319 --> 00:04:34.079
environment, using these services provides the

00:04:34.079 --> 00:04:38.310
power of... models like GPT -4, but within that

00:04:38.310 --> 00:04:41.129
necessary secure wrapper. Like for HR data or

00:04:41.129 --> 00:04:43.449
something? Yeah. An internal HR assistant dealing

00:04:43.449 --> 00:04:45.930
with sensitive employee data is a classic example.

00:04:46.230 --> 00:04:49.279
Okay. What if privacy is paramount? Like the

00:04:49.279 --> 00:04:51.839
data absolutely cannot leave my servers. Then

00:04:51.839 --> 00:04:53.779
you're going privacy first and self -hosting.

00:04:53.899 --> 00:04:56.300
Models run locally using something like Ollama,

00:04:56.420 --> 00:04:59.399
maybe running Lama3 or Mistral locally. Or using

00:04:59.399 --> 00:05:01.680
Mistral models directly gives you absolute data

00:05:01.680 --> 00:05:03.800
control, which is huge for some businesses. Right.

00:05:03.819 --> 00:05:05.660
If you're dealing with super sensitive IP or

00:05:05.660 --> 00:05:08.660
financials. Exactly. Processing, you know, highly

00:05:08.660 --> 00:05:10.980
proprietary financial data on a local server,

00:05:11.120 --> 00:05:14.160
for instance. What about seeing images or like

00:05:14.160 --> 00:05:16.800
reading charts, multimodal stuff? Yep. For that

00:05:16.800 --> 00:05:19.089
multimodal magic, you need... specific models

00:05:19.089 --> 00:05:22.850
Google Gemini 2 .5 Pro is known for strong vision

00:05:22.850 --> 00:05:26.410
capabilities or maybe GPT 4 .5 fear in that ecosystem

00:05:26.410 --> 00:05:29.149
okay analyzing images of product damage from

00:05:29.149 --> 00:05:32.129
support tickets seeing the photo and understanding

00:05:32.129 --> 00:05:34.490
the text description that's a perfect use case

00:05:34.490 --> 00:05:36.930
there okay so we've got power speed security

00:05:36.930 --> 00:05:41.209
privacy vision that's a whole lineup but what

00:05:41.209 --> 00:05:44.980
about cost because Honestly, those fancy models

00:05:44.980 --> 00:05:47.259
can get really expensive, right? Massively. And

00:05:47.259 --> 00:05:49.279
that's where the cost optimized production models

00:05:49.279 --> 00:05:52.480
become essential. Think GPT -4 mini or other

00:05:52.480 --> 00:05:54.959
lightweight models. When the task is relatively

00:05:54.959 --> 00:05:57.839
simple, like basic classification, simple data

00:05:57.839 --> 00:06:00.560
extraction or Q &A, and you need to process thousands

00:06:00.560 --> 00:06:03.279
or millions of requests affordably. These models

00:06:03.279 --> 00:06:06.620
are often, you know. 90 % as capable as their

00:06:06.620 --> 00:06:08.899
bigger siblings, but at a tiny fraction of the

00:06:08.899 --> 00:06:11.220
cost. Okay, wow. So the real professional approach

00:06:11.220 --> 00:06:13.500
isn't just picking one favorite model, but like

00:06:13.500 --> 00:06:15.639
having a whole arsenal and choosing the absolute

00:06:15.639 --> 00:06:18.550
best one for each query on the fly. Exactly.

00:06:18.689 --> 00:06:20.709
And that's the dynamic model selection they talk

00:06:20.709 --> 00:06:23.230
about, often using a model router. A model router.

00:06:23.430 --> 00:06:26.689
Yeah. Instead of hard coding, say, GPT -4 for

00:06:26.689 --> 00:06:29.629
every single thing, you use a much cheaper, much

00:06:29.629 --> 00:06:32.629
faster AI agent, that's your router, whose only

00:06:32.629 --> 00:06:35.410
job is to read the user's query or the workflow's

00:06:35.410 --> 00:06:38.170
need and decide where of those other more specialized

00:06:38.170 --> 00:06:40.649
models is the most appropriate and cost effective.

00:06:40.930 --> 00:06:44.170
Wait, so you use one AI to pick which AI to use.

00:06:44.209 --> 00:06:47.439
That's kind of meta. It is, but it's incredibly

00:06:47.439 --> 00:06:50.279
efficient at scale. The router agent uses a simple

00:06:50.279 --> 00:06:52.860
system prompt with decision rules based on the

00:06:52.860 --> 00:06:55.620
query type. Is it complex reasoning? Is it asking

00:06:55.620 --> 00:06:58.000
for code? Is it general chat? Is it something

00:06:58.000 --> 00:07:00.899
simple? Is it multimodal? Then the real worker

00:07:00.899 --> 00:07:03.959
agent uses an expression in AN to dynamically

00:07:03.959 --> 00:07:06.319
call the specific model the router recommended.

00:07:06.720 --> 00:07:08.779
You might even use a service like OpenRouter

00:07:08.779 --> 00:07:10.360
that connects to a whole bunch of different models.

00:07:10.459 --> 00:07:12.439
Okay, whoa. So you actually get the best model

00:07:12.439 --> 00:07:15.279
for the specific job and you keep costs way down.

00:07:15.399 --> 00:07:18.060
because you're not using that Ferrari to deliver

00:07:18.060 --> 00:07:20.660
every single email. That really makes so much

00:07:20.660 --> 00:07:22.860
sense for a production system. It's a hallmark

00:07:22.860 --> 00:07:25.439
of a robust production system, yeah. Finding

00:07:25.439 --> 00:07:27.579
that perfect balance of capability and cost.

00:07:27.860 --> 00:07:30.240
All right, so picking the right brain or dynamically

00:07:30.240 --> 00:07:33.480
routing to the right brain is clearly key. What's

00:07:33.480 --> 00:07:36.579
next on the list of settings people often overlook?

00:07:37.069 --> 00:07:40.089
Controlling creativity and reliability. This

00:07:40.089 --> 00:07:43.389
is where temperature and top P come in. And honestly,

00:07:43.629 --> 00:07:45.569
leaving these at their defaults is just like

00:07:45.569 --> 00:07:48.290
asking for unpredictable results. Temperature.

00:07:48.730 --> 00:07:50.670
I've seen that slider in interfaces. What does

00:07:50.670 --> 00:07:53.589
it actually do? Think of temperature as the creativity

00:07:53.589 --> 00:07:56.230
and risk dial, usually from 0 .0 up to like 1

00:07:56.230 --> 00:07:59.910
.0 or more. A low temperature, like 0 .1 to 0

00:07:59.910 --> 00:08:02.389
.3, means the AI is going to stick to the most

00:08:02.389 --> 00:08:05.410
common, safest, most probable word choices. Okay.

00:08:05.509 --> 00:08:08.269
It's super predictable, very factual, very consistent.

00:08:08.889 --> 00:08:11.610
That's what you want for, say, a support chatbot

00:08:11.610 --> 00:08:13.990
that needs to give the exact same correct answer

00:08:13.990 --> 00:08:16.129
every single time someone asks about your return

00:08:16.129 --> 00:08:18.379
policy. Okay. So predictable and boring. Boring,

00:08:18.459 --> 00:08:21.160
almost. What if you want something more engaging,

00:08:21.459 --> 00:08:24.980
more creative? Then you go medium, maybe 0 .6

00:08:24.980 --> 00:08:28.939
to 0 .9. The AI takes more risks, uses less common

00:08:28.939 --> 00:08:30.839
words, feels much more creative and human -like.

00:08:30.939 --> 00:08:32.820
That's perfect for marketing copy, generating

00:08:32.820 --> 00:08:35.940
social media posts, brainstorming ideas. And

00:08:35.940 --> 00:08:38.940
high. Yeah. Like 1 .0 and above. What happens

00:08:38.940 --> 00:08:41.759
then? That's the danger zone. It becomes extremely

00:08:41.759 --> 00:08:45.059
unpredictable, often random, sometimes nonsensical

00:08:45.059 --> 00:08:48.240
or, you know, wildly hallucinatory. It's almost

00:08:48.240 --> 00:08:50.360
never used in production unless you're like trying

00:08:50.360 --> 00:08:53.159
to brainstorm abstract art concepts with zero

00:08:53.159 --> 00:08:55.940
need for accuracy whatsoever. Got it. So temperature

00:08:55.940 --> 00:08:58.419
is kind of like how adventurous the AI is with

00:08:58.419 --> 00:09:00.779
its word choices. What about top P? How does

00:09:00.779 --> 00:09:02.580
that work? Does it relate? It does. Yeah. Top

00:09:02.580 --> 00:09:05.080
P works with temperature. Think of it as the

00:09:05.080 --> 00:09:07.940
word choice filter or the size of the pool of

00:09:07.940 --> 00:09:10.450
words the AI. considers for the next word. A

00:09:10.450 --> 00:09:13.669
low top P, say 0 .3, means it only looks at the

00:09:13.669 --> 00:09:17.330
top 30 % most likely words. Very safe, very conservative

00:09:17.330 --> 00:09:20.409
choices. And a high top P, like 0 .9. A much

00:09:20.409 --> 00:09:22.789
wider range of words. Like 0 .9 means it considers

00:09:22.789 --> 00:09:25.889
the top 90 % most likely words. That allows for

00:09:25.889 --> 00:09:28.809
a lot more variety and creativity. A top P of

00:09:28.809 --> 00:09:32.690
1 .0 considers all possible words, which can

00:09:32.690 --> 00:09:35.190
lead to some pretty bizarre outputs. So low temp

00:09:35.190 --> 00:09:37.289
and low top P is super conservative. High temp

00:09:37.289 --> 00:09:40.980
and high top. p is pretty much chaos pretty much

00:09:40.980 --> 00:09:43.259
the source gives this really handy production

00:09:43.259 --> 00:09:45.720
settings formula table as a starting point for

00:09:45.720 --> 00:09:48.200
different use cases for reliable business automation

00:09:48.200 --> 00:09:51.779
they suggest temperature around 0 .2 top p maybe

00:09:51.779 --> 00:09:55.580
0 .7 okay low temp medium -ish top p yeah and

00:09:55.580 --> 00:09:58.019
often adding a frequency penalty of 0 .5 to 1

00:09:58.019 --> 00:10:00.799
.0 to prevent the ai from repeating itself that's

00:10:00.799 --> 00:10:02.879
low creativity high reliability low repetition

00:10:02.879 --> 00:10:05.330
and for like generating creative content Marketing

00:10:05.330 --> 00:10:08.289
stuff. 10 .7, top P 1 .0, frequency penalty maybe

00:10:08.289 --> 00:10:11.009
0 .5, high creativity, wide word choice, keep

00:10:11.009 --> 00:10:13.730
it fresh. And like super structured output like

00:10:13.730 --> 00:10:16.330
JSON or code where format is critical. You need

00:10:16.330 --> 00:10:19.029
extremely low creativity there to ensure it follows

00:10:19.029 --> 00:10:22.309
the format precisely. 10 .1, top P lower 0 .5,

00:10:22.450 --> 00:10:24.309
maybe frequency penalty 0 .2. You're forcing

00:10:24.309 --> 00:10:26.809
it into a narrow, precise path. Okay, so you

00:10:26.809 --> 00:10:29.169
really, you have to dial these in for each specific

00:10:29.169 --> 00:10:31.250
task. You can't just leave them at default and

00:10:31.250 --> 00:10:34.200
hope it works. Crucial. And the source also quickly

00:10:34.200 --> 00:10:36.259
mentions other critical model settings you find

00:10:36.259 --> 00:10:38.940
in that node. Max tokens, which prevents rambling

00:10:38.940 --> 00:10:41.580
and controls costs. Right, stop it talking forever.

00:10:41.899 --> 00:10:44.519
Timeout, for how long the workflow waits for

00:10:44.519 --> 00:10:47.700
a response. And max retries, which is important

00:10:47.700 --> 00:10:50.200
for production to handle transient API failures.

00:10:50.379 --> 00:10:54.000
Set that to like two or three. Good tip. Okay,

00:10:54.279 --> 00:10:56.200
so we've covered picking the brain and controlling

00:10:56.200 --> 00:10:59.100
its personality, I guess, with temperature and

00:10:59.100 --> 00:11:02.240
top P. What about memory? AI agents need to remember

00:11:02.240 --> 00:11:04.399
past turns in a conversation, right? Yes. And

00:11:04.399 --> 00:11:06.299
this is another place where the amateur move

00:11:06.299 --> 00:11:08.919
can absolutely kill you in production. Using

00:11:08.919 --> 00:11:11.879
the standard NANN simple memory or window buffer

00:11:11.879 --> 00:11:14.759
memory nodes. Why? They seem easy to use when

00:11:14.759 --> 00:11:17.460
you're testing. They are easy for simple tests

00:11:17.460 --> 00:11:20.740
or internal only tools with maybe one user, but

00:11:20.740 --> 00:11:23.320
they use your server's RAM in production with,

00:11:23.419 --> 00:11:26.320
you know. potentially dozens or hundreds of users

00:11:26.320 --> 00:11:28.840
all having separate conversations, that memory

00:11:28.840 --> 00:11:31.700
usage just grows and grows and grows. Eventually,

00:11:31.879 --> 00:11:34.600
it'll exhaust your server's RAM, slowing everything

00:11:34.600 --> 00:11:37.399
down, and ultimately crashing your entire NN

00:11:37.399 --> 00:11:40.399
instance. Oh, okay. It's fundamentally not scalable

00:11:40.399 --> 00:11:42.820
for multi -user scenarios. Okay, so that's like

00:11:42.820 --> 00:11:45.480
a hidden scaling killer. What's the professional

00:11:45.480 --> 00:11:48.759
alternative then? Using a dedicated, external,

00:11:48.980 --> 00:11:52.080
scalable database for chat memory. The source

00:11:52.080 --> 00:11:55.240
specifically recommends Postgresql, and they

00:11:55.240 --> 00:11:58.000
point to Supabase as a really good option because

00:11:58.000 --> 00:12:00.200
it's easy to get started with and has a free

00:12:00.200 --> 00:12:03.440
tier. Supabase, okay. I've heard of them. How

00:12:03.440 --> 00:12:06.440
does that setup work? Is it complicated? According

00:12:06.440 --> 00:12:08.340
to the source, it's pretty straightforward. You

00:12:08.340 --> 00:12:10.419
create a free Supabase account, and then in your

00:12:10.419 --> 00:12:12.200
project settings, you get a connection string.

00:12:12.360 --> 00:12:14.179
They mention the transaction pillar connection

00:12:14.179 --> 00:12:15.840
string is important for managing connections

00:12:15.840 --> 00:12:18.000
efficiently. Cooler. Got it. You just copy those

00:12:18.000 --> 00:12:21.970
details, host name, database name, user. password

00:12:21.970 --> 00:12:25.409
into the NANN Postgres chat memory credential

00:12:25.409 --> 00:12:28.389
setup. So NANN talks directly to that database

00:12:28.389 --> 00:12:31.169
to store and retrieve the chat history. Exactly.

00:12:31.529 --> 00:12:35.409
And crucially, in the NETEN AI agent node itself,

00:12:35.470 --> 00:12:38.590
when you configure the memory, you must set a

00:12:38.590 --> 00:12:42.509
unique session ID for each user. Ah, so conversations

00:12:42.509 --> 00:12:45.350
don't get mixed up. Precisely. This is how the

00:12:45.350 --> 00:12:47.549
database knows which conversation history belongs

00:12:47.549 --> 00:12:50.889
to whom. You can use their username, email, or

00:12:50.889 --> 00:12:53.950
a unique ID from your system. And you give the

00:12:53.950 --> 00:12:56.629
table in the database a name, like NAT histories

00:12:56.629 --> 00:12:58.889
or whatever makes sense. Okay, that makes total

00:12:58.889 --> 00:13:01.450
sense. Separate histories, stored scalably in

00:13:01.450 --> 00:13:04.110
a database, not blowing up my server RAM. And

00:13:04.110 --> 00:13:06.250
you also configure the context window length

00:13:06.250 --> 00:13:08.289
here, right? How many past messages it remembers.

00:13:08.509 --> 00:13:10.750
Yes, that's configured in the node as well. That

00:13:10.750 --> 00:13:13.370
controls how many of the most recent messages

00:13:13.370 --> 00:13:15.950
the agent remembers and sends back to the AI

00:13:15.950 --> 00:13:18.250
model with each turn in the conversation. Right.

00:13:18.330 --> 00:13:20.029
How do you decide how long that should be? For

00:13:20.029 --> 00:13:22.450
short, simple chats like basic customer support

00:13:22.450 --> 00:13:25.429
FAQs, maybe five, 10 messages is enough. And

00:13:25.429 --> 00:13:28.029
for longer, more complex stuff like planning

00:13:28.029 --> 00:13:30.750
or research. Yeah. For conversations that build

00:13:30.750 --> 00:13:32.970
over time, like project planning or analytical

00:13:32.970 --> 00:13:36.210
tasks, you might need 20 or more messages for

00:13:36.210 --> 00:13:38.850
the AI to maintain good continuity and context.

00:13:39.269 --> 00:13:42.029
So the rule of thumb is a larger window means

00:13:42.029 --> 00:13:44.870
better memory, but also higher cost and potentially

00:13:44.870 --> 00:13:47.970
slower response. Precisely. Sending more text

00:13:47.970 --> 00:13:50.370
with every single request means a higher API

00:13:50.370 --> 00:13:52.929
cost and more processing time for the model.

00:13:53.049 --> 00:13:55.570
You absolutely have to tune this setting based

00:13:55.570 --> 00:13:58.029
on what your specific use case actually needs.

00:13:58.110 --> 00:14:01.090
Don't just crank it up to max. Okay. Memory problem

00:14:01.090 --> 00:14:04.940
solved with the database. Next up, tools. Giving

00:14:04.940 --> 00:14:07.259
agents the ability to use tools seems pretty

00:14:07.259 --> 00:14:09.399
fundamental. What's the configuration secret

00:14:09.399 --> 00:14:11.799
there? The golden rule, as the source puts it.

00:14:11.879 --> 00:14:14.320
The more fields you leave blank for the AI to

00:14:14.320 --> 00:14:16.779
dynamically define when it calls a tool, the

00:14:16.779 --> 00:14:19.019
higher the chance of hallucination, malformed

00:14:19.019 --> 00:14:22.000
data, and the tool call failing. Oof, that resonates.

00:14:22.320 --> 00:14:23.860
Can you give an example? Like, how does that

00:14:23.860 --> 00:14:25.879
play out? Sure. They use the classic send email

00:14:25.879 --> 00:14:28.620
tool the wrong way. You leave the to email address,

00:14:28.820 --> 00:14:31.179
the subject, and the message body entirely up

00:14:31.179 --> 00:14:32.840
to the AI to figure out from the conversation.

00:14:33.200 --> 00:14:35.919
Right. Just tell it, send an email about X to

00:14:35.919 --> 00:14:38.600
Bob. And the AI might invent an email address

00:14:38.600 --> 00:14:40.799
for Bob that doesn't exist, forget the subject

00:14:40.799 --> 00:14:43.019
line entirely, or mix up the recipient and the

00:14:43.019 --> 00:14:45.759
content. Disaster. Yeah, I've definitely seen

00:14:45.759 --> 00:14:49.200
variations of that happen. The right way. Predefine

00:14:49.200 --> 00:14:51.379
everything you possibly can in the tool node

00:14:51.379 --> 00:14:54.240
itself. Only let the AI control the specific

00:14:54.240 --> 00:14:57.240
parts where its creativity or understanding of

00:14:57.240 --> 00:14:59.840
the conversation is actually needed. Oh, okay.

00:14:59.960 --> 00:15:02.159
So for the email tool, you'd use an expression

00:15:02.159 --> 00:15:05.279
to pull the to email address from, say, a previous

00:15:05.279 --> 00:15:08.200
workflow step or a database lookup. You'd only

00:15:08.200 --> 00:15:10.820
let the AI define the subject. and the message

00:15:10.820 --> 00:15:13.700
dynamically based on the user's request. Okay,

00:15:13.779 --> 00:15:16.440
so you're essentially constraining the AI's creativity

00:15:16.440 --> 00:15:18.779
to only the parts of the tool call where its

00:15:18.779 --> 00:15:21.139
linguistic intelligence is valuable, not where

00:15:21.139 --> 00:15:23.580
it can just invent data and break the tool. Exactly.

00:15:23.639 --> 00:15:26.159
It dramatically increases the reliability of

00:15:26.159 --> 00:15:28.519
your tool calls, and the source lists the available

00:15:28.519 --> 00:15:32.259
tool types in AN8n, the hundreds of pre -built

00:15:32.259 --> 00:15:34.620
nodes for common services like Gmail, Slack,

00:15:34.919 --> 00:15:36.960
Airtable, Notion, databases. That would be usual

00:15:36.960 --> 00:15:40.860
suspects. Yeah. Then the universal HTTP request

00:15:40.860 --> 00:15:44.860
node for interacting with literally any API and

00:15:44.860 --> 00:15:47.980
even custom tools, which let you wrap another

00:15:47.980 --> 00:15:50.659
any and end workflow as a tool the AI can call,

00:15:50.799 --> 00:15:53.080
which is pretty powerful. Pro tip is probably

00:15:53.080 --> 00:15:55.399
always start with pre -built. Absolutely. They're

00:15:55.399 --> 00:15:58.159
the most robust and easiest to configure. Only

00:15:58.159 --> 00:16:01.559
resort to custom or HTTP if there's truly no

00:16:01.559 --> 00:16:03.860
pre -built node that does what you need. All

00:16:03.860 --> 00:16:05.899
right. We've picked the right brain. We've controlled

00:16:05.899 --> 00:16:08.159
its randomness. We've given it scalable memory

00:16:08.159 --> 00:16:11.659
and configured its tools smartly by predefining

00:16:11.659 --> 00:16:14.100
fields. What about the prompt itself? You said

00:16:14.100 --> 00:16:15.860
it's not just prompting, but it's still a key

00:16:15.860 --> 00:16:17.899
piece, right? It is, but it's about writing a

00:16:17.899 --> 00:16:20.500
professional system prompt, not just, you know,

00:16:20.500 --> 00:16:22.679
be a helpful assistant. Like that vague stuff.

00:16:22.820 --> 00:16:24.399
Yeah, that's like hiring someone and just saying,

00:16:24.460 --> 00:16:26.700
hey, be helpful without any other direction.

00:16:27.279 --> 00:16:29.159
The source provides a really solid structured

00:16:29.159 --> 00:16:31.580
template, which is key. Okay, walk me through

00:16:31.580 --> 00:16:33.500
that professional prompt template. What are the

00:16:33.500 --> 00:16:35.379
essential sections? It breaks it down logically.

00:16:35.539 --> 00:16:37.720
First, you have the role persona. What is this

00:16:37.720 --> 00:16:40.659
AI, EG? You are a helpful customer service assistant

00:16:40.659 --> 00:16:43.980
for ACME Corp. Clear identity. Then the primary

00:16:43.980 --> 00:16:47.580
goal. What is its main task? EGG, your main task

00:16:47.580 --> 00:16:50.379
is to provide prompt, accurate, and friendly

00:16:50.379 --> 00:16:53.240
responses to customer inquiries based on the

00:16:53.240 --> 00:16:55.799
provided tools and knowledge. Okay, clear job

00:16:55.799 --> 00:16:59.019
title and objective. Right. Then, domain knowledge

00:16:59.019 --> 00:17:01.960
context. This is where you inject specific facts

00:17:01.960 --> 00:17:04.339
about your business, products, policies, etc.

00:17:04.740 --> 00:17:07.180
This is crucial so it doesn't have to guess or

00:17:07.180 --> 00:17:09.440
hallucinate information about your company. Ah,

00:17:09.680 --> 00:17:11.480
okay, so it actually knows what it's talking

00:17:11.480 --> 00:17:13.940
about regarding my business. That's huge. Exactly.

00:17:14.220 --> 00:17:17.640
Then, tools. Explicitly list the tools it has

00:17:17.640 --> 00:17:20.119
access to, when and why it should use them, and

00:17:20.119 --> 00:17:23.400
maybe a rule like, prioritize using your tools

00:17:23.400 --> 00:17:25.880
over guessing if information might exist outside

00:17:25.880 --> 00:17:28.420
your core knowledge. Tell it what tools it has

00:17:28.420 --> 00:17:31.119
and how to use them. Yeah. And if you have a

00:17:31.119 --> 00:17:33.380
knowledge -based search tool, like for searching

00:17:33.380 --> 00:17:35.880
documentation, clearly lay out the rules for

00:17:35.880 --> 00:17:39.089
using it. Use this tool for all factual questions,

00:17:39.349 --> 00:17:41.849
base your responses strictly on the search results,

00:17:42.029 --> 00:17:44.970
and if the search tool finds no relevant information,

00:17:45.450 --> 00:17:47.769
state clearly that you could not find the answer

00:17:47.769 --> 00:17:50.269
in your knowledge base. Don't let it guess after

00:17:50.269 --> 00:17:52.769
a failed search. Okay, wow, so you're explicitly

00:17:52.769 --> 00:17:55.230
telling it what it knows, what it can do, and

00:17:55.230 --> 00:17:57.589
how it should use those capabilities. Very specific

00:17:57.589 --> 00:18:01.750
instructions. Yes. Then, format rules. What format

00:18:01.750 --> 00:18:05.289
should the final output be in? Plain text, markdown,

00:18:05.549 --> 00:18:08.329
JSON, maybe even a maximum word count. Tell it

00:18:08.329 --> 00:18:10.450
how to structure the answer. Style tone. How

00:18:10.450 --> 00:18:13.250
should it sound? AG, maintain a friendly and

00:18:13.250 --> 00:18:15.890
concise tone. Use the user's name if available.

00:18:16.089 --> 00:18:18.170
And safety and accuracy. That feels really important.

00:18:18.509 --> 00:18:21.069
Super critical section. You explicitly state

00:18:21.069 --> 00:18:23.430
rules like, if you are uncertain about an answer,

00:18:23.509 --> 00:18:25.049
state your uncertainty rather than guessing.

00:18:25.309 --> 00:18:27.809
Never disclose internal tool names or proprietary

00:18:27.809 --> 00:18:31.019
internal information. Always adhere to company

00:18:31.019 --> 00:18:33.680
policies and legal requirements. It's like giving

00:18:33.680 --> 00:18:36.220
the AI its full onboarding packet and employee

00:18:36.220 --> 00:18:40.119
handbook. All the guardrails. Pretty much. Finally,

00:18:40.220 --> 00:18:43.059
an optional but often useful one, reasoning.

00:18:43.579 --> 00:18:46.400
You can instruct it to think step -by -step internally

00:18:46.400 --> 00:18:49.759
before providing your final response. This encourages

00:18:49.759 --> 00:18:52.140
a more deliberate process, though you don't necessarily

00:18:52.140 --> 00:18:54.500
show these internal steps to the user. Okay,

00:18:54.539 --> 00:18:56.039
yeah, that makes so much more sense than just

00:18:56.039 --> 00:18:58.720
a couple of vague sentences. It moves from a...

00:18:58.970 --> 00:19:02.269
a vague command to a real structured professional

00:19:02.269 --> 00:19:05.130
delegation. It just dramatically improves reliability

00:19:05.130 --> 00:19:08.349
because the AI understands its boundaries, its

00:19:08.349 --> 00:19:11.309
resources, and the expected behavior. Less guessing,

00:19:11.450 --> 00:19:13.410
more structure. We've covered a lot of ground.

00:19:13.769 --> 00:19:15.950
Choosing the brain, controlling its output style,

00:19:16.190 --> 00:19:18.809
scalable memory, smart tools, professional prompts.

00:19:19.170 --> 00:19:21.549
What's next after the AI node itself? There's

00:19:21.549 --> 00:19:23.950
more. Output parsers. This solves that incredibly

00:19:23.950 --> 00:19:26.029
common problem where the AI gives you the correct

00:19:26.029 --> 00:19:27.970
answer, but it puts it in the wrong format for

00:19:27.970 --> 00:19:30.109
the next step in your innate end workflow. Oh,

00:19:30.109 --> 00:19:32.980
yeah. You need JSON. And it gives you a bulleted

00:19:32.980 --> 00:19:35.680
list. Or it messes up the JSON structure slightly.

00:19:35.859 --> 00:19:38.019
So frustrating. Happens all the time. Exactly.

00:19:38.259 --> 00:19:40.539
Output parsers are like the strict formatting

00:19:40.539 --> 00:19:43.500
guards right after the AI agent node in your

00:19:43.500 --> 00:19:46.460
workflow. The source lists three main types that

00:19:46.460 --> 00:19:48.740
are really useful. The structured output parser

00:19:48.740 --> 00:19:50.900
is for when you need a fixed JSON structure.

00:19:51.180 --> 00:19:54.039
You define the schema you expect. And this parser

00:19:54.039 --> 00:19:56.640
forces the AI's output into that exact shape,

00:19:56.720 --> 00:19:59.079
failing if it can't. Okay, so that guarantees

00:19:59.079 --> 00:20:02.500
my JSON format, mostly. What about lists of things?

00:20:02.660 --> 00:20:06.079
Like if I ask for five ideas. Item list output

00:20:06.079 --> 00:20:08.920
parser. This is for when the AI is supposed to

00:20:08.920 --> 00:20:11.200
return a variable list of items. You specify

00:20:11.200 --> 00:20:12.819
the separator character. Maybe it's AI between

00:20:12.819 --> 00:20:15.559
items or just a new line. And this parser takes

00:20:15.559 --> 00:20:18.440
that list -like text output and formats it into

00:20:18.440 --> 00:20:22.819
a clean usable array in N8n. Nice. Okay. And

00:20:22.819 --> 00:20:24.460
the third type, the one they call the professional

00:20:24.460 --> 00:20:27.329
safety net. What's that about? That's the autofixing

00:20:27.329 --> 00:20:29.049
output parser. This is for mission -critical

00:20:29.049 --> 00:20:31.210
workflows where a formatting error is just unacceptable.

00:20:31.430 --> 00:20:34.730
If your primary AI gives you broken JSON or some

00:20:34.730 --> 00:20:36.809
other malformed output... Which happens. Right.

00:20:36.930 --> 00:20:40.009
This parser makes a second... Usually very lightweight

00:20:40.009 --> 00:20:43.069
and cheap AI call. It sends the broken output

00:20:43.069 --> 00:20:45.369
to the secondary model with a simple instruction

00:20:45.369 --> 00:20:48.730
like, please fix this broken JSON. And it returns

00:20:48.730 --> 00:20:52.029
the corrected valid output. Wow. So if the main

00:20:52.029 --> 00:20:54.670
brain messes up the format, a backup brain just

00:20:54.670 --> 00:20:57.289
swoops in and fixes just that specific formatting

00:20:57.289 --> 00:21:00.480
problem. Exactly. It adds a tiny bit of cost

00:21:00.480 --> 00:21:03.200
and latency because it's a second API call, but

00:21:03.200 --> 00:21:05.319
it provides incredible resilience for systems

00:21:05.319 --> 00:21:07.799
that just cannot afford a formatting failure

00:21:07.799 --> 00:21:10.460
downstream. That's smart. The production strategy,

00:21:10.640 --> 00:21:12.660
they suggest, is often to use the structured

00:21:12.660 --> 00:21:15.799
parser for fixed formats and for critical workflows.

00:21:16.099 --> 00:21:18.779
Wrap that inside an auto -fixing parser node.

00:21:19.160 --> 00:21:21.519
Double protection. That is genuinely smart. It's

00:21:21.519 --> 00:21:24.039
like a little formatting quality control step

00:21:24.039 --> 00:21:26.259
that happens automatically. Built -in error correction.

00:21:26.480 --> 00:21:28.400
Yeah, it's essential for ensuring the AI's output

00:21:28.400 --> 00:21:30.839
integrates seamlessly into the rest of your automated

00:21:30.839 --> 00:21:33.039
processes. Stops the whole workflow breaking

00:21:33.039 --> 00:21:35.359
because of one misplaced bracket. All right,

00:21:35.400 --> 00:21:37.359
I think we've covered six of the seven. What's

00:21:37.359 --> 00:21:39.000
the last category of settings? Where else do

00:21:39.000 --> 00:21:41.619
we need to look? The agent level execution settings

00:21:41.619 --> 00:21:45.619
right there within the NANN AI agent node itself.

00:21:45.940 --> 00:21:48.519
The big one here is max iteration. Max iterations.

00:21:48.900 --> 00:21:50.960
What does that control exactly? This limits the

00:21:50.960 --> 00:21:53.619
number of back and forth steps the agent takes.

00:21:54.000 --> 00:21:56.940
You know, the cycle of the model thinking, deciding

00:21:56.940 --> 00:21:59.539
to use a tool, calling the tool, the tool returning

00:21:59.539 --> 00:22:01.720
a result, the model reading the result, thinking

00:22:01.720 --> 00:22:03.940
again, maybe calling another tool and so on.

00:22:04.339 --> 00:22:07.160
Model tool model. Okay, the thinking loop. Right.

00:22:07.240 --> 00:22:10.230
The default is often... 10 iterations. So if

00:22:10.230 --> 00:22:12.210
you have a pretty simple chat where it just needs

00:22:12.210 --> 00:22:14.589
to think and respond, 10 steps is probably fine.

00:22:14.730 --> 00:22:17.849
Probably. But for a complex analysis test that

00:22:17.849 --> 00:22:19.809
might involve multiple tool calls or breaking

00:22:19.809 --> 00:22:22.269
down a problem, you might need to increase that

00:22:22.269 --> 00:22:25.990
to 15 or 20. An autonomous research agent using

00:22:25.990 --> 00:22:28.470
several tools could potentially go up to 30.

00:22:28.650 --> 00:22:30.769
And the warning sign here, what's the danger?

00:22:31.230 --> 00:22:33.849
Setting max iterations too high without other

00:22:33.849 --> 00:22:36.509
strong guardrails in your prompt or tool configuration

00:22:36.509 --> 00:22:38.990
can lead to the agent getting stuck in infinite

00:22:38.990 --> 00:22:41.390
loops, just calling tools over and over. Oh,

00:22:41.490 --> 00:22:45.170
yeah. And racking up incredibly high costs very,

00:22:45.289 --> 00:22:47.730
very fast. It's a critical safety limit. You

00:22:47.730 --> 00:22:49.910
got to be careful with that one. Got it. Max

00:22:49.910 --> 00:22:52.869
iterations is a guardrail against infinite loops

00:22:52.869 --> 00:22:55.869
and uncontrolled spending. Anything else in that

00:22:55.869 --> 00:22:58.069
node we should know about? Return intermediate

00:22:58.069 --> 00:23:01.119
steps. This is super useful during development

00:23:01.119 --> 00:23:03.740
and debugging. How so? It shows you the agent's

00:23:03.740 --> 00:23:06.859
entire thought process. Every time it thinks,

00:23:07.000 --> 00:23:09.160
every tool it considers, every tool it calls,

00:23:09.420 --> 00:23:12.519
the results it got back, how it decided its next

00:23:12.519 --> 00:23:15.920
step. It's invaluable for understanding why an

00:23:15.920 --> 00:23:18.079
agent did what it did and troubleshooting problems.

00:23:18.380 --> 00:23:21.309
But you turn that off in production. For performance,

00:23:21.410 --> 00:23:23.769
maybe. Yes, definitely. It makes the output much

00:23:23.769 --> 00:23:25.650
cleaner for the rest of your workflow once you've

00:23:25.650 --> 00:23:27.910
got the agent working reliably. You don't need

00:23:27.910 --> 00:23:30.349
all that internal monologue clogging up the final

00:23:30.349 --> 00:23:32.349
result. Makes sense. And finally, automatically

00:23:32.349 --> 00:23:35.150
pass through binary images. That sounds specific.

00:23:35.490 --> 00:23:37.450
Yeah, you just want to enable that. If you're

00:23:37.450 --> 00:23:39.269
building multi -model workflows that involve

00:23:39.269 --> 00:23:41.890
images, for example, if you're sending an image

00:23:41.890 --> 00:23:45.809
to, say, GPT -4 for analysis, it just ensures

00:23:45.809 --> 00:23:48.349
the image data is handled correctly by the node

00:23:48.349 --> 00:23:51.559
and passed to the model. Simple toggle, but important

00:23:51.559 --> 00:23:54.279
for vision tasks. Okay, well, we covered all

00:23:54.279 --> 00:23:57.539
seven areas. Choosing the right brain or routing.

00:23:58.019 --> 00:24:00.960
Controlling creativity and reliability, TempTotP.

00:24:01.400 --> 00:24:04.140
Scalable memory with a database. Configuring

00:24:04.140 --> 00:24:07.000
tools intelligently. Predefining. Writing professional

00:24:07.000 --> 00:24:09.900
structured prompts. Using output parsers to guarantee

00:24:09.900 --> 00:24:12.720
format. And setting agent execution limits like

00:24:12.720 --> 00:24:15.740
max iterations. That is... That is so much more

00:24:15.740 --> 00:24:18.000
than just writing a prompt and, you know, hitting

00:24:18.000 --> 00:24:20.220
go. It really is. And the source material really

00:24:20.220 --> 00:24:22.559
nails the impact this has on the results you

00:24:22.559 --> 00:24:24.500
get. It's night and day. Yeah, let's talk about

00:24:24.500 --> 00:24:26.200
that. What's the real world difference between

00:24:26.200 --> 00:24:29.099
building an agent the amateur way versus applying

00:24:29.099 --> 00:24:30.880
these principles? What does it actually look

00:24:30.880 --> 00:24:32.400
like? Well, before applying these professional

00:24:32.400 --> 00:24:34.779
configurations, your agents work sometimes, right?

00:24:34.940 --> 00:24:37.859
Yeah, sometimes. Maybe. The outputs are unpredictable.

00:24:38.319 --> 00:24:40.819
They constantly break downstream workflows. Your

00:24:40.819 --> 00:24:42.720
costs are probably higher than they need to be

00:24:42.720 --> 00:24:44.579
because you're using the wrong models. And you

00:24:44.579 --> 00:24:47.900
spend hours manually debugging failures. It's

00:24:47.900 --> 00:24:51.200
honestly just an unreliable toy. Yeah, that sounds

00:24:51.200 --> 00:24:54.240
painfully familiar to, I think, anyone who's

00:24:54.240 --> 00:24:56.299
tried this at any scale. Yeah. That debugging

00:24:56.299 --> 00:24:59.099
time is killer. Exactly. After applying professional

00:24:59.099 --> 00:25:02.380
configuration, you get consistent... predictable

00:25:02.380 --> 00:25:05.640
behavior structured outputs that integrate seamlessly

00:25:05.640 --> 00:25:08.539
into the rest of your automation optimized costs

00:25:08.539 --> 00:25:10.579
because you're dynamically routing to the right

00:25:10.579 --> 00:25:13.680
model and resilient self -correcting systems

00:25:13.680 --> 00:25:16.299
you can actually depend on it transforms from

00:25:16.299 --> 00:25:19.680
a toy into a production asset So it is not just

00:25:19.680 --> 00:25:21.720
a little bit better quality output. It's about

00:25:21.720 --> 00:25:23.420
building something that's actually scalable,

00:25:23.640 --> 00:25:26.680
reliable, and can be the backbone for real business

00:25:26.680 --> 00:25:29.400
processes, something you can trust. Precisely.

00:25:29.400 --> 00:25:31.480
It's moving beyond the demo phase and building

00:25:31.480 --> 00:25:34.299
automation that can fundamentally change how

00:25:34.299 --> 00:25:35.980
you operate. It's serious business automation.

00:25:36.440 --> 00:25:39.119
This deep dive really shifted how I think about

00:25:39.119 --> 00:25:43.319
building these. It's not magic or just hoping

00:25:43.319 --> 00:25:46.869
the AI is smart enough. It's architecture. It's

00:25:46.869 --> 00:25:49.420
engineering. It's deliberate configuration. That's

00:25:49.420 --> 00:25:51.640
the core message the source wants to hammer home,

00:25:51.799 --> 00:25:54.940
I think. Stop treating NEs and AI agents like

00:25:54.940 --> 00:25:57.160
black boxes where you just throw a prompt in

00:25:57.160 --> 00:25:59.740
and hope. Start configuring them like the sophisticated,

00:26:00.079 --> 00:26:03.079
tunable tools they are. The competitive advantage

00:26:03.079 --> 00:26:05.319
isn't having some secret model nobody else has.

00:26:05.440 --> 00:26:07.960
It's how you configure, how you constrain, and

00:26:07.960 --> 00:26:09.900
how you architect them. That's where the value

00:26:09.900 --> 00:26:11.900
is. Yeah, it's like the slightly less glamorous

00:26:11.900 --> 00:26:14.160
stuff, the plumbing and the wiring. It actually

00:26:14.160 --> 00:26:17.140
makes the cool, visible stuff work reliably.

00:26:17.380 --> 00:26:19.180
The behind -the -scenes tuning. Totally. The

00:26:19.180 --> 00:26:21.779
companies and teams who are building truly reliable,

00:26:21.940 --> 00:26:24.519
innovative AI automation, they are absolutely

00:26:24.519 --> 00:26:26.680
using these kinds of professional configurations

00:26:26.680 --> 00:26:29.599
that, frankly, most online tutorials just never

00:26:29.599 --> 00:26:32.680
cover. This knowledge right here is the competitive

00:26:32.680 --> 00:26:35.480
advantage. That's a powerful takeaway. OK, so

00:26:35.480 --> 00:26:38.539
for you listening, maybe stop seeing your NNN

00:26:38.539 --> 00:26:41.380
AI agents as just, you know, toys you're hoping

00:26:41.380 --> 00:26:43.900
will perform well. Start seeing them as systems

00:26:43.900 --> 00:26:46.400
you can design, control and make dependable.

00:26:46.730 --> 00:26:49.109
And maybe think about this. What specific part

00:26:49.109 --> 00:26:51.529
of your current workflow is maybe relying on

00:26:51.529 --> 00:26:54.549
an unpredictable AI toy right now? Where are

00:26:54.549 --> 00:26:57.630
you just crossing your fingers? Hmm. Yeah. I

00:26:57.630 --> 00:27:00.059
can think of a few places in my own setups. How

00:27:00.059 --> 00:27:02.759
would applying even just one or two of these

00:27:02.759 --> 00:27:05.660
principles, maybe better model selection using

00:27:05.660 --> 00:27:08.400
a router, or dialing in temperature and top P

00:27:08.400 --> 00:27:11.000
correctly, or implementing structured output

00:27:11.000 --> 00:27:14.240
parsers, transform it from an unreliable mess

00:27:14.240 --> 00:27:16.859
into a dependable production grade system? Yeah.

00:27:16.900 --> 00:27:18.740
And how would that fundamental shift, building

00:27:18.740 --> 00:27:21.420
that reliability, change what you can even attempt

00:27:21.420 --> 00:27:23.920
to build next? What new possibilities open up

00:27:23.920 --> 00:27:25.509
when you know it's going to work? That's the

00:27:25.509 --> 00:27:27.750
provocative question, right? What becomes possible

00:27:27.750 --> 00:27:29.650
when the AI tool is truly reliable?
