WEBVTT

00:00:00.000 --> 00:00:01.659
I think we've all seen the pitch lately. It's

00:00:01.659 --> 00:00:04.900
really seductive. An AI that can build these

00:00:04.900 --> 00:00:07.500
incredibly complex business automations for you.

00:00:07.580 --> 00:00:09.759
Oh, yeah. You just describe what you want and,

00:00:09.839 --> 00:00:11.759
you know, magic is supposed to happen. It's like

00:00:11.759 --> 00:00:13.580
a magic wand, right? The idea of turning hours

00:00:13.580 --> 00:00:15.640
of coding into just a few seconds of typing.

00:00:15.960 --> 00:00:19.280
But here's the immediate reality check we found

00:00:19.280 --> 00:00:22.940
from the testing. It is a supremely powerful

00:00:22.940 --> 00:00:26.800
tool. It's incredibly fast. But the first version

00:00:26.800 --> 00:00:30.109
it gives you, it breaks a lot. Welcome back to

00:00:30.109 --> 00:00:33.030
the Deep Dive. Today, we're taking a really close

00:00:33.030 --> 00:00:35.630
look at the real -world performance of text -to

00:00:35.630 --> 00:00:39.229
-automation. We're focusing on NN's new AI Workflow

00:00:39.229 --> 00:00:42.549
Builder, and it's all based on four pretty rigorous

00:00:42.549 --> 00:00:44.789
stress tests from our source material. Right.

00:00:44.829 --> 00:00:47.350
So just to set the table here, the AI Workflow

00:00:47.350 --> 00:00:49.390
Builder is a feature where you use plain English

00:00:49.390 --> 00:00:52.210
to describe what you want. Okay. And in response,

00:00:52.630 --> 00:00:55.289
the AI generates the entire node structure for

00:00:55.289 --> 00:00:57.409
you. And that's basically the pre -wired blueprint

00:00:57.409 --> 00:01:00.049
of your automation. Okay, so let's get right

00:01:00.049 --> 00:01:03.509
into it. Our mission here is to figure out if

00:01:03.509 --> 00:01:06.170
this tool really replaces needing to know how

00:01:06.170 --> 00:01:09.290
to build automations yourself. Yeah. Or if it's

00:01:09.290 --> 00:01:12.310
more of a, what we're calling a powerful skeleton

00:01:12.310 --> 00:01:15.329
generator. Exactly. We need to know what... you

00:01:15.329 --> 00:01:17.629
listening still have to bring to the table. Absolutely.

00:01:17.890 --> 00:01:20.689
And these four demos, they really show you exactly

00:01:20.689 --> 00:01:24.510
where that human touch is still completely essential.

00:01:24.750 --> 00:01:26.730
So the promise is pretty thrilling. You're taking

00:01:26.730 --> 00:01:29.310
what could be hours of careful manual workflow

00:01:29.310 --> 00:01:32.959
building and turning it into just. Few minutes

00:01:32.959 --> 00:01:35.780
of prompting. That's a huge acceleration. And

00:01:35.780 --> 00:01:37.900
the reality, I mean, it does confirm that the

00:01:37.900 --> 00:01:40.540
AI massively speeds up the process. It creates

00:01:40.540 --> 00:01:43.219
these logical sequential structures so fast.

00:01:43.280 --> 00:01:45.739
But every single time it needs a human to come

00:01:45.739 --> 00:01:48.480
in and fix the configuration details. And especially

00:01:48.480 --> 00:01:51.319
the little quirks from third party APIs. It needs

00:01:51.319 --> 00:01:53.859
guidance to get from the idea to something that

00:01:53.859 --> 00:01:55.560
actually works. You know, I still wrestle with

00:01:55.560 --> 00:01:59.510
prompt drift myself. And that's. That's a vulnerable

00:01:59.510 --> 00:02:01.109
admission for someone who writes these things

00:02:01.109 --> 00:02:03.590
every day. Getting that initial text just right

00:02:03.590 --> 00:02:05.750
is really hard, even when you know exactly what

00:02:05.750 --> 00:02:07.989
you want. It's totally understandable. Prompt

00:02:07.989 --> 00:02:10.770
drift in automation is uniquely tricky because

00:02:10.770 --> 00:02:13.930
it's not just the AI changing the words. Right.

00:02:14.009 --> 00:02:17.030
It's when it subtly changes the core logic of

00:02:17.030 --> 00:02:19.930
how the thing runs or some tiny parameter that

00:02:19.930 --> 00:02:22.840
just makes the whole thing fail. So that translation

00:02:22.840 --> 00:02:25.759
from human language to a working workflow is

00:02:25.759 --> 00:02:28.240
really the hardest part. If the AI builds these

00:02:28.240 --> 00:02:30.639
skeletons so quickly, what's the biggest, most

00:02:30.639 --> 00:02:33.319
common failure that trips people up? The main

00:02:33.319 --> 00:02:35.900
culprits are almost always hidden settings in

00:02:35.900 --> 00:02:39.680
third -party APIs and variable mapping errors

00:02:39.680 --> 00:02:42.099
that the AI itself actually creates. Okay, so

00:02:42.099 --> 00:02:44.300
configuration details and empty variables. They're

00:02:44.300 --> 00:02:46.199
the architects of failure. Got it. Let's get

00:02:46.199 --> 00:02:48.620
into a practical example then. Demo one. This

00:02:48.620 --> 00:02:50.680
was about building a daily newsletter workflow.

00:02:50.979 --> 00:02:53.400
A really common task. The prompt seems straightforward.

00:02:53.860 --> 00:02:56.539
Research tech trends using a tool called Tavli.

00:02:56.719 --> 00:02:59.340
Find an AI tool with perplexity, add a quote,

00:02:59.479 --> 00:03:02.520
and then just email the results. And visually...

00:03:02.909 --> 00:03:05.669
What the AI came back with was flawless. It built

00:03:05.669 --> 00:03:08.870
a perfect five -node structure. You know, scheduled

00:03:08.870 --> 00:03:11.550
the two research nodes, a code generator, and

00:03:11.550 --> 00:03:13.770
the email node. And it did this in, like, five

00:03:13.770 --> 00:03:15.710
seconds. About five seconds, yeah. Everything

00:03:15.710 --> 00:03:17.650
looked connected. It looked ready to go. But

00:03:17.650 --> 00:03:20.650
when it actually ran, there were no errors, right?

00:03:20.770 --> 00:03:22.509
The workflow said it completed successfully.

00:03:22.830 --> 00:03:25.370
Exactly. It completed successfully. But the email

00:03:25.370 --> 00:03:28.310
that arrived was, well, it was almost empty.

00:03:28.889 --> 00:03:32.270
No tech trends, no AI tool, nothing. Just the

00:03:32.270 --> 00:03:35.629
boilerplate text. It failed silently. And that's

00:03:35.629 --> 00:03:38.250
a classic silent failure. So what was the culprit

00:03:38.250 --> 00:03:41.189
when they dug in? It was a critical but totally

00:03:41.189 --> 00:03:44.030
hidden setting inside the Tavoli node. By default,

00:03:44.229 --> 00:03:46.449
Tavoli just sends back a summary to be efficient.

00:03:46.629 --> 00:03:49.550
Okay. But to get the actual raw data to pass

00:03:49.550 --> 00:03:52.009
to the next node in the chain, you have to manually

00:03:52.009 --> 00:03:54.669
check a box called include response. So it's

00:03:54.669 --> 00:03:56.650
basically a hidden checkbox that's off by default.

00:03:56.810 --> 00:04:00.150
Exactly. The AI understands the generic NAN node

00:04:00.150 --> 00:04:03.250
perfectly. It knows the node exists. But it has

00:04:03.250 --> 00:04:06.409
no idea about that specific obscure setting for

00:04:06.409 --> 00:04:08.830
that one third -party service. So the variable

00:04:08.830 --> 00:04:11.050
it was supposed to pass on was never even created.

00:04:11.310 --> 00:04:13.770
It was never created. So the next node just got

00:04:13.770 --> 00:04:18.129
empty air. So the AI missed a required hidden

00:04:18.129 --> 00:04:20.610
checkbox. For you listening, what's the one thing

00:04:20.610 --> 00:04:22.920
we always have to check manually? in these ai

00:04:22.920 --> 00:04:25.500
built flows always always examine the optional

00:04:25.500 --> 00:04:28.459
settings and node parameters for any third -party

00:04:28.459 --> 00:04:30.740
services especially the ones that handle external

00:04:30.740 --> 00:04:33.199
data okay let's switch gears to demo two because

00:04:33.199 --> 00:04:35.779
this one shows a completely different side of

00:04:35.779 --> 00:04:38.019
the ai yeah a really positive one this was a

00:04:38.019 --> 00:04:40.860
sales brief generator the first try failed it

00:04:40.860 --> 00:04:43.939
was a variable name mismatch the ai created and

00:04:43.939 --> 00:04:46.220
it also chose the wrong model And here's where

00:04:46.220 --> 00:04:48.279
it gets really cool. So instead of spending,

00:04:48.279 --> 00:04:50.899
you know, 20 minutes manually hunting for that

00:04:50.899 --> 00:04:52.980
tiny error, which is just agonizing. I live in

00:04:52.980 --> 00:04:55.540
there. It's the worst. Right. The person building

00:04:55.540 --> 00:04:58.319
this used the AI itself as a troubleshooting

00:04:58.319 --> 00:05:00.759
partner. They just copied the raw error message,

00:05:00.939 --> 00:05:03.500
all that confusing code, and pasted it right

00:05:03.500 --> 00:05:05.980
back into the AI builder. That is a fascinating

00:05:05.980 --> 00:05:09.540
move. Using the AI as its own diagnostic tool.

00:05:09.759 --> 00:05:13.310
What happened? The result was stunning. The AI

00:05:13.310 --> 00:05:16.350
successfully diagnosed and instantly fixed the

00:05:16.350 --> 00:05:19.189
variable name mismatch, an error that it had

00:05:19.189 --> 00:05:21.970
created. It saw the error context and just provided

00:05:21.970 --> 00:05:23.810
the corrected workflow. Wait a minute. So the

00:05:23.810 --> 00:05:26.069
AI is actually better at fixing its own mistakes

00:05:26.069 --> 00:05:28.089
than we are at finding them. I mean, does that

00:05:28.089 --> 00:05:30.970
genuinely save time or is it just a loop? No,

00:05:31.029 --> 00:05:33.769
it genuinely saves a ton of time because it knows

00:05:33.769 --> 00:05:36.610
the internal language of the nodes better than

00:05:36.610 --> 00:05:39.449
a human can track every single data point. When

00:05:39.449 --> 00:05:42.040
you give it the context of an error. It's excellent

00:05:42.040 --> 00:05:44.100
at finding its own mistakes. It just gets rid

00:05:44.100 --> 00:05:46.600
of that horrible manual bug hunting. Okay, so

00:05:46.600 --> 00:05:48.839
that's where the real power lies. It's in that

00:05:48.839 --> 00:05:51.259
iterative improvement, using its own error messages

00:05:51.259 --> 00:05:54.160
as feedback. Now let's look at the biggest cost,

00:05:54.420 --> 00:05:57.540
the ambiguity trap, demo three. Ah, yes. The

00:05:57.540 --> 00:06:00.120
prompt here was so vague. It was just, build

00:06:00.120 --> 00:06:02.259
a multi -agent setup that can look into a subject,

00:06:02.459 --> 00:06:04.399
confirm what's accurate, and pull the results

00:06:04.399 --> 00:06:08.050
together. That prompt is. It's dangerously ambiguous.

00:06:08.310 --> 00:06:11.990
You're giving the AI zero constraints, no trigger,

00:06:12.149 --> 00:06:15.269
no data source, no format, nothing. And if you

00:06:15.269 --> 00:06:18.170
don't constrain it, the AI will always, always

00:06:18.170 --> 00:06:20.509
aim for the most complex solution it can think

00:06:20.509 --> 00:06:22.889
of. So what did it spit out when it got that

00:06:22.889 --> 00:06:25.730
vague request? It basically hallucinated. It

00:06:25.730 --> 00:06:28.730
created this ridiculously over -engineered, confusing,

00:06:29.009 --> 00:06:31.410
and totally broken workflow. Oh, wow. It had

00:06:31.410 --> 00:06:34.189
an orchestrator agent, multiple sub -agents all

00:06:34.189 --> 00:06:36.670
running in parallel, manual triggers, complex

00:06:36.670 --> 00:06:39.649
branching logic. It was just spaghetti, the kind

00:06:39.649 --> 00:06:41.490
of thing that instantly fails when it tries to

00:06:41.490 --> 00:06:44.769
merge data. So vague input leads directly to

00:06:44.769 --> 00:06:47.050
these over -engineered, broken workflows. And

00:06:47.050 --> 00:06:48.670
that costs time, but there's a real financial

00:06:48.670 --> 00:06:51.560
cost here too, isn't there? Yes. This is so important.

00:06:51.660 --> 00:06:54.240
The AN8s and cloud plans have mostly usage credits

00:06:54.240 --> 00:06:56.800
for the AI. Right. And generating a massive,

00:06:56.920 --> 00:06:59.660
broken, complex workflow like that. It just burns

00:06:59.660 --> 00:07:02.060
through your credits instantly. Sloppy, ambiguous

00:07:02.060 --> 00:07:04.899
prompts literally cost you money. That really

00:07:04.899 --> 00:07:07.339
drives home the need for detailed prompts from

00:07:07.339 --> 00:07:09.560
the start instead of just wasting credits on

00:07:09.560 --> 00:07:13.040
iteration. So what is the easiest way to stop

00:07:13.040 --> 00:07:15.680
the AI from... over -complicating a build. Just

00:07:15.680 --> 00:07:17.959
force it to build linear, sequential workflows.

00:07:18.600 --> 00:07:21.360
That simple constraint prevents almost all of

00:07:21.360 --> 00:07:23.519
the complex data -merging errors that happen

00:07:23.519 --> 00:07:26.100
when parallel branches try to combine their results.

00:07:26.420 --> 00:07:28.459
Okay, so let's contrast all that failure with

00:07:28.459 --> 00:07:31.060
the one that worked. Demo 4. This was another

00:07:31.060 --> 00:07:33.339
daily newsletter, but this time the prompt was

00:07:33.339 --> 00:07:36.850
basically a full project brief. The success was

00:07:36.850 --> 00:07:39.970
100 % due to specificity. The prompt laid everything

00:07:39.970 --> 00:07:42.550
out. The schedule was 6 a .m. The data source

00:07:42.550 --> 00:07:45.329
was Tavli. It even specified the exact configuration

00:07:45.329 --> 00:07:47.930
setting. The include response one. Include response,

00:07:48.209 --> 00:07:50.620
yeah. explicitly mentioning that to fix the error

00:07:50.620 --> 00:07:53.779
from demo one. And it even specified the AI model

00:07:53.779 --> 00:07:57.360
to use Anthropics Cloud 4 .5 Sonnet because it's

00:07:57.360 --> 00:07:59.439
better at handling complex instructions. So you

00:07:59.439 --> 00:08:02.019
basically addressed every single failure point

00:08:02.019 --> 00:08:04.540
from the other tests all in one perfect prompt.

00:08:04.779 --> 00:08:08.259
Almost. The first bill it gave us still had one

00:08:08.259 --> 00:08:11.420
tiny issue. Even with all that detail, it tried

00:08:11.420 --> 00:08:13.560
to run the four research searches in parallel.

00:08:14.189 --> 00:08:16.670
which still risk messing up the data merging.

00:08:16.990 --> 00:08:20.149
So how did you fix that structural problem? A

00:08:20.149 --> 00:08:22.569
single line in the chat. Just a command that

00:08:22.569 --> 00:08:25.189
said, force this into a fully linear structure.

00:08:25.470 --> 00:08:28.689
And that was it. The final result was a perfectly

00:08:28.689 --> 00:08:31.529
formatted HTML newsletter that worked on the

00:08:31.529 --> 00:08:35.049
very first try. Whoa. I mean, just imagine creating

00:08:35.049 --> 00:08:37.830
a production -ready multi -step workflow in seconds.

00:08:38.840 --> 00:08:40.879
Just by providing that level of detail, that

00:08:40.879 --> 00:08:43.279
really is the acceleration promise. It is. It

00:08:43.279 --> 00:08:45.419
really shows that the future of this isn't learning

00:08:45.419 --> 00:08:48.279
less. It's about learning how to be an incredibly

00:08:48.279 --> 00:08:51.399
precise project manager for an AI. Which brings

00:08:51.399 --> 00:08:53.620
us to the three core principles from these tests.

00:08:54.179 --> 00:08:56.820
First, be as detailed as possible. Think of it

00:08:56.820 --> 00:08:58.700
like you're briefing a junior developer. Right.

00:08:59.070 --> 00:09:01.389
Specify everything, the tools, the exact settings,

00:09:01.490 --> 00:09:04.070
the output you need. Second, don't expect it

00:09:04.070 --> 00:09:06.350
to be perfect on the first try. Plan on using

00:09:06.350 --> 00:09:08.710
the AI to help you debug the errors it's inevitably

00:09:08.710 --> 00:09:11.970
going to make. And third, prefer linear workflows.

00:09:12.710 --> 00:09:15.399
Just avoid the complex branching. They're easier

00:09:15.399 --> 00:09:18.539
to build, easier to test, and way, way easier

00:09:18.539 --> 00:09:22.179
to debug. So the critical takeaway about prompting

00:09:22.179 --> 00:09:24.740
these AI automation builders? Specificity is

00:09:24.740 --> 00:09:27.299
everything. The more detailed your instructions,

00:09:27.559 --> 00:09:30.100
especially on configuration, the better your

00:09:30.100 --> 00:09:32.620
results will be and the faster you'll get a working

00:09:32.620 --> 00:09:35.259
automation. That brings up a huge question for

00:09:35.259 --> 00:09:37.580
anyone looking at these tools. Is it even worth

00:09:37.580 --> 00:09:40.360
it anymore to learn a platform like NENA manually

00:09:40.360 --> 00:09:43.580
if the AI can just build the skeleton for you?

00:09:43.909 --> 00:09:47.389
The answer is definitively yes, absolutely. Your

00:09:47.389 --> 00:09:50.230
manual knowledge of how processes work is still

00:09:50.230 --> 00:09:52.409
critical. If you can't even articulate the steps

00:09:52.409 --> 00:09:53.990
of what you're trying to do, you can't write

00:09:53.990 --> 00:09:55.710
a good prompt. And you still need to understand

00:09:55.710 --> 00:09:57.649
data transformation, don't you? You have to know

00:09:57.649 --> 00:10:00.549
why a variable is empty or how to reformat data.

00:10:00.710 --> 00:10:03.669
That's a fundamental troubleshooting skill. Precisely.

00:10:03.950 --> 00:10:07.049
The AI is an architecture generator. You provide

00:10:07.049 --> 00:10:09.409
the logic. You provide the muscle and the nervous

00:10:09.409 --> 00:10:11.470
system that makes it all work. Right now, these

00:10:11.470 --> 00:10:14.470
tools really struggle with third -party API details

00:10:14.470 --> 00:10:17.870
and complex variable mapping, which just proves

00:10:17.870 --> 00:10:21.370
you need that expert human eye. So to sum up

00:10:21.370 --> 00:10:24.549
the big idea here. The AI workflow builder isn't

00:10:24.549 --> 00:10:26.370
a replacement for learning the fundamentals of

00:10:26.370 --> 00:10:28.950
automation, but it is an incredible accelerator

00:10:28.950 --> 00:10:31.909
that makes expert builders maybe 10 times faster.

00:10:32.049 --> 00:10:34.269
That's it. But anyone who tries to skip the learning

00:10:34.269 --> 00:10:36.590
part is just going to get stuck in a constant

00:10:36.590 --> 00:10:39.450
and expensive debugging cycle. Learn the platform

00:10:39.450 --> 00:10:42.990
first, then use the AI to go faster. So we'd

00:10:42.990 --> 00:10:45.179
encourage you to try this. Go automate a simple

00:10:45.179 --> 00:10:48.259
linear process by hand first. Then ask the AI

00:10:48.259 --> 00:10:50.279
to generate the skeleton for the same thing and

00:10:50.279 --> 00:10:51.840
just compare them. See what's different and see

00:10:51.840 --> 00:10:54.059
what tiny configuration details the AI missed.

00:10:54.360 --> 00:10:55.960
And here's a final thought to leave you with.

00:10:56.080 --> 00:10:59.259
If the AI can reliably diagnose its own variable

00:10:59.259 --> 00:11:02.620
and configuration errors, maybe even better than

00:11:02.620 --> 00:11:06.000
a human can spot them. How long is it until AI

00:11:06.000 --> 00:11:08.639
just masters the documentation for every third

00:11:08.639 --> 00:11:11.379
-party API out there? I mean, how long until...

00:11:12.399 --> 00:11:15.559
specific manual configuration becomes truly obsolete

00:11:15.559 --> 00:11:17.820
because the AI just knows that hidden checkbox

00:11:17.820 --> 00:11:20.379
needs to be ticked every single time. That's

00:11:20.379 --> 00:11:21.940
something to think about. Until next time. Something

00:11:21.940 --> 00:11:22.259
serious.