WEBVTT

00:00:00.000 --> 00:00:03.439
Imagine having an entire creative agency and

00:00:03.439 --> 00:00:06.719
a personal assistant, even a full market research

00:00:06.719 --> 00:00:08.880
firm, all working just for you, you know, 24

00:00:08.880 --> 00:00:12.160
-7 and costing less than hiring a single freelancer.

00:00:12.699 --> 00:00:15.019
It really does sound like something from, well,

00:00:15.160 --> 00:00:19.339
the future. But this isn't science fiction anymore.

00:00:19.440 --> 00:00:21.500
It's actually here. Yeah. Welcome, everyone,

00:00:21.519 --> 00:00:23.300
to the Deep Dive. This is where we try to unpack

00:00:23.300 --> 00:00:25.640
some pretty complex ideas and, you know, make

00:00:25.640 --> 00:00:28.210
them accessible for you. Today we're diving into

00:00:28.210 --> 00:00:31.030
something genuinely fascinating, a new guide

00:00:31.030 --> 00:00:34.710
on building a specialized AI agent swarm. Basically,

00:00:34.789 --> 00:00:37.109
a system that can automate your entire media

00:00:37.109 --> 00:00:39.009
production. That's right. We'll explore exactly

00:00:39.009 --> 00:00:42.070
how this AI army, as you could call it, functions.

00:00:42.429 --> 00:00:44.890
Everything from creating really stunning visual

00:00:44.890 --> 00:00:47.049
ads to doing deep market research. We're going

00:00:47.049 --> 00:00:49.130
to look at some real world examples, peek under

00:00:49.130 --> 00:00:51.189
the hood at its architecture, and yeah, even

00:00:51.189 --> 00:00:54.170
break down the costs and the setup. Our goal

00:00:54.170 --> 00:00:56.250
today really is to give you a shortcut, a way

00:00:56.250 --> 00:00:58.210
to understand this powerful kind of game changing

00:00:58.210 --> 00:01:01.229
tech without getting lost in the weeds. So let's

00:01:01.229 --> 00:01:03.789
get into it. Okay, so this core concept we're

00:01:03.789 --> 00:01:06.329
digging into today, it feels genuinely groundbreaking.

00:01:06.989 --> 00:01:10.069
It's all about building this specialized AI agent

00:01:10.069 --> 00:01:14.810
swarm, and it uses a tool called N8N. Now, for

00:01:14.810 --> 00:01:18.329
those maybe not familiar, N8N is this really

00:01:18.329 --> 00:01:21.590
powerful open source platform. Think of it like

00:01:21.590 --> 00:01:24.390
a digital conductor, maybe, connecting different

00:01:24.390 --> 00:01:27.069
services to automate complex stuff, often without

00:01:27.069 --> 00:01:29.329
needing much code. Right. And when we say swarm,

00:01:29.469 --> 00:01:32.849
we mean like a team of AI agents all working

00:01:32.849 --> 00:01:34.689
together, coordinated. And like you said, this

00:01:34.689 --> 00:01:37.140
isn't some five -year plan idea. People are building

00:01:37.140 --> 00:01:38.780
and using this stuff right now. So what makes

00:01:38.780 --> 00:01:41.840
this system so special compared to, say, other

00:01:41.840 --> 00:01:44.040
automation tools people might already be using?

00:01:44.260 --> 00:01:46.099
Well, the big thing I think is how it shatters

00:01:46.099 --> 00:01:48.390
this barrier. You know, the one between the analytical

00:01:48.390 --> 00:01:50.709
business stuff and the creative work. Traditionally,

00:01:50.769 --> 00:01:52.670
automation was kind of stuck in silos. You'd

00:01:52.670 --> 00:01:55.310
have one tool for data, another for design. Well,

00:01:55.390 --> 00:01:57.269
they didn't talk much. The system, though, it

00:01:57.269 --> 00:01:59.329
creates this integrated team. It really is like

00:01:59.329 --> 00:02:02.569
having your own well -oiled creative agency all

00:02:02.569 --> 00:02:04.909
connected. That's a great way to put it. A well

00:02:04.909 --> 00:02:08.050
-oiled creative agency. So can you tell us a

00:02:08.050 --> 00:02:10.830
bit more? What does this AI agency actually do?

00:02:10.930 --> 00:02:13.189
What are the capabilities? Oh, absolutely. It

00:02:13.189 --> 00:02:15.090
wears a lot of hats. First off, it's your personal

00:02:15.090 --> 00:02:16.819
assistant. It can manage email. emails, your

00:02:16.819 --> 00:02:19.259
calendar, even organize files in Google Drive.

00:02:19.800 --> 00:02:22.460
Pretty handy. Then it's this creative powerhouse,

00:02:22.580 --> 00:02:25.259
generating images completely from scratch, doing

00:02:25.259 --> 00:02:27.719
complex edits like you'd see in Photoshop, even

00:02:27.719 --> 00:02:30.719
creating full videos. It can take a boring static

00:02:30.719 --> 00:02:33.560
image and turn it into a dynamic ad. It's also

00:02:33.560 --> 00:02:35.939
a social media manager. It'll schedule and post

00:02:35.939 --> 00:02:38.379
content automatically to X, TikTok, Instagram,

00:02:38.659 --> 00:02:41.319
whatever you need. And beyond just making stuff,

00:02:41.539 --> 00:02:43.960
it's a research analyst. It can scrape platforms

00:02:43.960 --> 00:02:46.560
for trends, pull insights, and compile it all

00:02:46.560 --> 00:02:48.939
into a professional Google Doc report. Wow, that's

00:02:48.939 --> 00:02:53.259
quite... Yeah. And one cool detail. It's also

00:02:53.259 --> 00:02:56.639
a meticulous account. It logs every single action,

00:02:56.699 --> 00:02:59.180
every success, every failure, even the exact

00:02:59.180 --> 00:03:02.020
token usage for each task. All in a detailed

00:03:02.020 --> 00:03:05.240
log. But the real magic isn't just the list.

00:03:05.379 --> 00:03:07.780
It's how seamlessly it all works together. You

00:03:07.780 --> 00:03:10.500
can literally upload one image, say, make me

00:03:10.500 --> 00:03:12.919
a VFX ad. That's visual effects, right? Motion

00:03:12.919 --> 00:03:15.599
graphics. And just watch. It creates, edits,

00:03:15.699 --> 00:03:18.000
and then publishes it across your platforms without

00:03:18.000 --> 00:03:19.800
you needing to jump between five different apps.

00:03:19.979 --> 00:03:21.960
That seamless flow, that sounds like the real

00:03:21.960 --> 00:03:24.219
breakthrough here. What's the biggest sort of

00:03:24.219 --> 00:03:27.699
aha moment people have when they see this working?

00:03:27.800 --> 00:03:29.599
Why does it hit so hard? I think the biggest

00:03:29.599 --> 00:03:33.360
aha. is seeing that integration in action, creative

00:03:33.360 --> 00:03:35.520
and analytical automation finally working together.

00:03:35.599 --> 00:03:37.800
Every campaign instantly becomes data informed.

00:03:38.039 --> 00:03:39.659
Okay, let's walk through an example then. You

00:03:39.659 --> 00:03:41.520
mentioned turning a basic product photo into

00:03:41.520 --> 00:03:44.099
an ad campaign. Exactly. Think of like a creative

00:03:44.099 --> 00:03:47.240
assembly line. So we started by sending a simple

00:03:47.240 --> 00:03:50.000
image, just headphones through Telegram. And

00:03:50.000 --> 00:03:52.360
right away, it wasn't just dumb storage. The

00:03:52.360 --> 00:03:54.300
agent put it in the like Google Drive folder,

00:03:54.439 --> 00:03:56.900
sure. But then it asked, hey, what should I name

00:03:56.900 --> 00:03:59.370
this file? That's intelligent file management

00:03:59.370 --> 00:04:01.830
from the get -go. Okay, so it's organizing intelligently,

00:04:01.830 --> 00:04:03.969
not just storing. What's the next step after

00:04:03.969 --> 00:04:06.389
it's filed? Precisely. From there, the image

00:04:06.389 --> 00:04:08.770
went to the design studio agent. We gave it a

00:04:08.770 --> 00:04:10.370
pretty simple creative brief, something like,

00:04:10.430 --> 00:04:12.689
make this look like a studio shot. Give it energy.

00:04:12.870 --> 00:04:15.050
Make it colorful. Capture the feeling of listening

00:04:15.050 --> 00:04:17.629
to music. So the creative agent takes that and

00:04:17.629 --> 00:04:19.930
figures out which tools to use, generates several

00:04:19.930 --> 00:04:21.949
different stylistic options, and then shows you

00:04:21.949 --> 00:04:24.930
low -res previews. You get to pick. The result

00:04:24.930 --> 00:04:28.819
in our test. Three totally distinct professional

00:04:28.819 --> 00:04:31.060
-looking headphone ads, different lighting, different

00:04:31.060 --> 00:04:34.019
vibes, all delivered in just minutes. Minutes

00:04:34.019 --> 00:04:36.199
for professional quality variations? That's impressive.

00:04:36.420 --> 00:04:39.540
What about adding motion? Video? Right. The final

00:04:39.540 --> 00:04:42.319
stop was the VFX studio. The command was pretty

00:04:42.319 --> 00:04:44.459
straightforward. Take that first preview image

00:04:44.459 --> 00:04:47.339
and make it a video ad. Add music, make light

00:04:47.339 --> 00:04:49.240
sync to the beat, you know, typical ad stuff.

00:04:49.459 --> 00:04:52.199
And the agent did this two ways. First, image

00:04:52.199 --> 00:04:55.160
to video. It took our edited headphone shot and

00:04:55.160 --> 00:04:57.000
brought it to life with dynamic lighting that

00:04:57.000 --> 00:04:59.920
pulsed with the music. Second, text -to -video.

00:05:00.319 --> 00:05:02.899
It generated completely new B -roll footage from

00:05:02.899 --> 00:05:05.620
scratch, just based on the prompt. Both versions

00:05:05.620 --> 00:05:08.279
had professional -grade VFX, lights perfectly

00:05:08.279 --> 00:05:11.519
synced. Whoa! I mean, just imagine the possibilities

00:05:11.519 --> 00:05:14.000
there. Creating ads with synchronized effects

00:05:14.000 --> 00:05:16.480
in minutes, not days or weeks. It's like having

00:05:16.480 --> 00:05:18.240
a Hollywood effects team in your browser. Thinking

00:05:18.240 --> 00:05:19.899
back to when I first started creating content,

00:05:19.939 --> 00:05:22.199
this would have been, well, pure science fiction.

00:05:22.620 --> 00:05:25.040
Hours, maybe days of work, just gone. Automated.

00:05:25.120 --> 00:05:27.600
That is genuinely astonishing. So with all this

00:05:27.600 --> 00:05:29.339
production automated, how does this free up a

00:05:29.339 --> 00:05:31.120
creator? What does it allow them to focus on

00:05:31.120 --> 00:05:33.899
instead? The bigger picture. By automating all

00:05:33.899 --> 00:05:36.079
that production work, creators are basically

00:05:36.079 --> 00:05:38.399
freed up. They can focus on the high -level strategy,

00:05:38.600 --> 00:05:41.540
the vision, not just the execution grind. They

00:05:41.540 --> 00:05:44.019
become architects, not just builders. Makes sense.

00:05:44.540 --> 00:05:46.319
We've talked a lot about the creation side, but

00:05:46.319 --> 00:05:49.339
you mentioned it's also a research tool. Transforming

00:05:49.339 --> 00:05:52.120
a creator into more of a data -driven strategist

00:05:52.120 --> 00:05:54.379
with this market intelligence function. How does

00:05:54.379 --> 00:05:56.920
that part work? Exactly. So we give the system

00:05:56.920 --> 00:05:59.839
a mission. Pretty simple one. Find me two high

00:05:59.839 --> 00:06:02.720
-performing videos about NAN on TikTok, Instagram,

00:06:02.879 --> 00:06:05.920
and YouTube. And what happened next was pretty

00:06:05.920 --> 00:06:08.850
cool. Parallel processing. The social media agent

00:06:08.850 --> 00:06:10.850
didn't just check TikTok, then Instagram, then

00:06:10.850 --> 00:06:14.250
YouTube. No, it deployed Epify scrapers, which

00:06:14.250 --> 00:06:16.990
are like automated browsers that grab data to

00:06:16.990 --> 00:06:19.329
all three platforms at the same time. It pulled

00:06:19.329 --> 00:06:21.490
view counts, likes, comments, info about the

00:06:21.490 --> 00:06:23.870
creators and key insights. You know, what formats

00:06:23.870 --> 00:06:26.990
work, what hooks grab attention. OK, so it gathers

00:06:26.990 --> 00:06:29.569
all this intel simultaneously. What happens with

00:06:29.569 --> 00:06:31.550
that raw data then? How does it become useful?

00:06:32.060 --> 00:06:33.800
Right, so once the scraping was done, the AI

00:06:33.800 --> 00:06:36.519
agent compiled everything, all the findings from

00:06:36.519 --> 00:06:39.839
the different platforms into one single professional

00:06:39.839 --> 00:06:42.600
-looking report in a Google Doc, like an intelligence

00:06:42.600 --> 00:06:45.079
briefing. And this report had really actionable

00:06:45.079 --> 00:06:48.019
stuff. Like on TikTok, it found short, punchy

00:06:48.019 --> 00:06:50.420
tutorials were killing it. On YouTube, it broke

00:06:50.420 --> 00:06:53.379
down why a specific, longer tutorial was so successful.

00:06:53.579 --> 00:06:56.560
For Instagram, it analyzed visual styles and

00:06:56.560 --> 00:06:59.199
caption strategies that got engagement. The most

00:06:59.199 --> 00:07:01.930
impressive part, this whole... Multi -platform

00:07:01.930 --> 00:07:05.410
research task done in minutes. Not the hours

00:07:05.410 --> 00:07:07.350
and hours of manual scrolling and note -taking

00:07:07.350 --> 00:07:09.009
it would normally take. You'd usually need a

00:07:09.009 --> 00:07:11.470
team or at least a dedicated afternoon. That

00:07:11.470 --> 00:07:14.430
speed and depth is a huge advantage. Now, you

00:07:14.430 --> 00:07:16.509
mentioned earlier this isn't just one big AI

00:07:16.509 --> 00:07:19.430
brain. Can you unpack that agent swarm idea a

00:07:19.430 --> 00:07:21.110
bit more? How is this whole thing structured?

00:07:21.290 --> 00:07:23.639
What's the architecture? Yeah, absolutely. It's

00:07:23.639 --> 00:07:26.000
key to understanding why it works so well. It's

00:07:26.000 --> 00:07:29.019
not monolithic. It's structured more like a military

00:07:29.019 --> 00:07:31.319
unit or maybe a well -run company. There's a

00:07:31.319 --> 00:07:33.399
clear hierarchy. At the top, you've got the main

00:07:33.399 --> 00:07:35.819
agent. Think of it as the general or the CEO.

00:07:36.300 --> 00:07:38.620
It's usually powered by something cost -effective

00:07:38.620 --> 00:07:41.879
like GPT -40 mini, maybe through a service like

00:07:41.879 --> 00:07:44.699
OpenRouter to optimize costs and performance.

00:07:44.939 --> 00:07:47.839
And its main job, delegate. It doesn't do the

00:07:47.839 --> 00:07:49.899
work itself. It assigns tasks to the specialists.

00:07:50.120 --> 00:07:52.079
It also keeps track of the conversation, the

00:07:52.079 --> 00:07:54.459
short -term memory, so it understands multi -step

00:07:54.459 --> 00:07:57.180
requests. Okay, so the general delegates. Who

00:07:57.180 --> 00:08:00.060
are the troops? The specialists. Exactly. Below

00:08:00.060 --> 00:08:02.399
the general are the special forces units. These

00:08:02.399 --> 00:08:04.620
are specialized AI agents, each an expert in

00:08:04.620 --> 00:08:07.560
one area. You've got the creative division. That

00:08:07.560 --> 00:08:09.639
includes the creative agent, the artist doing

00:08:09.639 --> 00:08:13.420
image creation, editing, VFX, video. And the

00:08:13.420 --> 00:08:15.519
posting agent, the publisher, getting content

00:08:15.519 --> 00:08:17.750
out. Then there's the intelligence and operations

00:08:17.750 --> 00:08:20.389
division, the social media agent, your spy scraping

00:08:20.389 --> 00:08:23.230
platforms, and the web agent, a general scout

00:08:23.230 --> 00:08:26.170
for web searches. And finally, an administrative

00:08:26.170 --> 00:08:29.050
division, the Google Drive agent, sort of the

00:08:29.050 --> 00:08:31.990
digital quartermaster managing files, and a comms

00:08:31.990 --> 00:08:34.370
team handling email, calendar, contacts, that

00:08:34.370 --> 00:08:37.259
sort of thing. That hierarchy sounds, well, complex,

00:08:37.460 --> 00:08:40.039
like building a real org chart. But does that

00:08:40.039 --> 00:08:42.539
specialization actually make the system more

00:08:42.539 --> 00:08:45.019
robust, more reliable? Yes, absolutely. That

00:08:45.019 --> 00:08:46.820
specialization is what makes it robust, reliable,

00:08:47.080 --> 00:08:49.799
and actually easier to expand later. Each agent

00:08:49.799 --> 00:08:51.980
just focuses on what it does best. Okay, let's

00:08:51.980 --> 00:08:53.720
dive deeper into that creative agent then, the

00:08:53.720 --> 00:08:55.879
secret sauce, you called it. It's workshop. How

00:08:55.879 --> 00:08:58.440
exactly does it menu image and video stuff? Right,

00:08:58.480 --> 00:09:00.700
the digital easel handles the static images.

00:09:01.179 --> 00:09:03.419
For creating a new image, you give it a detailed

00:09:03.419 --> 00:09:06.860
prompt, a name, it uses an API like OpenAI's,

00:09:06.940 --> 00:09:10.360
and bam, delivers a preview to Telegram and saves

00:09:10.360 --> 00:09:13.440
the high -res version to Google Drive. The editing

00:09:13.440 --> 00:09:15.299
workflow is even smarter, I think. You give an

00:09:15.299 --> 00:09:17.840
existing image and instructions. Instead of just

00:09:17.840 --> 00:09:20.100
making one final version, it generates multiple

00:09:20.100 --> 00:09:22.799
low -res previews first. You look at them, pick

00:09:22.799 --> 00:09:25.139
the one you like, then it renders the final high

00:09:25.139 --> 00:09:27.659
-res image. Ah, that makes sense. Saves time

00:09:27.659 --> 00:09:29.559
and compute costs if the first try isn't quite

00:09:29.559 --> 00:09:32.419
right. What about video? Yeah, exactly. Then

00:09:32.419 --> 00:09:35.019
there's the editing bay. For video. For text

00:09:35.019 --> 00:09:37.200
-to -video, you give it a text prompt. It uses

00:09:37.200 --> 00:09:39.139
a model, maybe something fast and efficient,

00:09:39.259 --> 00:09:42.259
like Google's VO3 Fast, and generates original

00:09:42.259 --> 00:09:45.360
B -roll footage. It uses this smart polling thing

00:09:45.360 --> 00:09:47.279
to know when the video's ready. And for image

00:09:47.279 --> 00:09:49.059
-to -video, like in the demo we talked about,

00:09:49.240 --> 00:09:51.440
it takes your image and adds those cool VFX,

00:09:51.679 --> 00:09:53.639
like the lights pulsing in time with the music,

00:09:53.740 --> 00:09:56.559
adds the audio track too. So it's not just executing

00:09:56.559 --> 00:09:58.620
commands, it's actively trying to improve the

00:09:58.620 --> 00:10:01.450
output. Precisely. The creative agent also has

00:10:01.450 --> 00:10:04.090
this built -in artistic philosophy. Its core

00:10:04.090 --> 00:10:06.909
instructions, its system prompt tells it to be

00:10:06.909 --> 00:10:09.970
an optimizer, not just a blind follower. So when

00:10:09.970 --> 00:10:12.110
you give it a simple idea, its first move is

00:10:12.110 --> 00:10:14.350
often to rewrite that idea into a more detailed,

00:10:14.549 --> 00:10:17.250
more stylized prompt internally. It's trying

00:10:17.250 --> 00:10:19.629
to engineer a better prompt before it even calls

00:10:19.629 --> 00:10:22.389
the image or video model. It acts like a creative

00:10:22.389 --> 00:10:23.990
director. You know, I still wrestle with prompt

00:10:23.990 --> 00:10:26.980
just... myself sometimes that's when the ai kind

00:10:26.980 --> 00:10:30.299
of wanders from your original intent over iterations

00:10:30.299 --> 00:10:32.740
so having an ai agent basically refining the

00:10:32.740 --> 00:10:34.419
prompt for you that's pretty awesome that is

00:10:34.419 --> 00:10:36.259
a huge help yeah boosting quality automatically

00:10:36.259 --> 00:10:40.639
okay so you've created this amazing content but

00:10:40.639 --> 00:10:42.879
it's not much use sitting on a hard drive how

00:10:42.879 --> 00:10:44.759
does the system handle publishing Getting it

00:10:44.759 --> 00:10:46.179
out there. All right, this is where the posting

00:10:46.179 --> 00:10:48.100
agent comes in. And the whole posting system

00:10:48.100 --> 00:10:51.559
is built on this really elegant four -step modular

00:10:51.559 --> 00:10:53.960
process. Think of it like a standard shipping

00:10:53.960 --> 00:10:56.519
container for your content. Makes things predictable.

00:10:57.100 --> 00:11:00.299
First step, file prep. Making sure the final

00:11:00.299 --> 00:11:02.639
image or video in Google Drive has the right

00:11:02.639 --> 00:11:04.919
public sharing permissions so the posting tool

00:11:04.919 --> 00:11:08.899
can grab it. Second, platform optimization. The

00:11:08.899 --> 00:11:10.879
system automatically tweets things like caption

00:11:10.879 --> 00:11:12.799
length, hashtags for whatever platform you're

00:11:12.799 --> 00:11:15.080
targeting. X needs different stuff than TikTok,

00:11:15.240 --> 00:11:18.980
right? Third, the delivery. It uses a reliable

00:11:18.980 --> 00:11:21.080
third -party tool, something like Blotato or

00:11:21.080 --> 00:11:23.940
Buffers API, to actually do the posting. These

00:11:23.940 --> 00:11:26.340
tools are built for robust delivery. And finally,

00:11:27.059 --> 00:11:28.820
Confirmation. It gives you back a submission

00:11:28.820 --> 00:11:30.659
ID so you can track that the post went through

00:11:30.659 --> 00:11:33.519
successfully. That modular approach sounds incredibly

00:11:33.519 --> 00:11:35.500
efficient. Does that really make it easier to

00:11:35.500 --> 00:11:37.980
add new platforms later on, say, if a new social

00:11:37.980 --> 00:11:40.259
network pops up? Absolutely. That's the beauty

00:11:40.259 --> 00:11:43.000
of it. The actual N8N workflows for posting to

00:11:43.000 --> 00:11:45.720
X, TikTok, and Instagram, they're almost identical.

00:11:46.019 --> 00:11:48.000
The only thing that really changes is the final

00:11:48.000 --> 00:11:50.450
destination setting. This makes the whole posting

00:11:50.450 --> 00:11:53.409
system super easy to maintain and expand. Adding

00:11:53.409 --> 00:11:55.730
support for, I don't know, LinkedIn or threads,

00:11:55.929 --> 00:11:58.370
it's basically duplicating an existing workflow

00:11:58.370 --> 00:12:00.970
and changing one or two nodes. Really straightforward.

00:12:01.269 --> 00:12:04.370
Mid -roll sponsor read. Welcome back to the Deep

00:12:04.370 --> 00:12:07.269
Dive. We've just gone through the amazing capabilities

00:12:07.269 --> 00:12:11.370
of this AI media agent swarm. Now, the crucial

00:12:11.370 --> 00:12:14.399
question, how do you tailor it? How do you make

00:12:14.399 --> 00:12:16.460
it work for your specific brand, your workflow?

00:12:16.620 --> 00:12:18.980
Where does customization begin? Yeah, this is

00:12:18.980 --> 00:12:21.120
the calibration bay, essentially. And it's designed

00:12:21.120 --> 00:12:23.559
to be flexible. You can really define your brand's

00:12:23.559 --> 00:12:25.779
unique style by tweaking the system prompts within

00:12:25.779 --> 00:12:28.000
those creative sub -workflows. That means you

00:12:28.000 --> 00:12:30.279
can adjust the default image and video prompts

00:12:30.279 --> 00:12:32.799
to match your visual aesthetic. You want minimalist

00:12:32.799 --> 00:12:35.279
and clean. Tell it that. Bold and vibrant. Tell

00:12:35.279 --> 00:12:37.700
it that. You can also tune quality settings to

00:12:37.700 --> 00:12:40.679
balance how it looks versus the cost. Okay, so

00:12:40.679 --> 00:12:42.559
you can tune the creative output. What about

00:12:42.559 --> 00:12:44.860
the research side or the social media activity?

00:12:45.259 --> 00:12:47.360
Same idea. For the social media agent, you're

00:12:47.360 --> 00:12:49.320
basically giving your spy its mission parameters.

00:12:49.659 --> 00:12:52.039
You can figure the scraping settings, focus on

00:12:52.039 --> 00:12:54.179
a specific niche, track certain competitors.

00:12:54.440 --> 00:12:56.960
You can also customize posting schedules. Maybe

00:12:56.960 --> 00:12:59.580
your audience is active at specific times. And

00:12:59.580 --> 00:13:01.580
fine -tune the default captions and hashtags

00:13:01.580 --> 00:13:04.309
to always match your brand voice. and for the

00:13:04.309 --> 00:13:07.529
research agent you refine its focus adjust the

00:13:07.529 --> 00:13:09.830
search terms limit the number of results it pulls

00:13:09.830 --> 00:13:12.149
so you don't get overwhelmed customize the Google

00:13:12.149 --> 00:13:14.429
Doc templates so reports look exactly how you

00:13:14.429 --> 00:13:17.090
want them it's all about tuning its brain right

00:13:17.090 --> 00:13:19.990
now let's talk about the operators manual specifically

00:13:19.990 --> 00:13:23.330
costs and transparency what are the real numbers

00:13:23.330 --> 00:13:26.009
involved in running something like this what

00:13:26.009 --> 00:13:28.559
should someone expect Good question. There are

00:13:28.559 --> 00:13:30.759
basically three main cost areas to think about.

00:13:30.840 --> 00:13:33.440
First is token usage. That's your primary variable

00:13:33.440 --> 00:13:35.919
cost, kind of like the fuel for the AI models.

00:13:36.120 --> 00:13:38.360
The system smartly uses cost -effective models

00:13:38.360 --> 00:13:41.220
like GPT -40 mini for most things. But sometimes,

00:13:41.440 --> 00:13:43.960
for trickier tasks, specialized agents might

00:13:43.960 --> 00:13:46.620
strategically use a more powerful, slightly pricier

00:13:46.620 --> 00:13:49.980
model like Claude or maybe GPT -4 when deep reasoning

00:13:49.980 --> 00:13:52.419
is really needed. Second is media generation,

00:13:52.679 --> 00:13:54.980
the actual cost of creating the images and videos.

00:13:55.179 --> 00:13:56.970
Using OpenAI for images, you're looking at at

00:13:56.970 --> 00:14:00.169
maybe $0 .04 per medium -quality image. For video,

00:14:00.470 --> 00:14:03.070
using something like Fel .ai with VO3, it might

00:14:03.070 --> 00:14:06.190
be around $3 .75 for a high -quality 8 -second

00:14:06.190 --> 00:14:08.789
clip with synced audio. Those costs per asset

00:14:08.789 --> 00:14:10.809
seem surprisingly low, actually, for that kind

00:14:10.809 --> 00:14:12.950
of output. What about fixed costs? Subscriptions?

00:14:13.190 --> 00:14:15.350
Yeah, the per -asset cost is pretty manageable.

00:14:15.490 --> 00:14:18.049
Then you have the monthly subscriptions, your

00:14:18.049 --> 00:14:20.769
overhead, a solid social media posting tool like

00:14:20.769 --> 00:14:23.450
Blowdado. Maybe around $29 a month to start.

00:14:23.789 --> 00:14:26.929
apify for the web scraping has flexible pricing

00:14:26.929 --> 00:14:29.509
and honestly their free tier is pretty generous

00:14:29.509 --> 00:14:31.750
for getting started or for moderate use at a

00:14:31.750 --> 00:14:34.190
pro tip here always hunt around for special deals

00:14:34.190 --> 00:14:36.549
or extended trials lots of these sauce tools

00:14:36.549 --> 00:14:39.629
offer them for new users okay and you mentioned

00:14:39.629 --> 00:14:41.970
a crucial piece earlier the black box recorder

00:14:41.970 --> 00:14:44.549
how does that fit into understanding costs and

00:14:44.549 --> 00:14:47.230
performance ah yes the black box recorder This

00:14:47.230 --> 00:14:49.490
is absolutely critical, in my opinion. It gives

00:14:49.490 --> 00:14:52.210
you a complete, unchangeable audit trail. Proof

00:14:52.210 --> 00:14:54.909
of everything your AI army does. How it works

00:14:54.909 --> 00:14:57.909
is, it logs all the execution details, timestamp,

00:14:57.970 --> 00:15:00.190
which workflow ran, the inputs, the outputs,

00:15:00.389 --> 00:15:03.070
the token usage breakdown, success or failure

00:15:03.070 --> 00:15:06.549
status, any errors, all in real time to a dedicated

00:15:06.549 --> 00:15:09.110
Google Sheet. Why is this so critical? Well,

00:15:09.129 --> 00:15:10.809
for rapid debugging, if something goes wrong.

00:15:11.289 --> 00:15:13.470
For a smart cost optimization, you can see exactly

00:15:13.470 --> 00:15:15.370
where your tokens are going. For full accountability,

00:15:15.590 --> 00:15:17.450
and ultimately for a data -driven improvement,

00:15:17.710 --> 00:15:20.230
you can actually see what's working and optimize,

00:15:20.509 --> 00:15:23.250
just like managing a human team, but with perfect

00:15:23.250 --> 00:15:25.809
data. That level of logging sounds invaluable.

00:15:26.210 --> 00:15:27.889
Okay, so for someone listening who's thinking,

00:15:27.929 --> 00:15:29.769
right, I'm ready to build my own media empire.

00:15:30.399 --> 00:15:32.299
What's the assembly manual? What are the practical

00:15:32.299 --> 00:15:34.360
first steps? All right. Step one is acquiring

00:15:34.360 --> 00:15:37.139
the schematics. You need the NAN workflow files.

00:15:37.480 --> 00:15:40.080
You can usually find these shared in online automation

00:15:40.080 --> 00:15:43.159
communities, often bundled as a ZIP file. These

00:15:43.159 --> 00:15:44.779
files are the blueprints. There are about nine

00:15:44.779 --> 00:15:47.940
essential ones. The main agent, edit image, create

00:15:47.940 --> 00:15:50.899
image, image to video, create video, the posting

00:15:50.899 --> 00:15:53.299
workflows for X, TikTok, Instagram, and the Google

00:15:53.299 --> 00:15:56.620
Doc creation workflow. Got it. Download the blueprints,

00:15:56.740 --> 00:15:58.740
then comes the assembly part. Then comes the

00:15:58.740 --> 00:16:01.639
assembly. This needs some careful, focused work.

00:16:02.179 --> 00:16:05.259
First, you import all those workflow files into

00:16:05.259 --> 00:16:08.139
your NEN setup. The absolutely critical part

00:16:08.139 --> 00:16:10.500
here is linking the tool nodes in the main agent

00:16:10.500 --> 00:16:13.279
to the correct sub -workflows. The names must

00:16:13.279 --> 00:16:16.110
match exactly for the delegation to work. Second,

00:16:16.269 --> 00:16:18.370
set up your Google environment. Create the specific

00:16:18.370 --> 00:16:20.429
folders in Google Drive like media and media

00:16:20.429 --> 00:16:22.549
analysis where it will store assets and reports.

00:16:22.830 --> 00:16:25.070
Set up the Google Sheet using the provided logging

00:16:25.070 --> 00:16:27.570
template. Third, API integration. You've got

00:16:27.570 --> 00:16:29.909
to gather all your API keys, OpenAI, Google,

00:16:30.129 --> 00:16:33.029
Telegram, Apify, your social posting tool, and

00:16:33.029 --> 00:16:35.049
add them securely into the NAN credentials section.

00:16:35.269 --> 00:16:38.549
Okay, that sounds methodical. What would you

00:16:38.549 --> 00:16:42.090
say is the biggest potential snag or hurdle someone

00:16:42.090 --> 00:16:44.529
might hit during that initial setup? The place

00:16:44.529 --> 00:16:48.320
to be extra careful. Honestly. linking all those

00:16:48.320 --> 00:16:50.480
workflows correctly in the main agent, making

00:16:50.480 --> 00:16:52.740
sure every connection is right, every name matches.

00:16:52.919 --> 00:16:55.279
It really requires attention to detail. It's

00:16:55.279 --> 00:16:57.700
like Lego blocks one piece in the wrong spot

00:16:57.700 --> 00:16:59.580
and the whole thing might not work as expected.

00:16:59.799 --> 00:17:02.179
Right. Attention to detail is key there. So let's

00:17:02.179 --> 00:17:04.480
try and bring this all together. This system,

00:17:04.500 --> 00:17:06.420
it feels like more than just automation, doesn't

00:17:06.420 --> 00:17:08.279
it? It seems like a complete paradigm shift.

00:17:08.599 --> 00:17:10.759
You're moving from being just the artist, the

00:17:10.759 --> 00:17:14.400
creator, to being the director of an entire automated

00:17:14.400 --> 00:17:17.640
creative studio. Not just saving time, you're

00:17:17.640 --> 00:17:20.599
genuinely multiplying your creative output. Totally.

00:17:20.740 --> 00:17:22.759
And look, let's be real. This isn't a five minute

00:17:22.759 --> 00:17:25.380
setup. You definitely need to plan for, you know,

00:17:25.380 --> 00:17:27.859
maybe a few hours of careful, focused configuration

00:17:27.859 --> 00:17:30.440
initially to get everything wired up right. But

00:17:30.440 --> 00:17:34.000
once it is running, you have this absolute creative

00:17:34.000 --> 00:17:37.220
powerhouse working for you. Something that if

00:17:37.220 --> 00:17:38.819
you try to replicate it with traditional tools

00:17:38.819 --> 00:17:41.299
and freelancers would easily cost thousands of

00:17:41.299 --> 00:17:43.579
dollars every single month. It's a massive competitive

00:17:43.579 --> 00:17:46.440
advantage. It really feels like the AI revolution

00:17:46.440 --> 00:17:49.680
in creative work isn't some future event. It's

00:17:49.680 --> 00:17:52.119
actually here now. And it's surprisingly accessible

00:17:52.119 --> 00:17:54.099
if you're willing to put in that initial effort

00:17:54.099 --> 00:17:56.960
to build the system. This media agent army concept

00:17:56.960 --> 00:17:59.380
gives individuals and small teams the power to

00:17:59.380 --> 00:18:01.859
compete with much larger agencies, potentially

00:18:01.859 --> 00:18:04.660
even working solo. So the final thought perhaps

00:18:04.660 --> 00:18:07.180
is stop just dreaming about automating your creativity.

00:18:07.440 --> 00:18:09.559
The tools and the blueprints are out there. It's

00:18:09.559 --> 00:18:11.559
time to start building it. Yeah, and remember,

00:18:11.660 --> 00:18:13.900
there are dedicated online communities, even

00:18:13.900 --> 00:18:16.079
advanced courses, if you really want to dive

00:18:16.079 --> 00:18:19.359
deep and achieve true mastery. Think about how

00:18:19.359 --> 00:18:22.380
these ideas, this capability, could fundamentally

00:18:22.380 --> 00:18:24.759
change how you approach content creation and

00:18:24.759 --> 00:18:27.299
marketing. It really opens up a whole new world.

00:18:27.480 --> 00:18:29.259
Well, thank you for joining us on this deep dive

00:18:29.259 --> 00:18:31.539
today. It's been fascinating. We'll be back soon

00:18:31.539 --> 00:18:33.779
with more insights to unpack. Out to your own

00:18:33.779 --> 00:18:34.000
music.
