WEBVTT

00:00:00.000 --> 00:00:03.200
The days when you absolutely needed a computer

00:00:03.200 --> 00:00:05.440
science degree, you know, just to automate some

00:00:05.440 --> 00:00:08.119
basic tasks, those days are really fading. It's

00:00:08.119 --> 00:00:10.359
a pretty profound shift, actually. Today we're

00:00:10.359 --> 00:00:13.199
doing a deep dive into a platform that I think

00:00:13.199 --> 00:00:16.719
really embodies this change, the chat GPT agent

00:00:16.719 --> 00:00:19.399
builder. And let's pretty much anyone build these

00:00:19.399 --> 00:00:22.890
sophisticated AI helpers using, well, just. Visual

00:00:22.890 --> 00:00:24.949
blocks and an idea welcome. Yeah Our mission

00:00:24.949 --> 00:00:27.469
today is to really unpack this thing get to the

00:00:27.469 --> 00:00:30.429
core of how it works and the tool itself It's

00:00:30.429 --> 00:00:34.030
a platform for creating smart tasks specific

00:00:34.030 --> 00:00:36.350
AI assistance these agents without touching a

00:00:36.350 --> 00:00:38.929
line of code Okay, so just for clarity then define

00:00:38.929 --> 00:00:41.770
AI agent for us quickly What is it really think

00:00:41.770 --> 00:00:44.170
of it like a highly focused digital employee?

00:00:44.549 --> 00:00:46.570
Basically smart helpers designed to automate

00:00:46.570 --> 00:00:49.009
very specific tasks and they operate purely based

00:00:49.009 --> 00:00:51.399
on the rules you give them Perfect. So the plan

00:00:51.399 --> 00:00:55.159
today, first we'll look at the basic logic, the

00:00:55.159 --> 00:00:57.979
blueprint, using some examples. Then we'll actually

00:00:57.979 --> 00:00:59.979
walk through building an educational agent step

00:00:59.979 --> 00:01:02.759
by step, one with multiple paths. And finally,

00:01:03.520 --> 00:01:05.500
we'll look at how you connect these agents out

00:01:05.500 --> 00:01:10.019
to other apps, like Gmail or Shopify. All right,

00:01:10.019 --> 00:01:12.280
let's unpack that blueprint. The tool sounds

00:01:12.280 --> 00:01:15.269
really accessible, but you know. Accessibility

00:01:15.269 --> 00:01:18.290
isn't the same as it being easy to design something

00:01:18.290 --> 00:01:20.349
good. What's the actual biggest hurdle here?

00:01:20.750 --> 00:01:22.730
It's almost always conceptual, actually. I mean,

00:01:22.810 --> 00:01:24.650
the technical part. It's practically zero now.

00:01:25.030 --> 00:01:27.510
But your ability to clearly define the problem,

00:01:27.730 --> 00:01:30.049
the instructions you write, they have to be absolutely

00:01:30.049 --> 00:01:32.930
spot on, really specific for the agent to work

00:01:32.930 --> 00:01:36.299
reliably. It sounds incredibly powerful, but

00:01:36.299 --> 00:01:38.579
the mechanism you described, it's kind of like

00:01:38.579 --> 00:01:40.859
stacking Lego blocks almost. That's a perfect

00:01:40.859 --> 00:01:42.680
way to put it, yeah. If you can sketch out a

00:01:42.680 --> 00:01:44.739
simple flow chart, you can build an agent. Each

00:01:44.739 --> 00:01:47.239
block does one job. Maybe it collects data, searches

00:01:47.239 --> 00:01:49.739
the web, makes a decision based on a rule. You

00:01:49.739 --> 00:01:51.900
just connect them together to get the workflow

00:01:51.900 --> 00:01:54.640
you need. And getting started. People need to

00:01:54.640 --> 00:01:57.280
go into the OpenAI workspace, set up payment

00:01:57.280 --> 00:01:59.819
for usage first. That's right. Yeah, that's just

00:01:59.819 --> 00:02:02.060
to cover the computing resources the agent uses

00:02:02.060 --> 00:02:04.359
when it's running. Once you're in, you see the

00:02:04.359 --> 00:02:06.540
blocks on the left and this main canvas, the

00:02:06.540 --> 00:02:09.340
workspace where you arrange everything. OK, so

00:02:09.340 --> 00:02:13.020
if the tech barrier is basically gone and the

00:02:13.020 --> 00:02:15.639
main challenge is being clear in your thinking,

00:02:17.400 --> 00:02:19.580
what's the limiting factor then? Is it really

00:02:19.580 --> 00:02:22.259
just how well you understand your own process?

00:02:22.539 --> 00:02:24.939
Yeah, essentially. Your instructions must be

00:02:24.939 --> 00:02:26.819
absolutely clear for the agent to work well.

00:02:27.099 --> 00:02:30.080
That clarity piece leads nicely into agent logic.

00:02:30.659 --> 00:02:33.199
How these things actually make decisions. It

00:02:33.199 --> 00:02:35.280
feels like we're moving away from general chat

00:02:35.280 --> 00:02:37.810
towards something more structured. I like your

00:02:37.810 --> 00:02:40.569
idea of a complex agent being like a small, really

00:02:40.569 --> 00:02:42.509
specialized team. Exactly. Take the planning

00:02:42.509 --> 00:02:45.090
helper agent example they give. It kicks off

00:02:45.090 --> 00:02:46.990
with a start trigger that's just the moment the

00:02:46.990 --> 00:02:49.330
user submits their request, like ringing a doorbell.

00:02:49.710 --> 00:02:51.810
Then you've got the triage agent. The information

00:02:51.810 --> 00:02:53.830
gatherer, right. Its only job is grabbing all

00:02:53.830 --> 00:02:56.669
the key project details right away. Right. And

00:02:56.669 --> 00:02:58.789
then comes the condition check. This is kind

00:02:58.789 --> 00:03:01.590
of the smart bit. It asks a simple yes -no question.

00:03:01.949 --> 00:03:04.909
Do I have all the info I need? If it's yes, yes,

00:03:05.169 --> 00:03:07.789
okay, proceed to planning. If it's no, it might

00:03:07.789 --> 00:03:10.030
loop back or trigger another agent like a get

00:03:10.030 --> 00:03:12.710
data agent to ask for what's missing. So you

00:03:12.710 --> 00:03:15.430
see branching logic based on the data flow. The

00:03:15.430 --> 00:03:17.409
customer service agent example is another good

00:03:17.409 --> 00:03:19.449
one for branching. The first agent just figures

00:03:19.449 --> 00:03:21.229
out the intent. Is this person trying to return

00:03:21.229 --> 00:03:24.740
something? Cancel. Just get info. And that intent

00:03:24.740 --> 00:03:27.520
immediately dictates which specialized agent

00:03:27.520 --> 00:03:29.800
takes over. The return agent, the retention agent,

00:03:29.919 --> 00:03:32.280
whatever it is. But here's maybe a question.

00:03:32.620 --> 00:03:34.979
Doesn't using multiple specialized agents potentially

00:03:34.979 --> 00:03:38.439
add latency or cost compared to just one big,

00:03:38.719 --> 00:03:40.860
smarter model? Ah, that's the interesting trade

00:03:40.860 --> 00:03:43.000
-off, isn't it? By using these specialized agents,

00:03:43.360 --> 00:03:45.340
you can actually assign different types of AI

00:03:45.340 --> 00:03:47.460
models to different parts of the job. A simple

00:03:47.460 --> 00:03:49.379
sorting task doesn't need the most powerful,

00:03:49.400 --> 00:03:52.539
most expensive AI. Precisely. You use the big

00:03:52.539 --> 00:03:55.819
guns, maybe like the latest GPT model, for the

00:03:55.819 --> 00:03:58.960
heavy lifting complex reasoning, writing nuanced

00:03:58.960 --> 00:04:02.139
text. But for the simple stuff like basic classification

00:04:02.139 --> 00:04:05.199
or sorting, you program it to use the faster,

00:04:05.379 --> 00:04:08.379
much cheaper models like maybe GPT 4 .1 mini

00:04:08.379 --> 00:04:10.879
or something similar. You trade a tiny bit of

00:04:10.879 --> 00:04:13.400
setup complexity for potentially huge savings

00:04:13.400 --> 00:04:17.259
on running costs. So the agent essentially knows

00:04:17.259 --> 00:04:21.019
whether to use the cheap fast AI or the more

00:04:21.019 --> 00:04:23.399
powerful one based on the task gets handed from

00:04:23.399 --> 00:04:26.620
that branching logic. dictates the model choice,

00:04:26.939 --> 00:04:29.000
optimizing for cost. OK, let's simplify for a

00:04:29.000 --> 00:04:30.319
second and actually build something or at least

00:04:30.319 --> 00:04:32.839
walk through it. That multipath AI learning helper

00:04:32.839 --> 00:04:34.600
agent you mentioned, the one that sorts user

00:04:34.600 --> 00:04:36.939
questions. Right. So step one, start trigger.

00:04:37.180 --> 00:04:39.279
We set that to text input because, well, the

00:04:39.279 --> 00:04:41.240
user's typing a question. Simple enough. Step

00:04:41.240 --> 00:04:43.699
two is the sorting agent, the classifier. Its

00:04:43.699 --> 00:04:45.920
job is to read that question and slot it to one

00:04:45.920 --> 00:04:49.120
of maybe four buckets, AI news, AI tool info,

00:04:49.439 --> 00:04:52.279
AI basics, or AI business ideas. And this brings

00:04:52.279 --> 00:04:54.990
us to a really crucial technical point. Structured

00:04:54.990 --> 00:04:57.529
output. You have to instruct this agent very

00:04:57.529 --> 00:05:00.310
specifically to output its decision in JSON format.

00:05:00.509 --> 00:05:03.430
It's kind of mandatory for reliable flow. Why

00:05:03.430 --> 00:05:06.550
JSON though? Why is that specific format so important?

00:05:06.850 --> 00:05:09.649
Can't the agent just, you know, output text saying

00:05:09.649 --> 00:05:13.560
category AI news? Because the next block in the

00:05:13.560 --> 00:05:15.399
chain, the conditional logic block, it needs

00:05:15.399 --> 00:05:18.519
something unambiguous. It can't easily or reliably

00:05:18.519 --> 00:05:21.920
parse natural language, which can be fuzzy. JSON

00:05:21.920 --> 00:05:25.199
-like category A -news forces a clean machine

00:05:25.199 --> 00:05:27.980
-readable structure. It removes the AI's creativity

00:05:27.980 --> 00:05:30.420
for that specific step, making the output totally

00:05:30.420 --> 00:05:32.560
predictable, which is essential for the workflow

00:05:32.560 --> 00:05:34.660
stability. OK, that makes sense. It guarantees

00:05:34.660 --> 00:05:37.120
the next step gets exactly what it expects, which

00:05:37.120 --> 00:05:39.259
leads right into step three, conditional logic.

00:05:39.399 --> 00:05:41.500
This is the traffic cop, basically, taking that

00:05:41.500 --> 00:05:44.579
precise JSON output and sending the request down

00:05:44.579 --> 00:05:47.439
the right path. Exactly. And each path leads

00:05:47.439 --> 00:05:50.040
to its own dedicated, super specialized agent

00:05:50.040 --> 00:05:53.160
with really specific instructions, like the AI

00:05:53.160 --> 00:05:55.740
news agent. It must use web search. It must find

00:05:55.740 --> 00:05:57.759
the five most important stories from the last

00:05:57.759 --> 00:05:59.879
week. And it must give you the headline, a short

00:05:59.879 --> 00:06:03.699
summary, and the source link. Very precise. Constraints

00:06:03.699 --> 00:06:06.079
are definitely key there. And the AI tool agent,

00:06:06.540 --> 00:06:08.819
it has to use web search to explain the tool,

00:06:08.839 --> 00:06:11.860
but also give three real concrete examples of

00:06:11.860 --> 00:06:13.660
how you'd use it to actually create something,

00:06:13.959 --> 00:06:16.279
not just theory. Then you've got the AI basics

00:06:16.279 --> 00:06:19.639
agent. The instruction here is cool. Explain

00:06:19.639 --> 00:06:22.139
things like a patient eighth grade teacher. Use

00:06:22.139 --> 00:06:25.199
everyday words, zero jargon, and this one doesn't

00:06:25.199 --> 00:06:27.560
even need web search access. And finally, the

00:06:27.560 --> 00:06:30.279
AI business ideas agent. Find three successful

00:06:30.279 --> 00:06:33.300
business ideas using AI right now. Explain how

00:06:33.300 --> 00:06:35.680
AI is used and give a real company example for

00:06:35.680 --> 00:06:38.019
each. Whoa. I mean, just imagine the precision

00:06:38.019 --> 00:06:40.459
needed for that one conditional block to reliably

00:06:40.459 --> 00:06:42.439
route, I don't know, maybe billions of queries

00:06:42.439 --> 00:06:44.740
over time purely based on those instructions

00:06:44.740 --> 00:06:47.540
feeding it clean JSON. That's real operational

00:06:47.540 --> 00:06:49.860
scale right there. So going back to the instructions

00:06:49.860 --> 00:06:52.500
for those final agents, why do they need to be

00:06:52.500 --> 00:06:54.860
so specific about the output format and even

00:06:54.860 --> 00:06:56.720
the style, like the eighth grade teacher part?

00:06:56.959 --> 00:06:59.470
Yeah, that's specificity. stops you from getting

00:06:59.470 --> 00:07:03.209
vague, unhelpful answers. It ensures consistency,

00:07:03.430 --> 00:07:05.389
which is really important for the person using

00:07:05.389 --> 00:07:08.550
it, you know, the end -user experience. Mid -roll

00:07:08.550 --> 00:07:10.470
sponsor read provided separately. Okay, let's

00:07:10.470 --> 00:07:12.629
power this up. Building the internal logic is

00:07:12.629 --> 00:07:15.269
one thing, but making these agents truly useful

00:07:15.269 --> 00:07:17.709
often means connecting them to the outside world.

00:07:17.879 --> 00:07:19.959
Right? Absolutely. And that's where MCP comes

00:07:19.959 --> 00:07:22.879
in, Model Context Protocol. Think of it simply

00:07:22.879 --> 00:07:25.199
as the secure handshake that lets these agents

00:07:25.199 --> 00:07:28.480
talk to external apps and services, stuff outside

00:07:28.480 --> 00:07:31.139
the OpenAI environment. So the agent is smart,

00:07:31.720 --> 00:07:33.319
but it's kind of stuck in its box until you use

00:07:33.319 --> 00:07:36.680
MCP. How does that work securely? MCP is the

00:07:36.680 --> 00:07:39.220
bridge. It enables connections to things like

00:07:39.220 --> 00:07:41.879
Gmail so it can read or send emails, Google Calendar

00:07:41.879 --> 00:07:44.959
for scheduling, e -commerce platforms like Shopify,

00:07:45.579 --> 00:07:48.220
payment processors like Stripe, and maybe most

00:07:48.220 --> 00:07:50.920
powerfully, Zapier, which connects to literally

00:07:50.920 --> 00:07:54.339
thousands of other apps. Wow, okay. That unlocks

00:07:54.339 --> 00:07:56.680
some seriously advanced possibilities then. Totally,

00:07:56.720 --> 00:07:59.500
like that email replying agent example. It could

00:07:59.500 --> 00:08:02.160
read incoming emails, use its logic to sort them,

00:08:02.160 --> 00:08:04.860
maybe urgent, needs review, just a notification,

00:08:05.439 --> 00:08:07.920
then draft detailed replies based on rules you

00:08:07.920 --> 00:08:10.259
set, and either send them automatically or just

00:08:10.259 --> 00:08:12.959
save them as drafts for a human to quickly check.

00:08:13.040 --> 00:08:15.759
Okay, but hold on. If we're giving an automated

00:08:15.759 --> 00:08:19.019
agent access to read and write in my Gmail or

00:08:19.019 --> 00:08:23.620
see my Shopify orders, what about security? Governance.

00:08:23.759 --> 00:08:26.199
That seems like a big step. That's a super critical

00:08:26.199 --> 00:08:28.180
question. And the system is built with that in

00:08:28.180 --> 00:08:32.019
mind. It requires very specific scoped permissions

00:08:32.019 --> 00:08:33.820
for every connection you set up. You have to

00:08:33.820 --> 00:08:36.139
think about that governance layer up front, setting

00:08:36.139 --> 00:08:38.299
rules that stop the agent from doing things it

00:08:38.299 --> 00:08:39.720
shouldn't. Like you wouldn't want your customer

00:08:39.720 --> 00:08:42.120
return agent to suddenly be able to read HR emails,

00:08:42.159 --> 00:08:44.299
right? So you define those permissions carefully

00:08:44.299 --> 00:08:46.360
before you let it run wild. Let's circle back

00:08:46.360 --> 00:08:48.600
to instruction clarity because honestly that

00:08:48.600 --> 00:08:50.720
feels like the hardest part for me too. I still

00:08:50.720 --> 00:08:53.559
wrestle with prompt drift myself sometimes. Getting

00:08:53.559 --> 00:08:57.080
instructions concise but also completely unambiguous.

00:08:57.320 --> 00:08:59.559
It's way harder than it sounds. Oh, I agree.

00:08:59.740 --> 00:09:01.720
It's a common struggle that vulnerability is

00:09:01.720 --> 00:09:04.000
real. It's the difference between a bad instruction

00:09:04.000 --> 00:09:06.960
like help with emails, which is useless, versus

00:09:06.960 --> 00:09:10.159
a good one. Read incoming customer support emails,

00:09:10.480 --> 00:09:13.100
identify the core problem described, write a

00:09:13.100 --> 00:09:15.340
friendly and empathetic reply under 100 words,

00:09:15.519 --> 00:09:18.740
and then update the CRM record to resolved. That

00:09:18.740 --> 00:09:21.320
specificity is the design work. And you mentioned

00:09:21.320 --> 00:09:23.620
model choice earlier. That's critical, too, for

00:09:23.620 --> 00:09:25.620
performance but also cost, right? Absolutely

00:09:25.620 --> 00:09:28.440
crucial. Use your top -tier model, maybe GPT

00:09:28.440 --> 00:09:30.799
-5 when available, for the really tough reasoning

00:09:30.799 --> 00:09:33.720
or generating complex creative text. But for

00:09:33.720 --> 00:09:36.600
simple sorting, classification, data extraction,

00:09:37.340 --> 00:09:39.820
stick to the faster, much cheaper models like

00:09:39.820 --> 00:09:42.879
GPT -4 .1 Mini. If you build an agent that's

00:09:42.879 --> 00:09:45.299
inefficient, maybe it loops unnecessarily or

00:09:45.299 --> 00:09:47.200
defaults to the expensive model for everything,

00:09:47.580 --> 00:09:49.720
you'll see those costs spike really fast. So

00:09:49.720 --> 00:09:52.320
if things do go wrong, the agent breaks, gives

00:09:52.320 --> 00:09:55.960
weird answers, or it's just slow, what are the

00:09:55.960 --> 00:09:58.279
quick troubleshooting checks? First place I'd

00:09:58.279 --> 00:10:00.799
look is often the conditional logic, the if -then

00:10:00.799 --> 00:10:04.200
stuff. If it's giving wrong answers, reread those

00:10:04.200 --> 00:10:07.000
rules carefully. Maybe the intent classification

00:10:07.000 --> 00:10:09.340
is slightly off. If it's just not answering,

00:10:09.559 --> 00:10:11.519
check your block connections are solid and make

00:10:11.519 --> 00:10:14.580
sure it's actually published. If it's slow, try

00:10:14.580 --> 00:10:17.360
reducing how much it relies on web search or

00:10:17.360 --> 00:10:19.919
swap in a faster model for simpler steps. If

00:10:19.919 --> 00:10:22.159
someone's just starting out building their very

00:10:22.159 --> 00:10:24.519
first simple agent, what's the single most important

00:10:24.519 --> 00:10:26.399
thing they should do right after building it?

00:10:26.559 --> 00:10:29.419
Test it with weird inputs. Seriously. See how

00:10:29.419 --> 00:10:32.080
it handles unexpected questions, confusing language,

00:10:32.340 --> 00:10:35.080
or maybe missing information? Always anticipate

00:10:35.080 --> 00:10:37.399
how users might break it. Okay, let's bring this

00:10:37.399 --> 00:10:39.559
all together. The really big idea here seems

00:10:39.559 --> 00:10:43.230
to be this. Democratization. The power to build

00:10:43.230 --> 00:10:45.570
sophisticated tools isn't just for coders anymore.

00:10:46.129 --> 00:10:48.330
It's accessible to, you know, the business designer,

00:10:48.409 --> 00:10:51.389
the process owner. Exactly. And the practical

00:10:51.389 --> 00:10:55.070
uses are just huge. Personal research assistants,

00:10:55.669 --> 00:10:58.529
automated customer support flows, content strategy

00:10:58.529 --> 00:11:01.309
helpers, even sales assistants that could help

00:11:01.309 --> 00:11:04.409
qualify leads or schedule demos. And that final

00:11:04.409 --> 00:11:06.389
piece of advice from the material really resonates.

00:11:06.970 --> 00:11:09.779
Start simple. Pick one clear, well -defined job

00:11:09.779 --> 00:11:12.340
for your agent. Test it like crazy. Then and

00:11:12.340 --> 00:11:14.500
only then, start adding more features of complexity.

00:11:14.820 --> 00:11:16.399
Yeah, so the challenge to you, the listener,

00:11:16.720 --> 00:11:20.259
is maybe. Open up the builder, find that one

00:11:20.259 --> 00:11:22.480
repetitive, maybe kind of boring task you do

00:11:22.480 --> 00:11:25.720
often, and try automating just that. That's where

00:11:25.720 --> 00:11:27.580
you'll likely see the quickest win. Which leaves

00:11:27.580 --> 00:11:29.299
us with a final thought, something to maybe chew

00:11:29.299 --> 00:11:31.419
on. Perhaps the biggest value of this agent builder

00:11:31.419 --> 00:11:34.000
isn't just that it makes automation easier. but

00:11:34.000 --> 00:11:36.460
that it forces us, maybe for the first time for

00:11:36.460 --> 00:11:38.379
some processes, to really define and understand

00:11:38.379 --> 00:11:40.320
the tasks we should be automating in the first

00:11:40.320 --> 00:11:40.600
place.