WEBVTT

00:00:00.000 --> 00:00:02.339
So for the past few years, we've mostly interacted

00:00:02.339 --> 00:00:06.360
with AI through, well, a text box, right? Typing

00:00:06.360 --> 00:00:08.460
things in, getting text back. But I think it's

00:00:08.460 --> 00:00:11.880
time to stop typing because the really big shift

00:00:11.880 --> 00:00:15.640
happening like right now is AI moving, moving

00:00:15.640 --> 00:00:19.059
its brain out into the real world, having actual

00:00:19.059 --> 00:00:22.359
natural phone conversations in real time. Exactly.

00:00:22.460 --> 00:00:23.920
And that's what this deep dive is all about.

00:00:24.000 --> 00:00:25.899
There's this huge automation opportunity opening

00:00:25.899 --> 00:00:27.980
up. We're talking about how you can build functional.

00:00:29.379 --> 00:00:32.100
autonomous AI assistance, like starting today,

00:00:32.259 --> 00:00:34.799
really capitalize on this market that, well,

00:00:34.859 --> 00:00:37.579
we think is going to explode by 2026. Yeah, that's

00:00:37.579 --> 00:00:39.539
the mission here. We want to take you from maybe

00:00:39.539 --> 00:00:42.119
knowing nothing about this to really understanding

00:00:42.119 --> 00:00:45.840
the infrastructure of a voice agent, a functional

00:00:45.840 --> 00:00:47.979
one. built with no -code tools. We'll kind of

00:00:47.979 --> 00:00:49.960
unpack its technical anatomy, look at the tools

00:00:49.960 --> 00:00:52.280
you need, specifically VAPI and NAN, and then

00:00:52.280 --> 00:00:54.359
walk through the actual steps to build your first

00:00:54.359 --> 00:00:56.479
one. It's a massive skill gap out there, honestly.

00:00:56.640 --> 00:00:58.939
Huge demand, very lucrative. So yeah, let's start

00:00:58.939 --> 00:01:00.399
with the basics, the fundamental architecture.

00:01:00.799 --> 00:01:03.979
Okay, so the first thing is, how do you actually

00:01:03.979 --> 00:01:07.200
give... a large language model you know that

00:01:07.200 --> 00:01:09.780
smart text brain we've all used how do you give

00:01:09.780 --> 00:01:11.760
it a mouth and ears well the basic definition

00:01:11.760 --> 00:01:13.920
is pretty simple a voice agent is essentially

00:01:13.920 --> 00:01:17.379
a virtual assistant one that can hold like complex

00:01:17.379 --> 00:01:19.439
natural sounding conversations over the phone

00:01:19.439 --> 00:01:22.239
or even the web think of it like a chat bot on

00:01:22.239 --> 00:01:25.099
the inside but the interaction layer that's all

00:01:25.099 --> 00:01:27.540
speech so this isn't the same as when you call

00:01:27.540 --> 00:01:31.099
a company and get that clunky press One for sales,

00:01:31.260 --> 00:01:33.319
press two for service kind of thing. The thing

00:01:33.319 --> 00:01:35.260
everyone hates. Oh, absolutely not. That's the

00:01:35.260 --> 00:01:37.920
key difference. These agents, they can make and

00:01:37.920 --> 00:01:41.000
receive calls 247. They can maintain a genuinely

00:01:41.000 --> 00:01:44.359
human -like conversational flow. And this is

00:01:44.359 --> 00:01:46.379
critical. They integrate with your business systems,

00:01:46.480 --> 00:01:48.560
your CRM, your calendar, whatever. Okay. And

00:01:48.560 --> 00:01:51.180
the way it works, it relies on three core pieces

00:01:51.180 --> 00:01:53.299
working together, right? You called it the LLM

00:01:53.299 --> 00:01:55.000
with ears and a mouth. That's pretty much it.

00:01:55.099 --> 00:01:58.709
The ear is step one, speech to text or STT. That

00:01:58.709 --> 00:02:00.769
takes what the person says, their spoken words,

00:02:00.930 --> 00:02:03.209
and turns it into written text the AI can understand.

00:02:03.829 --> 00:02:06.829
And that text then feeds into the brain, the

00:02:06.829 --> 00:02:09.009
LLM. That's where the AI figures out the right

00:02:09.009 --> 00:02:10.990
response, something conversational and nuanced.

00:02:11.370 --> 00:02:14.330
Yep. And then finally, the mouth. That's text

00:02:14.330 --> 00:02:18.080
-to -speech. TTS, it converts the AI's text answer

00:02:18.080 --> 00:02:21.219
back into really natural sounding spoken words

00:02:21.219 --> 00:02:23.319
for the person to hear. What's kind of interesting

00:02:23.319 --> 00:02:25.719
here is that the skills you might already have

00:02:25.719 --> 00:02:29.020
from working with text based LLMs like prompting,

00:02:29.020 --> 00:02:31.639
figuring out the logic, those still totally apply.

00:02:31.860 --> 00:02:33.500
It's like you already get the brain part. Exactly.

00:02:33.719 --> 00:02:35.759
Now we just need to layer on the logistics and

00:02:35.759 --> 00:02:39.060
crucially deal with this new factor, latency.

00:02:39.439 --> 00:02:41.819
Ah, right. Speed. Yeah. If you're building a

00:02:41.819 --> 00:02:43.979
chat bot, you know, a delay of a second or two

00:02:43.979 --> 00:02:47.419
might be OK. But with voice, even half a second

00:02:47.419 --> 00:02:49.960
delay feels weird. Like the agent's buffering

00:02:49.960 --> 00:02:52.800
or it's confused. Latency just destroys user

00:02:52.800 --> 00:02:55.479
trust in a real -time conversation. That's honestly

00:02:55.479 --> 00:02:57.379
why voice agents are fundamentally harder to

00:02:57.379 --> 00:02:59.199
get right than chatbots. That makes total sense.

00:02:59.280 --> 00:03:01.539
Voice really demands that immediate back -and

00:03:01.539 --> 00:03:04.020
-forth feeling. Yeah, speed is completely non

00:03:04.020 --> 00:03:05.900
-negotiable if you want people to actually...

00:03:06.139 --> 00:03:08.620
accept it and use it. Okay. So let's dig a bit

00:03:08.620 --> 00:03:11.460
deeper then into the structure, the anatomy of

00:03:11.460 --> 00:03:13.639
these agents. You're saying there are four essential

00:03:13.639 --> 00:03:16.330
parts we need to understand. Right. four key

00:03:16.330 --> 00:03:19.229
pieces. Part one is the LLM, the brain, that's

00:03:19.229 --> 00:03:21.650
the intelligence engine. And choosing the right

00:03:21.650 --> 00:03:24.870
model, maybe GPT -4, maybe clog, maybe something

00:03:24.870 --> 00:03:27.650
else optimized for speed, that choice is critical.

00:03:27.750 --> 00:03:29.310
You're always doing this balancing act between

00:03:29.310 --> 00:03:32.509
how smart it is, how sophisticated, versus that

00:03:32.509 --> 00:03:34.990
latency, the speed, and of course how much the

00:03:34.990 --> 00:03:37.680
call costs to run. And part two, you said this

00:03:37.680 --> 00:03:39.719
is maybe the most important part, the system

00:03:39.719 --> 00:03:42.099
prompt, the playbook. Definitely the most important,

00:03:42.199 --> 00:03:44.080
yeah. This is basically the agent's instruction

00:03:44.080 --> 00:03:46.360
manual. It defines everything. Its role, like

00:03:46.360 --> 00:03:48.780
you are a warm, patient customer support rep.

00:03:48.919 --> 00:03:51.979
Its personality, its style, and maybe most importantly,

00:03:52.099 --> 00:03:54.300
its rules and boundaries. This is where you explicitly

00:03:54.300 --> 00:03:57.360
state things like, under no circumstances give

00:03:57.360 --> 00:04:00.020
financial advice. You know, I have to admit something

00:04:00.020 --> 00:04:02.080
here. Even with all the time I've spent working

00:04:02.080 --> 00:04:05.469
with LLMs, Getting that prompt perfect on the

00:04:05.469 --> 00:04:08.110
first try. It's basically impossible. I still

00:04:08.110 --> 00:04:10.969
wrestle with prompt drift myself. Real conversation

00:04:10.969 --> 00:04:14.669
is just messy. You have to constantly test, tweak,

00:04:14.750 --> 00:04:16.829
iterate. That's honestly the hardest part of

00:04:16.829 --> 00:04:18.930
making these things work well in the real world.

00:04:19.149 --> 00:04:20.769
No, I appreciate you saying that. It's absolutely

00:04:20.769 --> 00:04:22.509
the truth when you actually deploy these things.

00:04:22.870 --> 00:04:25.410
Okay, so part three is the voice, the persona.

00:04:25.629 --> 00:04:28.350
This really impacts how users perceive the agent,

00:04:28.449 --> 00:04:30.980
whether they trust it. You can choose gender,

00:04:31.160 --> 00:04:35.560
age, accent from providers like 11 Labs. It brings

00:04:35.560 --> 00:04:38.060
the persona to life. And the last piece, number

00:04:38.060 --> 00:04:41.000
four, is the tools, the superpowers. This is

00:04:41.000 --> 00:04:43.199
the action layer, right? What lets it do more

00:04:43.199 --> 00:04:45.459
than just talk? Exactly. This is where it gets

00:04:45.459 --> 00:04:46.980
really powerful. We're talking about the agent

00:04:46.980 --> 00:04:49.220
being able to, say, view internal databases,

00:04:49.579 --> 00:04:52.160
create appointments directly in a calendar, process

00:04:52.160 --> 00:04:54.899
payments, or trigger literally any kind of automation

00:04:54.899 --> 00:04:58.009
through some connected service. So let's take

00:04:58.009 --> 00:05:01.149
that dentist office agent example. The brain

00:05:01.149 --> 00:05:04.610
might be a fast LLM like GPT -4. The playbook

00:05:04.610 --> 00:05:06.910
is super clear. You only schedule appointments,

00:05:07.269 --> 00:05:11.069
nothing else. The personas may be friendly, reassuring,

00:05:11.110 --> 00:05:13.889
and the superpowers. The superpowers would be

00:05:13.889 --> 00:05:16.470
check the dentist's calendar availability, book

00:05:16.470 --> 00:05:18.810
the appointment slot, and then maybe send a confirmation

00:05:18.810 --> 00:05:22.019
email or text. all done autonomously, instantly,

00:05:22.120 --> 00:05:24.279
during the call. And that system prompt, the

00:05:24.279 --> 00:05:26.060
playbook, is kind of like the agent's constitution.

00:05:26.379 --> 00:05:28.600
It sets all the rules. Exactly. It dictates everything

00:05:28.600 --> 00:05:30.839
it should and shouldn't do. Now, to build something

00:05:30.839 --> 00:05:33.139
professional like that, especially using no -code

00:05:33.139 --> 00:05:36.660
tools, we need specific platforms. The sources

00:05:36.660 --> 00:05:39.180
you looked at point towards a combination, VAPI

00:05:39.180 --> 00:05:41.939
and NAN. Yeah, VAPI is the specialized platform

00:05:41.939 --> 00:05:44.139
for the voice piece. It's designed for this.

00:05:44.199 --> 00:05:46.360
It handles all the complex voice logistics, the

00:05:46.360 --> 00:05:49.560
phone calls, the real -time STTTTS, managing...

00:05:49.610 --> 00:05:51.949
that whole conversational interface. Think of

00:05:51.949 --> 00:05:53.850
it as the front end for the conversation. Okay.

00:05:53.949 --> 00:05:57.629
Voppy for voice. And then N8n. N8n is this really

00:05:57.629 --> 00:06:01.310
robust visual automation platform. It's for building

00:06:01.310 --> 00:06:03.730
out all the backend logic, the workflows. And

00:06:03.730 --> 00:06:06.029
the key thing is it connects to pretty much any

00:06:06.029 --> 00:06:08.550
service or API you can think of. It's the integration

00:06:08.550 --> 00:06:11.730
engine. So why put them together? What's the

00:06:11.730 --> 00:06:14.589
magic in combining Voppy and N8n? Well, it's

00:06:14.589 --> 00:06:17.069
like peanut butter and jelly, really. VAPI handles

00:06:17.069 --> 00:06:19.529
the conversation, the talking and listening,

00:06:19.709 --> 00:06:22.930
the mouth and ears. But NANN gives it unlimited

00:06:22.930 --> 00:06:25.129
tools, the ability to actually do things. Ah,

00:06:25.290 --> 00:06:28.470
okay. So VAPI is like the interface managing

00:06:28.470 --> 00:06:31.410
the user experience of the call. And NANN is...

00:06:31.600 --> 00:06:33.579
Everything happening behind the scenes, the business

00:06:33.579 --> 00:06:36.079
logic connecting to other systems. That's a really

00:06:36.079 --> 00:06:38.540
good way to put it. VAPI chats and it enacts.

00:06:38.600 --> 00:06:40.379
So the agent goes from just answering a question

00:06:40.379 --> 00:06:43.060
like, what's your return policy to actually taking

00:06:43.060 --> 00:06:45.500
action? Maybe checking inventory in one system,

00:06:45.639 --> 00:06:48.100
logging the call details in Salesforce and setting

00:06:48.100 --> 00:06:50.339
a follow up text message all within that single

00:06:50.339 --> 00:06:53.199
phone call. Got it. VAPI for the natural dialogue

00:06:53.199 --> 00:06:55.899
and ANN to let the agent interact with the messy

00:06:55.899 --> 00:06:58.709
real world stuff. Precisely. That's the power

00:06:58.709 --> 00:07:01.290
combo, mid -roll sponsor read provided separately.

00:07:01.850 --> 00:07:04.189
Okay, so we've covered the how. Now let's look

00:07:04.189 --> 00:07:06.449
at the two main ways these agents typically operate.

00:07:06.629 --> 00:07:08.970
You've basically got inbound and outbound. Right.

00:07:09.069 --> 00:07:11.930
Inbound agents, that's like your virtual receptionist,

00:07:11.949 --> 00:07:15.209
available 24 -7. This is where someone calls

00:07:15.209 --> 00:07:17.589
the agent's dedicated number and the agent picks

00:07:17.589 --> 00:07:20.209
up. Common uses are, you know, round -the -clock

00:07:20.209 --> 00:07:23.579
customer support. answering complex FAQs, maybe

00:07:23.579 --> 00:07:26.279
checking on an order status. The key is the customer

00:07:26.279 --> 00:07:28.639
initiates the call. Then you flip it and you

00:07:28.639 --> 00:07:30.980
have outbound agents. Think of these as your

00:07:30.980 --> 00:07:33.379
autonomous assistant. In this case, the agent

00:07:33.379 --> 00:07:36.540
proactively calls people for you. This area has

00:07:36.540 --> 00:07:39.490
huge ROI potential. Things like sales follow

00:07:39.490 --> 00:07:42.769
-ups, payment reminders or collections, and a

00:07:42.769 --> 00:07:44.430
really big one, appointment reminders. Yeah.

00:07:44.490 --> 00:07:46.850
Those drastically cut down on costly no -shows.

00:07:46.949 --> 00:07:49.329
Here, the agent initiates. And there's that hybrid

00:07:49.329 --> 00:07:52.089
model too, right? The WebSo widget agent. Yeah,

00:07:52.149 --> 00:07:53.490
that's a neat one. It's like a little chat bubble

00:07:53.490 --> 00:07:55.850
on your website, but instead of typing, the user

00:07:55.850 --> 00:07:58.449
clicks a button and it starts an instant voice

00:07:58.449 --> 00:08:00.949
call right through their browser. Super convenient.

00:08:01.670 --> 00:08:04.490
Bypasses needing a phone number entirely. So

00:08:04.490 --> 00:08:06.310
if a small business is looking for the quickest

00:08:06.310 --> 00:08:09.750
win, the clearest ROI, which type usually delivers

00:08:09.750 --> 00:08:13.230
that? Generally, inbound support for off hours

00:08:13.230 --> 00:08:16.250
or overflow and outbound appointment reminders.

00:08:16.730 --> 00:08:19.470
Those tend to show immediate measurable financial

00:08:19.470 --> 00:08:21.889
value pretty quickly. We've laid out that the

00:08:21.889 --> 00:08:24.889
tech is here. It's functional. But maybe let's

00:08:24.889 --> 00:08:26.970
zoom out for a second. Why should someone listening

00:08:26.970 --> 00:08:29.949
right now really focus on learning about voice

00:08:29.949 --> 00:08:33.200
agents now? Because this whole space. Voice AI

00:08:33.200 --> 00:08:35.700
agents, it feels like it's right at the tipping

00:08:35.700 --> 00:08:38.279
point for massive mainstream adoption by businesses.

00:08:38.580 --> 00:08:41.080
Probably around 2026 is when we'll see it everywhere.

00:08:41.159 --> 00:08:43.879
It's not some far -off future thing. The technical

00:08:43.879 --> 00:08:46.039
hurdles, especially latency, they've largely

00:08:46.039 --> 00:08:47.820
been overcome recently. We're at an inflection

00:08:47.820 --> 00:08:50.600
point. And the value proposition for businesses

00:08:50.600 --> 00:08:53.080
seems incredibly straightforward. It's easy for

00:08:53.080 --> 00:08:56.100
them to grasp. You save potentially a lot of

00:08:56.100 --> 00:08:59.200
money like that $50 receptionist salary example

00:08:59.200 --> 00:09:02.519
while also getting 24 -7 service. It's just pure

00:09:02.519 --> 00:09:05.460
efficiency. Absolutely. And that clear value

00:09:05.460 --> 00:09:07.799
creates this perfect storm situation for anyone

00:09:07.799 --> 00:09:10.340
learning this now. You have extremely high demand

00:09:10.340 --> 00:09:12.600
from businesses desperate to automate things

00:09:12.600 --> 00:09:15.379
like call centers. But you have very low competition

00:09:15.379 --> 00:09:17.720
because, frankly, not many people know how to.

00:09:17.740 --> 00:09:19.779
build these things properly yet using the right

00:09:19.779 --> 00:09:22.580
tools like VAPI and A8N. It really gets interesting

00:09:22.580 --> 00:09:24.299
when you think about the kinds of problems this

00:09:24.299 --> 00:09:26.100
solves. These aren't trivial things. They're

00:09:26.100 --> 00:09:29.799
real, expensive, often painful business bottlenecks,

00:09:29.820 --> 00:09:32.000
scheduling nightmares, rising customer service

00:09:32.000 --> 00:09:35.080
costs. Whoa. Yeah, just imagine scaling that

00:09:35.080 --> 00:09:37.639
one agent. You build it once, refine it, and

00:09:37.639 --> 00:09:39.679
suddenly it can handle, I don't know, a billion

00:09:39.679 --> 00:09:42.500
routine customer interactions a year. Perfectly.

00:09:42.500 --> 00:09:44.840
Every time. No sick days, no vacations. Being

00:09:44.840 --> 00:09:46.620
able to build solutions for those kinds of expensive

00:09:46.620 --> 00:09:49.519
problems, that positions you as an expert really

00:09:49.519 --> 00:09:51.639
early on in a field that's about to take off.

00:09:51.860 --> 00:09:53.919
Okay, so if the opportunity is really that big,

00:09:54.080 --> 00:09:56.539
what's the main hurdle? What's stopping more

00:09:56.539 --> 00:09:58.500
people from jumping in and building these right

00:09:58.500 --> 00:10:01.500
now? Honestly, it mostly comes down to the lack

00:10:01.500 --> 00:10:04.039
of specialized knowledge. People aren't familiar

00:10:04.039 --> 00:10:06.320
with the specific tools and techniques needed

00:10:06.320 --> 00:10:09.340
to connect that conversational part, the voice,

00:10:09.480 --> 00:10:13.080
to the action part. the back end systems so let's

00:10:13.080 --> 00:10:15.279
try and fix that right now let's get practical

00:10:15.279 --> 00:10:17.639
how about we walk through building a very basic

00:10:17.639 --> 00:10:20.679
customer support agent one that just uses a knowledge

00:10:20.679 --> 00:10:24.059
base all inside boppy okay sounds good so step

00:10:24.059 --> 00:10:27.120
one and two are just set up right Get a VAPI

00:10:27.120 --> 00:10:29.240
account, grab the free credits they offer for

00:10:29.240 --> 00:10:31.419
testing, then just poke around the default agent

00:10:31.419 --> 00:10:33.500
settings, see where you set the LLM model, the

00:10:33.500 --> 00:10:36.080
first message the caller hears. Exactly. And

00:10:36.080 --> 00:10:37.919
pay close attention to the system prompt structure

00:10:37.919 --> 00:10:40.379
in that default agent. You'll see those key parts

00:10:40.379 --> 00:10:43.059
we talked about. The role, the identity, the

00:10:43.059 --> 00:10:45.279
voice characteristics, and that step -by -step

00:10:45.279 --> 00:10:48.799
conversational flow. That flow often uses a kind

00:10:48.799 --> 00:10:51.320
of if this, then that logic to try and guide

00:10:51.320 --> 00:10:53.179
the conversation, keep it on track, even when

00:10:53.179 --> 00:10:55.759
the user says unexpected things. Then step three,

00:10:55.820 --> 00:10:58.700
this is where it gets useful. Adding a knowledge

00:10:58.700 --> 00:11:01.940
base, like the agent's cheat sheet. Yep. Inside

00:11:01.940 --> 00:11:04.179
Boppy, there's usually a file section. You just

00:11:04.179 --> 00:11:07.500
upload your company documents, FAQs, PDFs, text

00:11:07.500 --> 00:11:09.919
files with product info, warranty details, whatever.

00:11:10.279 --> 00:11:12.600
Then you connect those files to your agent. And

00:11:12.600 --> 00:11:15.159
just like that, the agent can now reference that

00:11:15.159 --> 00:11:18.279
specific information during a call. Like if someone

00:11:18.279 --> 00:11:21.019
asks about the shipping policy. Instantly. It

00:11:21.019 --> 00:11:23.840
basically gives the agent an internal search

00:11:23.840 --> 00:11:27.039
engine for your company info. So instead of putting

00:11:27.039 --> 00:11:28.919
the caller on hold while a human looks it up,

00:11:29.019 --> 00:11:32.039
the AI can find and relay the info right away.

00:11:32.220 --> 00:11:35.029
Okay. Step four is obviously testing. talking

00:11:35.029 --> 00:11:37.730
to it absolutely crucial use the talk to assistant

00:11:37.730 --> 00:11:40.090
button voppy provides have a real conversation

00:11:40.090 --> 00:11:42.570
and importantly watch the live transcription

00:11:42.570 --> 00:11:45.090
logs as you talk you can literally see the agent

00:11:45.090 --> 00:11:47.629
thinking see when it decides to query that knowledge

00:11:47.629 --> 00:11:49.990
-based tool to find an answer and step five which

00:11:49.990 --> 00:11:51.629
you mentioned is where the real work happens

00:11:51.629 --> 00:11:56.289
iterating on the prompt yes 90 of the value comes

00:11:56.289 --> 00:11:59.490
from this loop the first version will rarely

00:11:59.490 --> 00:12:03.139
be perfect maybe it sounds too robotic You go

00:12:03.139 --> 00:12:05.220
back to the prompt, add instructions like be

00:12:05.220 --> 00:12:07.500
friendly and conversational. Maybe it talks too

00:12:07.500 --> 00:12:09.860
much. Add keep your answers concise and direct.

00:12:10.159 --> 00:12:13.120
So it's this constant cycle. Test it. See what's

00:12:13.120 --> 00:12:16.039
wrong. Document the issues. Refine the system

00:12:16.039 --> 00:12:18.580
prompt and test it again. That's where the human

00:12:18.580 --> 00:12:21.679
really shapes the AI's performance. Exactly.

00:12:21.679 --> 00:12:23.600
Now, the agent we just described, it can talk.

00:12:23.639 --> 00:12:25.500
It can reference files from the knowledge base.

00:12:26.440 --> 00:12:28.620
But it still has one big limitation at this stage.

00:12:28.720 --> 00:12:30.460
Right. It can't actually do anything yet, can

00:12:30.460 --> 00:12:33.000
it? It can't take real action in the world. Precisely.

00:12:33.000 --> 00:12:34.720
It's still just talking and looking things up.

00:12:34.970 --> 00:12:37.210
Okay, so that kind of brings us back to the start,

00:12:37.289 --> 00:12:39.990
but with a tangible result. We've outlined how

00:12:39.990 --> 00:12:42.669
to build a functional AI receptionist, one that

00:12:42.669 --> 00:12:45.330
can answer calls and use company info, all without

00:12:45.330 --> 00:12:48.110
writing code. We went beyond just the LLM as

00:12:48.110 --> 00:12:50.309
a text box. We looked at its anatomy, the brain,

00:12:50.450 --> 00:12:52.649
the playbook, the persona, and the potential

00:12:52.649 --> 00:12:55.549
for superpowers using tools. Yeah, we proved

00:12:55.549 --> 00:12:57.409
it can handle answering calls based on documents.

00:12:57.870 --> 00:13:01.370
But the real game changer, the automation powerhouse.

00:13:01.950 --> 00:13:04.149
That gets unlocked when we give it those superpowers.

00:13:04.250 --> 00:13:06.950
Yeah. Likely using something like N8N. Just think

00:13:06.950 --> 00:13:09.029
about that difference for a moment. An agent

00:13:09.029 --> 00:13:11.710
that can only talk versus one that can all on

00:13:11.710 --> 00:13:14.750
the same call, simultaneously check a real -time

00:13:14.750 --> 00:13:17.269
calendar, book the appointment right into the

00:13:17.269 --> 00:13:20.490
system, log the interaction in a CRM or spreadsheet,

00:13:20.809 --> 00:13:23.769
and send out the confirmation email, all while

00:13:23.769 --> 00:13:26.440
the person is still on the line. That's the moment

00:13:26.440 --> 00:13:28.720
the agent transforms from just an answering machine

00:13:28.720 --> 00:13:31.779
into a truly autonomous employee. And that is

00:13:31.779 --> 00:13:34.399
the enormous opportunity sitting there for anyone

00:13:34.399 --> 00:13:36.919
who decides to learn these tools right now. So

00:13:36.919 --> 00:13:38.620
we really encourage you to take what we've discussed

00:13:38.620 --> 00:13:40.720
today, start exploring, and see the potential

00:13:40.720 --> 00:13:42.899
of voice automation for yourself. Absolutely.

00:13:43.000 --> 00:13:46.100
Until the next deep dive, keep learning, keep

00:13:46.100 --> 00:13:46.460
building.
