WEBVTT

00:00:00.000 --> 00:00:02.940
Imagine an AI not just, you know, answer your

00:00:02.940 --> 00:00:04.839
questions, but actually doing things for you

00:00:04.839 --> 00:00:07.660
online. Popfully. We're really stepping into

00:00:07.660 --> 00:00:10.160
an era where artificial intelligence takes action

00:00:10.160 --> 00:00:13.859
on its own. Welcome curious minds to another

00:00:13.859 --> 00:00:16.379
deep dive. Today, we're exploring a pretty profound

00:00:16.379 --> 00:00:19.160
shift happening in AI. Some are calling it the

00:00:19.160 --> 00:00:22.449
agent era. AI is no longer just a passive encyclopedia.

00:00:22.829 --> 00:00:25.289
It's becoming, well, a remarkably active participant

00:00:25.289 --> 00:00:27.309
in our digital world. We've looked through a

00:00:27.309 --> 00:00:29.129
whole stack of sources for this, including a

00:00:29.129 --> 00:00:31.949
really detailed look at OpenAI's new JAT GPT

00:00:31.949 --> 00:00:34.710
agent feature, and also a flurry of other, frankly,

00:00:35.250 --> 00:00:38.009
revolutionary AI tools that popped up just this

00:00:38.009 --> 00:00:41.369
past week. Our mission today. To understand what's

00:00:41.369 --> 00:00:43.670
truly possible with AI right now, where these

00:00:43.670 --> 00:00:45.770
new capabilities really shine, and maybe most

00:00:45.770 --> 00:00:48.149
importantly, where human ingenuity, human oversight

00:00:48.149 --> 00:00:51.130
remains absolutely crucial. Okay, so for years,

00:00:51.270 --> 00:00:52.969
our interaction with AI has mostly been about

00:00:52.969 --> 00:00:55.049
asking questions, right? Getting information

00:00:55.049 --> 00:00:57.070
back. But from what our sources suggest, that

00:00:57.070 --> 00:00:59.450
paradigm, that fundamental way we use AI, it

00:00:59.450 --> 00:01:01.450
seems like it's genuinely shifted. It really

00:01:01.450 --> 00:01:04.269
has. Think of the new chat GPT agent feature,

00:01:04.450 --> 00:01:07.269
like a highly skilled... digital assistant, an

00:01:07.269 --> 00:01:10.349
assistant that can essentially borrow your computer

00:01:10.349 --> 00:01:13.250
to get things done. Previous AI models were pretty

00:01:13.250 --> 00:01:16.409
much stuck in a chat window. An AI agent, though,

00:01:16.790 --> 00:01:19.790
operates inside a simulated web browser. That's

00:01:19.790 --> 00:01:22.829
a big leap, a significant jump in autonomy. A

00:01:22.829 --> 00:01:25.430
simulated web browser. OK, so if I'm getting

00:01:25.430 --> 00:01:28.349
this right. It can actually navigate websites,

00:01:28.590 --> 00:01:30.890
click buttons, fill out forms, kind of like a

00:01:30.890 --> 00:01:33.310
person would, but all within its own, like, secure

00:01:33.310 --> 00:01:36.209
space. The potential applications there seem...

00:01:36.170 --> 00:01:39.269
Huge. Precisely, yeah. This capability opens

00:01:39.269 --> 00:01:41.269
up just a universe of possibilities for you.

00:01:41.670 --> 00:01:44.250
It means intelligent web browsing, actually executing

00:01:44.250 --> 00:01:46.549
transactions, performing really complex multi

00:01:46.549 --> 00:01:49.790
-step research, and handling sequences of tasks

00:01:49.790 --> 00:01:52.590
all autonomously. It's almost like having a dedicated

00:01:52.590 --> 00:01:54.950
digital employee for certain workflows. That

00:01:54.950 --> 00:01:57.709
sounds incredibly powerful. OpenAI CEO Sam Altman,

00:01:57.709 --> 00:01:59.670
from what we read, he even highlighted its potential

00:01:59.670 --> 00:02:02.849
to handle financial stuff, transactions. But

00:02:02.849 --> 00:02:04.609
at the same time, he gave a pretty serious warning,

00:02:04.730 --> 00:02:07.670
didn't he? Yes, a very important one. Users absolutely

00:02:07.670 --> 00:02:10.050
must proceed with extreme caution, especially

00:02:10.050 --> 00:02:12.330
when dealing with sensitive info like credit

00:02:12.330 --> 00:02:14.729
card details, login credentials, that kind of

00:02:14.729 --> 00:02:17.449
thing. This immense power, it just comes with

00:02:17.449 --> 00:02:19.849
significant risk. You really need to be vigilant,

00:02:20.169 --> 00:02:22.729
not complacent. So that virtual browser is key

00:02:22.729 --> 00:02:24.389
for security then. What makes it different from

00:02:24.389 --> 00:02:27.389
just my regular Chrome window? It's a secure

00:02:27.389 --> 00:02:30.050
temporary sandbox, totally separate from your

00:02:30.050 --> 00:02:32.770
personal computer. OK, now here's where the mechanics

00:02:32.770 --> 00:02:35.330
get really fascinating for me. How do these agents

00:02:35.330 --> 00:02:38.370
actually see and do things? It's not some kind

00:02:38.370 --> 00:02:40.810
of digital magic, right? No, no magic at all.

00:02:40.889 --> 00:02:43.270
It's actually a pretty sophisticated iterative

00:02:43.270 --> 00:02:45.449
process. Imagine you've hired a remote worker,

00:02:45.490 --> 00:02:47.530
maybe, and you're watching their screen through

00:02:47.530 --> 00:02:49.669
something like TeamViewer. You see them observe,

00:02:50.050 --> 00:02:53.110
decide, then act. The AI agent works in a remarkably

00:02:53.110 --> 00:02:55.879
similar way, but it's Computer is that virtual

00:02:55.879 --> 00:02:58.020
browser environment we talked about. And it's

00:02:58.020 --> 00:03:01.099
secure, isolated, a clean instance of a browser,

00:03:01.800 --> 00:03:05.159
a sandbox. That means it absolutely cannot access

00:03:05.159 --> 00:03:07.219
your personal files or settings. It's really

00:03:07.219 --> 00:03:10.240
contained, which is vital for security. And its

00:03:10.240 --> 00:03:13.219
operation follows this loop, like observe, think,

00:03:13.539 --> 00:03:16.620
act, over and over. That's exactly it. First,

00:03:16.780 --> 00:03:19.740
the agent observes. It sees a simplified version

00:03:19.740 --> 00:03:22.159
of the web page, think the underlying HTML, the

00:03:22.159 --> 00:03:24.939
visible text. It carefully labels interactive

00:03:24.939 --> 00:03:27.879
bits, maybe button ID 25, so it knows what's

00:03:27.879 --> 00:03:30.840
clickable. Then it thinks. This is the large

00:03:30.840 --> 00:03:32.860
language model of the brain, basically, like

00:03:32.860 --> 00:03:36.319
GPT -4, the part that reasons. It compares what

00:03:36.319 --> 00:03:38.379
it sees against its goal, let's say, book a flight

00:03:38.379 --> 00:03:40.919
to Da Nang and figures out the next logical step,

00:03:40.939 --> 00:03:43.300
like, OK, my next move should be to put SGN in

00:03:43.300 --> 00:03:46.039
the from field. And finally, it acts. Based on

00:03:46.039 --> 00:03:47.919
that thought, it executes a command, something

00:03:47.919 --> 00:03:50.719
like click button ID 25 or maybe type text input

00:03:50.719 --> 00:03:53.599
field search, nonstop flights to Da Nang. Right.

00:03:53.659 --> 00:03:56.319
And this cycle repeats maybe hundreds of times

00:03:56.319 --> 00:03:58.740
for something complicated, this methodical process.

00:03:59.120 --> 00:04:01.620
That must be why they can sometimes feel a bit

00:04:01.620 --> 00:04:04.400
slow, I guess. Precisely. Yeah, the latency you

00:04:04.400 --> 00:04:06.639
might notice. It isn't the agent getting stuck

00:04:06.639 --> 00:04:09.319
or anything. It's the cumulative time for each

00:04:09.319 --> 00:04:11.979
of those round trips back and forth between the

00:04:11.979 --> 00:04:15.379
virtual browser and the AI model's brain. Each

00:04:15.379 --> 00:04:18.319
action, each little decision needs a new communication

00:04:18.319 --> 00:04:21.319
cycle. So this deliberate step -by -step thing

00:04:21.319 --> 00:04:24.560
ensures accuracy. But yeah, it definitely sacrifices

00:04:24.560 --> 00:04:26.439
that instant speed we're used to with simple

00:04:26.439 --> 00:04:28.740
AI questions. It's a methodical approach. Yeah.

00:04:29.490 --> 00:04:31.610
What's the main trade -off for getting that accuracy?

00:04:31.790 --> 00:04:34.430
Accuracy comes at the cost of speed. It's a step

00:04:34.430 --> 00:04:36.769
-by -step process. Okay, so to really, you know,

00:04:36.769 --> 00:04:39.110
kick the tires on this, our sources set up a

00:04:39.110 --> 00:04:41.269
complex real -world challenge. This wasn't just

00:04:41.269 --> 00:04:42.910
asking for facts. It was about planning a whole

00:04:42.910 --> 00:04:45.170
weekend trip. Yeah, they really threw down the

00:04:45.170 --> 00:04:47.069
gauntlet. They tasked it with planning a three

00:04:47.069 --> 00:04:49.790
-day weekend trip for two people. Destination.

00:04:50.089 --> 00:04:53.009
Da Nang. Vietnam. Time frame. The second weekend

00:04:53.009 --> 00:04:56.029
of next month. And the budget was tight. Flights

00:04:56.029 --> 00:05:00.089
and hotel combined. Maximum $700. It had to find

00:05:00.089 --> 00:05:02.709
round -trip, non -stop flights from Ho Chi Minh

00:05:02.709 --> 00:05:05.129
City to Da Nang, find a four -star hotel with

00:05:05.129 --> 00:05:07.910
a pool, Good Reviews, near Mai Que Beach, and

00:05:07.910 --> 00:05:10.610
find a unique local food tour for Saturday evening.

00:05:11.009 --> 00:05:13.509
Then, here's the kicker, actually book the flights

00:05:13.509 --> 00:05:16.240
at hotel. Wow. Okay, so how did the agent do

00:05:16.240 --> 00:05:18.759
on this real -world gauntlet? Did it manage to

00:05:18.759 --> 00:05:20.680
actually book everything? Well, it immediately

00:05:20.680 --> 00:05:22.980
got to work inside its dedicated virtual browser.

00:05:23.240 --> 00:05:25.240
Researchers could watch it. It tackled the request

00:05:25.240 --> 00:05:27.620
really methodically over about 50 minutes. It

00:05:27.620 --> 00:05:30.100
went to Google Flights, Kayak, found some decent

00:05:30.100 --> 00:05:32.500
non -stop options on Vietjet Air and Bamboo Airways.

00:05:32.639 --> 00:05:34.199
Price -wise, they looked okay. For hotels, it

00:05:34.199 --> 00:05:36.800
used Booking .com, a go -to, filtering for four

00:05:36.800 --> 00:05:39.540
-star pool near the beach. It even shortlisted

00:05:39.540 --> 00:05:41.819
a few, like the Salah Denang Beach Hotel. And

00:05:41.819 --> 00:05:44.089
yeah, it successfully found a motorbike... street

00:05:44.089 --> 00:05:46.329
food tour for the activity. Pretty cool. That's

00:05:46.329 --> 00:05:47.930
honestly impressive, getting all those pieces

00:05:47.930 --> 00:05:50.550
lined up, finding the options, checking criteria.

00:05:50.930 --> 00:05:53.610
But this is where we hit that snag, right? What

00:05:53.610 --> 00:05:55.850
the source is called the last mile problem. It

00:05:55.850 --> 00:05:59.970
got so far, but then. Exactly. Yes. After 50

00:05:59.970 --> 00:06:02.410
minutes of really impressive planning, the agent

00:06:02.410 --> 00:06:04.610
couldn't complete a single actual transaction.

00:06:05.050 --> 00:06:07.129
It got right up to the final payment screen for

00:06:07.129 --> 00:06:10.199
both the flights and the hotel. and then it just

00:06:10.199 --> 00:06:12.379
stopped. It needed passenger details, credit

00:06:12.379 --> 00:06:15.220
card info, sensitive stuff. It gave the link

00:06:15.220 --> 00:06:17.920
for the food tour like requested, but it just

00:06:17.920 --> 00:06:20.800
couldn't finalize anything requiring that sensitive

00:06:20.800 --> 00:06:22.720
data. It really hit that security wall, which

00:06:22.720 --> 00:06:25.120
is there for a reason, of course. So what's the

00:06:25.120 --> 00:06:27.480
main takeaway from that experiment then? What

00:06:27.480 --> 00:06:29.740
worked exceptionally well and where did it, you

00:06:29.740 --> 00:06:32.480
know, fall short? Okay, it excelled at a complex

00:06:32.480 --> 00:06:34.920
understanding, really grasping the multi -part

00:06:34.920 --> 00:06:37.420
request. Intelligent research too, comparing

00:06:37.420 --> 00:06:39.399
prices and reviews across different sites, that

00:06:39.399 --> 00:06:42.060
was good. Handling simultaneous tasks, problem

00:06:42.060 --> 00:06:44.920
solving. Like, it pivoted pretty smoothly when

00:06:44.920 --> 00:06:47.500
one booking site was slow. It truly acted like

00:06:47.500 --> 00:06:49.600
a tireless digital assistant for all the planning

00:06:49.600 --> 00:06:52.740
stuff. And its biggest limitation? The place

00:06:52.740 --> 00:06:56.209
it fell short. The last mile problem. It's that

00:06:56.209 --> 00:06:59.209
critical security safeguard, stopping it before

00:06:59.209 --> 00:07:01.569
payment. You know, I still kind of wrestle with

00:07:01.569 --> 00:07:03.490
the practical friction of that last mile problem

00:07:03.490 --> 00:07:06.170
myself. I mean, it's obviously a necessary hurdle

00:07:06.170 --> 00:07:09.009
for security, right? But it does mean it's not

00:07:09.009 --> 00:07:12.470
truly set it and forget it. Not yet. So basically,

00:07:12.569 --> 00:07:15.209
it's a phenomenal planner. a great research assistant,

00:07:15.730 --> 00:07:17.970
but not quite a fully autonomous booker. Is that

00:07:17.970 --> 00:07:20.610
fair? Exactly. Think of it as a co -pilot. Gets

00:07:20.610 --> 00:07:22.810
you maybe 90 % of the way there, but you still

00:07:22.810 --> 00:07:24.649
have to take the controls for landing. Okay,

00:07:24.649 --> 00:07:27.050
let's clarify the difference here. How do these

00:07:27.050 --> 00:07:30.389
new AI agents really compare to the traditional

00:07:30.389 --> 00:07:33.389
AI assistants we've used for years, like Siri

00:07:33.389 --> 00:07:37.009
or Classic Chat GPT? Oh, it's a fundamental paradigm

00:07:37.009 --> 00:07:40.240
shift, really. Traditional assistance, mostly

00:07:40.240 --> 00:07:42.540
for information retrieval, answering questions,

00:07:43.160 --> 00:07:46.300
usually stuck inside their own app. Agents, though,

00:07:46.439 --> 00:07:49.360
are about task execution. Getting things done.

00:07:49.759 --> 00:07:52.139
Goal completion across the open web. Traditional

00:07:52.139 --> 00:07:54.980
is usually single turn, mostly stateless. Doesn't

00:07:54.980 --> 00:07:56.920
remember much from one question to the next.

00:07:57.379 --> 00:07:59.800
Agents are multi -step, autonomous processes.

00:08:00.079 --> 00:08:02.319
They keep track of the state, the context throughout

00:08:02.319 --> 00:08:04.560
long tasks. They kind of remember what they're

00:08:04.560 --> 00:08:07.139
doing and why. That distinction is huge. It's

00:08:07.139 --> 00:08:09.939
a different category of tool. Knowing that, our

00:08:09.939 --> 00:08:12.279
sources decided to push it even further. Could

00:08:12.279 --> 00:08:14.720
you run multiple agents at the same time? They

00:08:14.720 --> 00:08:16.639
tested this with three separate tasks running

00:08:16.639 --> 00:08:18.920
in parallel. They did, yeah. They launched three

00:08:18.920 --> 00:08:21.800
agents with pretty different complex goals. One

00:08:21.800 --> 00:08:23.779
was the Da Nang weekend trip we just talked about.

00:08:24.120 --> 00:08:26.740
Another was creating a 10 -slide PowerPoint presentation

00:08:26.740 --> 00:08:29.680
on content marketing. And the third was analyzing

00:08:29.680 --> 00:08:32.440
a competitor's YouTube channel to pull data into

00:08:32.440 --> 00:08:35.279
a spreadsheet, all running simultaneously. OK.

00:08:35.379 --> 00:08:37.820
And what were the results of this multi -agent

00:08:37.820 --> 00:08:40.500
test? Was it a mixed bag? Did some tasks work

00:08:40.500 --> 00:08:42.639
better than others? It was a very mixed bag,

00:08:42.759 --> 00:08:45.889
yeah. And super revealing, the web -heavy Da

00:08:45.889 --> 00:08:48.330
Nang trip took the 50 minutes and like we said,

00:08:48.690 --> 00:08:50.769
still needed human help at the end. The creative

00:08:50.769 --> 00:08:53.450
task, the presentation, that took 41 minutes.

00:08:54.090 --> 00:08:55.789
But the design was kind of generic, the content

00:08:55.789 --> 00:08:57.950
pretty basic, needed a lot of human editing.

00:08:58.669 --> 00:09:01.350
But here's the really interesting bit. The data

00:09:01.350 --> 00:09:04.389
analysis task. Scraping YouTube data, making

00:09:04.389 --> 00:09:07.500
a spreadsheet. That took only four minutes. And

00:09:07.500 --> 00:09:10.500
it produced a perfectly formatted, accurate spreadsheet.

00:09:11.039 --> 00:09:13.799
Even had some insightful summaries. Whoa. I mean,

00:09:14.019 --> 00:09:16.019
just imagine a world where you could orchestrate

00:09:16.019 --> 00:09:18.919
like a dozen specialized agents, each one nailing

00:09:18.919 --> 00:09:20.860
their specific task, all running in parallel.

00:09:21.240 --> 00:09:23.320
The potential is kind of mind -bending. That

00:09:23.320 --> 00:09:25.500
really highlights. where these agents seem to

00:09:25.500 --> 00:09:27.159
shine right now, doesn't it? Structured data

00:09:27.159 --> 00:09:29.700
-driven tasks seem slower and less refined when

00:09:29.700 --> 00:09:32.240
it comes to maybe more creative work or really

00:09:32.240 --> 00:09:34.919
complex, nuanced web navigation. So agents are

00:09:34.919 --> 00:09:36.759
really at their best when the task involves a

00:09:36.759 --> 00:09:39.100
lot of structured data. Yes, they truly shine

00:09:39.100 --> 00:09:41.500
with data analysis and clear, objective tasks.

00:09:41.879 --> 00:09:43.940
Mid -roll sponsor, Readmarker sponsor content

00:09:43.940 --> 00:09:46.940
provided separately. OK, so beyond what we've

00:09:46.940 --> 00:09:49.480
dived into with chat GPT agents, the whole AI

00:09:49.480 --> 00:09:52.519
space is just exploding with innovation. It feels

00:09:52.519 --> 00:09:55.299
like every week. What other groundbreaking AI

00:09:55.299 --> 00:09:57.519
launches from just this past week should we know

00:09:57.519 --> 00:10:00.480
about? It's hard to keep up. It really has been

00:10:00.480 --> 00:10:03.139
an incredible week. So much happened. First,

00:10:03.340 --> 00:10:06.100
ChatGPT's record feature is now for everyone.

00:10:06.179 --> 00:10:08.840
Well, for plus users anyway. It was pro -exclusive.

00:10:08.940 --> 00:10:11.419
This lets you record any system audio on your

00:10:11.419 --> 00:10:14.080
Mac, like a Zoom call, a lecture, whatever. And

00:10:14.080 --> 00:10:16.539
it automatically generates a really detailed

00:10:16.539 --> 00:10:19.299
summary. Super powerful for meeting notes or

00:10:19.299 --> 00:10:22.120
repurposing content quickly. Then, Anthropix

00:10:22.120 --> 00:10:24.620
Claude is positioning itself as a hub. They launched

00:10:24.620 --> 00:10:26.460
a directory of tools that integrate directly

00:10:26.460 --> 00:10:28.360
with Claude, seamless connections with stuff

00:10:28.360 --> 00:10:30.720
like Asana, Canva, Gmail, Google Drive, even

00:10:30.720 --> 00:10:33.200
Stripe. Early testing showed some bugs apparently,

00:10:33.899 --> 00:10:35.980
but you gotta expect stability will improve fast.

00:10:36.159 --> 00:10:38.559
That idea of Claude becoming a central work hub,

00:10:39.480 --> 00:10:41.000
that feels like a significant step towards a

00:10:41.000 --> 00:10:43.840
more unified AI assistant. What else is making

00:10:43.840 --> 00:10:45.879
waves, maybe in the more personalized AI space?

00:10:46.139 --> 00:10:50.070
Okay, check this out. NVIDIA AI launched AI Twin.

00:10:50.549 --> 00:10:53.190
Version 4 .0 creates a digital avatar of you.

00:10:53.529 --> 00:10:55.549
You just record at least 60 seconds of yourself,

00:10:55.950 --> 00:10:58.610
give verbal permission, and within minutes, boom,

00:10:58.809 --> 00:11:01.250
you have a digital clone. Imagine, like a real

00:11:01.250 --> 00:11:02.970
estate agent writes a script for their weekly

00:11:02.970 --> 00:11:05.509
market update, pastes it in, and their digital

00:11:05.509 --> 00:11:08.289
clone presents it, flawlessly. It could turn

00:11:08.289 --> 00:11:10.450
what was maybe a half -day task into just 10

00:11:10.450 --> 00:11:14.220
minutes, and even more personal. Hume AI is cloning

00:11:14.220 --> 00:11:17.299
personality. Their EVI 3 model replicates not

00:11:17.299 --> 00:11:19.899
just your voice, but your actual speaking style.

00:11:20.100 --> 00:11:22.759
It analyzes like a 30 -90 second voice sample,

00:11:23.100 --> 00:11:25.139
learns your cadence, your filler words, your

00:11:25.139 --> 00:11:26.879
ahns and ahms, your conversational patterns.

00:11:27.399 --> 00:11:29.019
A podcaster maybe could create a digital version

00:11:29.019 --> 00:11:31.080
of themselves for interactive Q &As with fans,

00:11:31.399 --> 00:11:33.360
keeping their unique style. That's fascinating.

00:11:33.419 --> 00:11:35.080
Not just the voice, but the little mannerisms,

00:11:35.259 --> 00:11:37.779
the speech patterns. Wow. What about for more

00:11:37.779 --> 00:11:39.960
traditional creative fields, like filmmaking

00:11:39.960 --> 00:11:42.899
or audio work? Yeah, for filmmakers sound designers,

00:11:43.539 --> 00:11:45.860
Adobe Firefly now hears your voice and creates

00:11:45.860 --> 00:11:48.840
sound effects. This is genuinely kind of mind

00:11:48.840 --> 00:11:50.659
blowing. You can literally record yourself making

00:11:50.659 --> 00:11:53.860
a noise like just swoosh or flutter flutter and

00:11:53.860 --> 00:11:56.320
tell the AI what you want it to become. So a

00:11:56.320 --> 00:11:58.620
filmmaker sees a bird take flight, makes that

00:11:58.620 --> 00:12:01.559
sound, and Firefly turns it into a high fidelity,

00:12:02.019 --> 00:12:04.240
perfectly synced audio track of realistic wing

00:12:04.240 --> 00:12:06.950
beats. Crazy and we also saw other things popping

00:12:06.950 --> 00:12:09.830
up right like runways act 2 for motion capture

00:12:09.830 --> 00:12:12.590
animating characters with your movements Mirage

00:12:12.590 --> 00:12:14.789
will LSD for real -time video transformation

00:12:14.789 --> 00:12:18.529
turning your feet into visual art plus Grok AI

00:12:18.529 --> 00:12:21.149
is somewhat controversial Annie and Rudy companions

00:12:21.149 --> 00:12:23.710
which kind of point to this demand for more personalized

00:12:23.710 --> 00:12:26.490
AI even if some options are less filtered OK,

00:12:26.529 --> 00:12:28.250
so looking at all these, what's the common thread?

00:12:28.330 --> 00:12:30.909
What ties these diverse new tools together? I

00:12:30.909 --> 00:12:33.230
think they're all about automating and customizing

00:12:33.230 --> 00:12:35.629
creative or repetitive tasks. That seems to be

00:12:35.629 --> 00:12:39.250
the core focus. This just dizzying level of innovation,

00:12:39.250 --> 00:12:41.830
it also speaks to the intense competition heating

00:12:41.830 --> 00:12:44.490
up in the AI world, right? The talent wars, from

00:12:44.490 --> 00:12:46.730
what our sources indicate, sound absolutely real.

00:12:46.970 --> 00:12:49.129
Oh, they are. Our sources detailed this high

00:12:49.129 --> 00:12:52.090
-stakes saga in AI coding. really interesting.

00:12:52.690 --> 00:12:55.029
OpenAI was apparently in talks to acquire a company

00:12:55.029 --> 00:12:57.669
called Windsurf, a promising AI coding tool.

00:12:58.250 --> 00:13:01.429
But then, boom, its CEO and top talent abruptly

00:13:01.429 --> 00:13:04.190
left for Google DeepMind. But even after losing

00:13:04.190 --> 00:13:06.389
its leadership, Windsurf still got acquired,

00:13:06.629 --> 00:13:09.309
but by cognition, the company behind the Devon

00:13:09.309 --> 00:13:11.929
AI agent. This whole scenario just highlights

00:13:11.929 --> 00:13:15.610
how incredibly valuable elite AI talent has become

00:13:15.610 --> 00:13:18.120
in the fierce competition between these big players

00:13:18.120 --> 00:13:20.840
to own the future of software development, especially

00:13:20.840 --> 00:13:23.159
in this agent space. It's like a high stakes

00:13:23.159 --> 00:13:26.600
game of digital chess. OK, so for you, our listener

00:13:26.600 --> 00:13:29.019
listening all this, what does it actually mean?

00:13:29.179 --> 00:13:31.279
How do we effectively use these powerful new

00:13:31.279 --> 00:13:33.840
tools in our own lives or businesses without

00:13:33.840 --> 00:13:35.740
getting overwhelmed or making mistakes? Right.

00:13:35.919 --> 00:13:37.500
AI agents are definitely here. They're real.

00:13:37.840 --> 00:13:39.480
But it's crucial to understand they are not yet

00:13:39.480 --> 00:13:41.220
fully autonomous. Not really. They're powerful

00:13:41.220 --> 00:13:44.029
force multipliers. Absolutely. But they consistently

00:13:44.029 --> 00:13:47.169
require human strategy, human oversight. That,

00:13:47.250 --> 00:13:49.730
to me, is the core message our sources keep hammering

00:13:49.730 --> 00:13:52.009
home. Could you maybe offer some practical guidance?

00:13:52.590 --> 00:13:54.590
How to actually integrate these into our daily

00:13:54.590 --> 00:13:57.350
routines, whether it's for work or just personal

00:13:57.350 --> 00:14:00.960
life? Sure. OK, for business professionals. Think

00:14:00.960 --> 00:14:03.500
of and use these agents like tireless research

00:14:03.500 --> 00:14:06.100
interns. Have them gather huge amounts of data,

00:14:06.299 --> 00:14:08.639
compare vendors super efficiently, maybe create

00:14:08.639 --> 00:14:11.519
initial drafts of reports, delegate routine data

00:14:11.519 --> 00:14:14.240
entry. But, and this is the critical part, never

00:14:14.240 --> 00:14:17.100
allow an agent to make a final unsupervised decision

00:14:17.100 --> 00:14:20.419
on important business matters or especially financial

00:14:20.419 --> 00:14:23.399
transactions. Always, always review its work

00:14:23.399 --> 00:14:25.440
with your own judgment for personal productivity.

00:14:26.000 --> 00:14:27.700
Yeah, let an agent plan your vacation outline,

00:14:27.860 --> 00:14:29.899
find recipes that fit criteria, create detailed

00:14:29.929 --> 00:14:32.289
shopping list, it'll save you countless hours

00:14:32.289 --> 00:14:34.450
of just tedious drudgery freeing you up to make

00:14:34.450 --> 00:14:36.750
that final 10 % of decisions requiring your personal

00:14:36.750 --> 00:14:38.669
taste, your judgment. Just be really vigilant

00:14:38.669 --> 00:14:41.399
about your data. Use unique passwords. Be present

00:14:41.399 --> 00:14:43.320
for any stuff that needs personal or financial

00:14:43.320 --> 00:14:45.360
info. Don't just let it run wild with that stuff.

00:14:45.399 --> 00:14:47.559
And for content creators, tools like that in

00:14:47.559 --> 00:14:49.980
Video AI Twin or Adobe Sound Effect Generator,

00:14:50.000 --> 00:14:51.899
they can dramatically speed up your production

00:14:51.899 --> 00:14:54.159
workflow. Use real -time video effects for unique

00:14:54.159 --> 00:14:57.159
live content, maybe. But always, always treat

00:14:57.159 --> 00:14:59.080
AI -generated content, text, scripts, images,

00:14:59.120 --> 00:15:01.519
whatever, as a first draft. You have to infuse

00:15:01.519 --> 00:15:03.980
it with your unique voice, your style, your perspective.

00:15:04.360 --> 00:15:07.639
That human touch is still totally irreplaceable.

00:15:07.840 --> 00:15:10.519
And we also saw a brief m - of things like Google's

00:15:10.519 --> 00:15:13.440
AI business caller, where AI can call local businesses

00:15:13.440 --> 00:15:16.879
for you, China's Kimi K2 model ranking high globally,

00:15:17.559 --> 00:15:19.399
specialized financial AI tools from Anthropic

00:15:19.399 --> 00:15:23.000
and Mistral, Amazon's Cura IDE for coding, planning

00:15:23.000 --> 00:15:26.000
project architecture first. It's just clear that

00:15:26.000 --> 00:15:28.799
innovation is bursting out everywhere, in every

00:15:28.799 --> 00:15:31.759
sector. So the bottom line message seems to be...

00:15:31.519 --> 00:15:34.000
Leverage AI's capabilities, definitely use them,

00:15:34.320 --> 00:15:36.279
but always keep a human in the loop for the critical

00:15:36.279 --> 00:15:38.100
thinking and final decisions. Absolutely. They

00:15:38.100 --> 00:15:39.860
are co -pilots. They are not replacements for

00:15:39.860 --> 00:15:42.279
critical thinking. Not yet, anyway. So the big

00:15:42.279 --> 00:15:44.460
idea here, pulling it all together, it feels

00:15:44.460 --> 00:15:47.320
like ChatGPT's agent feature and really all these

00:15:47.320 --> 00:15:50.580
new AI advancements signal a fundamental change,

00:15:51.019 --> 00:15:52.500
a change in our relationship with technology.

00:15:52.600 --> 00:15:55.159
We're moving from just being users to becoming

00:15:55.159 --> 00:15:57.779
managers, maybe even skilled orchestrators of

00:15:57.779 --> 00:16:00.269
AI. Yeah, and what's truly fascinating now is

00:16:00.269 --> 00:16:02.210
thinking about what's coming next, say in the

00:16:02.210 --> 00:16:04.909
next six to 12 months. Expect big increases in

00:16:04.909 --> 00:16:07.490
speed, reliability, that's almost a given, and

00:16:07.490 --> 00:16:10.549
a rapid move towards true multimodality or vision.

00:16:11.090 --> 00:16:13.409
Agents being able to interpret entire page layouts,

00:16:13.850 --> 00:16:16.009
recognize icons, maybe even learn by watching

00:16:16.009 --> 00:16:18.649
video tutorials, will probably also see more

00:16:18.649 --> 00:16:20.990
advanced long -term memory and deep personalization.

00:16:21.350 --> 00:16:23.529
Agents learning your preferences over time becoming

00:16:23.529 --> 00:16:26.250
proactive, maybe even anticipating needs before

00:16:26.250 --> 00:16:28.779
you stake them. And certainly, expect the rise

00:16:28.779 --> 00:16:31.000
of highly specialized agents, agents for specific

00:16:31.000 --> 00:16:33.240
jobs like legal research or marketing campaign

00:16:33.240 --> 00:16:35.299
creation, that could fundamentally change how

00:16:35.299 --> 00:16:39.179
work gets done. While that dream of a fully autonomous

00:16:39.179 --> 00:16:41.840
AI handling absolutely everything, it isn't quite

00:16:41.840 --> 00:16:43.840
reality yet. But the progress we're seeing now

00:16:43.840 --> 00:16:46.399
is just staggering. These tools, even as they

00:16:46.399 --> 00:16:48.799
are, are already capable of absorbing a huge

00:16:48.799 --> 00:16:50.879
chunk of the tedious time -consuming work that

00:16:50.879 --> 00:16:53.539
fills up our days. It seems like the most valuable

00:16:53.539 --> 00:16:56.159
skill in the coming years might actually be AI

00:16:56.159 --> 00:16:58.409
orchestration. you know, the ability to effectively

00:16:58.409 --> 00:17:01.669
define goals, delegate complex tasks to a team

00:17:01.669 --> 00:17:04.170
of specialized AI agents, and provide that critical

00:17:04.170 --> 00:17:06.630
human oversight needed for quality, accuracy,

00:17:06.750 --> 00:17:10.650
and security. So, my advice, just start experimenting

00:17:10.650 --> 00:17:13.250
now. Give an agent a small, low -stakes task.

00:17:13.529 --> 00:17:15.650
See what happens. Learn its strengths, its weaknesses,

00:17:15.769 --> 00:17:17.670
its quirks. The future, I think, really belongs

00:17:17.670 --> 00:17:19.910
to those who don't just use AI, but learn how

00:17:19.910 --> 00:17:22.369
to lead it. Thank you for joining us on this

00:17:22.369 --> 00:17:23.710
deep dive out true music.