WEBVTT

00:00:00.000 --> 00:00:02.040
You know, sometimes the really big shifts, they

00:00:02.040 --> 00:00:04.419
happen kind of quietly, away from the main headlines.

00:00:04.519 --> 00:00:06.860
Imagine a giant in the tech world, someone everyone

00:00:06.860 --> 00:00:11.529
knows, getting sort of overtaken. their most

00:00:11.529 --> 00:00:13.570
important fight. That's what we're unpacking

00:00:13.570 --> 00:00:15.449
today. Right. And it's not just about, you know,

00:00:15.449 --> 00:00:17.210
who's winning right now. It's really about how

00:00:17.210 --> 00:00:19.730
AI itself is evolving. We're going to deep dive

00:00:19.730 --> 00:00:22.129
into a whole stack of fresh insights. We'll look

00:00:22.129 --> 00:00:24.489
at new market dynamics and breakthrough tools

00:00:24.489 --> 00:00:27.789
and even these autonomous AI agents that might

00:00:27.789 --> 00:00:30.449
just change, well, everything. So what does all

00:00:30.449 --> 00:00:32.609
this mean for you? For anyone trying to navigate

00:00:32.609 --> 00:00:35.689
this fast -paced world, we've got the nuggets,

00:00:35.810 --> 00:00:38.189
the insights, and maybe a few surprising facts

00:00:38.189 --> 00:00:41.780
along the way. Let's explore it. Okay, so our

00:00:41.780 --> 00:00:43.520
deep dive kicks off with what a lot of people

00:00:43.520 --> 00:00:45.039
are calling the most surprising shift in the

00:00:45.039 --> 00:00:48.240
AI market, looking at mid -2025. For the very

00:00:48.240 --> 00:00:50.520
first time, it seems Anthropix cloud models have

00:00:50.520 --> 00:00:53.960
jumped to number one. For enterprise AI, and

00:00:53.960 --> 00:00:55.840
this isn't just a small nudge, it feels like

00:00:55.840 --> 00:00:58.679
a pretty monumental change. It really is. And

00:00:58.679 --> 00:01:01.000
what's fascinating is the data that backs this

00:01:01.000 --> 00:01:03.579
up. This comes from Menlo Ventures. Claude now

00:01:03.579 --> 00:01:06.280
apparently holds about 32 % of enterprise usage.

00:01:06.519 --> 00:01:08.959
That actually puts it ahead of OpenAI, which

00:01:08.959 --> 00:01:12.260
is sitting at 25%. And if you zoom in specifically

00:01:12.260 --> 00:01:14.560
on coding workloads, you know, where AI is really

00:01:14.560 --> 00:01:16.959
being put through its paces daily, Claude's lead

00:01:16.959 --> 00:01:21.469
is even bigger. 42 % compared to OpenAI's 21%.

00:01:21.469 --> 00:01:24.030
And this is a complete flip from just last year,

00:01:24.150 --> 00:01:27.670
2023. Back then, OpenAI had, what, 50 %? Dominant.

00:01:27.689 --> 00:01:31.250
Totally. And Claude was way back at 12%. So,

00:01:31.250 --> 00:01:34.689
yeah. A seismic shift in barely a year. It is

00:01:34.689 --> 00:01:36.890
a really striking contrast, isn't it? Yeah. I

00:01:36.890 --> 00:01:38.650
mean, OpenAI has definitely captured the public

00:01:38.650 --> 00:01:40.689
imagination, dominated the consumer headlines,

00:01:40.810 --> 00:01:43.909
processing, what was it, an incredible 2 .5 billion

00:01:43.909 --> 00:01:47.010
prompts a day. Wow. Huge numbers. But while all

00:01:47.010 --> 00:01:48.790
that consumer spotlight was burning so bright,

00:01:48.890 --> 00:01:51.849
Cloud was quietly, almost like stealthily, rising

00:01:51.849 --> 00:01:53.450
up, becoming what a lot of developers are now

00:01:53.450 --> 00:01:56.170
calling their best friend. Especially for companies

00:01:56.170 --> 00:01:58.010
actually building and shipping real products.

00:01:58.109 --> 00:01:59.829
It highlights a very different battleground for

00:01:59.829 --> 00:02:02.219
AI. Exactly. And the dev chatter, you know, the

00:02:02.219 --> 00:02:05.439
buzz among developers really backs this up. We're

00:02:05.439 --> 00:02:07.819
hearing companies consistently choose Claude

00:02:07.819 --> 00:02:10.460
because it's better at long form tasks. It also

00:02:10.460 --> 00:02:12.860
seems to be easier to plug into their existing

00:02:12.860 --> 00:02:15.840
workflows, experiences fewer failures, apparently,

00:02:16.020 --> 00:02:19.370
and just generally feels well. More enterprise

00:02:19.370 --> 00:02:21.210
ready. That's the phrase people use. These aren't

00:02:21.210 --> 00:02:23.250
just minor points. They speak to that core liability,

00:02:23.530 --> 00:02:26.150
that predictability businesses really need. And

00:02:26.150 --> 00:02:28.629
building that kind of deep reliability, that

00:02:28.629 --> 00:02:31.050
trust, especially in the enterprise world, that's

00:02:31.050 --> 00:02:33.569
incredibly hard work. Once you've got it, it's

00:02:33.569 --> 00:02:36.270
even harder for someone else to unseat you. So

00:02:36.270 --> 00:02:38.669
this signals that, OK, open AI might still be

00:02:38.669 --> 00:02:42.030
the king of consumer AI for now. But they absolutely

00:02:42.030 --> 00:02:45.090
need a stronger B2B strategy. Yeah. And probably

00:02:45.090 --> 00:02:47.969
fast. Yeah. Because the enterprise AI space is

00:02:47.969 --> 00:02:50.409
evolving super quickly and others are clearly

00:02:50.409 --> 00:02:52.669
leapfrogging them right here. Yeah. And if you

00:02:52.669 --> 00:02:54.289
connect this to the bigger picture, like why

00:02:54.289 --> 00:02:57.409
it matters, it basically means cloud is increasingly

00:02:57.409 --> 00:03:00.289
seen as the safe bet for both startups and big

00:03:00.289 --> 00:03:02.370
enterprises when they're picking their first

00:03:02.370 --> 00:03:05.150
or their main large language model, their LLM.

00:03:05.250 --> 00:03:07.349
Right. That foundational AI model trained on

00:03:07.349 --> 00:03:10.069
huge amounts of text to understand and talk like

00:03:10.069 --> 00:03:12.430
a. human exactly so Claude's got this image now

00:03:12.430 --> 00:03:16.629
quiet reliable well supported constantly improving

00:03:16.629 --> 00:03:19.770
makes it a very appealing kind of low -risk choice

00:03:19.770 --> 00:03:21.389
when you're putting something into production

00:03:21.389 --> 00:03:24.729
so this quiet reliability of Claude what's the

00:03:24.729 --> 00:03:27.289
fundamental impact for businesses then It lets

00:03:27.289 --> 00:03:30.889
companies ship products faster, more confidently.

00:03:30.909 --> 00:03:33.229
It's just a dependable choice. Okay, let's switch

00:03:33.229 --> 00:03:35.770
gears. Let's talk about some other quickfire

00:03:35.770 --> 00:03:38.009
AI highlights that caught our eye, starting with

00:03:38.009 --> 00:03:40.710
a really important one about privacy. So it came

00:03:40.710 --> 00:03:44.050
out that Google had kind of quietly indexed shared

00:03:44.050 --> 00:03:47.689
links from chat GPT, which potentially made some

00:03:47.689 --> 00:03:51.129
private chats public. OpenAI stepped in pretty

00:03:51.129 --> 00:03:53.879
quickly, shut that down. Thank you. Yeah. But

00:03:53.879 --> 00:03:55.819
it's really potent reminder for you listening.

00:03:55.939 --> 00:03:58.419
If you've ever shared a GPT link, maybe double

00:03:58.419 --> 00:04:00.659
check your settings. Be mindful of what goes

00:04:00.659 --> 00:04:02.879
public. Yeah, definitely a good wake up call

00:04:02.879 --> 00:04:05.400
for, you know, digital hygiene. Right. And speaking

00:04:05.400 --> 00:04:07.620
of shifts, Microsoft put out these fascinating

00:04:07.620 --> 00:04:10.699
lists recently looking at AI's impact on jobs.

00:04:10.819 --> 00:04:14.300
They flagged like. 40 jobs potentially impacted,

00:04:14.539 --> 00:04:16.680
maybe doomed, as some headlines put it. A little

00:04:16.680 --> 00:04:19.300
dramatic, maybe. Ah, yeah. And then 40 jobs considered

00:04:19.300 --> 00:04:21.839
safe. But what's really insightful isn't just

00:04:21.839 --> 00:04:24.779
the specific jobs. It's the types of tasks AI

00:04:24.779 --> 00:04:27.240
is automating. You know, repetitive stuff, data

00:04:27.240 --> 00:04:29.920
-heavy, predictable tasks. Versus the jobs proving

00:04:29.920 --> 00:04:32.379
resilient, which often need complex problem solving,

00:04:32.540 --> 00:04:34.579
creativity, that kind of unique human interaction.

00:04:34.939 --> 00:04:37.319
It's a snapshot of how work itself is changing.

00:04:37.540 --> 00:04:39.899
That's a great way to look at it. And talking

00:04:39.899 --> 00:04:42.300
about breakthroughs that show AI. AI is evolving

00:04:42.300 --> 00:04:44.860
power. This is where it gets really interesting.

00:04:45.040 --> 00:04:47.480
There's this viral super agent called Manus.

00:04:47.600 --> 00:04:50.100
It just launched a feature called wide research.

00:04:50.300 --> 00:04:53.540
Now, Manus is this AI system focused on automated

00:04:53.540 --> 00:04:56.579
research, putting knowledge together and wide

00:04:56.579 --> 00:04:59.519
research. It lets multiple AI agents work together

00:04:59.519 --> 00:05:02.660
simultaneously, processing truly massive amounts

00:05:02.660 --> 00:05:06.139
of data. Whoa. I mean, imagine scaling that.

00:05:06.750 --> 00:05:09.029
billion queries you let multiple specialized

00:05:09.029 --> 00:05:12.250
agents just attack that data together each from

00:05:12.250 --> 00:05:14.610
a different angle that's a whole new level of

00:05:14.610 --> 00:05:16.889
scalable intelligent insight it's pretty incredible

00:05:16.889 --> 00:05:19.589
to think about and uh on the app development

00:05:19.589 --> 00:05:23.019
side there was this cool real world test A founder

00:05:23.019 --> 00:05:25.800
took Lovable, that's an AI app builder, and ReapBlit,

00:05:25.879 --> 00:05:27.920
the online coding platform, and tried to clone

00:05:27.920 --> 00:05:30.920
five viral apps using both head to head. Apparently,

00:05:30.920 --> 00:05:33.019
a clear winner emerged pretty quickly for speed

00:05:33.019 --> 00:05:35.259
and ease of replication. Just highlights how

00:05:35.259 --> 00:05:36.899
accessible app development is getting, even if

00:05:36.899 --> 00:05:39.060
you're not a hardcore coder. Right. And Apple.

00:05:39.120 --> 00:05:41.240
They've been getting some heat for maybe lagging

00:05:41.240 --> 00:05:43.939
a bit in the AI race. But it sounds like they're

00:05:43.939 --> 00:05:47.310
making significant moves now. planning to significantly

00:05:47.310 --> 00:05:50.029
grow their AI investments, being very open to

00:05:50.029 --> 00:05:52.850
acquisitions. Their goal isn't necessarily like

00:05:52.850 --> 00:05:55.990
launching standalone AI gadgets. It seems more

00:05:55.990 --> 00:05:58.410
about deeply integrating AI into their existing

00:05:58.410 --> 00:06:01.170
stuff. Yeah, the ecosystem. Exactly. Enhancing

00:06:01.170 --> 00:06:04.529
Siri, maybe better photo processing, just making

00:06:04.529 --> 00:06:06.949
user experiences smoother across their products.

00:06:07.009 --> 00:06:09.170
It's more about that pervasive, maybe subtle

00:06:09.170 --> 00:06:12.050
AI integration. And underpinning all these advances,

00:06:12.149 --> 00:06:14.430
you need the infrastructure, right? So find an

00:06:14.430 --> 00:06:16.139
AI infrastructure. company, pretty significant

00:06:16.139 --> 00:06:18.680
one, they just raised a big round, $125 million,

00:06:18.959 --> 00:06:22.259
led by Meritech, but Salesforce Ventures and

00:06:22.259 --> 00:06:25.600
Google's AI Futures Fund chipped in too. Shows

00:06:25.600 --> 00:06:27.420
there's still huge investment going into not

00:06:27.420 --> 00:06:29.660
just the AI models themselves, but the kind of

00:06:29.660 --> 00:06:32.199
plumbing, the tech that lets them scale and run.

00:06:32.439 --> 00:06:34.480
So thinking about all these different pieces,

00:06:34.680 --> 00:06:37.779
privacy fixes, job impacts, scalable research,

00:06:38.000 --> 00:06:40.720
app building, how quickly are these advancements

00:06:40.720 --> 00:06:43.129
really shaping our daily tech lives? Oh, they're

00:06:43.129 --> 00:06:45.949
integrating super fast, making everyday tasks

00:06:45.949 --> 00:06:48.430
and even complex development feel more intelligent.

00:06:48.910 --> 00:06:51.569
Okay, let's pivot now. Let's talk about how AI

00:06:51.569 --> 00:06:55.509
is moving beyond just chatting. Moving into real

00:06:55.509 --> 00:06:58.069
automation, we're starting to see this new standard

00:06:58.069 --> 00:07:00.550
emerge. It's called the Model Context Protocol,

00:07:00.829 --> 00:07:03.750
or MCP. Yeah, MCP. What's cool about it is how

00:07:03.750 --> 00:07:06.129
much it simplifies things, potentially. Think

00:07:06.129 --> 00:07:08.810
of it as giving an AI hands to interact with

00:07:08.810 --> 00:07:10.870
the digital world, not just a voice. That's the

00:07:10.870 --> 00:07:14.089
analogy people use. It lets AI use your existing

00:07:14.089 --> 00:07:16.329
tools, connect to different applications, and

00:07:16.329 --> 00:07:19.360
really automate complex workflows. So before

00:07:19.360 --> 00:07:21.519
MCP, maybe an AI could tell you how to do something

00:07:21.519 --> 00:07:24.259
complex online. Now it could actually do it for

00:07:24.259 --> 00:07:26.720
you. That's a massive leap in practical automation

00:07:26.720 --> 00:07:29.420
from just advising to actually executing. It

00:07:29.420 --> 00:07:31.660
really is. And that leap brings us to things

00:07:31.660 --> 00:07:33.980
like building AI web apps. You can actually build

00:07:33.980 --> 00:07:36.620
custom -led magnet web apps now. You know, those

00:07:36.620 --> 00:07:38.540
little tools websites use to capture customer

00:07:38.540 --> 00:07:41.639
info. You can build those in minutes using tools

00:07:41.639 --> 00:07:45.139
like ChatGPT or Claude, often with no coding

00:07:45.139 --> 00:07:47.680
required, which means even if you're not a developer,

00:07:47.899 --> 00:07:51.019
you can create these powerful customer -attracting

00:07:51.019 --> 00:07:54.019
tools. lowers the barrier to entry for digital

00:07:54.019 --> 00:07:57.100
business so much. And Google Gemini is also pushing

00:07:57.100 --> 00:07:59.319
workflows forward. They've got this suite of

00:07:59.319 --> 00:08:01.339
like 28 free features. And these aren't just

00:08:01.339 --> 00:08:03.360
minor tricks. They let you automate tasks, analyze

00:08:03.360 --> 00:08:05.959
data, even build working apps pretty quickly.

00:08:06.439 --> 00:08:10.360
Imagine an AI quickly going through a huge spreadsheet

00:08:10.360 --> 00:08:13.579
of customer feedback, identifying the key sentiment

00:08:13.579 --> 00:08:16.120
patterns, and then just automatically drafting

00:08:16.120 --> 00:08:18.220
a summary report for you. It's all about working

00:08:18.220 --> 00:08:21.339
smarter, you know, not necessarily harder. Exactly.

00:08:21.600 --> 00:08:26.040
And here are just a few specific new AI tools

00:08:26.040 --> 00:08:28.360
that caught our eye, really illustrating this

00:08:28.360 --> 00:08:31.560
shift to practical automation. First, there's

00:08:31.560 --> 00:08:35.740
Vibe N8n. This helps you build and tweak N8n

00:08:35.740 --> 00:08:37.539
workflows. Which is that open source automation

00:08:37.539 --> 00:08:40.139
software. Right. But you do it just by prompting.

00:08:40.350 --> 00:08:42.730
Using natural language, like telling your automation

00:08:42.730 --> 00:08:45.009
software what to build instead of clicking through

00:08:45.009 --> 00:08:47.750
endless menus. Then there's Project OS. This

00:08:47.750 --> 00:08:49.690
one helps you get a personalized resume that

00:08:49.690 --> 00:08:51.710
really grabs attention. It doesn't just format.

00:08:51.769 --> 00:08:54.750
It kind of crafts your story. Next up, Buki.

00:08:55.450 --> 00:08:58.110
Plans, writes, and even distributes content that's

00:08:58.110 --> 00:09:00.149
designed to be authentic and convert customers.

00:09:01.049 --> 00:09:04.029
Huge for marketing. Content creation is big.

00:09:04.210 --> 00:09:07.639
And finally, Launch. This one creates fully functional

00:09:07.639 --> 00:09:10.179
apps, but with AI assistance and human support.

00:09:10.740 --> 00:09:13.059
Kind of bridging the gap between just an idea

00:09:13.059 --> 00:09:16.320
and a real deployable product makes app creation

00:09:16.320 --> 00:09:18.720
way more accessible. So looking at all these

00:09:18.720 --> 00:09:20.899
new tools, what's the main thing someone should

00:09:20.899 --> 00:09:22.500
take away if they're thinking about exploring

00:09:22.500 --> 00:09:25.700
them? They really simplify complex jobs. From

00:09:25.700 --> 00:09:28.259
automating workflows to creating content, powerful

00:09:28.259 --> 00:09:31.100
tools are becoming much more accessible. Sponsor.

00:09:31.480 --> 00:09:33.659
Okay, we've got a couple more AI quick hits for

00:09:33.659 --> 00:09:35.620
you before we move on. There's this clever new

00:09:35.620 --> 00:09:37.580
tutorial floating around, shows you how to turn

00:09:37.580 --> 00:09:40.080
your resume into a Netflix -style web page using

00:09:40.080 --> 00:09:43.460
AI. Huh, like browser skills? Kind of. It treats

00:09:43.460 --> 00:09:45.700
your profile like a personalized streaming experience.

00:09:45.820 --> 00:09:47.980
It's creative, right? Speaks to that broader

00:09:47.980 --> 00:09:50.639
trend of AI enabling really customized digital

00:09:50.639 --> 00:09:54.179
experiences. Interesting. And back to privacy

00:09:54.179 --> 00:09:56.860
again for a second. OpenAI recently canceled

00:09:56.860 --> 00:09:58.559
a feature they were working on. Yeah. It would

00:09:58.559 --> 00:10:00.919
let users search through their private GPC chats.

00:10:01.470 --> 00:10:04.590
Oh, wow. Yeah. So canceling it reinforces that

00:10:04.590 --> 00:10:06.490
theme we talked about earlier. Companies are

00:10:06.490 --> 00:10:09.129
really having to react quickly to public and

00:10:09.129 --> 00:10:11.990
regulatory concerns about data access security.

00:10:12.309 --> 00:10:14.549
Shows that constant tension, doesn't it? Between

00:10:14.549 --> 00:10:18.090
convenience and privacy and AI. Definitely. And,

00:10:18.210 --> 00:10:20.169
you know, there's also this growing chatter about

00:10:20.169 --> 00:10:23.470
the real cost of chasing AGI. Artificial General

00:10:23.470 --> 00:10:26.370
Intelligence. Machines with like human level

00:10:26.370 --> 00:10:29.039
thinking. Exactly. And the consensus seems to

00:10:29.039 --> 00:10:31.379
be that power consolidation is just the beginning.

00:10:31.600 --> 00:10:34.659
The sheer resources needed, computation, money

00:10:34.659 --> 00:10:37.980
to even attempt, AGI means maybe only a handful

00:10:37.980 --> 00:10:40.580
of big players can truly compete, raises some

00:10:40.580 --> 00:10:42.419
big questions about who controls that kind of

00:10:42.419 --> 00:10:44.899
power, who gets access. Yeah, deep questions.

00:10:45.139 --> 00:10:47.679
And speaking of resources and expansion, OpenAI

00:10:47.679 --> 00:10:50.620
also announced Stargate Norway. That's going

00:10:50.620 --> 00:10:52.700
to be its first European AI data center, which

00:10:52.700 --> 00:10:55.080
is significant, not just for raw computing power,

00:10:55.299 --> 00:10:58.049
but for things like data sovereignty. Regional

00:10:58.049 --> 00:11:01.049
AI development. Could kickstart more localized

00:11:01.049 --> 00:11:04.490
AI hubs in Europe. And one more quick one. Poe,

00:11:04.490 --> 00:11:06.590
that's the platform letting you use a bunch of

00:11:06.590 --> 00:11:08.470
different AI models. Right, like a buffet of

00:11:08.470 --> 00:11:11.090
AIs. Ha, yeah. They just released a developer

00:11:11.090 --> 00:11:13.950
API. So now developers can programmatically access

00:11:13.950 --> 00:11:15.789
all those different cutting -edge models through

00:11:15.789 --> 00:11:18.710
one single point. Makes it way easier to experiment,

00:11:18.889 --> 00:11:20.909
mix and mash, integrate different AI skills into

00:11:20.909 --> 00:11:23.950
their own apps. Okay, but now here's where things

00:11:23.950 --> 00:11:28.259
get maybe really interesting. Thank you. And

00:11:28.259 --> 00:11:30.700
perhaps a little mind -bending. Are you tired

00:11:30.700 --> 00:11:34.179
of cleaning data sets, painstakingly tuning models,

00:11:34.419 --> 00:11:37.399
debugging those awful CUDA errors yourself? Yeah,

00:11:37.399 --> 00:11:39.399
been there. Imagine an AI that just does all

00:11:39.399 --> 00:11:41.440
that for you. Meet NEO. They're calling it the

00:11:41.440 --> 00:11:44.620
Kaggle Killer AI agent. Yeah, this is genuinely

00:11:44.620 --> 00:11:46.940
remarkable stuff from a company called NeoAI.

00:11:47.159 --> 00:11:49.799
So NEO is an AI agent, but it's actually made

00:11:49.799 --> 00:11:51.919
of 11 specialized sub -agents working together.

00:11:52.000 --> 00:11:54.460
And it's designed meticulously to do basically

00:11:54.460 --> 00:11:56.960
everything a full -stack machine learning engineer

00:11:56.960 --> 00:12:01.460
does. And here's the kicker. Autonomously. Autonomously?

00:12:01.559 --> 00:12:03.580
Yeah. It's like having a whole AI engineering

00:12:03.580 --> 00:12:05.860
team just ready to go. Okay, let's break that

00:12:05.860 --> 00:12:08.399
down. So you give it a problem like, build me

00:12:08.399 --> 00:12:12.200
a predictive model using this data, right? Then

00:12:12.200 --> 00:12:16.100
NEO plans the whole project, writes the code.

00:12:16.610 --> 00:12:19.490
Debugs the code, deploys the solution, all in

00:12:19.490 --> 00:12:22.309
this continuous loop, iterating. And it runs

00:12:22.309 --> 00:12:24.570
in a safe sandbox environment so it can try stuff

00:12:24.570 --> 00:12:27.269
without messing things up. It communicates with

00:12:27.269 --> 00:12:29.309
you through chat. You can jump in, guide it if

00:12:29.309 --> 00:12:31.830
you want, or you can just watch it work, solving

00:12:31.830 --> 00:12:34.090
complex problems by itself. It's pretty incredible

00:12:34.090 --> 00:12:35.730
to think about. Yeah, it's way more than just

00:12:35.730 --> 00:12:38.870
one tool. It's like you took ChatGPT plus AutoML

00:12:38.870 --> 00:12:41.970
that's automated machine learning, simplifies

00:12:41.970 --> 00:12:44.450
building models. Right. Plus DevOps, the whole

00:12:44.450 --> 00:12:46.289
software development and operations pipeline

00:12:46.289 --> 00:12:48.909
and put them all together. But like on steroids,

00:12:49.129 --> 00:12:51.490
it's this integrated intelligence system handling

00:12:51.490 --> 00:12:54.009
the whole workflow. And the performance. This

00:12:54.009 --> 00:12:56.730
is what's truly astonishing. They benchmarked

00:12:56.730 --> 00:13:00.350
NEO across 75 Kaggle competitions. Yeah. Used

00:13:00.350 --> 00:13:03.929
a standard framework, MLE bench. And it didn't

00:13:03.929 --> 00:13:08.169
just like participate. It meddled in 34 .2 %

00:13:08.169 --> 00:13:11.809
of them. Wow. Yeah. Actual competition grade

00:13:11.809 --> 00:13:15.549
performance. often beating human teams. It even

00:13:15.549 --> 00:13:18.330
outperformed other advanced AI agents like RD

00:13:18.330 --> 00:13:22.210
Agent and OpenAI's own aid stack. You know, I

00:13:22.210 --> 00:13:24.289
still wrestle with prompt drift myself sometimes

00:13:24.289 --> 00:13:26.789
where the AI output just kind of changes unexpectedly.

00:13:27.029 --> 00:13:29.149
Oh yeah, frustrating. So an agent that can actually

00:13:29.149 --> 00:13:32.190
course correct autonomously, learn and adapt.

00:13:32.700 --> 00:13:35.139
as it hits problems. That feels like a real game

00:13:35.139 --> 00:13:37.620
changer for the field. And crucially, they built

00:13:37.620 --> 00:13:41.299
in this human in the loop mode, which means you

00:13:41.299 --> 00:13:43.919
can intervene any time. You can tweak its logic,

00:13:44.000 --> 00:13:46.120
give it feedback, add constraints midstream,

00:13:46.200 --> 00:13:48.679
or just chat with it like it's another member

00:13:48.679 --> 00:13:50.940
of the team. It's designed to be fast, iterative,

00:13:51.080 --> 00:13:53.120
and really importantly, it's not a black box.

00:13:53.360 --> 00:13:56.539
Ah, transparency. Good. Yeah, it's... Processes

00:13:56.539 --> 00:13:58.559
are transparent, so you can oversee it, collaborate

00:13:58.559 --> 00:14:00.620
with it. This feels like more than just a neat

00:14:00.620 --> 00:14:03.039
tool, though. It feels like a clear signal. OpenAI,

00:14:03.299 --> 00:14:07.000
Google, Adept. You can bet someone's already

00:14:07.000 --> 00:14:09.120
working hard on their own version of Neo. Oh,

00:14:09.120 --> 00:14:11.519
for sure. The agent wars, where these autonomous

00:14:11.519 --> 00:14:14.179
AI systems compete to solve really complex problems,

00:14:14.419 --> 00:14:17.600
they feel like they're truly just getting started

00:14:17.600 --> 00:14:21.929
now. So thinking about NEO. How does this level

00:14:21.929 --> 00:14:24.830
of autonomy really change the game for human

00:14:24.830 --> 00:14:28.409
ML engineers? It handles the complex, often tedious

00:14:28.409 --> 00:14:31.370
tasks freeing up human engineers for higher level

00:14:31.370 --> 00:14:34.570
innovation and strategy. Okay, let's try and

00:14:34.570 --> 00:14:36.490
bring this all together. What's really clear

00:14:36.490 --> 00:14:38.850
from all this is, wow, the AI landscape is shifting

00:14:38.850 --> 00:14:42.139
fast, almost daily. And in ways that are both,

00:14:42.220 --> 00:14:44.840
you know, subtle and really profound. Yeah, we're

00:14:44.840 --> 00:14:46.360
definitely seeing that big pivot, aren't we?

00:14:46.419 --> 00:14:49.460
From AI as maybe a fun consumer novelty towards

00:14:49.460 --> 00:14:52.399
its role in serious enterprise reliability, with

00:14:52.399 --> 00:14:54.279
players like Claude just quietly gaining massive

00:14:54.279 --> 00:14:56.639
ground in the business world, really challenging

00:14:56.639 --> 00:14:58.600
the established leaders there. Two -sec silence.

00:14:59.460 --> 00:15:02.059
And then beyond just those market battles, these

00:15:02.059 --> 00:15:04.779
AI tools themselves are becoming incredibly practical,

00:15:05.000 --> 00:15:06.639
really accessible. They're empowering pretty

00:15:06.639 --> 00:15:08.659
much anyone, really, regardless of their tech

00:15:08.659 --> 00:15:11.299
background. To automate complex tasks, analyze

00:15:11.299 --> 00:15:13.679
data, even build functional apps without needing

00:15:13.679 --> 00:15:16.539
to write a single line of code. But maybe the

00:15:16.539 --> 00:15:18.419
most profound development, the one that really

00:15:18.419 --> 00:15:22.860
sparks the most curiosity, is the rise of these

00:15:22.860 --> 00:15:26.559
truly autonomous agents. like NEO, systems that

00:15:26.559 --> 00:15:28.779
are now capable of handling complex engineering

00:15:28.779 --> 00:15:32.279
tasks, often with minimal human oversight. This

00:15:32.279 --> 00:15:34.740
feels like where AI really starts to genuinely

00:15:34.740 --> 00:15:37.899
collaborate with us or maybe even lead in the

00:15:37.899 --> 00:15:40.179
act of creation itself. It's an incredibly exciting

00:15:40.179 --> 00:15:42.120
time, isn't it? Just watching these technologies

00:15:42.120 --> 00:15:44.799
grow up, mature from like fascinating concepts

00:15:44.799 --> 00:15:49.059
and to tools that feel indispensable, redefining

00:15:49.059 --> 00:15:51.559
what's even possible. And leaves us with an important

00:15:51.559 --> 00:15:53.600
question, I think, for you, our listener, to

00:15:53.600 --> 00:15:56.039
consider. As AI does become more autonomous,

00:15:56.220 --> 00:15:59.320
more capable, what new forms of human -AI collaboration

00:15:59.320 --> 00:16:01.779
are really going to emerge? And maybe more importantly,

00:16:01.940 --> 00:16:04.360
what skills become the most valuable for us humans

00:16:04.360 --> 00:16:06.740
in this rapidly evolving landscape? Definitely

00:16:06.740 --> 00:16:08.360
something to chew on as you go about your day.

00:16:08.519 --> 00:16:10.379
Thanks so much for joining us for this deep dive

00:16:10.379 --> 00:16:12.360
into the latest in AI. Yeah, we hope you found

00:16:12.360 --> 00:16:14.019
some valuable nuggets in there, something to

00:16:14.019 --> 00:16:15.879
help you stay well -informed and hopefully curious.

00:16:16.279 --> 00:16:17.960
Until next time, out to your own music.