WEBVTT

00:00:00.000 --> 00:00:02.180
So imagine this. You've got an advanced AI, something

00:00:02.180 --> 00:00:03.779
like Claude. It's winning hacking competitions,

00:00:04.019 --> 00:00:06.599
really impressive stuff. But then here's the

00:00:06.599 --> 00:00:10.060
twist. It actually helps attackers find critical

00:00:10.060 --> 00:00:13.500
security flaws in itself. It's this really deep

00:00:13.500 --> 00:00:16.699
paradox, isn't it? AI evolves so fast. Amazing

00:00:16.699 --> 00:00:19.500
capabilities emerge. But then you see these inherent

00:00:19.500 --> 00:00:22.879
kind of unsettling risks laid bare, this tension.

00:00:23.659 --> 00:00:25.940
creation versus vulnerability. That's something

00:00:25.940 --> 00:00:27.760
we're really going to dig into today. Welcome

00:00:27.760 --> 00:00:29.559
to the Deep Dive. We're taking all the latest

00:00:29.559 --> 00:00:31.579
AI news and challenges from a pretty packed newsletter,

00:00:31.760 --> 00:00:33.579
and we're breaking it down for you, making it

00:00:33.579 --> 00:00:35.780
clear, making it digestible. Our plan. First,

00:00:35.920 --> 00:00:38.520
unpack OpenAI's big kind of surprising shift

00:00:38.520 --> 00:00:40.759
towards open models. Then, look at the sheer

00:00:40.759 --> 00:00:43.020
speed of AI innovation right now. It's dizzying.

00:00:43.100 --> 00:00:44.640
We'll check out some new tools, new techniques

00:00:44.640 --> 00:00:46.539
people are using, and then, yeah, circle back

00:00:46.539 --> 00:00:49.020
to that really fascinating, maybe slightly terrifying

00:00:49.020 --> 00:00:51.079
security thing with Claude. We want to give you

00:00:51.079 --> 00:00:53.299
the clarity, the details that stick. and really

00:00:53.299 --> 00:00:55.340
connect it all back to what it means for your

00:00:55.340 --> 00:00:57.960
grasp of this whole AI landscape. Yeah, absolutely.

00:00:58.240 --> 00:01:00.520
And just quickly, before we jump right in, maybe

00:01:00.520 --> 00:01:02.579
clarify a couple of terms we'll be throwing around.

00:01:02.679 --> 00:01:06.000
So open weight models. Think of these like, well,

00:01:06.040 --> 00:01:08.340
the core AI mechanics, the weights, they're public.

00:01:08.560 --> 00:01:10.640
You can grab them, use them, tweak them for what

00:01:10.640 --> 00:01:14.079
you need. But, and this is key, it's not the

00:01:14.079 --> 00:01:16.709
full picture. You don't get the original training

00:01:16.709 --> 00:01:19.849
data or the whole architecture blueprint. That's

00:01:19.849 --> 00:01:22.109
what makes something truly open source. And this

00:01:22.109 --> 00:01:25.709
isn't quite that. Okay. And then agentic tasks.

00:01:26.049 --> 00:01:28.269
This is basically when an AI isn't just answering

00:01:28.269 --> 00:01:31.109
your question, right? It's given a goal. And

00:01:31.109 --> 00:01:33.090
then it actually uses external tools, maybe searches

00:01:33.090 --> 00:01:35.469
the web, runs some code to go out and achieve

00:01:35.469 --> 00:01:38.090
that goal itself. It's like giving the AI a job

00:01:38.090 --> 00:01:40.269
to do with tools. Got it. OK, so let's start

00:01:40.269 --> 00:01:42.310
with that big news. Open AI. It feels like a

00:01:42.310 --> 00:01:44.170
major pivot. I mean, five years, they've mostly

00:01:44.170 --> 00:01:46.269
kept their best stuff locked down. And now, boom,

00:01:46.450 --> 00:01:49.409
new open weight models for everyone. For open

00:01:49.409 --> 00:01:51.609
AI, that feels huge. Oh, it's absolutely huge.

00:01:52.120 --> 00:01:53.579
And they're specific about what they're offering.

00:01:53.739 --> 00:01:58.019
Two main sizes. You've got GPUR120B, super powerful,

00:01:58.219 --> 00:02:01.879
but get this, designed to run on just one high

00:02:01.879 --> 00:02:05.299
-end NVIDIA GPU. That's pretty efficient. Then

00:02:05.299 --> 00:02:08.139
there's the smaller one, DPTOS20B, lightweight

00:02:08.139 --> 00:02:11.259
enough for a regular laptop, just needs 16 gigs

00:02:11.259 --> 00:02:13.819
of RAM. And the important part for both, you

00:02:13.819 --> 00:02:15.860
can fine -tune them, customize them. They're

00:02:15.860 --> 00:02:17.639
built for those agentic tasks we just talked

00:02:17.639 --> 00:02:19.659
about. Plus, and this is a big deal for businesses,

00:02:19.900 --> 00:02:22.819
fully commercially usable. Apache 2 .0 license.

00:02:23.039 --> 00:02:24.939
You can find them on Hugging Face right now.

00:02:25.060 --> 00:02:27.860
Okay. Accessible, tunable, sounds promising.

00:02:28.120 --> 00:02:29.979
But how do they actually stack up performance

00:02:29.979 --> 00:02:32.659
-wise? The sources suggest they're actually outperforming

00:02:32.659 --> 00:02:35.300
some existing open models. Like DeepSeek R1,

00:02:35.479 --> 00:02:37.219
Quia, on certain things. Yeah, that's right.

00:02:37.280 --> 00:02:39.439
On specific benchmarks like coding think code

00:02:39.439 --> 00:02:42.379
forces and some reasoning tasks like HLE. They

00:02:42.379 --> 00:02:44.560
seem to hold their own or even beat them. Which

00:02:44.560 --> 00:02:46.159
is impressive for new kids on the block. But

00:02:46.159 --> 00:02:48.319
there's always a but, right? Yeah, pretty much.

00:02:48.379 --> 00:02:50.740
They definitely fall short of OpenAI's own closed

00:02:50.740 --> 00:02:53.639
models, like their GBT 3 .5 or GBT 4 mini versions,

00:02:53.780 --> 00:02:57.939
especially on more complex stuff. And hallucinations,

00:02:57.939 --> 00:02:59.379
you've got to watch out for those. The rate's

00:02:59.379 --> 00:03:02.300
higher, up to 53 % on person QA, which tests

00:03:02.300 --> 00:03:04.960
factual stuff about people. So powerful, yes,

00:03:05.159 --> 00:03:08.000
but limitations are there. It's expected with

00:03:08.000 --> 00:03:09.780
these smaller models, really. Also text only.

00:03:10.080 --> 00:03:12.259
No images, no audio yet, but definitely built

00:03:12.259 --> 00:03:14.020
for those agent workflows calling external tools.

00:03:14.219 --> 00:03:17.060
Interesting tradeoffs. So stepping back, why?

00:03:17.219 --> 00:03:20.099
Why would OpenAI, the company practically synonymous

00:03:20.099 --> 00:03:23.620
with closed AI, make this move now? Well, the

00:03:23.620 --> 00:03:25.740
why it matters here is pretty fascinating. It

00:03:25.740 --> 00:03:27.780
looks like OpenAI is deliberately getting in

00:03:27.780 --> 00:03:30.509
the ring with companies like Meta. And also some

00:03:30.509 --> 00:03:33.830
big Chinese labs, DeepSeat, Quen, Moonshot. These

00:03:33.830 --> 00:03:35.889
guys have been winning over developers like crazy

00:03:35.889 --> 00:03:38.889
with their open models. And Sam Altman, OpenAI's

00:03:38.889 --> 00:03:41.430
CEO, even said before they felt they were kind

00:03:41.430 --> 00:03:43.689
of on the wrong side of history being so closed

00:03:43.689 --> 00:03:47.469
off. So this move, it lines up way better with

00:03:47.469 --> 00:03:50.310
their original 2015 mission, remember. AI for

00:03:50.310 --> 00:03:52.969
wide benefit. Whether you see it as smart PR

00:03:52.969 --> 00:03:55.810
or a real change of heart, it definitely shakes

00:03:55.810 --> 00:03:58.289
up the power dynamics. For developers, one source

00:03:58.289 --> 00:04:01.810
called it. Gold. Pure gold. It kind of re -decentralizes

00:04:01.810 --> 00:04:03.870
things a bit, doesn't it? Let's innovation happen

00:04:03.870 --> 00:04:06.210
outside the usual big tech players. So what's

00:04:06.210 --> 00:04:08.870
the real impact of OpenAI's shift to these more

00:04:08.870 --> 00:04:11.870
accessible models for you? Simply put, it gives

00:04:11.870 --> 00:04:14.509
developers way more power and options, which

00:04:14.509 --> 00:04:16.310
directly shapes what they can actually build.

00:04:16.490 --> 00:04:18.990
Right. That shift is huge. And it kind of leads

00:04:18.990 --> 00:04:21.290
us into just the sheer amount of stuff happening

00:04:21.290 --> 00:04:23.490
in AI generally. It's almost impossible to keep

00:04:23.490 --> 00:04:25.670
up, isn't it? Ooh, tell me about it. It's just

00:04:25.670 --> 00:04:28.350
constant. But some cool highlights recently.

00:04:28.829 --> 00:04:30.910
Really show the range. Like there was this viral

00:04:30.910 --> 00:04:33.610
prompt going around from Bucko Capital for doing

00:04:33.610 --> 00:04:36.189
super deep company research, like professional

00:04:36.189 --> 00:04:38.550
level stuff. Just got an update, apparently sharper

00:04:38.550 --> 00:04:41.689
now than Cloud Opus 4 .1 came out. Not quite

00:04:41.689 --> 00:04:44.589
the next big generation, 5 .0 or whatever, but

00:04:44.589 --> 00:04:47.170
a more precise updated version of their best

00:04:47.170 --> 00:04:50.529
model. That's live now. Character .ai. They launched

00:04:50.529 --> 00:04:53.990
this new social feed, but it's all AI characters

00:04:53.990 --> 00:04:56.850
interacting. Kind of wild. And Google Gemini,

00:04:56.990 --> 00:04:59.069
this is neat, lets you create personalized kids

00:04:59.069 --> 00:05:02.189
storybooks with illustrations in minutes across

00:05:02.189 --> 00:05:04.370
like 45 languages. You can even use your own

00:05:04.370 --> 00:05:06.949
photos to set the style. Crazy creative power.

00:05:07.230 --> 00:05:09.029
It's not just cool features, though, is it? There's

00:05:09.029 --> 00:05:11.589
this growing economic story, too. Axios was saying

00:05:11.589 --> 00:05:13.529
AI is basically propping up the U .S. economy

00:05:13.529 --> 00:05:15.649
right now. And I saw Derek Thompson put it super

00:05:15.649 --> 00:05:19.949
bluntly. GDP growth equals AI capex. Ha! He's

00:05:19.949 --> 00:05:22.290
directly linking investment in AI infrastructure

00:05:22.290 --> 00:05:25.189
to, well, the whole economy growing. That's fundamental.

00:05:25.410 --> 00:05:26.810
And you see it in the real world, absolutely.

00:05:27.069 --> 00:05:29.870
Big funding rounds, real impact. Look at Carbine.

00:05:29.970 --> 00:05:32.649
They just landed $100 million to improve their

00:05:32.649 --> 00:05:35.389
AI system for 911 calls. Their revenue growth

00:05:35.389 --> 00:05:39.470
is like 477%, operating in nearly 300 places

00:05:39.470 --> 00:05:42.290
now. That's AI directly touching public safety,

00:05:42.430 --> 00:05:45.050
making emergency response better. It's not just

00:05:45.050 --> 00:05:48.129
abstract tech anymore. How do these varied AI

00:05:48.129 --> 00:05:51.029
advances reshape your daily interactions or even

00:05:51.029 --> 00:05:53.959
the economy? Well, AI is subtly changing how

00:05:53.959 --> 00:05:56.699
we work, how we create things, and yeah, even

00:05:56.699 --> 00:05:58.879
how our economies tick. It really is creeping

00:05:58.879 --> 00:06:01.439
into everything. So, OK, beyond the big headlines,

00:06:01.660 --> 00:06:03.680
how are people actually using this stuff, building

00:06:03.680 --> 00:06:05.579
with it, finding new ways to make things work,

00:06:05.639 --> 00:06:08.000
maybe even make money? Yeah, good question. On

00:06:08.000 --> 00:06:09.819
the practical side, you see lots of head -to

00:06:09.819 --> 00:06:13.100
-head comparisons now, like chat GPT agents versus

00:06:13.100 --> 00:06:16.420
Genspark AI. Which one's better for coding? For

00:06:16.420 --> 00:06:19.019
research, for writing stuff. People are figuring

00:06:19.019 --> 00:06:21.800
out the right tool for the job. But there's also

00:06:21.800 --> 00:06:24.379
this deeper idea emerging called context engineering.

00:06:24.639 --> 00:06:26.879
It's about moving past just simple prompts. You

00:06:26.879 --> 00:06:29.759
give the AI much richer information, better strategies,

00:06:29.819 --> 00:06:32.600
so it becomes less of a tool and more of an actual

00:06:32.600 --> 00:06:34.660
intelligent partner. Building smarter agents

00:06:34.660 --> 00:06:38.139
faster. Think of it like giving the AI a really

00:06:38.139 --> 00:06:39.920
good briefing before a mission. That makes sense.

00:06:40.100 --> 00:06:42.399
And I saw something really intriguing for anyone

00:06:42.399 --> 00:06:46.879
who creates content online. of AI SEO coupled

00:06:46.879 --> 00:06:49.920
with Cloudflare's pay -per -crawl model. The

00:06:49.920 --> 00:06:51.819
suggestion is your content could actually get

00:06:51.819 --> 00:06:53.639
paid just for being crawled and found useful

00:06:53.639 --> 00:06:56.180
by an AI, even if no human clicks on it. It's

00:06:56.180 --> 00:06:58.759
a direct response to Google doing more zero -click

00:06:58.759 --> 00:07:01.500
searches, right? Imagine getting paid just because

00:07:01.500 --> 00:07:04.959
your info is valuable to an AI. Wild. Super interesting

00:07:04.959 --> 00:07:07.759
concept. And yeah, the tools keep coming. Rapid

00:07:07.759 --> 00:07:11.060
fire. Covertro lets you build AI agents fast,

00:07:11.259 --> 00:07:13.920
no coding needed, kind of democratizing that

00:07:13.920 --> 00:07:16.480
context engineering thing. Embeddable helps you

00:07:16.480 --> 00:07:18.459
build interactive web tools just by chatting

00:07:18.459 --> 00:07:21.220
with an AI, making web dev easier. WritingMate

00:07:21.220 --> 00:07:24.259
3 .0, one subscription, access to multiple different

00:07:24.259 --> 00:07:27.519
AI models, simplifies things. And Corvenimage

00:07:27.519 --> 00:07:29.660
is apparently getting really good at drawing

00:07:29.660 --> 00:07:32.319
text accurately within images and precise editing.

00:07:32.439 --> 00:07:35.000
That's always been tricky for image AI. It just

00:07:35.000 --> 00:07:37.180
shows how specialized and diverse the tools are

00:07:37.180 --> 00:07:39.220
getting. other quick updates that caught my eye

00:07:39.220 --> 00:07:42.600
just some quick hits open ai models now on aws

00:07:42.600 --> 00:07:44.000
for the first time that's big for enterprise

00:07:44.000 --> 00:07:47.220
adoption google's notebook lm their research

00:07:47.220 --> 00:07:49.279
assistant tool is opening up to younger users

00:07:49.279 --> 00:07:52.480
13 and up google also mentioned their new genie

00:07:52.480 --> 00:07:55.959
3 model framing it as a step towards agi Always

00:07:55.959 --> 00:07:58.519
interesting when they talk AGI. 11 labs dropped,

00:07:58.620 --> 00:08:00.300
11 music letting you generate your own music

00:08:00.300 --> 00:08:02.839
tracks. And the GSA, the U .S. government procurement

00:08:02.839 --> 00:08:06.120
agency, added OpenAI, Google, and Anthropic to

00:08:06.120 --> 00:08:08.740
their line of approved AI vendors. So AI is definitely

00:08:08.740 --> 00:08:10.819
going government mainstream. Which of these new

00:08:10.819 --> 00:08:13.360
tools or methods do you think will be most impactful

00:08:13.360 --> 00:08:16.740
for daily workflows? I think tools simplifying

00:08:16.740 --> 00:08:19.579
AI agent creation and maybe those new content

00:08:19.579 --> 00:08:22.899
monetization models seem pretty promising. Okay.

00:08:23.230 --> 00:08:25.490
Let's shift to our final segment. And honestly,

00:08:25.629 --> 00:08:28.069
this one is the most mind -bending. The story

00:08:28.069 --> 00:08:30.589
about Claude AI discovering its own security

00:08:30.589 --> 00:08:34.529
flaws. It's both brilliant and deeply alarming.

00:08:34.769 --> 00:08:36.970
Right. It's exactly that paradox we started with.

00:08:37.169 --> 00:08:39.450
So researchers found two pretty critical security

00:08:39.450 --> 00:08:42.149
holes in Claude code. And the wild part is Claude

00:08:42.149 --> 00:08:43.909
itself kind of helped them find these weaknesses

00:08:43.909 --> 00:08:46.490
during the testing process. Hey, flaw hashtag

00:08:46.490 --> 00:08:50.710
one. It's got a fancy number. CVE -2025 -54794.

00:08:51.850 --> 00:08:53.830
Basically, Claude was tricked into escaping its

00:08:53.830 --> 00:08:56.509
sandbox, its designated working area. The path

00:08:56.509 --> 00:08:58.149
checking wasn't strong enough. It wandered off

00:08:58.149 --> 00:08:59.529
where it shouldn't have. So it broke out of its

00:08:59.529 --> 00:09:01.909
cage. Yeah, very much. And then flaw hashtag

00:09:01.909 --> 00:09:06.389
two, CVE -2025 -54795. Claude has a list of safe

00:09:06.389 --> 00:09:08.730
commands it can use. But attackers figured out

00:09:08.730 --> 00:09:11.190
how to sneak malicious code inside those supposedly

00:09:11.190 --> 00:09:14.230
safe commands. It really hammers home. These

00:09:14.230 --> 00:09:16.409
AI systems can reason. And that reasoning ability

00:09:16.409 --> 00:09:18.029
can be turned back on them to break their own

00:09:18.029 --> 00:09:20.029
rules. Okay, but here's the part that just...

00:09:20.529 --> 00:09:23.190
floors me. Despite having these flaws, Claude

00:09:23.190 --> 00:09:25.309
is simultaneously winning hacking competitions.

00:09:25.509 --> 00:09:27.450
Exactly. That's the stunning part. It ranked

00:09:27.450 --> 00:09:30.470
in the top 3 % in PICO CTF, which is a big student

00:09:30.470 --> 00:09:33.210
hacking contest. And on Hack the Box, a cybersecurity

00:09:33.210 --> 00:09:35.990
training site, it solved 19 out of 20 challenges

00:09:35.990 --> 00:09:39.620
they gave it. Whoa. Just pause and think about

00:09:39.620 --> 00:09:42.620
that. An AI that can not only solve complex hacking

00:09:42.620 --> 00:09:45.059
challenges better than most humans, but also

00:09:45.059 --> 00:09:47.559
help discover its own ways to be hacked. Yeah,

00:09:47.820 --> 00:09:49.860
brilliant and terrifying is the only way to put

00:09:49.860 --> 00:09:52.360
it. It shows this incredible, almost self -aware

00:09:52.360 --> 00:09:54.960
level of problem solving, but the implications

00:09:54.960 --> 00:09:57.759
for security are huge. The source we looked at

00:09:57.759 --> 00:09:59.799
put it really starkly. If Claude can reverse

00:09:59.799 --> 00:10:02.220
engineer its own sandbox and start solving CTFs

00:10:02.220 --> 00:10:04.279
better than most humans, we've crocked a line.

00:10:04.419 --> 00:10:06.179
We definitely have. It means we're essentially

00:10:06.179 --> 00:10:09.809
building AI hackers now. And whether they end

00:10:09.809 --> 00:10:11.889
up being security tools for us or tools used

00:10:11.889 --> 00:10:14.509
against us. Well, that depends entirely on how

00:10:14.509 --> 00:10:16.710
well we can build those sandboxes, those controls.

00:10:16.870 --> 00:10:20.169
And honestly, I still wrestle with how you truly

00:10:20.169 --> 00:10:22.809
sandbox something this intelligent, especially

00:10:22.809 --> 00:10:25.330
when it learns and adapts so damn fast. It's

00:10:25.330 --> 00:10:27.110
a massive challenge. It keeps me up sometimes

00:10:27.110 --> 00:10:29.149
thinking about it. That vulnerability. Yeah.

00:10:29.210 --> 00:10:32.820
That challenge is real. So boiling it down. What

00:10:32.820 --> 00:10:35.200
are the immediate concerns or opportunities when

00:10:35.200 --> 00:10:38.580
AI can both hack systems and find its own flaws?

00:10:38.799 --> 00:10:41.080
It means cybersecurity just got exponentially

00:10:41.080 --> 00:10:43.259
more complex. We need way more sophisticated

00:10:43.259 --> 00:10:45.919
guardrails, constant vigilance, the opportunity.

00:10:46.139 --> 00:10:48.379
Maybe AI could be the ultimate security auditor,

00:10:48.500 --> 00:10:51.139
finding flaws we miss. The concern, that same

00:10:51.139 --> 00:10:53.080
intelligence could be the ultimate weapon if

00:10:53.080 --> 00:10:55.279
not contained. It's a double -edged sword, isn't

00:10:55.279 --> 00:10:56.879
it, our sponsor? All right, let's try and pull

00:10:56.879 --> 00:10:59.440
the threads together here. What are the big ideas

00:10:59.440 --> 00:11:01.669
from this deep dive? We've definitely seen this

00:11:01.669 --> 00:11:04.690
ongoing trend towards democratizing AI, right?

00:11:04.789 --> 00:11:07.929
Open AI's move with open models puts serious

00:11:07.929 --> 00:11:10.950
power into more hands, which should spur innovation.

00:11:11.289 --> 00:11:15.269
And just the sheer pace, it's relentless. From

00:11:15.269 --> 00:11:18.309
AI creating kids' books to literally driving

00:11:18.309 --> 00:11:21.250
economic growth figures. It's everywhere. And

00:11:21.250 --> 00:11:23.669
threaded through it all is that core duality,

00:11:23.690 --> 00:11:26.850
that paradox. Immense power for good, better

00:11:26.850 --> 00:11:29.870
research tools, safer emergency responses. But

00:11:29.870 --> 00:11:32.309
right alongside it, these incredibly complex

00:11:32.309 --> 00:11:36.090
challenges of control, security, safety. Cloud

00:11:36.090 --> 00:11:37.750
finding its own flaws is just the most vivid

00:11:37.750 --> 00:11:39.549
example. It underscores both the brilliance we're

00:11:39.549 --> 00:11:41.450
unlocking and the absolute need for caution,

00:11:41.529 --> 00:11:43.769
for thoughtful governance as we build these things.

00:11:43.970 --> 00:11:46.490
So here's the thought to leave you with. As these

00:11:46.490 --> 00:11:48.970
AI models get more open, more powerful, and even

00:11:48.970 --> 00:11:51.500
capable of analyzing themselves, What does that

00:11:51.500 --> 00:11:54.259
mean for our responsibility? All of us. How do

00:11:54.259 --> 00:11:56.460
we collectively shape where this goes, make sure

00:11:56.460 --> 00:11:58.940
it develops safely? Something to chew on. Definitely.

00:11:58.980 --> 00:12:00.600
Keep digging into it yourself. The conversation's

00:12:00.600 --> 00:12:02.740
only getting started. Thanks for joining us for

00:12:02.740 --> 00:12:05.200
this deep dive into the world of AI. Yeah, thanks

00:12:05.200 --> 00:12:07.200
for listening. We appreciate it. Out to your

00:12:07.200 --> 00:12:07.559
old music.