WEBVTT

00:00:00.000 --> 00:00:02.740
Imagine for a second an AI, one that finishes

00:00:02.740 --> 00:00:06.639
those really complex Excel tasks, but in seconds.

00:00:07.160 --> 00:00:10.400
Well, our sources suggest it's not just theory

00:00:10.400 --> 00:00:12.919
anymore. It's reportedly here. And actually beating

00:00:12.919 --> 00:00:15.539
human experts at it. That's the kicker. We're

00:00:15.539 --> 00:00:18.339
digging into AI agents today. Yeah. From those

00:00:18.339 --> 00:00:20.920
spreadsheet wizards to this tiny AI that seems

00:00:20.920 --> 00:00:23.679
to kind of think like a brain. Get ready for

00:00:23.679 --> 00:00:26.359
some big shifts. Welcome back to the Deep Dive,

00:00:26.359 --> 00:00:28.760
everyone. Today, we are really unpacking your

00:00:28.760 --> 00:00:31.140
latest sources, focusing hard on the cutting

00:00:31.140 --> 00:00:33.780
edge of AI agents. Okay, let's get into it. Yeah,

00:00:33.840 --> 00:00:36.259
let's do it. Our mission, like always, is navigating

00:00:36.259 --> 00:00:39.880
this really fast -moving AI landscape. We try

00:00:39.880 --> 00:00:41.799
to boil down what matters most for you. Think

00:00:41.799 --> 00:00:43.399
of it as your shortcut, you know, understanding

00:00:43.399 --> 00:00:45.719
where AI is heading and how it's changing work,

00:00:45.780 --> 00:00:48.509
like right now. We'll kick off with an AI agent

00:00:48.509 --> 00:00:50.549
that's making some serious waves in finance.

00:00:51.049 --> 00:00:53.630
Then we'll kind of zoom out, look at how AI is

00:00:53.630 --> 00:00:56.649
shaping other areas, video creation, corporate

00:00:56.649 --> 00:00:59.289
strategy, that sort of thing. And finally, we've

00:00:59.289 --> 00:01:02.250
got this really surprising new AI architecture.

00:01:03.200 --> 00:01:05.200
Honestly, it could change everything we thought

00:01:05.200 --> 00:01:08.019
we knew about building intelligent systems. It's

00:01:08.019 --> 00:01:10.319
pretty wild. Okay. Let's jump right into segment

00:01:10.319 --> 00:01:14.340
one. Our sources are calling it an Excel revolution.

00:01:14.540 --> 00:01:17.680
Big words. Yeah. And it centers on this new spreadsheet

00:01:17.680 --> 00:01:20.719
native AI agent. It's called Shortcut. Right.

00:01:20.819 --> 00:01:23.879
And this isn't just like a fancy macro. No, not

00:01:23.879 --> 00:01:26.120
at all. It's way beyond that. This agent, it

00:01:26.120 --> 00:01:28.859
actually builds entire financial models, fills

00:01:28.859 --> 00:01:32.439
out data. And gets this one shots tasks, tasks

00:01:32.439 --> 00:01:34.799
usually given to junior analysts. Exactly. There's

00:01:34.799 --> 00:01:38.319
this demo where it auto populated a whole discounted

00:01:38.319 --> 00:01:41.780
cash flow model, a DCF. Which is. Pretty complex

00:01:41.780 --> 00:01:44.159
stuff, yeah. Totally. And it did it using raw

00:01:44.159 --> 00:01:46.640
data, just pulled straight from SEC filings,

00:01:46.680 --> 00:01:49.819
no human input. Wow. So it handles the data gathering,

00:01:49.959 --> 00:01:52.099
the formatting, the formulas, everything. The

00:01:52.099 --> 00:01:55.180
whole workflow. Data, formatting, formulas, checking,

00:01:55.420 --> 00:01:58.260
all in one go. It's kind of like watching a whole

00:01:58.260 --> 00:02:01.340
team work, but super fast, like a bot. Okay,

00:02:01.379 --> 00:02:03.280
that's impressive. What about performance? How

00:02:03.280 --> 00:02:05.459
does it stack up? Well, the sources detailed

00:02:05.459 --> 00:02:08.280
these benchmarks. They put shortcut up against

00:02:08.280 --> 00:02:11.400
actual junior analysts. From like... Top firms.

00:02:11.539 --> 00:02:14.060
Yeah. Private equity, banking, consulting, product

00:02:14.060 --> 00:02:18.280
roles, the big leagues. OK. And here's the kicker.

00:02:18.500 --> 00:02:21.340
Shortcut reportedly won eighty nine point one

00:02:21.340 --> 00:02:23.879
percent of the time. Eighty nine percent scored

00:02:23.879 --> 00:02:26.120
by managers from those firms. That's what the

00:02:26.120 --> 00:02:28.740
sources say. It's pretty dominant. OK, well.

00:02:29.159 --> 00:02:31.680
Any other comparisons? It also apparently beat

00:02:31.680 --> 00:02:35.599
the ChatGPT agent in 90 % of head -to -head tests.

00:02:35.680 --> 00:02:38.180
90%. Okay. Got to add the usual caveats, right?

00:02:38.199 --> 00:02:41.419
Early benchmarks, maybe fuzzy criteria for winning.

00:02:41.599 --> 00:02:43.780
Sure. Absolutely. Need to see more verification.

00:02:43.780 --> 00:02:46.460
But the buzz, it's definitely there. Oh, yeah.

00:02:46.919 --> 00:02:49.460
Finance influencers calling it a ChatGPT moment

00:02:49.460 --> 00:02:52.199
for Excel. You can practically hear the jaws

00:02:52.199 --> 00:02:54.379
dropping in finance departments. It really makes

00:02:54.379 --> 00:02:56.860
you think about that warning from Dario Amadei,

00:02:56.860 --> 00:02:59.319
you know, about... potentially half of junior

00:02:59.319 --> 00:03:03.159
office jobs just vanishing because of AI. Because

00:03:03.159 --> 00:03:05.759
this isn't just automating one cell, it's full

00:03:05.759 --> 00:03:08.159
task automation. Right. It's the whole process,

00:03:08.460 --> 00:03:11.180
getting data, building the model, writing analysis,

00:03:11.340 --> 00:03:14.340
even exporting it. It's a huge shift, like machines

00:03:14.340 --> 00:03:18.199
and factories. But now it's bots, one -shotting

00:03:18.199 --> 00:03:21.080
spreadsheets. Makes you really consider the skills

00:03:21.080 --> 00:03:23.900
we'll need going forward. So with shortcuts showing

00:03:23.900 --> 00:03:27.379
this kind of power, What does this really mean

00:03:27.379 --> 00:03:30.139
right now for those detailed, repetitive financial

00:03:30.139 --> 00:03:33.599
analysis tasks? Is it a total rewrite? Oh, absolutely.

00:03:33.840 --> 00:03:36.340
It signals a clear shift. Yeah. That routine

00:03:36.340 --> 00:03:39.560
spreadsheet grunt work. It's automatable now,

00:03:39.719 --> 00:03:41.840
plain and simple. Okay, so shortcuts shaping

00:03:41.840 --> 00:03:44.159
up Excel, but that's just one piece, right? Exactly.

00:03:44.199 --> 00:03:46.680
The sources show a much bigger picture. AI's

00:03:46.680 --> 00:03:49.379
capabilities are expanding everywhere. The industry's

00:03:49.379 --> 00:03:51.939
shifting fast. So moving beyond spreadsheets,

00:03:51.939 --> 00:03:53.969
what else is happening? Well... For one, we're

00:03:53.969 --> 00:03:56.050
seeing more complex systems. Take Claude, right?

00:03:56.189 --> 00:03:58.610
They've added this sub -agents feature. Sub -agents.

00:03:58.610 --> 00:04:00.509
What's that? It lets you build basically teams

00:04:00.509 --> 00:04:02.969
of agents. They can tackle multiple tasks at

00:04:02.969 --> 00:04:05.810
the same time working in parallel. So you give

00:04:05.810 --> 00:04:07.789
it a project and it kind of assembles its own

00:04:07.789 --> 00:04:10.729
digital team. Pretty much. Imagine that for complex

00:04:10.729 --> 00:04:12.770
workflows. And then there's this thing called

00:04:12.770 --> 00:04:16.269
JSON prompting. Sources are calling it the most

00:04:16.269 --> 00:04:20.089
underrated AI skill of 2025. Yeah, JSON prompting.

00:04:20.410 --> 00:04:22.470
it sounds technical but it's basically a way

00:04:22.470 --> 00:04:25.089
to give ai really precise instructions so it

00:04:25.089 --> 00:04:27.250
knows exactly what you need exactly structured

00:04:27.250 --> 00:04:30.670
instructions leads to reliable answers and crucially

00:04:30.670 --> 00:04:34.449
no hallucinations turns models like gpt gemini

00:04:34.449 --> 00:04:38.009
claude into consistent reliable agents super

00:04:38.009 --> 00:04:40.490
important for professional use okay makes sense

00:04:40.490 --> 00:04:43.189
what else we're also seeing ai just blend into

00:04:43.189 --> 00:04:45.949
everyday tools more google search is ai mode

00:04:45.949 --> 00:04:48.899
yeah it can now see homework You can upload a

00:04:48.899 --> 00:04:51.899
PDF or just use your camera. And ask AI mode

00:04:51.899 --> 00:04:54.339
for answers based on it. Yep. Makes research

00:04:54.339 --> 00:04:57.139
or figuring stuff out way faster. And ChatGPT

00:04:57.139 --> 00:05:00.259
has that new study mode. Right. Like a personalized

00:05:00.259 --> 00:05:02.879
tutor in your pocket. Adaptive lessons, the works.

00:05:03.040 --> 00:05:04.899
Feels like a real study partner. Interesting.

00:05:04.980 --> 00:05:06.879
What about the creative side? Runway just launched

00:05:06.879 --> 00:05:09.740
Aleph. It's a new video model. In -context video

00:05:09.740 --> 00:05:13.240
generation. Multitask visual stuff. So better

00:05:13.240 --> 00:05:17.000
AI video tools. Letting creators do wilder things,

00:05:17.160 --> 00:05:20.449
basically. More control, more flexibility for

00:05:20.449 --> 00:05:22.709
imaginative content. And then there was that

00:05:22.709 --> 00:05:25.110
story about Meta. Oh, yeah. The poaching attempt.

00:05:25.629 --> 00:05:28.750
Wild story. They reportedly tried to hire away

00:05:28.750 --> 00:05:31.149
like half the staff from Thinking Machines Lab.

00:05:31.350 --> 00:05:33.509
A top research group. And Zuckerberg himself

00:05:33.509 --> 00:05:35.649
was apparently messaging people. Offers over

00:05:35.649 --> 00:05:39.769
a billion dollars mentioned. Yeah. Not one person

00:05:39.769 --> 00:05:43.209
left. Wow. Tells you a lot about what top AI

00:05:43.209 --> 00:05:45.370
talent actually wants, doesn't it? Yeah. It's

00:05:45.370 --> 00:05:47.279
not just the money. Meanwhile, you've got the

00:05:47.279 --> 00:05:49.699
National Science Foundation, the NSF, putting

00:05:49.699 --> 00:05:52.139
serious cash into AI research. $100 million.

00:05:52.579 --> 00:05:54.399
That's significant. And they're partnering with

00:05:54.399 --> 00:05:57.620
big names, Simons Foundation, NIST, Capital One,

00:05:57.800 --> 00:05:59.939
Intel. Yeah, it's a big deal. It's not just about

00:05:59.939 --> 00:06:02.120
apps. It's foundational stuff. Material science,

00:06:02.459 --> 00:06:05.199
new tech, pushing boundaries in the physical

00:06:05.199 --> 00:06:07.480
world with AI's help. Okay, let's connect that

00:06:07.480 --> 00:06:09.180
back. If we look at that meta poaching attempt

00:06:09.180 --> 00:06:12.779
failing. What does that tell us about the AI

00:06:12.779 --> 00:06:16.259
talent market beyond just the insane money? It

00:06:16.259 --> 00:06:19.259
really says top AI talent seeks more than just

00:06:19.259 --> 00:06:22.000
cash. They want meaningful work, innovative places,

00:06:22.220 --> 00:06:24.459
real frontier stuff. All right, let's shift gears

00:06:24.459 --> 00:06:26.560
slightly. How about a quick round of some new

00:06:26.560 --> 00:06:29.040
AI tools and maybe some industry quick hits,

00:06:29.079 --> 00:06:30.819
stuff that's changing things day to day? Yeah,

00:06:30.860 --> 00:06:35.019
let's do it. Rapid fire. First up. Kasi AI. Kasi

00:06:35.019 --> 00:06:38.300
AI. Turns long videos into short viral clips.

00:06:38.519 --> 00:06:41.720
It finds the good bits automatically. Think AI

00:06:41.720 --> 00:06:45.240
content strategist for creators. Big time saver.

00:06:45.259 --> 00:06:48.019
Potential reach booster. Okay. Useful. Next.

00:06:48.339 --> 00:06:51.420
Cache Scene. Removes backgrounds. Enhances photos

00:06:51.420 --> 00:06:54.139
instantly. Super common task. Now super fast.

00:06:54.339 --> 00:06:56.600
For designers. Or just, you know, touching up

00:06:56.600 --> 00:06:59.180
photos. Andy. And Immersity for mobile. Turn

00:06:59.180 --> 00:07:01.680
your flat 2D images into immersive 3D videos.

00:07:01.920 --> 00:07:03.680
Kind of cool for making more dynamic content

00:07:03.680 --> 00:07:05.959
right on your phone. Neat. What about layout

00:07:05.959 --> 00:07:08.720
.dev? This one's for developers, designers maybe.

00:07:08.879 --> 00:07:11.699
Turn simple ideas like a sketch or just text

00:07:11.699 --> 00:07:14.120
into working prototypes really fast. So speeds

00:07:14.120 --> 00:07:17.079
up that early ideation phase. Massively. Could

00:07:17.079 --> 00:07:18.920
really accelerate getting projects off the ground.

00:07:19.100 --> 00:07:20.740
Okay, those are some tools. Any quick industry

00:07:20.740 --> 00:07:24.250
news bites? Yeah, a few. Zed AI released a model

00:07:24.250 --> 00:07:26.449
cheaper than DeepSeek. Yeah. And it's free to

00:07:26.449 --> 00:07:28.250
download. Shows things are getting more accessible,

00:07:28.410 --> 00:07:30.610
maybe more decentralized. Interesting. What else?

00:07:30.670 --> 00:07:33.269
Meta's going to let job candidates use AI during

00:07:33.269 --> 00:07:36.350
coding tests. Really? Yeah. Kind of acknowledging

00:07:36.350 --> 00:07:38.430
it's just part of the workflow now. Pretty pragmatic.

00:07:38.750 --> 00:07:41.050
Makes sense. Google. They revealed an internal

00:07:41.050 --> 00:07:44.350
site, AI Savvy Google, for their non -tech employees.

00:07:44.910 --> 00:07:47.170
Just trying to get everyone up to speed on AI.

00:07:47.569 --> 00:07:50.069
Smart internal education. Good idea. Anthropic.

00:07:50.600 --> 00:07:53.339
They published research on using automated agents

00:07:53.339 --> 00:07:57.620
to audit other AI models. AI auditing AI. Yeah.

00:07:58.000 --> 00:08:00.779
Crucial for safety, reliability, making sure

00:08:00.779 --> 00:08:02.680
these things behave, especially as they get more

00:08:02.680 --> 00:08:04.800
autonomous. Definitely important. And finally,

00:08:04.939 --> 00:08:07.860
Spotify. Just hinting at a more conversational

00:08:07.860 --> 00:08:10.800
voice AI interface down the line. You know, chat

00:08:10.800 --> 00:08:13.339
with your AI DJ about playlists based on your

00:08:13.339 --> 00:08:15.569
mood. Kind of like talking to a friend. So all

00:08:15.569 --> 00:08:17.970
these different tools and updates. Yeah. How

00:08:17.970 --> 00:08:20.949
do these smaller things impact us, you know,

00:08:20.949 --> 00:08:24.189
directly? They basically bring powerful AI into

00:08:24.189 --> 00:08:27.050
everyday stuff, often behind the scenes. Makes

00:08:27.050 --> 00:08:29.410
digital life smoother, more intuitive. Okay.

00:08:29.670 --> 00:08:32.450
Fascinating stuff. Before we get to maybe the

00:08:32.450 --> 00:08:34.730
most mind -bending part, this new AI architecture,

00:08:35.110 --> 00:08:37.529
let's just take a quick moment for our sponsor.

00:08:38.269 --> 00:08:42.259
Sponsor. And we are back. Okay, that revolutionary

00:08:42.259 --> 00:08:45.360
architecture we teased. Seriously, this could

00:08:45.360 --> 00:08:47.500
shake things up. It challenges some basic ideas

00:08:47.500 --> 00:08:50.299
about how we build AI. Right. The sources talk

00:08:50.299 --> 00:08:52.399
about something called the Hierarchical Reasoning

00:08:52.399 --> 00:08:55.899
Model, HRM, from a group called Sapient Intelligence.

00:08:56.299 --> 00:09:00.399
And the key thing, it's tiny, but it uses this

00:09:00.399 --> 00:09:02.840
brain -like architecture. That's the buzz. Brain

00:09:02.840 --> 00:09:05.559
-like? How so? It's got two main parts working

00:09:05.559 --> 00:09:08.669
together. A planner module. That's the slow thinking

00:09:08.669 --> 00:09:11.570
part. Strategizing, like planning chess moves.

00:09:11.669 --> 00:09:13.289
Okay, thinking ahead. And then a worker module.

00:09:13.509 --> 00:09:16.190
That's the fast acting part. Executing with like

00:09:16.190 --> 00:09:18.610
how your brain instantly recognizes a face, you

00:09:18.610 --> 00:09:20.690
know, quick processing. So they work together

00:09:20.690 --> 00:09:23.450
in a hierarchy. Exactly. They plan and solve

00:09:23.450 --> 00:09:26.409
in one go, one single forward pass. It's different

00:09:26.409 --> 00:09:28.330
from how many big language models work. And you

00:09:28.330 --> 00:09:31.169
said it's tiny. Unbelievably tiny. Only 27 million

00:09:31.169 --> 00:09:33.409
parameters. 27 million. How does that compare?

00:09:33.970 --> 00:09:37.110
GPT -1, the original. 117 million parameters.

00:09:37.679 --> 00:09:39.879
HRM is less than a quarter of the size. Wow.

00:09:40.080 --> 00:09:43.360
Okay. Tiny size. But what about performance?

00:09:43.580 --> 00:09:46.200
This is where it gets crazy. On the ARC -AGI

00:09:46.200 --> 00:09:49.240
benchmark, think of it like an AI IQ test for

00:09:49.240 --> 00:09:52.960
fluid intelligence. Yeah. HRM scored 40 .3%.

00:09:52.960 --> 00:09:58.200
Claude, 3 .7. 21 .2%. Another model, 03 Mini

00:09:58.200 --> 00:10:02.879
High, got 34 .5%. Wait. So this tiny model. It

00:10:02.879 --> 00:10:04.740
outperformed much bigger ones on this reasoning

00:10:04.740 --> 00:10:06.639
test. Significantly outperformed them. Okay.

00:10:06.679 --> 00:10:08.940
Are there specific examples? Yeah. And they really

00:10:08.940 --> 00:10:12.019
highlight the difference. It solved 55 % of Sudoku

00:10:12.019 --> 00:10:14.259
extreme puzzles. And other models? Claude and

00:10:14.259 --> 00:10:17.320
OpenAI scored 0 % on those. Zero. Okay. Any others?

00:10:17.580 --> 00:10:20.799
It found the optimal path to nearly 75 % of these

00:10:20.799 --> 00:10:23.399
really complex 30 by 30 mazes. And the others?

00:10:23.460 --> 00:10:25.500
Also 0%. It's not just a little better. It's

00:10:25.500 --> 00:10:27.539
solving problems the big guys just can't. Yeah,

00:10:27.559 --> 00:10:29.639
I still wrestle with prompt drift myself sometimes,

00:10:29.919 --> 00:10:32.360
getting the AI to stay on track. So the idea

00:10:32.360 --> 00:10:34.059
of a tiny model having this kind of consistent

00:10:34.059 --> 00:10:36.340
step -by -step reasoning, it's really fascinating.

00:10:36.500 --> 00:10:37.840
It makes you wonder if we've been too focused

00:10:37.840 --> 00:10:40.929
on just scaling up. Whoa, okay, that's... That's

00:10:40.929 --> 00:10:44.610
genuinely remarkable. A tiny AI outperforming

00:10:44.610 --> 00:10:47.830
models vastly larger, solving problems they can't

00:10:47.830 --> 00:10:50.190
even touch. It really does challenge the whole

00:10:50.190 --> 00:10:53.789
bigger is better idea in AI scaling. That's kind

00:10:53.789 --> 00:10:56.129
of mind blowing. Exactly. It's early days, sure,

00:10:56.330 --> 00:10:59.049
and it's not general purpose like GPT yet, but

00:10:59.049 --> 00:11:01.049
it suggests advanced reasoning might not require

00:11:01.049 --> 00:11:04.590
trillions of tokens or massive compute. So this

00:11:04.590 --> 00:11:07.769
raises a huge question then. Could this... brain

00:11:07.769 --> 00:11:10.490
-like hierarchical approach fundamentally change

00:11:10.490 --> 00:11:12.990
how we build AI in the future? Are we looking

00:11:12.990 --> 00:11:15.429
at a different path entirely? It definitely points

00:11:15.429 --> 00:11:18.750
towards efficiency and maybe more novel ways

00:11:18.750 --> 00:11:20.669
of thinking emerging rather than just relying

00:11:20.669 --> 00:11:23.110
on sheer scale. Could unlock AI for new areas,

00:11:23.230 --> 00:11:25.210
maybe inspire totally different kinds of intelligence.

00:11:25.490 --> 00:11:26.970
Okay, let's try and pull this all together. What's

00:11:26.970 --> 00:11:29.309
the big picture here? We've seen AI agents evolve

00:11:29.309 --> 00:11:32.389
incredibly quickly from tools to full task automation,

00:11:32.750 --> 00:11:36.100
spreadsheets, complex reasoning. Yeah, what strikes

00:11:36.100 --> 00:11:38.240
me is just the speed and the variety of innovation.

00:11:38.659 --> 00:11:41.740
It's not just about making AI bigger anymore.

00:11:41.899 --> 00:11:45.360
It's smarter, more specialized, way more efficient

00:11:45.360 --> 00:11:48.440
sometimes, tackling really sophisticated problems

00:11:48.440 --> 00:11:51.360
with amazing precision. So connecting it all

00:11:51.360 --> 00:11:54.259
up, the signal seems clear. AI is rapidly changing

00:11:54.259 --> 00:11:56.539
how we work, how we solve problems, not just

00:11:56.539 --> 00:11:59.940
in huge systems, but through these focused, intelligent

00:11:59.940 --> 00:12:03.279
agents in our everyday tools and tasks. Absolutely.

00:12:03.720 --> 00:12:06.080
It's becoming embedded. So this whole deep dive

00:12:06.080 --> 00:12:08.419
leaves us with a final thought, maybe something

00:12:08.419 --> 00:12:11.879
for you, the listener, to chew on. As these AI

00:12:11.879 --> 00:12:15.200
agents get better and more integrated, what new

00:12:15.200 --> 00:12:17.379
roles, what new opportunities actually open up

00:12:17.379 --> 00:12:19.539
for humans? Right. How does our own creativity,

00:12:19.600 --> 00:12:22.460
our problem solving? How do we adapt and evolve

00:12:22.460 --> 00:12:24.740
with these powerful new partners? It's not just

00:12:24.740 --> 00:12:26.720
about replacement. It's about partnership, right?

00:12:26.820 --> 00:12:28.899
Shifting maybe from doing the task to defining

00:12:28.899 --> 00:12:31.320
the problem or overseeing the AI, moving from

00:12:31.320 --> 00:12:33.860
rote work to more creative. of ideation. Could

00:12:33.860 --> 00:12:35.899
be. We really encourage you to keep exploring

00:12:35.899 --> 00:12:38.399
this, keep asking questions, keep thinking about

00:12:38.399 --> 00:12:40.919
how all this tech is shaping our collective future

00:12:40.919 --> 00:12:43.399
and your own place in it. Thank you for joining

00:12:43.399 --> 00:12:45.840
us on this deep dive. Until next time, keep that

00:12:45.840 --> 00:12:47.759
curiosity alive. Otiro Music.