WEBVTT

00:00:00.000 --> 00:00:02.740
We often talk about A .I. in terms of, you know,

00:00:02.740 --> 00:00:05.360
the algorithms, the models themselves. But maybe

00:00:05.360 --> 00:00:07.299
the real story, the one that's kind of mind blowing

00:00:07.299 --> 00:00:11.119
right now is the physical scale. Yeah. The infrastructure.

00:00:11.419 --> 00:00:13.480
Oh, absolutely. It's staggering. And there's

00:00:13.480 --> 00:00:15.980
this one stat about open A .I. that just puts

00:00:15.980 --> 00:00:17.699
everything into perspective. They're apparently

00:00:17.699 --> 00:00:20.120
aiming to build something they call an infrastructure

00:00:20.120 --> 00:00:23.559
factory. An infrastructure factory. Yeah. Designed

00:00:23.559 --> 00:00:27.079
to produce one gigawatt of A .I. compute every

00:00:27.079 --> 00:00:32.950
single. One gigawatt weekly. I mean, it's hard

00:00:32.950 --> 00:00:35.130
to even picture that kind of pace. It really

00:00:35.130 --> 00:00:36.770
is. It means they're operating on a completely

00:00:36.770 --> 00:00:39.030
different level. If you try to contextualize

00:00:39.030 --> 00:00:41.429
that, you're basically looking at combining like...

00:00:41.679 --> 00:00:44.200
All the big NVIDIA deployments, plus the Tesla

00:00:44.200 --> 00:00:47.359
Gigafactory, plus major AWS data centers, all

00:00:47.359 --> 00:00:49.420
running at once. And they plan to just keep doing

00:00:49.420 --> 00:00:51.539
that. Yeah. Week after week. That seems to be

00:00:51.539 --> 00:00:53.740
the plan. Yeah. Just to handle the training and

00:00:53.740 --> 00:00:56.140
deployment needs for whatever comes next. It

00:00:56.140 --> 00:00:58.520
forces a total rethink of, well, compute economics.

00:00:59.200 --> 00:01:01.359
Welcome to the Deep Dive. Today, we're going

00:01:01.359 --> 00:01:04.609
to dive into the latest industry sources. cut

00:01:04.609 --> 00:01:07.109
through some of the noise and pull out the key

00:01:07.109 --> 00:01:11.189
things you probably need to know. We'll be looking

00:01:11.189 --> 00:01:14.230
at some big corporate shifts, some very real

00:01:14.230 --> 00:01:16.170
practical threats that are emerging, and also

00:01:16.170 --> 00:01:18.450
this rising global competition, especially around

00:01:18.450 --> 00:01:21.189
the cost of performance. Right. So roadmap wise,

00:01:21.430 --> 00:01:24.409
first up, we'll unpack the new structure at OpenAI,

00:01:24.609 --> 00:01:27.040
what that means for their speed. Then we'll get

00:01:27.040 --> 00:01:28.920
into some immediate disruptions, things like

00:01:28.920 --> 00:01:32.420
fake AI receipts, workforce impacts, new tools.

00:01:32.739 --> 00:01:34.760
And finally, we absolutely have to talk about

00:01:34.760 --> 00:01:37.680
these super competitive Chinese models. They're

00:01:37.680 --> 00:01:40.079
beating Western models pretty badly on cost.

00:01:40.340 --> 00:01:43.659
Let's jump in. Segment one, OpenAI's new reality.

00:01:44.079 --> 00:01:47.700
OK, so focusing first on the, let's say, the

00:01:47.700 --> 00:01:50.579
strategic and financial side. The big news is

00:01:50.579 --> 00:01:53.159
OpenAI has finally finished this really complex

00:01:53.159 --> 00:01:56.010
legal reshuffle. They are now officially a for

00:01:56.010 --> 00:01:58.150
-profit company. That's right. The pivot is complete.

00:01:58.609 --> 00:02:01.250
But that ownership is still intentionally messy.

00:02:01.370 --> 00:02:02.750
You know, it's not like a standard corporate

00:02:02.750 --> 00:02:05.069
setup, which keeps people guessing. Yeah. We

00:02:05.069 --> 00:02:06.790
learned the original nonprofit foundation, it

00:02:06.790 --> 00:02:09.069
still owns 26 % of the new thing, the OpenAI

00:02:09.069 --> 00:02:11.789
group. Right. And meanwhile, Microsoft, they

00:02:11.789 --> 00:02:14.789
are dug in deep. They've got about 27 % of this

00:02:14.789 --> 00:02:17.569
new group. That stake is valued at, what, around

00:02:17.569 --> 00:02:22.259
$135 billion now? Wow. But the real power might

00:02:22.259 --> 00:02:24.960
be the exclusive model rights they hold. That

00:02:24.960 --> 00:02:28.599
runs until 2032. 2032. That's a long time. That

00:02:28.599 --> 00:02:30.599
gives Microsoft huge control over where this

00:02:30.599 --> 00:02:32.439
tech goes commercially, doesn't it? For the next

00:02:32.439 --> 00:02:34.860
decade, pretty much. Yeah. And speaking of valuation

00:02:34.860 --> 00:02:37.400
frenzy, here's a kind of interesting side note.

00:02:37.460 --> 00:02:39.860
Elon Musk apparently tried to buy them not long

00:02:39.860 --> 00:02:45.139
ago for $97 billion. $97 billion. Obviously,

00:02:45.159 --> 00:02:46.560
that didn't happen. Maybe it wasn't even enough.

00:02:46.599 --> 00:02:49.439
Who knows? But the why behind this whole for

00:02:49.439 --> 00:02:52.580
profit shift is key. They basically have to remove

00:02:52.580 --> 00:02:55.719
all the legal friction. The whole structure made

00:02:55.719 --> 00:02:58.120
it hard to raise the truly astronomical amounts

00:02:58.120 --> 00:02:59.919
of money they need for that physical infrastructure

00:02:59.919 --> 00:03:02.139
we talked about. So open the floodgates for investment.

00:03:02.560 --> 00:03:06.199
Scale up faster. Exactly. Which brings us back

00:03:06.199 --> 00:03:08.740
to that infrastructure goal. The infrastructure

00:03:08.740 --> 00:03:13.300
factory. One gigawatt per week. Whoa. I mean,

00:03:13.319 --> 00:03:17.520
just imagine trying to scale to handle say a

00:03:17.520 --> 00:03:20.740
billion queries consistently. That kind of investment,

00:03:20.900 --> 00:03:23.039
that output. It's removed all the hardware bottlenecks

00:03:23.039 --> 00:03:24.939
they've ever had, right? It unlocks speed in

00:03:24.939 --> 00:03:26.900
a way we haven't seen. And you see that urgency

00:03:26.900 --> 00:03:29.659
reflected internally, too. Oh, so? Well, their

00:03:29.659 --> 00:03:32.280
chief scientist, Jakub Pachocki, recently suggested

00:03:32.280 --> 00:03:34.520
superintelligence might be less than 10 years

00:03:34.520 --> 00:03:36.699
away. Less than 10 years. Yeah, and that doesn't

00:03:36.699 --> 00:03:38.539
sound like a casual prediction. It sounds like

00:03:38.539 --> 00:03:40.300
it's based on the speed they're seeing right

00:03:40.300 --> 00:03:43.300
now internally. So the implication there is we

00:03:43.300 --> 00:03:46.379
could see big jumps in model performance, maybe

00:03:46.379 --> 00:03:49.189
even in the next six months. Before we even get

00:03:49.189 --> 00:03:52.030
to GPT -6. That seems to be the hint, yes. The

00:03:52.030 --> 00:03:53.990
training runs are moving faster than maybe even

00:03:53.990 --> 00:03:55.949
they predicted. Okay, so here's a question then.

00:03:56.669 --> 00:03:59.110
With all this money pouring in, this infrastructure,

00:03:59.330 --> 00:04:02.349
this speed, does that guarantee open AI market

00:04:02.349 --> 00:04:04.889
dominance long term? Well, the capital flow is

00:04:04.889 --> 00:04:08.150
huge, definitely. But those proprietary rights,

00:04:08.270 --> 00:04:10.629
the control aspects, they really complicate the

00:04:10.629 --> 00:04:12.689
market picture. Right. So the money's there,

00:04:12.830 --> 00:04:15.229
but control rights make it complicated. Exactly.

00:04:15.710 --> 00:04:18.629
OK, let's shift gears from the big money story

00:04:18.629 --> 00:04:22.810
to more immediate practical things hitting people

00:04:22.810 --> 00:04:25.810
segment two. And first up is something that's

00:04:25.810 --> 00:04:29.670
honestly a bit scary. The realism of these models

00:04:29.670 --> 00:04:32.980
is creating a well. a crisis in digital trust.

00:04:33.399 --> 00:04:35.839
We need to talk about fake AI receipts. Yeah,

00:04:35.920 --> 00:04:37.819
this is becoming a million dollar problem. Literally.

00:04:38.060 --> 00:04:40.160
We're not talking about clumsy Photoshop anymore.

00:04:40.300 --> 00:04:42.540
These are like professional level fakes. Good

00:04:42.540 --> 00:04:44.740
enough to fool corporate expense systems. So

00:04:44.740 --> 00:04:47.040
the takeaway is you just can't trust your eyes

00:04:47.040 --> 00:04:49.800
anymore. Not with digital documents. Pretty much.

00:04:49.819 --> 00:04:52.519
Visual evidence alone isn't enough. Assume verification

00:04:52.519 --> 00:04:55.800
is needed. Digital trust is eroding fast. Okay.

00:04:56.079 --> 00:04:58.279
At the same time, though, powerful creative tools

00:04:58.279 --> 00:05:00.889
are getting easier to access, right? Adobe just

00:05:00.889 --> 00:05:02.910
put AI assistance into Express and Photoshop.

00:05:03.110 --> 00:05:05.850
That's huge, yeah. Tools that used to need serious

00:05:05.850 --> 00:05:08.490
design skills, now usable with simple text prompts.

00:05:08.689 --> 00:05:11.129
It makes creativity more accessible, sure. But

00:05:11.129 --> 00:05:13.250
it also probably fuels that forgery problem we

00:05:13.250 --> 00:05:15.149
just mentioned. Exactly. Double -edged sword.

00:05:15.930 --> 00:05:20.110
And look at commerce integration. OpenAI and

00:05:20.110 --> 00:05:22.689
PayPal are partnering up. For instant checkout

00:05:22.689 --> 00:05:25.269
within ChatGPT. Right. So you could be chatting,

00:05:25.389 --> 00:05:27.350
the AI suggests something, and boom, you buy

00:05:27.350 --> 00:05:29.730
it right there. Conversation at checkout. Seamless.

00:05:29.870 --> 00:05:33.470
And speaking of access, look at India. They're

00:05:33.470 --> 00:05:37.250
getting 12 months of free ChatGPT Go access for

00:05:37.250 --> 00:05:39.970
a limited time. That's a big strategic play for

00:05:39.970 --> 00:05:43.870
market share. And this tech leap is causing weird...

00:05:44.110 --> 00:05:46.670
contrasts in the corporate world, too, like Amazon

00:05:46.670 --> 00:05:48.889
laying off thousands. Right. Thousands of people

00:05:48.889 --> 00:05:50.930
gone. But it's not because Amazon's struggling

00:05:50.930 --> 00:05:54.290
financially. No, they're doing it while massively

00:05:54.290 --> 00:05:56.850
increasing their spending on AI development.

00:05:57.009 --> 00:06:00.129
It's a really stark restructuring of the workforce

00:06:00.129 --> 00:06:02.470
happening right now. It really shows the tradeoff

00:06:02.470 --> 00:06:05.290
companies are making, swapping human capital

00:06:05.290 --> 00:06:08.389
for compute capital. And sometimes, you know,

00:06:08.410 --> 00:06:11.350
that compute capital doesn't quite behave. Yeah.

00:06:11.370 --> 00:06:14.360
Yeah. Slight juggle. The community reactions

00:06:14.360 --> 00:06:17.439
are always quick when models stumble. I saw that

00:06:17.439 --> 00:06:20.699
post about ChatGPT Atlas being dragged into the

00:06:20.699 --> 00:06:23.360
bin after everyone initially called it top one.

00:06:23.579 --> 00:06:25.740
Oh yeah, that went viral. You know, that kind

00:06:25.740 --> 00:06:27.860
of hit home for me. I mean, I still wrestle with

00:06:27.860 --> 00:06:29.740
prompt drift myself sometimes. You figure something

00:06:29.740 --> 00:06:31.560
out one day, the perfect prompt, and the next

00:06:31.560 --> 00:06:35.439
day, poof. It just doesn't work the same. It's

00:06:35.439 --> 00:06:37.399
frustrating. It really is. It's just the nature

00:06:37.399 --> 00:06:40.620
of working on the bleeding edge, I guess. Unstable.

00:06:41.689 --> 00:06:43.589
Meanwhile, governments are trying to react to

00:06:43.589 --> 00:06:46.449
all this. The EU just launched a $5 billion fund.

00:06:46.970 --> 00:06:50.069
Right, for AI, quantum computing, semiconductors.

00:06:51.269 --> 00:06:53.550
Trying to catch up. Explicitly, yeah. Trying

00:06:53.550 --> 00:06:55.490
to match the scale of investment coming out of

00:06:55.490 --> 00:06:57.949
the U .S. and Asia. Okay, so thinking about those

00:06:57.949 --> 00:07:01.370
fake receipts again. Yeah. Given that risk, how

00:07:01.370 --> 00:07:03.870
much should we actually rely on digital confirmation

00:07:03.870 --> 00:07:07.560
for, say, financial stuff? I think we have to

00:07:07.560 --> 00:07:09.920
assume verification is needed outside the document

00:07:09.920 --> 00:07:11.680
itself. You can't just trust the digital file

00:07:11.680 --> 00:07:15.259
anymore. Right. Digital trust is eroding. Always

00:07:15.259 --> 00:07:18.980
cross verify the source. OK, let's move to our

00:07:18.980 --> 00:07:22.519
final segment, the cost frontier. This is where

00:07:22.519 --> 00:07:25.240
models from China are really shaking things up.

00:07:25.300 --> 00:07:27.560
Yeah, this is a potentially huge market shift

00:07:27.560 --> 00:07:30.019
coming from the east. We need to talk about a

00:07:30.019 --> 00:07:33.800
model called Minimax M2. OK, Minimax M2. What's

00:07:33.800 --> 00:07:36.319
the goal there? The goal is basically GPT -4

00:07:36.319 --> 00:07:38.540
level performance, but without the crazy high

00:07:38.540 --> 00:07:40.680
price tag you get with the big U .S. models.

00:07:40.920 --> 00:07:43.800
And the cost difference is big. Staggering. M2

00:07:43.800 --> 00:07:48.180
apparently runs at just 8 % of the cost of running

00:07:48.180 --> 00:07:50.740
something comparable like Cloud Sonnet. 8 %?

00:07:50.800 --> 00:07:53.639
Wow. That changes everything, doesn't it? How

00:07:53.639 --> 00:07:55.879
do they do that? Well, the key is the tech underneath.

00:07:56.060 --> 00:07:58.660
M2 is what's called a sparse mixture of experts

00:07:58.660 --> 00:08:02.300
model and Moe. Okay, jargon alert. Mixture of

00:08:02.300 --> 00:08:04.100
experts. Can you break that down simply? Sure.

00:08:04.139 --> 00:08:06.759
Think of it like this. Instead of using its entire

00:08:06.759 --> 00:08:09.459
massive brain for every single question, a MoE

00:08:09.459 --> 00:08:12.480
model only activates a small, specialized part,

00:08:12.600 --> 00:08:15.019
like an expert team, that's best suited for that

00:08:15.019 --> 00:08:17.680
specific task. Ah, okay, so it's not firing up

00:08:17.680 --> 00:08:20.019
the whole engine every time. Exactly. Maybe it

00:08:20.019 --> 00:08:22.579
only uses, say, 10 billion parameters out of

00:08:22.579 --> 00:08:25.180
a much larger total for one query. That makes

00:08:25.180 --> 00:08:27.980
it way lighter, much faster, and radically more

00:08:27.980 --> 00:08:31.339
efficient than running a huge, dense model constantly.

00:08:31.639 --> 00:08:33.220
And the practical result of that efficiency?

00:08:33.639 --> 00:08:37.049
It's fast. uses less memory, and crucially, you

00:08:37.049 --> 00:08:40.750
don't need those giant, super expensive GPU clusters

00:08:40.750 --> 00:08:43.429
to run it well. That just lowers the barrier

00:08:43.429 --> 00:08:46.190
to entry massively for innovators everywhere.

00:08:46.590 --> 00:08:48.190
Okay, and how does it actually perform? You said

00:08:48.190 --> 00:08:51.210
GPT -4 level. The benchmarks was really strong.

00:08:51.350 --> 00:08:54.049
It's reportedly outperforming older Cloud 4 models

00:08:54.049 --> 00:08:56.909
in some areas, and it's nearly matching the newer

00:08:56.909 --> 00:08:59.909
Cloud 4 .5. So the big picture here is Chinese

00:08:59.909 --> 00:09:04.139
labs, Minimax, DeepSeq, Moonshot. They're putting

00:09:04.139 --> 00:09:06.440
out high quality open models, models that compete

00:09:06.440 --> 00:09:08.759
performance wise with the West. But crush them

00:09:08.759 --> 00:09:11.059
on operational costs. Yes. These are good for

00:09:11.059 --> 00:09:13.159
what kinds of applications? They're ideal if

00:09:13.159 --> 00:09:15.539
you're building, say, agent workflows or custom

00:09:15.539 --> 00:09:17.620
code co -pilots, retrieval bots, specialized

00:09:17.620 --> 00:09:20.480
tool chains, anything where cost efficiency matters,

00:09:20.639 --> 00:09:22.620
especially if you want to self -host outside

00:09:22.620 --> 00:09:26.659
the big cloud platforms. Right. So probing question

00:09:26.659 --> 00:09:31.159
then. If the running cost drops by like 90 percent.

00:09:32.090 --> 00:09:34.490
What does that cost advantage really mean for

00:09:34.490 --> 00:09:37.149
companies trying to build new agents, maybe outside

00:09:37.149 --> 00:09:40.370
the orbit of AWS or Google Cloud? Well, fundamentally,

00:09:40.490 --> 00:09:42.490
it democratizes access, right? It shifts innovation

00:09:42.490 --> 00:09:45.870
power. Yeah. Away from just the giant labs, potentially

00:09:45.870 --> 00:09:48.809
towards smaller, more agile builders anywhere

00:09:48.809 --> 00:09:51.450
in the world. So cheaper models mean faster innovation

00:09:51.450 --> 00:09:54.529
and wider access for startups. Exactly. Mid -roll

00:09:54.529 --> 00:09:57.309
sponsor, Reed Placeholder. Okay, so wrapping

00:09:57.309 --> 00:09:59.549
up this deep dive, the main threads seem to be

00:09:59.549 --> 00:10:01.769
this incredible race for physical infrastructure,

00:10:02.009 --> 00:10:04.820
this hyper -financed build -out. You know, OpenAI's

00:10:04.820 --> 00:10:07.419
one gigawatt goal really symbolizes that. And

00:10:07.419 --> 00:10:09.440
that massive spending really sets the stage for

00:10:09.440 --> 00:10:11.620
the competitive shift we're seeing. These cost

00:10:11.620 --> 00:10:13.600
-effective open models, particularly from the

00:10:13.600 --> 00:10:16.700
East, like MiniMax M2, they're completely changing

00:10:16.700 --> 00:10:18.860
the definition of high performance for developers

00:10:18.860 --> 00:10:21.240
because now it's actually affordable. It feels

00:10:21.240 --> 00:10:23.220
like AI is becoming more distributed, moving

00:10:23.220 --> 00:10:25.750
beyond just a few huge labs. I think that's right.

00:10:25.929 --> 00:10:28.230
And connecting back to the physical buildout,

00:10:28.269 --> 00:10:30.230
the infrastructure, we talked about the scale,

00:10:30.450 --> 00:10:33.950
the 1GW factory, the Amazon layoffs linked to

00:10:33.950 --> 00:10:36.830
AI spending. There's one more piece here related

00:10:36.830 --> 00:10:39.730
to secrecy. Yeah, there are reports suggesting

00:10:39.730 --> 00:10:42.730
that nondisclosure agreements, NDAs, are being

00:10:42.730 --> 00:10:45.509
used quite aggressively, specifically to keep

00:10:45.509 --> 00:10:48.490
details about these new AI data centers, the

00:10:48.490 --> 00:10:51.110
tech, the operations hidden from the public,

00:10:51.169 --> 00:10:55.299
particularly in the U .S. Given the sheer scale

00:10:55.299 --> 00:10:58.179
and speed of all this building the equivalent

00:10:58.179 --> 00:11:00.080
of a massive data center complex every single

00:11:00.080 --> 00:11:02.639
week, what exactly are we not being told about

00:11:02.639 --> 00:11:04.860
the physical foundation supporting arguably the

00:11:04.860 --> 00:11:06.820
most powerful technology we've ever built? That's

00:11:06.820 --> 00:11:08.700
the question to maybe keep mulling over. What's

00:11:08.700 --> 00:11:10.419
behind the curtain of all this infrastructure?

00:11:10.840 --> 00:11:13.059
Indeed. Something to think about. Thanks for

00:11:13.059 --> 00:11:14.899
joining us for this deep dive, El Trio Music.