WEBVTT

00:00:00.000 --> 00:00:03.500
When these massive AI models are running, whether

00:00:03.500 --> 00:00:05.540
you're training the next big thing or just serving

00:00:05.540 --> 00:00:08.960
billions of queries a day, there's this really

00:00:08.960 --> 00:00:11.220
simple question that every company has to ask

00:00:11.220 --> 00:00:15.140
itself. Are we trying to be the fastest? Or are

00:00:15.140 --> 00:00:17.320
we just trying to be the cheapest? Because right

00:00:17.320 --> 00:00:20.839
now, Google is... very clearly choosing to be

00:00:20.839 --> 00:00:23.859
the cheapest. And that whole strategy is their

00:00:23.859 --> 00:00:26.539
direct shot at Nvidia's dominance. And welcome

00:00:26.539 --> 00:00:29.320
to the deep dive. Today, we're digging into the

00:00:29.320 --> 00:00:31.600
latest intelligence on these new AI infrastructure

00:00:31.600 --> 00:00:34.810
wars. And it's... It's a really fascinating fight

00:00:34.810 --> 00:00:36.549
because the goal isn't just about winning on

00:00:36.549 --> 00:00:38.590
performance benchmarks anymore. It's about making

00:00:38.590 --> 00:00:41.310
AI compute so cheap that it just becomes the

00:00:41.310 --> 00:00:43.890
default. Exactly. So first, we're going to unpack

00:00:43.890 --> 00:00:46.270
that whole cost per token battle. It's Google's

00:00:46.270 --> 00:00:49.549
very specialized TPUs versus NVIDIA's GPUs that

00:00:49.549 --> 00:00:51.990
are everywhere. It's a fight for the very foundation

00:00:51.990 --> 00:00:54.390
of AI. Yeah, and after we break down that chip

00:00:54.390 --> 00:00:56.329
war, we've got a really fast -paced segment,

00:00:56.390 --> 00:00:58.270
some critical security alerts, a look at some

00:00:58.270 --> 00:01:00.729
surprising new consumer tech. I'm talking translation

00:01:00.729 --> 00:01:03.450
glasses. And we'll explore some new interactive

00:01:03.450 --> 00:01:05.769
learning tools that are coming online. And then

00:01:05.769 --> 00:01:08.049
finally, we're jumping into a huge medical breakthrough.

00:01:08.450 --> 00:01:10.969
This one is out of Harvard Medical School, a

00:01:10.969 --> 00:01:15.950
new AI tool called PopEVE. And this system is

00:01:15.950 --> 00:01:19.189
solving these really complex genetic mysteries

00:01:19.189 --> 00:01:21.849
that, you know, even the big one, Alpha Missense,

00:01:22.010 --> 00:01:24.189
couldn't quite get right. We'll look at the data,

00:01:24.250 --> 00:01:26.430
and it's pretty clear this is a huge step forward.

00:01:26.769 --> 00:01:28.790
So let's start right there with the hardware.

00:01:29.349 --> 00:01:31.310
Looking at Google versus NVIDIA, I mean, the

00:01:31.310 --> 00:01:33.689
sources we saw make it really clear. Google has

00:01:33.689 --> 00:01:36.629
completely given up on winning the benchmark

00:01:36.629 --> 00:01:38.430
race. They're going all in on cost supremacy.

00:01:38.930 --> 00:01:41.170
That's the pivot. That is the entire strategy.

00:01:41.840 --> 00:01:44.500
Google's goal is to make AI compute so cheap,

00:01:44.500 --> 00:01:46.959
so accessible, that other companies almost have

00:01:46.959 --> 00:01:49.640
no choice but to use Google Cloud. They want

00:01:49.640 --> 00:01:51.739
to win on price. And the key to their advantage,

00:01:51.980 --> 00:01:54.340
they own the whole stack. Right, the entire vertical

00:01:54.340 --> 00:01:56.859
stack, from the chip design itself, the TPUs,

00:01:56.859 --> 00:01:59.019
all the way to the data centers they run in,

00:01:59.079 --> 00:02:00.780
and then the software that sits on top of all

00:02:00.780 --> 00:02:02.659
of it. And that control just gives them this

00:02:02.659 --> 00:02:05.530
incredible power over pricing. Precisely. I mean,

00:02:05.530 --> 00:02:07.769
if you look at Nvidia's market right now, a huge

00:02:07.769 --> 00:02:09.469
chunk of their revenue comes from the margins

00:02:09.469 --> 00:02:12.090
on their hardware. We hear estimates of, what,

00:02:12.129 --> 00:02:15.310
70 % markups on those high -end GPUs? Meanwhile,

00:02:15.550 --> 00:02:18.550
Google can basically sell its own custom TPUs

00:02:18.550 --> 00:02:20.830
through Google Cloud at cost, or maybe even below

00:02:20.830 --> 00:02:22.930
cost. And they can do that because they just

00:02:22.930 --> 00:02:24.710
make the money back on all the other services

00:02:24.710 --> 00:02:27.210
that are tied into that cloud ecosystem, or just

00:02:27.210 --> 00:02:29.949
on pure volume. And this is a total game -changer

00:02:29.949 --> 00:02:31.710
when you look at where the real spending is happening

00:02:31.710 --> 00:02:34.659
now. Absolutely. The market has shifted so dramatically,

00:02:34.819 --> 00:02:38.300
90%, maybe even more, of all the AI spending

00:02:38.300 --> 00:02:41.439
today. It's not for training the models. That's

00:02:41.439 --> 00:02:44.020
a one -time cost, basically. Most of the money

00:02:44.020 --> 00:02:46.460
is for inference. It's running the model every

00:02:46.460 --> 00:02:49.400
single time a user asks a question. So the main

00:02:49.400 --> 00:02:51.479
thing that these big cloud customers care about

00:02:51.479 --> 00:02:54.659
is no longer raw speed. It's not FLOPS, floating

00:02:54.659 --> 00:02:56.900
point operations per second, which is what training

00:02:56.900 --> 00:02:59.530
always focused on. Correct. The metric that truly

00:02:59.530 --> 00:03:02.449
matters now is the lowest possible cost per token,

00:03:02.590 --> 00:03:05.650
but at a massive scale. And Google has designed

00:03:05.650 --> 00:03:09.169
the TPU specifically for that one job. Drive

00:03:09.169 --> 00:03:12.150
the cost of inference down as low as it can possibly

00:03:12.150 --> 00:03:15.530
go. And this cost advantage is, well, it's causing

00:03:15.530 --> 00:03:18.050
real anxiety for NVIDIA. The buzz we're hearing

00:03:18.050 --> 00:03:21.729
is that huge players, think meta, anthropic.

00:03:22.509 --> 00:03:25.389
they might shift billions of dollars in compute

00:03:25.389 --> 00:03:28.250
spend over to Google's TPUs. Even if they only

00:03:28.250 --> 00:03:30.449
move a little bit of that, it could shave, what,

00:03:30.509 --> 00:03:33.389
10 % off NVIDIA's AI revenue. That's an earthquake.

00:03:33.669 --> 00:03:35.969
It is. Now, NVIDIA's defense is strong, but it's

00:03:35.969 --> 00:03:38.030
entirely about the platform. They immediately

00:03:38.030 --> 00:03:40.870
come out and say TPUs are locked in, narrow purpose,

00:03:41.030 --> 00:03:43.430
inflexible. And what's so fascinating is how

00:03:43.430 --> 00:03:46.449
they use CUA, their software ecosystem, as their

00:03:46.449 --> 00:03:49.210
shield. CUA is the bedrock of their whole argument.

00:03:49.370 --> 00:03:51.849
It is. The combination of their GPUs and CUDA.

00:03:52.039 --> 00:03:54.039
Well, it works with almost any model. It handles

00:03:54.039 --> 00:03:56.280
training and inference, no problem. And it runs

00:03:56.280 --> 00:03:59.280
everywhere, any cloud or even on -prem. So NVIDIA

00:03:59.280 --> 00:04:01.500
is fighting a cost war with the platform argument.

00:04:01.620 --> 00:04:03.379
They're basically saying, we're the default.

00:04:03.479 --> 00:04:05.280
We're the flexible ecosystem you can bet your

00:04:05.280 --> 00:04:08.039
whole company on. So if that lock -in argument

00:04:08.039 --> 00:04:12.199
is so powerful, you know, the fear of getting

00:04:12.199 --> 00:04:15.580
stuck with one vendor like Google, what's stopping

00:04:15.580 --> 00:04:17.980
these big players, the metas and anthropics,

00:04:18.040 --> 00:04:21.410
from... Just sticking with NVIDIA's flexibility

00:04:21.410 --> 00:04:23.790
for the long term. The massive cost savings.

00:04:23.850 --> 00:04:26.529
It just incentivizes adopting specialized hardware,

00:04:26.610 --> 00:04:29.029
even with those lock -in fears. Dollar stock.

00:04:29.329 --> 00:04:31.310
Especially when you're dealing with billions

00:04:31.310 --> 00:04:33.430
of user queries every day. It's a tough trade

00:04:33.430 --> 00:04:35.889
-off. Okay, switching gears. Let's hit some rapid

00:04:35.889 --> 00:04:38.230
-fire updates from the world of AI. Starting

00:04:38.230 --> 00:04:40.629
with security and trust, which this just keeps

00:04:40.629 --> 00:04:42.490
coming up in all the sources we see. Yeah, we

00:04:42.490 --> 00:04:44.769
saw a pretty big security alert about chat GPT

00:04:44.769 --> 00:04:48.379
user data. And it wasn't OpenAI's main platform

00:04:48.379 --> 00:04:50.740
that got breached. It was a third -party partner

00:04:50.740 --> 00:04:53.279
with a sloppy configuration that exposed user

00:04:53.279 --> 00:04:56.399
data, emails, things like that. And this just

00:04:56.399 --> 00:04:58.180
brings up such a critical point about the whole

00:04:58.180 --> 00:05:01.379
AI ecosystem. As we weave these tools deeper

00:05:01.379 --> 00:05:03.939
into our businesses, our security isn't just

00:05:03.939 --> 00:05:06.139
about the main company anymore. It's tied to

00:05:06.139 --> 00:05:08.019
the weakest link in the chain, the partners,

00:05:08.180 --> 00:05:10.560
the APIs, all the extensions they use. It's a

00:05:10.560 --> 00:05:13.829
distributed risk. Exactly. Even if you have perfect

00:05:13.829 --> 00:05:17.410
internal controls, that one API gateway on a

00:05:17.410 --> 00:05:20.329
partner system can be the point of failure. And

00:05:20.329 --> 00:05:22.050
honestly, it's something I still wrestle with,

00:05:22.089 --> 00:05:25.550
you know, trying to manage security across, what,

00:05:25.610 --> 00:05:28.089
a dozen different APIs and platforms. It's just...

00:05:28.509 --> 00:05:30.769
It's incredibly difficult to keep a consistent

00:05:30.769 --> 00:05:33.589
security posture when you rely on that many outside

00:05:33.589 --> 00:05:35.670
integrations. That's a really important point.

00:05:36.209 --> 00:05:38.810
Meanwhile, this whole idea of trust is being

00:05:38.810 --> 00:05:41.550
tested by the tech itself. We all saw those Thanksgiving

00:05:41.550 --> 00:05:43.629
photos of Elon Musk and Mark Zuckerberg that

00:05:43.629 --> 00:05:45.230
just went everywhere. Oh, they were so convincing.

00:05:45.449 --> 00:05:48.290
But verified later as totally fake, made by a

00:05:48.290 --> 00:05:51.209
tool called Nano Banana Pro. It just goes to

00:05:51.209 --> 00:05:53.529
show you how easy it's becoming to create really

00:05:53.529 --> 00:05:56.730
high quality, convincing misinformation. Deep

00:05:56.730 --> 00:05:59.079
fakes are here. On a more positive note, the

00:05:59.079 --> 00:06:01.139
tools for learning are getting way more interactive.

00:06:01.300 --> 00:06:03.180
I'm really excited about this. Gem and I just

00:06:03.180 --> 00:06:05.399
rolled out these new interactive diagrams. And

00:06:05.399 --> 00:06:07.079
this is where the tech goes beyond just being

00:06:07.079 --> 00:06:09.420
like an audio textbook. If you're looking at

00:06:09.420 --> 00:06:12.199
a complex system, like a diagram of a cell or

00:06:12.199 --> 00:06:14.879
the digestive system, you can now tap on any

00:06:14.879 --> 00:06:18.259
specific part of it. And you instantly get a

00:06:18.259 --> 00:06:20.959
definition, a deep explanation, all the context

00:06:20.959 --> 00:06:22.959
just for that one little piece. It's like having

00:06:22.959 --> 00:06:25.579
a dynamic tutor for any complex image you see.

00:06:25.720 --> 00:06:27.899
And we're also seeing AI pop up in some surprising

00:06:27.899 --> 00:06:30.000
consumer hardware. Alibaba just dropped some

00:06:30.000 --> 00:06:32.399
AI glasses that are surprisingly cheap. They

00:06:32.399 --> 00:06:34.579
look pretty much like normal glasses, but they

00:06:34.579 --> 00:06:37.000
can scan prices and translate speech in real

00:06:37.000 --> 00:06:39.519
time as you're walking around. A practical use

00:06:39.519 --> 00:06:42.199
case. Finally. And then on the pure ambition

00:06:42.199 --> 00:06:45.360
side of things, IBM is launching a $500 million.

00:06:45.610 --> 00:06:49.050
And it's focused specifically on AI and quantum

00:06:49.050 --> 00:06:51.610
breakthroughs. Their goal is not small. They're

00:06:51.610 --> 00:06:53.889
aiming for a fault -tolerant quantum computer

00:06:53.889 --> 00:06:58.529
by 2029. Wait, by 2029? Whoa. I mean, imagine

00:06:58.529 --> 00:07:01.110
scaling that. In just four years. Fault -tolerant

00:07:01.110 --> 00:07:03.290
compute power at that level. That changes, well,

00:07:03.370 --> 00:07:05.990
it changes everything. Every industry from material

00:07:05.990 --> 00:07:08.550
science to finance. It changes what's even possible.

00:07:09.199 --> 00:07:11.379
Before we move on to genetics, we did see some

00:07:11.379 --> 00:07:13.579
really useful career advice from an ex -meta

00:07:13.579 --> 00:07:15.980
director about how to break into the AI field.

00:07:16.139 --> 00:07:18.939
So beyond the usual talk about getting a PhD,

00:07:19.180 --> 00:07:21.019
what was the most valuable takeaway you saw from

00:07:21.019 --> 00:07:24.019
those tips? Practical experience. Solving real

00:07:24.019 --> 00:07:26.300
-world problems is way more critical than just

00:07:26.300 --> 00:07:29.300
having advanced degrees. That makes sense. Okay,

00:07:29.319 --> 00:07:31.860
so moving from compute and careers, let's jump

00:07:31.860 --> 00:07:34.399
into a genuine breakthrough in medicine. We're

00:07:34.399 --> 00:07:37.040
talking genetics. And the huge challenge of finding

00:07:37.040 --> 00:07:39.980
that one critical disease signal inside all of

00:07:39.980 --> 00:07:42.860
the harmless genetic noise. Right. So DeepMind's

00:07:42.860 --> 00:07:45.600
alpha missants made huge headlines for flagging

00:07:45.600 --> 00:07:49.180
potentially harmful DNA mutations, but just flagging

00:07:49.180 --> 00:07:51.779
a mutation. That's only the first step. The really

00:07:51.779 --> 00:07:54.199
hard part is figuring out which of those mutations

00:07:54.199 --> 00:07:56.899
actually causes a disease and which ones are

00:07:56.899 --> 00:07:59.480
just, you know, common variations that don't

00:07:59.480 --> 00:08:01.899
do anything, the background noise. And that difference,

00:08:01.920 --> 00:08:04.800
the signal versus the noise, that's exactly where

00:08:04.800 --> 00:08:07.079
this new AI from Harvard Medical School, Poppy

00:08:07.079 --> 00:08:10.879
VE, is just proving to be incredible. PopBVE's

00:08:10.879 --> 00:08:13.139
accuracy really comes down to its method. It

00:08:13.139 --> 00:08:15.800
doesn't just look at human DNA. First, it analyzes

00:08:15.800 --> 00:08:18.040
mutation patterns across hundreds of thousands

00:08:18.040 --> 00:08:20.459
of different species. That gives it this massive

00:08:20.459 --> 00:08:23.339
evolutionary context. It's like checking a variant

00:08:23.339 --> 00:08:25.540
against the master blueprint for all of life,

00:08:25.540 --> 00:08:27.860
not just the latest human version. So it's basically

00:08:27.860 --> 00:08:30.160
asking, how important has this gene been over

00:08:30.160 --> 00:08:32.720
millions of years? If it's essential for a frog

00:08:32.720 --> 00:08:34.659
and a fish, it's probably pretty important for

00:08:34.659 --> 00:08:37.809
a human too. Precisely. And then it calibrates

00:08:37.809 --> 00:08:39.649
those evolutionary predictions against these

00:08:39.649 --> 00:08:43.529
huge databases of healthy human genomes. Just

00:08:43.529 --> 00:08:45.669
to double check if a variant is actually common

00:08:45.669 --> 00:08:48.470
in people who don't have the disease, the result

00:08:48.470 --> 00:08:51.110
is a much, much more reliable ranking system

00:08:51.110 --> 00:08:54.149
for doctors. And the numbers here are just dramatic.

00:08:55.340 --> 00:09:00.639
Pop EV cuts false positives by over 75 % compared

00:09:00.639 --> 00:09:03.340
to alpha -miscence. That is a huge improvement

00:09:03.340 --> 00:09:05.620
in clarity for a diagnosis. Think about what

00:09:05.620 --> 00:09:08.360
that means for a patient. Alpha -miscence flagged

00:09:08.360 --> 00:09:11.039
44 % of healthy people as having harmful genetic

00:09:11.039 --> 00:09:13.960
variants. That creates so much unnecessary anxiety

00:09:13.960 --> 00:09:16.919
and confusion. Pop EVD, after it gets rid of

00:09:16.919 --> 00:09:19.700
all those false alarms, only flags 11%. That

00:09:19.700 --> 00:09:21.799
reduction in noise is life -changing. And this

00:09:21.799 --> 00:09:23.840
is where the real -world impact is just stunning.

00:09:24.139 --> 00:09:26.600
Researchers... ran PopEvie on the data from 31

00:09:26.600 --> 00:09:29.659
,000 undiagnosed children, all with severe developmental

00:09:29.659 --> 00:09:32.019
issues. These were cases that had stumped doctors

00:09:32.019 --> 00:09:34.580
for years. The results were immediate. I mean,

00:09:34.580 --> 00:09:37.220
just incredible. Poppy solved one out of every

00:09:37.220 --> 00:09:39.759
three cases that had previously been unexplained.

00:09:39.899 --> 00:09:42.080
And it wasn't just confirming things we already

00:09:42.080 --> 00:09:45.659
knew. It flagged over 120 new genes that had

00:09:45.659 --> 00:09:47.600
never been linked to a human disease before.

00:09:47.879 --> 00:09:51.240
And get this, at least 24 of those have already

00:09:51.240 --> 00:09:53.240
been independently verified by other research

00:09:53.240 --> 00:09:56.179
teams. That's massive validation. It proves this

00:09:56.179 --> 00:09:58.980
AI is moving from just being an identifier to

00:09:58.980 --> 00:10:01.759
a real diagnostic discovery tool. It's actually

00:10:01.759 --> 00:10:05.080
accelerating research. So given how successful

00:10:05.080 --> 00:10:07.399
PPE has been with these rare developmental disorders,

00:10:07.799 --> 00:10:10.120
how quickly do you think this kind of method

00:10:10.120 --> 00:10:13.100
could be adopted for screening more common hereditary

00:10:13.100 --> 00:10:15.720
conditions? The high accuracy is going to rapidly

00:10:15.720 --> 00:10:18.000
accelerate its adoption for broad population

00:10:18.000 --> 00:10:20.740
screening and for personalized medicine. Wow.

00:10:20.840 --> 00:10:22.840
This has been a really comprehensive deep dive.

00:10:23.059 --> 00:10:25.100
We've covered the infrastructure battles, the

00:10:25.100 --> 00:10:27.519
security landscape, and now the frontier of medicine.

00:10:27.779 --> 00:10:30.179
Can we quickly recap the big ideas we hit today?

00:10:30.549 --> 00:10:33.149
The big strategy battle in AI compute, it's not

00:10:33.149 --> 00:10:35.549
about raw speed anymore. It's about owning the

00:10:35.549 --> 00:10:38.269
stack and winning on cost. Google is making a

00:10:38.269 --> 00:10:41.230
massive, very deliberate play for inference supremacy.

00:10:41.629 --> 00:10:44.250
And as these AI tools get deeper into our lives,

00:10:44.470 --> 00:10:46.889
our personal security is critically dependent

00:10:46.889 --> 00:10:49.029
on the reliability of all those third -party

00:10:49.029 --> 00:10:51.690
partners and APIs, not just the main company.

00:10:52.110 --> 00:10:56.169
And finally, AI, like POPEE, is moving beyond

00:10:56.169 --> 00:10:59.710
simple prediction. It's becoming a genuine, verifiable

00:10:59.710 --> 00:11:02.950
tool for diagnostic discovery. It's drastically

00:11:02.950 --> 00:11:05.490
cutting down that painful, crucial diagnostic

00:11:05.490 --> 00:11:07.769
time for families who are dealing with these

00:11:07.769 --> 00:11:10.950
severe, unexplained illnesses. Which really brings

00:11:10.950 --> 00:11:12.649
us to our final thought for you to think about.

00:11:13.029 --> 00:11:16.210
If Harvard's AI can solve one in three previously

00:11:16.210 --> 00:11:19.470
unexplained genetic mysteries today, What percentage

00:11:19.470 --> 00:11:21.850
of major human illnesses, the common ones we

00:11:21.850 --> 00:11:24.230
all deal with, will be fully explained by these

00:11:24.230 --> 00:11:26.909
AI models in, say, the next five years? That's

00:11:26.909 --> 00:11:28.570
the question. And that's what's driving all this

00:11:28.570 --> 00:11:30.350
innovation forward. Thank you for sharing your

00:11:30.350 --> 00:11:32.110
sources with us for this deep dive. We really

00:11:32.110 --> 00:11:33.909
encourage you to explore these ideas further.

00:11:34.049 --> 00:11:35.570
We will catch you on the next deep dive.