WEBVTT

00:00:00.000 --> 00:00:02.720
Okay, so we're back and ready to go deep on this

00:00:02.720 --> 00:00:08.879
one. It's NVIDIA's GTC March 2025 keynote. And,

00:00:08.879 --> 00:00:11.460
well, Jensen Wong did not disappoint, to say

00:00:11.460 --> 00:00:13.880
the least. Not even close. We've got the full

00:00:13.880 --> 00:00:16.420
keynote transcript, the slides, the demos. I

00:00:16.420 --> 00:00:18.640
mean, it was a firehose of information and some

00:00:18.640 --> 00:00:20.859
really, really big announcements in there. Absolutely.

00:00:21.440 --> 00:00:24.300
So the mission today is to cut through all that

00:00:24.300 --> 00:00:26.760
and really distill, you know, the key takeaways.

00:00:27.100 --> 00:00:29.320
What are the big moves NVIDIA is making? What

00:00:29.320 --> 00:00:31.839
does it mean for the industry as a whole? And

00:00:31.839 --> 00:00:33.780
most importantly, how does it connect back to

00:00:33.780 --> 00:00:35.590
what our listeners need to know? I think the

00:00:35.590 --> 00:00:37.850
biggest theme right from the start was just how

00:00:37.850 --> 00:00:40.450
intertwined everything's becoming. You know,

00:00:40.490 --> 00:00:42.670
AI obviously is at the center, but the way it's

00:00:42.670 --> 00:00:44.729
impacting graphics, data centers, cars, robotics,

00:00:45.090 --> 00:00:47.509
even basic scientific research. Yeah. It's all

00:00:47.509 --> 00:00:49.810
connected now. Right, right. And the speed at

00:00:49.810 --> 00:00:51.670
which it's all happening is, well, it's kind

00:00:51.670 --> 00:00:54.369
of mind blowing. So let's jump in. One of the

00:00:54.369 --> 00:00:55.710
first things that really caught my attention

00:00:55.710 --> 00:00:58.710
was this concept of AI as a new kind of factory,

00:00:58.869 --> 00:01:00.850
a token factory. Can you break that down a bit?

00:01:01.179 --> 00:01:03.359
Yeah, it's a really powerful analogy. I mean,

00:01:03.380 --> 00:01:05.659
think about it. Traditional factories, they take

00:01:05.659 --> 00:01:08.159
raw materials and produce physical goods. But

00:01:08.159 --> 00:01:10.459
these AI factories, they're producing something

00:01:10.459 --> 00:01:12.799
different, right? They're churning out tokens

00:01:12.799 --> 00:01:16.219
as their output. Tokens. So in simple terms,

00:01:16.359 --> 00:01:19.379
what exactly are these tokens? Essentially, think

00:01:19.379 --> 00:01:21.219
of them as the fundamental building blocks of

00:01:21.219 --> 00:01:24.239
information that AI models use. Like a token

00:01:24.239 --> 00:01:26.599
could represent a word. a piece of code, a pixel,

00:01:26.659 --> 00:01:29.760
and an image. There would allow AI to process

00:01:29.760 --> 00:01:31.879
and generate all these different kinds of data.

00:01:32.180 --> 00:01:34.219
And what's really amazing is just how versatile

00:01:34.219 --> 00:01:36.379
they are. Okay, that's starting to click. So

00:01:36.379 --> 00:01:39.519
the more complex the AI, the more tokens it's

00:01:39.519 --> 00:01:42.299
processing and generating. Exactly. And that

00:01:42.299 --> 00:01:44.379
connects to this whole evolution of AI that Jensen

00:01:44.379 --> 00:01:46.459
Huang talked about. We went from AI that was

00:01:46.459 --> 00:01:48.340
primarily about perception, like recognizing

00:01:48.340 --> 00:01:50.239
objects and images. Then we had generative AI,

00:01:50.459 --> 00:01:52.920
creating new content. Now we're moving into what

00:01:52.920 --> 00:01:55.730
he calls agentic AI. Agentic AI. What does that

00:01:55.730 --> 00:01:58.650
mean? Like AI with a sense of agency? Yeah, essentially.

00:01:58.790 --> 00:02:00.750
I mean, it goes beyond just generating outputs.

00:02:00.849 --> 00:02:04.890
It's AI that can reason, plan, make decisions,

00:02:05.049 --> 00:02:07.909
use tools that can actually take actions to achieve

00:02:07.909 --> 00:02:11.550
a goal. Wow. That's a big leap. And the next

00:02:11.550 --> 00:02:14.500
stage, he said, is physical AI. I mean, that

00:02:14.500 --> 00:02:16.180
sounds like something out of science fiction.

00:02:16.400 --> 00:02:19.240
Right. It's about bringing AI into the real world,

00:02:19.379 --> 00:02:21.780
you know, enabling it to truly understand and

00:02:21.780 --> 00:02:24.759
interact with the physical environment. Robotics

00:02:24.759 --> 00:02:27.060
is a huge part of that. OK, so we have this clear

00:02:27.060 --> 00:02:29.819
progression, but each of these steps comes with

00:02:29.819 --> 00:02:32.400
its own challenges, right? Like accuracy for

00:02:32.400 --> 00:02:36.039
perception, coherence for generative AI. What

00:02:36.039 --> 00:02:38.319
are the big hurdles for agentic and physical

00:02:38.319 --> 00:02:42.229
AI? For agentic AI, it's all about trust. I mean,

00:02:42.229 --> 00:02:44.030
if these AI agents are going to be making decisions,

00:02:44.229 --> 00:02:45.870
we need to be confident that they're making the

00:02:45.870 --> 00:02:47.389
right ones, that they're aligned with our values

00:02:47.389 --> 00:02:50.569
and that they won't go rogue. Right, right. That's

00:02:50.569 --> 00:02:52.389
a bit unsettling to think about. And for physical

00:02:52.389 --> 00:02:54.810
AI? Well, the real world is messy. I mean, it's

00:02:54.810 --> 00:02:57.330
unpredictable. So the challenge there is building

00:02:57.330 --> 00:02:59.650
AI that can handle all that complexity, that

00:02:59.650 --> 00:03:02.270
can adapt to changing conditions and still operate

00:03:02.270 --> 00:03:05.030
safely and effectively. And all of this, of course,

00:03:05.069 --> 00:03:07.729
requires massive amounts of computational power.

00:03:08.009 --> 00:03:10.789
I mean, Jensen Huang really hammered home the

00:03:10.789 --> 00:03:13.129
point that the need for compute is just exploding.

00:03:13.370 --> 00:03:15.590
Yeah, it's almost like a wake -up call. He said

00:03:15.590 --> 00:03:18.090
the computational demands for these new AI models,

00:03:18.389 --> 00:03:21.229
especially the agentic AI, it's something like

00:03:21.229 --> 00:03:23.430
100 times greater than what we were anticipating

00:03:23.430 --> 00:03:26.669
even a year ago. Wow. And what's driving that?

00:03:26.770 --> 00:03:29.669
Is it just the sheer size of these models? Well,

00:03:29.689 --> 00:03:31.430
that's part of it, but it's also the techniques

00:03:31.430 --> 00:03:33.129
they're using. Like he talked about chain of

00:03:33.129 --> 00:03:35.689
thought reasoning. Basically, it means the AI

00:03:35.689 --> 00:03:38.409
breaks down a problem into smaller steps, generating

00:03:38.409 --> 00:03:41.710
a ton more tokens in the process. So the smarter

00:03:41.710 --> 00:03:44.550
and more complex the AI gets, the more compute

00:03:44.550 --> 00:03:47.379
it needs. And that brings us to, well, NVIDIA's

00:03:47.379 --> 00:03:49.240
bread and butter, right? Accelerated computing,

00:03:49.680 --> 00:03:53.180
GPUs, the whole CUDA ecosystem. Absolutely. It's

00:03:53.180 --> 00:03:56.120
all about harnessing the power of parallel processing

00:03:56.120 --> 00:03:58.539
to tackle these massively complex computational

00:03:58.539 --> 00:04:02.080
tasks. And CUDA, that's NVIDIA's platform for

00:04:02.080 --> 00:04:03.939
making that happen. Okay. For those who might

00:04:03.939 --> 00:04:05.740
not be familiar with C2, can you give us the

00:04:05.740 --> 00:04:09.159
quick rundown? Sure. Think of CUDA as the language

00:04:09.159 --> 00:04:11.379
that lets developers write software that runs

00:04:11.379 --> 00:04:14.500
on NVIDIA GPUs. Unlike CPUs, which are good at

00:04:14.500 --> 00:04:17.560
doing a few things really fast, GPUs have thousands

00:04:17.560 --> 00:04:20.459
of smaller cores designed for massive parallel

00:04:20.459 --> 00:04:23.620
processing. So for tasks like AI training and

00:04:23.620 --> 00:04:26.680
inference, that's where GPUs really shine. And

00:04:26.680 --> 00:04:29.279
then there's all these CUDAx libraries. It felt

00:04:29.279 --> 00:04:31.079
like they were announcing a new one every few

00:04:31.079 --> 00:04:32.879
minutes during the keynote. Can you give us a

00:04:32.879 --> 00:04:34.819
sense of what they are and why they're so important?

00:04:35.300 --> 00:04:37.720
Yeah. Think of them as specialized toolkits,

00:04:37.720 --> 00:04:40.680
right? Like they're built on top of CUA and they're

00:04:40.680 --> 00:04:43.019
specifically optimized for accelerating different

00:04:43.019 --> 00:04:45.439
areas of science and industry. So instead of

00:04:45.439 --> 00:04:47.160
starting from scratch, developers can just plug

00:04:47.160 --> 00:04:49.509
in these libraries and boom. Instant performance

00:04:49.509 --> 00:04:52.290
boost. And there's a library for almost everything,

00:04:52.389 --> 00:04:54.569
right? I mean, we saw QNumeric for NumPy, Keyletho

00:04:54.569 --> 00:04:57.850
for chip manufacturing, Arial for 5G. Right.

00:04:57.910 --> 00:05:00.089
And each one is addressing a very specific need.

00:05:00.209 --> 00:05:03.170
Like QNumeric, that's a huge deal for data scientists.

00:05:03.670 --> 00:05:05.930
NumPy is like the foundation of scientific computing

00:05:05.930 --> 00:05:08.389
in Python. And QNumeric makes it run seamlessly

00:05:08.389 --> 00:05:11.920
on GPUs with massive speedups. In Kulitho, that's

00:05:11.920 --> 00:05:13.779
the one for computational lithography, right?

00:05:13.899 --> 00:05:16.720
The process of making chips. That seemed to tie

00:05:16.720 --> 00:05:19.139
in really closely with this idea of AI factories

00:05:19.139 --> 00:05:21.379
that Jensen Huang kept talking about. Totally.

00:05:21.439 --> 00:05:24.379
His point was that every company that has a physical

00:05:24.379 --> 00:05:27.100
factory, they're also going to need an AI factory

00:05:27.100 --> 00:05:30.699
to design and optimize the processes in that

00:05:30.699 --> 00:05:33.329
physical factory. And Caletho, it's a perfect

00:05:33.329 --> 00:05:35.949
example. It's accelerating the creation of those

00:05:35.949 --> 00:05:39.389
incredibly complex masks used to etch circuits

00:05:39.389 --> 00:05:41.970
onto silicon wafers. So it's almost like a factory

00:05:41.970 --> 00:05:44.509
within a factory, right? The AI factory powering

00:05:44.509 --> 00:05:46.810
the physical factory. Exactly. And he said it's

00:05:46.810 --> 00:05:48.209
not just chip manufacturing. It's going to be

00:05:48.209 --> 00:05:51.680
true for every industry. Aerospace. automotive,

00:05:51.680 --> 00:05:54.439
pharmaceuticals, AI is becoming fundamental to

00:05:54.439 --> 00:05:56.300
the design and creation of physical products.

00:05:56.519 --> 00:05:58.319
That's a big statement. And then there was Arial,

00:05:58.399 --> 00:06:01.220
which is enabling GPUs to basically function

00:06:01.220 --> 00:06:04.480
as 5G radios. How does that even work? It's pretty

00:06:04.480 --> 00:06:06.579
mind -blowing. Arial is a library that allows

00:06:06.579 --> 00:06:09.019
GPUs to handle all the complex signal processing

00:06:09.019 --> 00:06:12.100
that's needed for 5G communication. And then

00:06:12.100 --> 00:06:15.160
they layer AI on top of that with AI -RAN, which

00:06:15.160 --> 00:06:17.970
stands for AI and Radio Access Networks. AI,

00:06:17.970 --> 00:06:20.370
Ran. So AI is being used to optimize the network

00:06:20.370 --> 00:06:23.689
in real time. Yeah. It can dynamically adjust

00:06:23.689 --> 00:06:26.709
things like power allocation, beamforming all

00:06:26.709 --> 00:06:28.310
the things that affect the performance of the

00:06:28.310 --> 00:06:30.750
network. The goal is to make it more efficient,

00:06:30.930 --> 00:06:33.889
more responsive, and ultimately deliver a better

00:06:33.889 --> 00:06:36.420
user experience. That's fascinating. And it's

00:06:36.420 --> 00:06:38.800
another example of how AI is blurring the lines

00:06:38.800 --> 00:06:41.759
between hardware and software. Absolutely. And

00:06:41.759 --> 00:06:45.240
that trend continues with Qopt, the library for

00:06:45.240 --> 00:06:47.600
numerical and mathematical optimization. Right.

00:06:47.879 --> 00:06:49.959
Optimization problems are everywhere, right?

00:06:50.019 --> 00:06:52.339
From logistics and supply chains to financial

00:06:52.339 --> 00:06:54.800
modeling. Exactly. It's all about finding the

00:06:54.800 --> 00:06:56.980
best solution out of a huge number of possibilities.

00:06:57.660 --> 00:07:00.379
And traditionally, those calculations can be

00:07:00.379 --> 00:07:03.620
extremely computationally intensive. Qopt brings

00:07:03.620 --> 00:07:06.019
the power of GPUs to bear on those problems,

00:07:06.199 --> 00:07:08.699
significantly speeding up the process. And then

00:07:08.699 --> 00:07:10.879
there was the whole rapid fire section on all

00:07:10.879 --> 00:07:13.279
the other CDX libraries, Parabricks for gene

00:07:13.279 --> 00:07:16.500
sequencing, MONAI for medical imaging, Earth2

00:07:16.500 --> 00:07:18.399
for weather prediction. It was almost overwhelming,

00:07:18.579 --> 00:07:20.970
but in a good way. Yeah, it really showcased

00:07:20.970 --> 00:07:24.610
just how broad the impact of accelerated computing

00:07:24.610 --> 00:07:27.329
is. I mean, these libraries were enabling breakthroughs

00:07:27.329 --> 00:07:30.290
in critical areas like health care, climate science,

00:07:30.449 --> 00:07:32.990
drug discovery. It's incredible to see the potential.

00:07:33.189 --> 00:07:35.509
And they're even dipping their toes into quantum

00:07:35.509 --> 00:07:38.910
computing with KuQuantum and CuDay Quantum. They

00:07:38.910 --> 00:07:41.670
even had their first quantum day at GTC, which

00:07:41.670 --> 00:07:44.199
is. Pretty forward -looking. Definitely. Quantum

00:07:44.199 --> 00:07:47.019
computing is still early days, but NVIDIA is

00:07:47.019 --> 00:07:49.699
clearly taking it seriously. These libraries

00:07:49.699 --> 00:07:51.959
are providing tools for researchers to explore

00:07:51.959 --> 00:07:54.959
and develop quantum algorithms and to start integrating

00:07:54.959 --> 00:07:57.819
quantum processors with classical GPUs and hybrid

00:07:57.819 --> 00:08:00.779
systems. So NVIDIA is laying the groundwork for

00:08:00.779 --> 00:08:02.879
a future where quantum and classical computing

00:08:02.879 --> 00:08:05.399
work together. Exactly. And then there was CUT

00:08:05.399 --> 00:08:07.819
ESS, the big announcement for sparse solvers

00:08:07.819 --> 00:08:10.860
in CAE, which is computer -aided engineering.

00:08:11.279 --> 00:08:13.079
Right. That one seemed to generate a lot of buzz.

00:08:13.259 --> 00:08:15.920
What's so significant about it? Well, SCAR solvers,

00:08:15.980 --> 00:08:17.800
they're used in all sorts of engineering simulations,

00:08:17.800 --> 00:08:21.259
like how a bridge will withstand stress or how

00:08:21.259 --> 00:08:23.379
a new airplane design will behave in flight.

00:08:23.620 --> 00:08:26.959
And until recently, NVIDIA themselves were using

00:08:26.959 --> 00:08:29.180
general purpose computers to design their own

00:08:29.180 --> 00:08:31.519
accelerated computing hardware. Wow. So they

00:08:31.519 --> 00:08:34.399
were essentially using slower tools to design

00:08:34.399 --> 00:08:38.419
faster tools. Exactly. But with Cut ESS, they've

00:08:38.419 --> 00:08:41.600
optimized those solvers for CD. which means they

00:08:41.600 --> 00:08:44.820
can run on GPUs. So now the entire process of

00:08:44.820 --> 00:08:47.100
engineering design, it's getting supercharged.

00:08:47.440 --> 00:08:50.039
Faster simulations, more complex designs. It's

00:08:50.039 --> 00:08:52.039
a big leap forward. And they also touched on

00:08:52.039 --> 00:08:55.139
CutEF for data frames and Warp for physics simulations

00:08:55.139 --> 00:08:57.059
and Python. I mean, the list just goes on and

00:08:57.059 --> 00:08:58.919
on. And that's the key takeaway, right? They're

00:08:58.919 --> 00:09:01.759
building this incredibly rich ecosystem of software

00:09:01.759 --> 00:09:05.639
tools, all powered by CBA and GPUs. And because

00:09:05.639 --> 00:09:09.100
CDA is so widely adopted, these tools are immediately

00:09:09.100 --> 00:09:11.620
valuable to millions of developers across all

00:09:11.620 --> 00:09:14.000
these different fields. It's why NVIDIA believes

00:09:14.000 --> 00:09:15.820
we've reached this tipping point of accelerated

00:09:15.820 --> 00:09:18.500
computing. Before we move on from CDA, I wanted

00:09:18.500 --> 00:09:20.960
to touch on Halos, which is their framework for

00:09:20.960 --> 00:09:23.399
automotive safety. They didn't spend a ton of

00:09:23.399 --> 00:09:25.639
time on it, but given the growing importance

00:09:25.639 --> 00:09:28.080
of self -driving cars, it seems pretty crucial.

00:09:28.419 --> 00:09:31.120
Oh, absolutely. Halos, it's their way of making

00:09:31.120 --> 00:09:34.309
sure that self -driving cars are, well... safe.

00:09:34.570 --> 00:09:37.129
They're doing rigorous testing on every line

00:09:37.129 --> 00:09:39.629
of code, which millions of lines, by the way.

00:09:39.750 --> 00:09:42.370
And they're building in principles of diversity,

00:09:42.750 --> 00:09:46.070
transparency and explainability. They even dedicated

00:09:46.070 --> 00:09:48.429
a whole workshop at GTC to it. So they're not

00:09:48.429 --> 00:09:50.409
just focused on the performance of self -driving

00:09:50.409 --> 00:09:53.029
systems, but also on their safety and trustworthiness.

00:09:53.090 --> 00:09:55.090
Exactly. It's not enough for these systems to

00:09:55.090 --> 00:09:57.110
just work. They have to work safely and predictably.

00:09:57.230 --> 00:09:59.230
OK, so we've covered the foundational layer,

00:09:59.389 --> 00:10:01.710
the software tools. Now let's get into the hardware.

00:10:02.450 --> 00:10:04.889
NVIDIA announced their new Blackwell architecture,

00:10:05.269 --> 00:10:07.149
and the big news was that they're essentially

00:10:07.149 --> 00:10:10.610
putting two GPUs in a single package. What's

00:10:10.610 --> 00:10:12.330
the thinking behind that? Well, it's all about

00:10:12.330 --> 00:10:14.730
increasing compute density, right? Putting more

00:10:14.730 --> 00:10:17.730
processing power into a smaller space. But the

00:10:17.730 --> 00:10:19.889
really interesting part is the disaggregated

00:10:19.889 --> 00:10:22.570
NVLink switch, which they're calling MVLink Now.

00:10:22.929 --> 00:10:25.230
MVLink. So what does that mean and why is it

00:10:25.230 --> 00:10:28.600
important? So NVLink, that's NVIDIA's high -speed

00:10:28.600 --> 00:10:30.980
interconnect, that allows GPUs to communicate

00:10:30.980 --> 00:10:33.879
with each other really fast before it was integrated

00:10:33.879 --> 00:10:37.000
onto the motherboard. But now with NVLink, they've

00:10:37.000 --> 00:10:39.179
made it a separate switch. And that gives them

00:10:39.179 --> 00:10:41.419
a lot more flexibility. Flexibility in what way?

00:10:41.600 --> 00:10:44.200
Well, they can connect more GPUs together with

00:10:44.200 --> 00:10:46.779
higher bandwidth and with more simultaneous communication

00:10:46.779 --> 00:10:50.220
pathways. And that's crucial for scaling up these

00:10:50.220 --> 00:10:53.529
massive AI models. So MVLink is all about allowing

00:10:53.529 --> 00:10:55.610
the GPUs to talk to each other more efficiently.

00:10:55.830 --> 00:10:58.990
Exactly. And that, combined with the move towards

00:10:58.990 --> 00:11:01.590
liquid -cooled data centers, means they can pack

00:11:01.590 --> 00:11:03.929
a lot more compute into a single rack. I mean,

00:11:03.950 --> 00:11:05.809
they talked about achieving one exaflops in a

00:11:05.809 --> 00:11:08.669
single rack. One exaflops? That's mind -boggling.

00:11:08.669 --> 00:11:10.970
I mean, what does that even translate to in practical

00:11:10.970 --> 00:11:13.669
terms? It's an astronomical amount of computing

00:11:13.669 --> 00:11:16.529
power. And it's all made possible by liquid cooling.

00:11:16.919 --> 00:11:19.000
which is much more efficient at removing heat

00:11:19.000 --> 00:11:21.580
than traditional air cooling. So they're pushing

00:11:21.580 --> 00:11:24.340
the limits of both chip design and thermal management

00:11:24.340 --> 00:11:27.179
to achieve these massive scales. Absolutely.

00:11:27.399 --> 00:11:29.620
And it's not just about training these AI models.

00:11:29.940 --> 00:11:31.919
They also talked a lot about the importance of

00:11:31.919 --> 00:11:35.279
inference, which is actually using the trained

00:11:35.279 --> 00:11:37.820
models to generate outputs. Right. That's where

00:11:37.820 --> 00:11:39.879
the rubber meets the road, right? It's what actually

00:11:39.879 --> 00:11:42.720
delivers value to users. Exactly. And inference,

00:11:42.960 --> 00:11:45.559
especially for these new reasoning -based AI

00:11:45.559 --> 00:11:48.509
models. it can be incredibly computationally

00:11:48.509 --> 00:11:50.429
demanding. I mean, they gave this example of

00:11:50.429 --> 00:11:53.070
a wedding seating problem where a reasoning model

00:11:53.070 --> 00:11:56.509
used over 8 ,000 tokens compared to just under

00:11:56.509 --> 00:11:59.669
500 for a basic language model. Wow. So reasoning

00:11:59.669 --> 00:12:02.529
requires a lot more token generation, which translates

00:12:02.529 --> 00:12:05.669
to more compute. Exactly. And users expect those

00:12:05.669 --> 00:12:07.190
outputs quickly, right? They don't want to be

00:12:07.190 --> 00:12:09.350
waiting around. So you need a really powerful

00:12:09.350 --> 00:12:12.460
infrastructure to handle all that. And of course,

00:12:12.460 --> 00:12:14.500
these models are so big that they have to be

00:12:14.500 --> 00:12:18.220
split across multiple GPUs. They mention techniques

00:12:18.220 --> 00:12:21.440
like tensor parallelism, pipeline parallelism,

00:12:21.539 --> 00:12:24.840
expert parallelism. Can you give us a quick overview

00:12:24.840 --> 00:12:27.240
of what those are? Yeah, they're basically different

00:12:27.240 --> 00:12:29.600
ways of breaking down the model and its computations

00:12:29.600 --> 00:12:32.279
so that it can be distributed across multiple

00:12:32.279 --> 00:12:35.429
GPUs working together. Each technique has its

00:12:35.429 --> 00:12:38.490
own advantages and challenges, and the best approach

00:12:38.490 --> 00:12:41.950
often depends on the specific model and the hardware

00:12:41.950 --> 00:12:44.269
you're using. So it's a complex orchestration

00:12:44.269 --> 00:12:46.110
problem trying to manage all these different

00:12:46.110 --> 00:12:48.610
pieces. Totally. And that's where NVIDIA Dynavo

00:12:48.610 --> 00:12:50.649
comes in. Yeah. They're describing it as the

00:12:50.649 --> 00:12:53.470
operating system for these AI factories. An operating

00:12:53.470 --> 00:12:55.830
system for AI. What does that mean? Basically,

00:12:55.830 --> 00:12:58.029
it handles all the low level stuff like scheduling

00:12:58.029 --> 00:13:01.330
workloads, routing data between GPUs, optimizing

00:13:01.330 --> 00:13:04.149
the distribution of compute resources. It's like

00:13:04.149 --> 00:13:06.210
the conductor of the orchestra, making sure everything

00:13:06.210 --> 00:13:08.370
runs smoothly. And they're making it open source,

00:13:08.429 --> 00:13:10.649
right? So other companies can build on top of

00:13:10.649 --> 00:13:12.809
it. Exactly. And they're collaborating with companies

00:13:12.809 --> 00:13:15.250
like Perplexity to really build out this ecosystem

00:13:15.250 --> 00:13:18.000
around Dynamo. OK, so we've got the hardware,

00:13:18.159 --> 00:13:20.500
we've got the software infrastructure. Now, what

00:13:20.500 --> 00:13:22.980
about the actual performance games? I mean, they

00:13:22.980 --> 00:13:25.679
were throwing around some impressive numbers.

00:13:25.980 --> 00:13:28.220
Yeah, they showed Blackwell with MVLink 72 and

00:13:28.220 --> 00:13:30.740
Dynamo delivering up to 40 times faster token

00:13:30.740 --> 00:13:33.940
generation compared to Hopper, especially for

00:13:33.940 --> 00:13:36.659
those reasoning models. And the ISO power efficiency

00:13:36.659 --> 00:13:40.539
gains were around 25x. ISO power. Can you explain

00:13:40.539 --> 00:13:42.360
what that means? Sure. Basically, it means they're

00:13:42.360 --> 00:13:44.980
getting 25 times the performance for the same

00:13:44.980 --> 00:13:47.019
amount of power consumption. So it's not just

00:13:47.019 --> 00:13:49.159
about raw speed. It's about efficiency, too.

00:13:49.240 --> 00:13:51.240
Right. Which is really important for data centers

00:13:51.240 --> 00:13:53.799
where power is a major cost factor. Absolutely.

00:13:54.059 --> 00:13:55.679
And then there was this whole discussion about

00:13:55.679 --> 00:13:59.559
digital twins and the role of NVIDIA Omniverse

00:13:59.559 --> 00:14:02.559
in designing and managing these AI factories.

00:14:02.759 --> 00:14:05.220
Can you unpack that a bit? What are digital twins

00:14:05.220 --> 00:14:07.820
and why are they important in this context? A

00:14:07.820 --> 00:14:10.460
digital twin, it's basically a virtual replica

00:14:10.460 --> 00:14:13.149
of a physical system. So in this case, it would

00:14:13.149 --> 00:14:16.129
be a virtual representation of the entire AI

00:14:16.129 --> 00:14:19.789
infrastructure, the servers, the GPUs, the networking,

00:14:19.990 --> 00:14:22.370
the cooling systems. And the idea is that you

00:14:22.370 --> 00:14:24.769
can use this digital twin to simulate different

00:14:24.769 --> 00:14:27.990
scenarios, optimize performance, and even identify

00:14:27.990 --> 00:14:30.289
potential problems before they happen in the

00:14:30.289 --> 00:14:32.539
real world. So it's like having a virtual test

00:14:32.539 --> 00:14:35.379
bed for your entire AI factory. Exactly. And

00:14:35.379 --> 00:14:38.639
NVIDIA Omniverse, that's their platform for creating

00:14:38.639 --> 00:14:41.059
and running these digital twins. They're partnering

00:14:41.059 --> 00:14:43.299
with companies like Vertiv and Schneider Electric,

00:14:43.539 --> 00:14:45.759
who are experts in data center infrastructure,

00:14:46.059 --> 00:14:48.639
to make these digital twins as realistic and

00:14:48.639 --> 00:14:50.820
useful as possible. OK, so they're really thinking

00:14:50.820 --> 00:14:53.100
holistically about the entire AI infrastructure

00:14:53.100 --> 00:14:55.840
from the chips to the software to the physical

00:14:55.840 --> 00:14:58.120
data center itself. Totally. And then there's

00:14:58.120 --> 00:15:00.279
the networking piece, which is critical for scaling

00:15:00.279 --> 00:15:02.600
these systems. They talked about their Spectrum

00:15:02.600 --> 00:15:05.279
X Ethernet solution and how Cisco is adopting

00:15:05.279 --> 00:15:07.799
it for enterprise AI. Cisco is a big name in

00:15:07.799 --> 00:15:10.179
networking. What's significant about this partnership?

00:15:10.559 --> 00:15:13.200
Well, it means that high performance networking

00:15:13.200 --> 00:15:17.149
for AI. It's becoming more mainstream. Cisco's

00:15:17.149 --> 00:15:19.629
putting Spectrum X into their enterprise products,

00:15:19.929 --> 00:15:22.269
making it more accessible for companies that

00:15:22.269 --> 00:15:24.830
want to deploy AI at scale. And they also talked

00:15:24.830 --> 00:15:27.929
about the limitations of traditional copper interconnects

00:15:27.929 --> 00:15:30.110
at these massive scales. What's the alternative?

00:15:30.730 --> 00:15:33.309
They're betting on silicon photonics, which is

00:15:33.309 --> 00:15:36.669
essentially using light to transmit data instead

00:15:36.669 --> 00:15:38.889
of electrical signals. It offers much higher

00:15:38.889 --> 00:15:41.450
bandwidth, lower latency and lower power consumption.

00:15:41.750 --> 00:15:44.639
So it's like fiber optic cables on a chip. Yeah,

00:15:44.700 --> 00:15:47.600
sort of. They're co -packaging the optical transceivers

00:15:47.600 --> 00:15:49.720
directly with the processing chips, which is

00:15:49.720 --> 00:15:51.980
a pretty big engineering feat. And they highlighted

00:15:51.980 --> 00:15:54.500
this new type of modulator called a micro ring

00:15:54.500 --> 00:15:57.080
resonator or MRM. What's the advantage of that?

00:15:57.500 --> 00:16:00.629
MRMs are much smaller. and more power efficient

00:16:00.629 --> 00:16:03.230
than traditional modulators so they can pack

00:16:03.230 --> 00:16:05.370
more optical connections into the same space.

00:16:05.850 --> 00:16:08.330
It's a big step towards making silicon photonics

00:16:08.330 --> 00:16:11.090
practical for large scale data centers. OK, so

00:16:11.090 --> 00:16:12.909
they're laying the groundwork for the next generation

00:16:12.909 --> 00:16:15.850
of high performance networking. Now let's get

00:16:15.850 --> 00:16:18.509
to the roadmap. They talked about Blackwell Ultra

00:16:18.509 --> 00:16:21.389
coming in the second half of 2025, followed by

00:16:21.389 --> 00:16:24.529
Vera Rubin in the second half of 2026. What's

00:16:24.529 --> 00:16:27.149
the big deal about Rubin? Rubin is a complete

00:16:27.149 --> 00:16:30.139
architectural overhaul. They're introducing a

00:16:30.139 --> 00:16:34.980
new CPU, a new GPU, a new networking fabric called

00:16:34.980 --> 00:16:40.100
MVLink 144, and the latest HBM4 memory. It's

00:16:40.100 --> 00:16:42.279
like a whole new generation of their high -performance

00:16:42.279 --> 00:16:44.820
computing platform. And then there's RubinUltra

00:16:44.820 --> 00:16:47.840
in 2027, which promises even more extreme scale

00:16:47.840 --> 00:16:50.460
-up. They were talking about 15 exaflops of compute

00:16:50.460 --> 00:16:53.399
and 4 ,600 terabytes per second of bandwidth.

00:16:53.620 --> 00:16:55.799
Those are just insane number. Yeah, it's clear

00:16:55.799 --> 00:16:57.720
they're not slowing down. And they showed how

00:16:57.720 --> 00:16:59.659
the total cost of ownership is actually going

00:16:59.659 --> 00:17:02.740
down with each generation, even as the performance

00:17:02.740 --> 00:17:05.460
is skyrocketing. So these incredibly powerful

00:17:05.460 --> 00:17:07.539
capabilities are becoming more and more accessible.

00:17:07.960 --> 00:17:09.660
That's an important point. It's not just about

00:17:09.660 --> 00:17:11.400
pushing the limits of what's possible. It's about

00:17:11.400 --> 00:17:13.619
making those capabilities affordable and practical

00:17:13.619 --> 00:17:16.819
for a wider range of users. Exactly. And then

00:17:16.819 --> 00:17:19.220
they shifted gears to talk about AI for enterprise

00:17:19.220 --> 00:17:22.180
computing. They announced these new DGX systems,

00:17:22.440 --> 00:17:25.380
DGX Spark, which is like a personal AI development

00:17:25.380 --> 00:17:29.400
platform, and DGX Station, which is a super -powered

00:17:29.400 --> 00:17:31.220
workstation. What's the target market for those?

00:17:31.420 --> 00:17:32.980
I mean, who needs that much power at their desk?

00:17:33.160 --> 00:17:35.740
Well, DGX Stark. That's aimed at individual researchers

00:17:35.740 --> 00:17:38.160
and developers who want a dedicated AI development

00:17:38.160 --> 00:17:40.420
environment. It's powerful enough for serious

00:17:40.420 --> 00:17:43.079
work, but still relatively compact and affordable.

00:17:44.519 --> 00:17:46.920
That's for the real power users, people doing

00:17:46.920 --> 00:17:49.440
cutting -edge AI research or developing really

00:17:49.440 --> 00:17:51.920
demanding applications. It's like having a mini

00:17:51.920 --> 00:17:54.400
supercomputer on your desk. And they talked about

00:17:54.400 --> 00:17:57.160
reinventing storage for AI. What does that mean?

00:17:57.359 --> 00:17:59.460
It's all about moving away from traditional file

00:17:59.460 --> 00:18:01.599
-based storage to something that's more intelligent

00:18:01.599 --> 00:18:03.619
and aware of the meaning of the data. They're

00:18:03.619 --> 00:18:06.160
calling it semantics -based retrieval, and they're

00:18:06.160 --> 00:18:08.160
partnering with all the major storage vendors

00:18:08.160 --> 00:18:11.500
to make it happen. So it's about making the storage

00:18:11.500 --> 00:18:14.579
itself more AI -aware. Exactly. And then there's

00:18:14.579 --> 00:18:16.960
the open sourcing of their reasoning model as

00:18:16.960 --> 00:18:19.420
part of their NIMS initiative. What's the significance

00:18:19.420 --> 00:18:22.869
of that? It's about democratizing access. to

00:18:22.869 --> 00:18:26.089
these advanced AI capabilities. By making the

00:18:26.089 --> 00:18:29.069
model open source, they're allowing anyone to

00:18:29.069 --> 00:18:31.210
use it and build on top of it. So they're not

00:18:31.210 --> 00:18:32.849
just building the tools, they're also making

00:18:32.849 --> 00:18:35.190
the core technologies more accessible. Exactly.

00:18:35.410 --> 00:18:37.529
And finally, there was the big push in for robotics.

00:18:38.190 --> 00:18:41.069
Jensen Wong declared that the time has come for

00:18:41.069 --> 00:18:44.109
robots, and they laid out their vision for how

00:18:44.109 --> 00:18:47.220
AI is going to revolutionize this field. They

00:18:47.220 --> 00:18:49.680
talked about this continuous loop of simulation,

00:18:50.099 --> 00:18:53.039
training, testing and real world experience.

00:18:53.259 --> 00:18:55.220
Can you walk us through that? It's about using

00:18:55.220 --> 00:18:57.880
simulation to generate massive amounts of data

00:18:57.880 --> 00:19:01.990
for. training robot AI, then testing those AI

00:19:01.990 --> 00:19:04.569
policies in virtual environments before deploying

00:19:04.569 --> 00:19:07.670
them to real world robots. Right. And then the

00:19:07.670 --> 00:19:09.529
data from the real world feeds back into the

00:19:09.529 --> 00:19:11.910
simulation, creating this continuous cycle of

00:19:11.910 --> 00:19:13.789
improvement. So it's like a virtual proving ground

00:19:13.789 --> 00:19:16.609
for robot AI. Exactly. And they announced these

00:19:16.609 --> 00:19:19.809
new tools to make that happen. Cosmos for generating

00:19:19.809 --> 00:19:22.250
synthetic data. Newton for physics simulation.

00:19:23.210 --> 00:19:25.390
And then there was a big announcement of Isaac

00:19:25.390 --> 00:19:28.569
Groot N1, their generalist foundation model for

00:19:28.569 --> 00:19:30.829
humanoid robots. And they're open sourcing that

00:19:30.829 --> 00:19:33.150
as well, right? Yes, which is huge. It's like

00:19:33.150 --> 00:19:36.809
giving the entire robotics community access to

00:19:36.809 --> 00:19:39.890
this incredibly powerful starting point for building

00:19:39.890 --> 00:19:43.349
their own robots. So it's clear that NVIDIA is...

00:19:43.759 --> 00:19:46.160
taking a very comprehensive and forward -looking

00:19:46.160 --> 00:19:49.279
approach to robotics. Totally. And that's really

00:19:49.279 --> 00:19:51.259
the takeaway from the entire keynote, right?

00:19:51.420 --> 00:19:53.299
They're not just focused on one piece of the

00:19:53.299 --> 00:19:55.019
puzzle. They're thinking about the whole stack,

00:19:55.220 --> 00:19:57.619
from the hardware to the software to the applications,

00:19:57.900 --> 00:19:59.839
and they're pushing the boundaries in every direction.

00:20:00.269 --> 00:20:02.210
It's an incredibly exciting time to be following

00:20:02.210 --> 00:20:04.329
this field. I mean, the pace of innovation is

00:20:04.329 --> 00:20:06.970
just relentless. It is. And it's not just about

00:20:06.970 --> 00:20:08.930
the technology itself. It's about the impact

00:20:08.930 --> 00:20:11.609
it's going to have on our lives. I mean, AI is

00:20:11.609 --> 00:20:13.690
going to transform everything from the way we

00:20:13.690 --> 00:20:15.930
work to the way we interact with the world around

00:20:15.930 --> 00:20:19.289
us. So for our listeners, the key takeaway is

00:20:19.289 --> 00:20:21.990
that this stuff matters. Even if you're not a

00:20:21.990 --> 00:20:25.210
developer or a data scientist, AI and accelerated

00:20:25.210 --> 00:20:27.509
computing are going to shape the future. And

00:20:27.509 --> 00:20:29.410
understanding the trends and the key players.

00:20:29.609 --> 00:20:31.849
is going to be increasingly important. Absolutely.

00:20:32.289 --> 00:20:35.549
This keynote, it was a glimpse into that future,

00:20:35.670 --> 00:20:38.069
and it's clear that NVIDIA is playing a central

00:20:38.069 --> 00:20:40.130
role in shaping it. Well, on that note, we'll

00:20:40.130 --> 00:20:42.230
wrap things up here. As always, thanks for joining

00:20:42.230 --> 00:20:44.670
us for another deep dive. We'll be back soon

00:20:44.670 --> 00:20:47.410
to dissect another fascinating topic. Until then,

00:20:47.509 --> 00:20:50.369
keep exploring, keep learning, and stay curious.

00:20:50.430 --> 00:20:51.230
See you next time.