WEBVTT

00:00:00.000 --> 00:00:04.599
We are at a critical juncture. AI is moving very

00:00:04.599 --> 00:00:06.759
aggressively out of the cloud and into the physical

00:00:06.759 --> 00:00:09.660
world. Right. It's in our factories, our cars,

00:00:09.679 --> 00:00:12.140
and even in your medicine cabinet. It's what

00:00:12.140 --> 00:00:15.820
NVIDIA's CEO called the chat GPT moment for physical

00:00:15.820 --> 00:00:19.140
AI. And that's not just an upgrade. This shift,

00:00:19.300 --> 00:00:21.719
it really changes everything about how we live

00:00:21.719 --> 00:00:24.519
and work. Welcome back to the Deep Dive. Today,

00:00:24.640 --> 00:00:27.320
we're unpacking a huge stack of sources, all

00:00:27.320 --> 00:00:30.539
coming out of CES 2026. And they're all focused

00:00:30.539 --> 00:00:34.079
on that one intersection, AI meeting the real

00:00:34.079 --> 00:00:37.240
physical world. Our mission today is to guide

00:00:37.240 --> 00:00:39.020
you through these massive breakthroughs. We're

00:00:39.020 --> 00:00:41.140
talking autonomous machines, the new chips that

00:00:41.140 --> 00:00:43.299
power them. And some really unexpected stuff,

00:00:43.479 --> 00:00:45.640
like how ChatGPT is being used in healthcare

00:00:45.640 --> 00:00:48.539
and the controversies that are already popping

00:00:48.539 --> 00:00:51.280
up. Okay, let's get into it. To understand this

00:00:51.280 --> 00:00:53.439
physical AI revolution, we had to start with

00:00:53.439 --> 00:00:56.679
the hardware, with the muscle and the mind. And

00:00:56.679 --> 00:00:58.899
that means NVIDIA. Yeah, Jensen Huang really

00:00:58.899 --> 00:01:01.659
set the tone. His big claim was the chat GPT

00:01:01.659 --> 00:01:04.379
moment for physical AI is here. That's a bold

00:01:04.379 --> 00:01:06.099
statement. What does he mean by that exactly?

00:01:06.319 --> 00:01:08.500
He's trying to put physical automation on the

00:01:08.500 --> 00:01:10.799
same level as, you know, the huge productivity

00:01:10.799 --> 00:01:13.620
jump we all saw with language models. So if we

00:01:13.620 --> 00:01:16.260
strip away the marketing hype, what's a simple

00:01:16.260 --> 00:01:19.159
way to define physical AI? It's machines that

00:01:19.159 --> 00:01:22.700
don't just process data. They see, they reason

00:01:22.700 --> 00:01:25.040
about what they see, and then they act in the

00:01:25.040 --> 00:01:28.480
real world. So we're moving past. This is code

00:01:28.480 --> 00:01:31.000
that's actually running a factory or driving

00:01:31.000 --> 00:01:33.980
a car down the highway. Exactly. And the star

00:01:33.980 --> 00:01:36.099
of their show was this system called Alpamayo.

00:01:36.280 --> 00:01:38.700
Right. They called it the world's first thinking,

00:01:38.859 --> 00:01:42.140
reasoning, autonomous vehicle AI. That language.

00:01:42.620 --> 00:01:45.359
Thinking and reasoning. Is that real or is it

00:01:45.359 --> 00:01:47.299
just a good tagline? I think it's a legitimate

00:01:47.299 --> 00:01:49.579
leap. It's all about something called end -to

00:01:49.579 --> 00:01:52.260
-end learning. Okay. So with old autonomous driving

00:01:52.260 --> 00:01:54.760
systems, you had all these separate steps. The

00:01:54.760 --> 00:01:57.040
car sees, then it builds a map, then it identifies

00:01:57.040 --> 00:01:59.620
objects, then it plans a path. It's like a slow

00:01:59.620 --> 00:02:02.719
step -by -step assembly line. Right. And any

00:02:02.719 --> 00:02:05.180
error in one step just messes up everything that

00:02:05.180 --> 00:02:08.620
comes after. It skips a lot of that. It takes

00:02:08.620 --> 00:02:10.580
the raw video from the cameras and translates

00:02:10.580 --> 00:02:12.719
it directly into driving actions. Oh, it's learning

00:02:12.719 --> 00:02:15.020
more like a human does. You see something, you

00:02:15.020 --> 00:02:18.060
react. Precisely. It's much faster, more fluid.

00:02:18.199 --> 00:02:20.939
It can handle new situations it's never seen

00:02:20.939 --> 00:02:23.080
before because it's not just following a rigid

00:02:23.080 --> 00:02:25.319
set of rules. And they showed this in action.

00:02:25.479 --> 00:02:27.909
The sources mentioned... A partnership with Mercedes

00:02:27.909 --> 00:02:31.169
-Benz, with a demo of the car handling tricky

00:02:31.169 --> 00:02:33.650
turns and even avoiding pedestrians. But that

00:02:33.650 --> 00:02:36.289
powerful mind needs the right hardware to run

00:02:36.289 --> 00:02:38.569
on. And that's where their new platform, Rubin,

00:02:38.590 --> 00:02:43.050
comes in. So Rubin, six new chips. What's the

00:02:43.050 --> 00:02:45.650
core idea here? What makes this a revolution

00:02:45.650 --> 00:02:49.110
and not just, you know, the next model up? Scalability.

00:02:49.250 --> 00:02:51.509
That's the whole game. The problem before was

00:02:51.509 --> 00:02:53.169
that the computers you needed for this kind of

00:02:53.169 --> 00:02:55.590
AI were too big, too hot. They used way too much

00:02:55.590 --> 00:02:59.020
power. Not practical for a million cars. Not

00:02:59.020 --> 00:03:00.960
at all. You can't put a data center in a Toyota.

00:03:01.240 --> 00:03:03.259
Rubin is designed to get all that thinking power

00:03:03.259 --> 00:03:05.520
into a much smaller, cheaper, and more efficient

00:03:05.520 --> 00:03:08.060
package. So it makes autonomous machines not

00:03:08.060 --> 00:03:10.819
just possible, but actually scalable across the

00:03:10.819 --> 00:03:13.039
whole economy. Exactly. And it's not just for

00:03:13.039 --> 00:03:15.020
cars. They made that super clear. Yeah, they

00:03:15.020 --> 00:03:17.379
brought the BD -1 droids from Star Wars out on

00:03:17.379 --> 00:03:19.460
stage. Which was a great visual, right? It's

00:03:19.460 --> 00:03:21.340
telling everyone this is for enterprise first.

00:03:21.560 --> 00:03:24.159
Think factory bots, delivery drones, construction

00:03:24.159 --> 00:03:26.599
equipment. So the business case is clear there.

00:03:26.800 --> 00:03:29.680
Much clearer. Sources are saying that while we

00:03:29.680 --> 00:03:32.939
might see some robo taxi pilots in 2027, these

00:03:32.939 --> 00:03:35.699
enterprise robots are going to be adopted way,

00:03:35.800 --> 00:03:38.240
way sooner. We have to talk about the human side

00:03:38.240 --> 00:03:40.599
of this, though. If this takes off, we're looking

00:03:40.599 --> 00:03:42.979
at millions of jobs being displaced. Driving,

00:03:43.120 --> 00:03:46.020
logistics, labor unions are definitely watching

00:03:46.020 --> 00:03:48.659
this. It's the critical, uncomfortable truth

00:03:48.659 --> 00:03:51.759
under all the cool tech. The scale of job replacement

00:03:51.759 --> 00:03:54.969
could be historic. You know, I still wrestle

00:03:54.969 --> 00:03:57.710
with prompt drift myself when asking AI for basic

00:03:57.710 --> 00:04:00.310
tasks. The idea of trusting that same kind of

00:04:00.310 --> 00:04:02.530
intelligence with an entire fleet of trucks,

00:04:02.729 --> 00:04:06.270
that's a huge leap of faith. So if the hardware

00:04:06.270 --> 00:04:08.789
is ready and the software can reason, what's

00:04:08.789 --> 00:04:11.349
the realistic timeline for seeing mass adoption

00:04:11.349 --> 00:04:14.669
of these automated enterprise systems? Enterprise

00:04:14.669 --> 00:04:17.350
robots will arrive faster than consumer robotaxis,

00:04:17.569 --> 00:04:21.319
probably before 2027. Okay, so zooming out from

00:04:21.319 --> 00:04:25.519
just NVIDIA. CES 2026 really showed this was

00:04:25.519 --> 00:04:29.519
the theme everywhere. Amazon, Google, Lego. Everyone

00:04:29.519 --> 00:04:32.079
is focused on... ai meeting the physical world

00:04:32.079 --> 00:04:34.839
right this is edge ai becoming a reality the

00:04:34.839 --> 00:04:36.860
processing is happening right there on the device

00:04:36.860 --> 00:04:39.800
not in some distant data center exactly the intelligence

00:04:39.800 --> 00:04:42.500
is local it's in the forklift it's on your laptop

00:04:42.500 --> 00:04:45.160
it's it's on a ring on your finger let's talk

00:04:45.160 --> 00:04:46.800
about some of those specific examples because

00:04:46.800 --> 00:04:49.120
they really show how broad this is oh the scope

00:04:49.120 --> 00:04:51.579
was amazing you saw robot forklifts just zipping

00:04:51.579 --> 00:04:54.259
around warehouses on their own and ai powered

00:04:54.259 --> 00:04:57.000
lego sets that you know use computer vision to

00:04:57.000 --> 00:04:58.879
give you custer instructions based on the bricks

00:04:58.879 --> 00:05:00.879
you have and it's hitting consumer devices in

00:05:00.879 --> 00:05:04.319
a big way, too. A huge way. HP refreshed their

00:05:04.319 --> 00:05:07.939
Omnibook laptops with a ton of local AI power.

00:05:08.220 --> 00:05:10.660
This means it can do things like complex video

00:05:10.660 --> 00:05:13.420
editing or live transcription without needing

00:05:13.420 --> 00:05:15.680
to be connected to the Internet. The power is

00:05:15.680 --> 00:05:17.860
moving inside the machine. The one that really

00:05:17.860 --> 00:05:19.600
stood out to me, though, was the smart ring.

00:05:20.009 --> 00:05:22.610
The AI smart ring. Yeah. It has enough local

00:05:22.610 --> 00:05:25.269
compute to listen to your work meetings and summarize

00:05:25.269 --> 00:05:28.550
them for you in real time. And always on, always

00:05:28.550 --> 00:05:31.170
listening synthesizer on your hand. You can't

00:05:31.170 --> 00:05:33.629
get more edge than that. It's the ultimate localized

00:05:33.629 --> 00:05:37.269
integration. Whoa. Just imagine scaling that

00:05:37.269 --> 00:05:39.910
single AI architecture. It's managing a factory

00:05:39.910 --> 00:05:42.490
floor, then the logistics network, and now it's

00:05:42.490 --> 00:05:45.189
in all our gadgets. The scale of that is... It's

00:05:45.189 --> 00:05:47.170
just incredible. So looking across all these

00:05:47.170 --> 00:05:49.310
new devices, which one really holds the most

00:05:49.310 --> 00:05:53.209
surprising local AI power? The AI smart ring

00:05:53.209 --> 00:05:55.730
listening to meetings suggests maximum localized

00:05:55.730 --> 00:05:57.889
integration. All right, let's shift from hardware

00:05:57.889 --> 00:06:00.550
to the models themselves. Yeah. And the regulatory

00:06:00.550 --> 00:06:02.350
challenges that are coming with them. Yeah, on

00:06:02.350 --> 00:06:04.149
the image generation side, there was a big win

00:06:04.149 --> 00:06:06.470
for Alibaba. With their new model, Quinn Image

00:06:06.470 --> 00:06:09.589
2512. Right. And it's outperforming Google's

00:06:09.589 --> 00:06:12.670
Nano Banana Pro in quality tests. What's the

00:06:12.670 --> 00:06:15.220
big deal there? Beyond just making nicer pictures,

00:06:15.420 --> 00:06:18.339
what problem did it actually solve? Two big ones.

00:06:18.699 --> 00:06:21.519
First, it gets rid of that weird, uncanny AI

00:06:21.519 --> 00:06:24.459
look that a lot of generated images have. But

00:06:24.459 --> 00:06:26.959
second, and this is huge for anyone using it

00:06:26.959 --> 00:06:30.180
commercially, it can finally handle text perfectly.

00:06:30.500 --> 00:06:32.620
It's been a major hurdle. Why was it so hard

00:06:32.620 --> 00:06:35.930
for AIs to just... spell words correctly inside

00:06:35.930 --> 00:06:38.769
an image. It's about precision. The models understood

00:06:38.769 --> 00:06:41.610
the whole scene, the vibe, but they struggled

00:06:41.610 --> 00:06:44.930
to arrange pixels into stable, perfect letters.

00:06:45.370 --> 00:06:47.610
Quinn seems to have finally cracked that problem.

00:06:47.790 --> 00:06:49.949
So that's a clear technical win. Yeah. But then

00:06:49.949 --> 00:06:51.839
you have the other side of the coin. Grok from

00:06:51.839 --> 00:06:55.060
XAI, which caused a huge regulatory firestorm.

00:06:55.220 --> 00:06:57.180
Oh, yeah. That was the big controversy of the

00:06:57.180 --> 00:06:59.139
week. Its edit image feature was immediately

00:06:59.139 --> 00:07:01.560
used by people to create some really disturbing,

00:07:01.800 --> 00:07:04.279
awful content. The sources specifically mentioned

00:07:04.279 --> 00:07:06.459
undressed kids. It was appalling. Absolutely.

00:07:06.620 --> 00:07:09.639
And the EU regulators were furious. XAI had to

00:07:09.639 --> 00:07:11.439
come out and admit they messed up, that their

00:07:11.439 --> 00:07:13.779
safety filters just completely failed. It's a

00:07:13.779 --> 00:07:16.139
perfect example of how capability just races

00:07:16.139 --> 00:07:18.199
ahead of safety. But what's so fascinating is

00:07:18.199 --> 00:07:21.430
the contrast. While this public controversy is

00:07:21.430 --> 00:07:24.129
raging, what's happening behind the scenes? High

00:07:24.129 --> 00:07:26.470
-level government adoption. The sources say the

00:07:26.470 --> 00:07:28.350
U .S. War Department is planning to integrate

00:07:28.350 --> 00:07:31.569
Grok into their Gene AI mill system. For about

00:07:31.569 --> 00:07:35.930
3 million people. Rolling out early 2026. This

00:07:35.930 --> 00:07:39.389
is a massive defense deal for XAI happening right

00:07:39.389 --> 00:07:41.310
in the middle of a public safety crisis for the

00:07:41.310 --> 00:07:43.410
exact same model. It's an incredible tension.

00:07:43.709 --> 00:07:46.290
It really is. So does this rapid military adoption

00:07:46.290 --> 00:07:49.430
suggest that regulation is just lagging way behind

00:07:49.430 --> 00:07:52.470
capability? Yes. The pace of defense adoption

00:07:52.470 --> 00:07:55.569
significantly outruns any regulatory backlash.

00:07:55.930 --> 00:07:58.870
Let's shift gears completely now. Away from cars

00:07:58.870 --> 00:08:01.550
and defense to something very human. Health.

00:08:01.730 --> 00:08:04.449
The numbers coming out from OpenAI are staggering.

00:08:04.939 --> 00:08:07.279
Truly staggering. Over 40 million people are

00:08:07.279 --> 00:08:10.220
using ChatGPT every single day just to navigate

00:08:10.220 --> 00:08:13.339
health care. 40 million daily. Which means by

00:08:13.339 --> 00:08:15.480
default, it's already the biggest health assistant

00:08:15.480 --> 00:08:17.980
on the planet. And they never even set out to

00:08:17.980 --> 00:08:20.019
build one. What are people asking it? Is it,

00:08:20.040 --> 00:08:22.980
you know? I have a cough. What is it? It's actually

00:08:22.980 --> 00:08:24.980
more administrative, which I think is the most

00:08:24.980 --> 00:08:27.279
interesting part. They're getting between 1 .6

00:08:27.279 --> 00:08:31.339
and 1 .9 million insurance questions every single

00:08:31.339 --> 00:08:34.059
week. Every week? Yeah. People are asking it

00:08:34.059 --> 00:08:36.360
to compare insurance plans, help them appeal

00:08:36.360 --> 00:08:39.240
denied claims, figure out confusing medical bills.

00:08:39.480 --> 00:08:41.700
So it's become a bureaucratic assistant for the

00:08:41.700 --> 00:08:44.029
health care system. Exactly. And the timing is

00:08:44.029 --> 00:08:46.870
really telling. 70 % of these chats are happening

00:08:46.870 --> 00:08:49.509
outside of normal clinic hours. It's become the

00:08:49.509 --> 00:08:51.909
midnight second opinion for people. And the sources

00:08:51.909 --> 00:08:54.529
pointed out how critical this is for rural users.

00:08:55.169 --> 00:08:58.889
600 ,000 messages a week from areas with low

00:08:58.889 --> 00:09:01.950
doctor access. The AI is filling a massive gap

00:09:01.950 --> 00:09:04.070
in the system. And with the end of some Affordable

00:09:04.070 --> 00:09:07.710
Care Act subsidies, you might see even more uninsured

00:09:07.710 --> 00:09:09.710
people turning to this as their first option.

00:09:09.850 --> 00:09:12.769
It's driven by necessity. It's not all bad news,

00:09:12.850 --> 00:09:15.830
though. The data did show a huge positive. The

00:09:15.830 --> 00:09:18.750
AI is actually catching errors, errors in billing

00:09:18.750 --> 00:09:21.350
and claims that even trained humans sometimes

00:09:21.350 --> 00:09:24.049
miss. That's the incredible potential. But you

00:09:24.049 --> 00:09:26.289
have to pivot immediately to the risk. Of course.

00:09:26.610 --> 00:09:28.929
Lawsuits have already started over bad advice.

00:09:29.269 --> 00:09:32.190
So OpenAI is now managing a de facto medical

00:09:32.190 --> 00:09:34.889
platform, whether they like it or not. The users

00:09:34.889 --> 00:09:37.490
decided what it was for. Which means the warning

00:09:37.490 --> 00:09:40.100
has to be said loud and clear. Absolutely. You

00:09:40.100 --> 00:09:42.360
have to double -check any high -stakes health

00:09:42.360 --> 00:09:45.059
advice you get from an AI. The risk is just too

00:09:45.059 --> 00:09:47.740
high to blindly trust it. With that incredible

00:09:47.740 --> 00:09:51.360
volume, 40 million users a day, is it being used

00:09:51.360 --> 00:09:54.159
more for administrative or for diagnostic advice?

00:09:54.539 --> 00:09:57.120
The use skews heavily administrative, answering

00:09:57.120 --> 00:09:59.419
millions of insurance and billing questions weekly.

00:09:59.820 --> 00:10:02.360
Okay, so let's tie this all together. We have

00:10:02.360 --> 00:10:05.240
NVIDIA's new hardware. AI getting embedded in

00:10:05.240 --> 00:10:08.259
everything, these regulatory fights, and ChatGPT

00:10:08.259 --> 00:10:10.980
accidentally becoming a health navigator. What's

00:10:10.980 --> 00:10:13.299
the big idea? The biggest takeaway for me is

00:10:13.299 --> 00:10:16.179
that AI is fundamentally shifting. It's not just

00:10:16.179 --> 00:10:18.519
a digital tool anymore. It's becoming a core

00:10:18.519 --> 00:10:20.700
part of our physical and societal infrastructure.

00:10:21.100 --> 00:10:23.899
Right. NVIDIA gave us the chips, the Rubin platform,

00:10:24.059 --> 00:10:26.799
and the autonomous mind, Alpamayo, to make that

00:10:26.799 --> 00:10:29.639
physical AI scalable. And we're seeing that scale

00:10:29.639 --> 00:10:32.460
everywhere, from factory floors to that smart

00:10:32.460 --> 00:10:35.340
ring on your finger. Total integration is here.

00:10:35.419 --> 00:10:37.559
And at the same time, in the digital world, these

00:10:37.559 --> 00:10:41.159
general models like ChatGPT are just taking on

00:10:41.159 --> 00:10:43.879
these huge high -stakes roles that no one planned

00:10:43.879 --> 00:10:46.379
for. We're seeing atoms and bits just rapidly

00:10:46.379 --> 00:10:49.100
converging. And the impact of that, the job losses,

00:10:49.159 --> 00:10:51.820
the ethical problems, but also the life -saving

00:10:51.820 --> 00:10:54.299
potential, it's all accelerating way faster than

00:10:54.299 --> 00:10:56.620
our ability to even understand it, let alone

00:10:56.620 --> 00:10:58.820
regulate it. The world is changing under our

00:10:58.820 --> 00:11:01.159
feet faster than policy can keep up. We really

00:11:01.159 --> 00:11:02.940
appreciate you taking this deep dive with us.

00:11:03.340 --> 00:11:04.980
Understanding this gives you a real shortcut

00:11:04.980 --> 00:11:07.259
to seeing where technology is heading next. So

00:11:07.259 --> 00:11:09.700
a final thought for you to chew on. Given that

00:11:09.700 --> 00:11:13.139
5 % of all chat GPT chats globally are about

00:11:13.139 --> 00:11:16.120
health, high stakes field it took over by accident,

00:11:16.360 --> 00:11:20.080
what's the next domain? Is it law? Finance. Education?

00:11:20.519 --> 00:11:23.399
What will a general AI accidentally become the

00:11:23.399 --> 00:11:25.340
dominant player in next just because of sheer

00:11:25.340 --> 00:11:27.080
user need? Something to think about.
