WEBVTT

00:00:00.000 --> 00:00:03.459
Imagine a world where the software on your screen

00:00:03.459 --> 00:00:06.639
isn't just mimicking conversation, but it's actually

00:00:06.639 --> 00:00:09.480
thinking. I mean, really contemplating a problem.

00:00:10.419 --> 00:00:12.580
Elon Musk recently dropped a statistic that,

00:00:12.599 --> 00:00:15.339
frankly, it just stops you in your tracks. He

00:00:15.339 --> 00:00:17.859
believes there is a 10 % probability, one in

00:00:17.859 --> 00:00:21.219
10, that his new model, Grok 5, will be as smart

00:00:21.219 --> 00:00:23.420
as a human being. And just to give you a sense

00:00:23.420 --> 00:00:26.989
of the gravity there, that single claim... was

00:00:26.989 --> 00:00:29.850
reportedly enough to trigger an emergency strategy

00:00:29.850 --> 00:00:32.789
meeting at OpenAI. Wow. They didn't just shrug

00:00:32.789 --> 00:00:35.350
it off as marketing hype. They hit the accelerator.

00:00:35.710 --> 00:00:37.570
Welcome back to the Deep Dive. Today, we are

00:00:37.570 --> 00:00:42.590
unpacking XAI, Grok 5, and this whole race to

00:00:42.590 --> 00:00:44.789
artificial general intelligence. It feels like

00:00:44.789 --> 00:00:46.770
everyone, there's a new model, a new claim, a

00:00:46.770 --> 00:00:49.109
new benchmark. But looking at the source material

00:00:49.109 --> 00:00:51.600
we have today, this one. It feels different.

00:00:51.859 --> 00:00:53.899
It does. It feels less like a software update

00:00:53.899 --> 00:00:57.320
and more like a philosophical shift in how we

00:00:57.320 --> 00:00:59.600
approach intelligence. It absolutely is. And

00:00:59.600 --> 00:01:02.619
today, we are going to map out exactly why. We

00:01:02.619 --> 00:01:04.319
have a lot of ground to cover. OK, so what's

00:01:04.319 --> 00:01:07.159
the plan? First, we're going to talk about Colossus,

00:01:07.540 --> 00:01:09.980
this massive hardware scaling happening right

00:01:09.980 --> 00:01:13.019
now in Memphis, Tennessee. Right. We'll break

00:01:13.019 --> 00:01:16.239
down the three specific ways Grok 5 is fundamentally

00:01:16.239 --> 00:01:19.079
different from Chat, GPT, and Gemini. We'll look

00:01:19.079 --> 00:01:22.560
at the architecture. the six trillion parameter

00:01:22.560 --> 00:01:26.000
structure aiming for AGI, and then we'll get

00:01:26.000 --> 00:01:29.760
practical coding markets, image generation. And

00:01:29.760 --> 00:01:31.459
of course, the part that always feels a bit like

00:01:31.459 --> 00:01:33.280
science fiction, the integration with Tesla and

00:01:33.280 --> 00:01:35.159
the optimist robots. Putting the brain inside

00:01:35.159 --> 00:01:37.099
the body. That's the end game. So let's start

00:01:37.099 --> 00:01:39.459
with the timeline, because in the world of Musk

00:01:39.459 --> 00:01:42.200
companies, timelines can be. Well, let's call

00:01:42.200 --> 00:01:44.780
them aspirational. For sure. But we have a confirmed

00:01:44.780 --> 00:01:48.260
window now. We do. We do. MUST confirmed in November

00:01:48.260 --> 00:01:52.019
2025 that the target release for Grok 5 is the

00:01:52.019 --> 00:01:54.700
first quarter of 2026. So we're looking at January,

00:01:54.739 --> 00:01:57.239
February, March. OK. And what's interesting is

00:01:57.239 --> 00:02:00.000
that despite the history of delays with rockets

00:02:00.000 --> 00:02:03.640
or cars, XAI seems incredibly confident about

00:02:03.640 --> 00:02:06.540
this one. So why the confidence? Usually when

00:02:06.540 --> 00:02:09.060
you get this close to a release of this magnitude,

00:02:09.460 --> 00:02:11.939
you start hearing about delays, but the sources

00:02:11.939 --> 00:02:14.120
suggest the hardware is already humming. Because

00:02:14.120 --> 00:02:16.599
it's not just code anymore, it's physics. This

00:02:16.599 --> 00:02:18.340
is where we have to talk about Colossus. Right,

00:02:18.379 --> 00:02:20.800
the supercomputer. It's the name of the supercomputer,

00:02:21.099 --> 00:02:24.159
XAI, built in Memphis. And when I say supercomputer,

00:02:24.300 --> 00:02:26.460
I don't mean a server room with blinking lights.

00:02:26.539 --> 00:02:29.340
I mean a facility that consumes as much electricity

00:02:29.340 --> 00:02:32.000
as a small city. A small city, just to train

00:02:32.000 --> 00:02:35.389
an AI. It's staggering. To get Grok 5 to that

00:02:35.389 --> 00:02:38.789
level of intelligence, they are moving from using

00:02:38.789 --> 00:02:42.770
100 ,000 GPUs, which is already a mind -bending

00:02:42.770 --> 00:02:46.050
amount of compute, to trying to connect 1 million

00:02:46.050 --> 00:02:48.389
GPUs. OK, so let's unpack that scale. I think

00:02:48.389 --> 00:02:51.250
for most people, 1 million GPUs is just a number.

00:02:51.770 --> 00:02:54.069
What does that actually represent in terms of

00:02:54.069 --> 00:02:57.009
engineering? Think of a GPU like a tiny brain

00:02:57.009 --> 00:03:00.680
or a neuron. Your home computer usually has one.

00:03:01.219 --> 00:03:03.020
Colossus is trying to link a million of them

00:03:03.020 --> 00:03:06.740
together to a single cohesive system. It is arguably

00:03:06.740 --> 00:03:09.139
one of the biggest engineering projects in history.

00:03:09.879 --> 00:03:12.939
It's like trying to wire up a synthetic neocortex.

00:03:13.039 --> 00:03:14.900
And from what I understand, the challenge isn't

00:03:14.900 --> 00:03:16.939
just plugging them in. It's the latency, right?

00:03:17.099 --> 00:03:19.740
The speed of communication between them. That

00:03:19.740 --> 00:03:22.099
is the bottleneck. It's a networking nightmare.

00:03:22.680 --> 00:03:25.240
If you have a brain where the neurons take five

00:03:25.240 --> 00:03:27.219
seconds to talk to each other, you don't have

00:03:27.219 --> 00:03:29.819
intelligence. You have a disconnected mess. You

00:03:29.819 --> 00:03:31.460
have to get these chips to talk to each other

00:03:31.460 --> 00:03:34.180
instantly, almost at the speed of light. That's

00:03:34.180 --> 00:03:37.319
why we have the delay until 2026. It's not about

00:03:37.319 --> 00:03:39.539
the software learning. It's about physically

00:03:39.539 --> 00:03:42.039
building the brain big enough to hold the thoughts.

00:03:42.280 --> 00:03:44.680
It reminds me of that paper, The Bitter Lesson

00:03:44.680 --> 00:03:47.500
by Rich Sutton. The idea that we keep trying

00:03:47.500 --> 00:03:49.780
to be clever with programming, but eventually

00:03:49.780 --> 00:03:52.180
brute force scaling, just adding more compute,

00:03:52.460 --> 00:03:56.180
always wins. That's the bet XAI is making. They

00:03:56.180 --> 00:03:58.599
are betting the farm on the idea that if you

00:03:58.599 --> 00:04:01.039
build a big enough computer, intelligence just

00:04:01.039 --> 00:04:04.340
emerges. It's not magic, it's scale. So why do

00:04:04.340 --> 00:04:07.280
we actually need a million ships linked together?

00:04:07.360 --> 00:04:09.460
To create an intelligence smarter than anything

00:04:09.460 --> 00:04:11.840
previously seen. That leads us perfectly into

00:04:11.840 --> 00:04:14.699
the software itself. Because once you build this

00:04:14.699 --> 00:04:18.180
massive brain, what makes it different from what

00:04:18.180 --> 00:04:21.300
we already have? If I'm sitting there with ChatGPT

00:04:21.300 --> 00:04:23.819
open in one tab and Grok 5 in another, what's

00:04:23.819 --> 00:04:26.240
the tangible difference? There are a few core

00:04:26.240 --> 00:04:28.899
differentiators, but the biggest one is time.

00:04:29.730 --> 00:04:33.509
Most AI models, like Chat, GPT, or Gemini, operate

00:04:33.509 --> 00:04:36.529
like a library book. A library book? How so?

00:04:36.610 --> 00:04:38.649
Well, think about it. If you pick up a book printed

00:04:38.649 --> 00:04:41.649
in 2023, it has zero idea what happened yesterday.

00:04:41.689 --> 00:04:44.170
Right. It's frozen. It's frozen in time. Those

00:04:44.170 --> 00:04:46.750
models rely on training data that has a cutoff

00:04:46.750 --> 00:04:48.910
date. Now, they are getting better at searching

00:04:48.910 --> 00:04:51.550
Google to patch that hole, but it's often slow.

00:04:51.720 --> 00:04:54.160
clunky, or a bit messy. I have noticed that you

00:04:54.160 --> 00:04:56.019
ask about a news event from this morning and

00:04:56.019 --> 00:04:58.420
it sometimes hallucinates or just says, I don't

00:04:58.420 --> 00:05:01.199
have that information. Precisely. Grok 5 is different

00:05:01.199 --> 00:05:04.060
because it lives inside the X platform. It has

00:05:04.060 --> 00:05:06.160
a direct pipeline to the global conversation.

00:05:06.300 --> 00:05:09.019
So it's reading posts in real time. Instantly.

00:05:09.339 --> 00:05:12.199
When news breaks, an earthquake, a stock market

00:05:12.199 --> 00:05:14.879
crash, a political scandal, people posted on

00:05:14.879 --> 00:05:18.889
X instantly. Grok reads that. It doesn't wait

00:05:18.889 --> 00:05:21.250
for a news article to be written and edited and

00:05:21.250 --> 00:05:23.810
published and then indexed by Google. It sees

00:05:23.810 --> 00:05:26.649
the raw data as it happens. That is a massive

00:05:26.649 --> 00:05:28.649
advantage for anyone who needs to be current.

00:05:28.790 --> 00:05:30.990
It's the difference between a historian and a

00:05:30.990 --> 00:05:33.689
news anchor. And it's drinking from the fire

00:05:33.689 --> 00:05:35.759
hose. It has to filter the noise, obviously,

00:05:36.279 --> 00:05:37.759
but the signal is there before it's anywhere

00:05:37.759 --> 00:05:39.339
else. But there's another difference that people

00:05:39.339 --> 00:05:41.500
talk about a lot, and that's the personality,

00:05:41.819 --> 00:05:44.519
the uncensored aspect. Root -seeking mode, yeah.

00:05:44.600 --> 00:05:47.259
Yeah. And I have to admit, I sometimes feel this

00:05:47.259 --> 00:05:49.779
frustration. I feel like I wrestle with other

00:05:49.779 --> 00:05:51.839
AIs when I ask a slightly sensitive question.

00:05:51.920 --> 00:05:55.160
You know the response. As an AI lineage model,

00:05:55.399 --> 00:05:58.579
I cannot provide an opinion. It feels very corporate.

00:05:58.639 --> 00:06:02.370
It feels very safe, and that is by design. OpenAI

00:06:02.370 --> 00:06:04.810
and Google have a massive enterprise reputation

00:06:04.810 --> 00:06:06.610
to protect. They have shareholders. They have

00:06:06.610 --> 00:06:08.769
corporate clients. They optimize for safety.

00:06:09.230 --> 00:06:12.529
XAI is taking a different approach. Croc 5 is

00:06:12.529 --> 00:06:15.110
designed to be less strict. It treats the user

00:06:15.110 --> 00:06:18.370
like an adult. So fun mode isn't just a gimmick.

00:06:18.629 --> 00:06:20.829
No, it's a philosophy. Yeah. It allows the model

00:06:20.829 --> 00:06:23.949
to be sarcastic, to make jokes, but more importantly,

00:06:24.370 --> 00:06:27.029
to give answers that might be considered controversial

00:06:27.029 --> 00:06:30.089
or sensitive without immediately shutting down.

00:06:30.199 --> 00:06:33.399
It's trying to provide the raw truth. Or at least

00:06:33.399 --> 00:06:35.879
the available information rather than a sanitized

00:06:35.879 --> 00:06:38.720
version of it. So is being uncensored actually

00:06:38.720 --> 00:06:41.680
a utility though, or is it just a gimmick? It's

00:06:41.680 --> 00:06:44.500
essential for users wanting raw truth over corporate

00:06:44.500 --> 00:06:47.100
safety. I can see that. If you are a researcher

00:06:47.100 --> 00:06:50.079
or a journalist, you don't want a tool that filters

00:06:50.079 --> 00:06:52.800
reality. You want the data, even if it's ugly.

00:06:53.019 --> 00:06:54.560
Exactly. You want to know what the sentiment

00:06:54.560 --> 00:06:57.199
actually is, not what the AI thinks the sentiment

00:06:57.199 --> 00:06:59.100
should be. That makes sense. It's a tool for

00:06:59.100 --> 00:07:01.879
deep thinking, not just, you know, drafting HR

00:07:01.879 --> 00:07:04.379
emails. And speaking of deep thinking, we have

00:07:04.379 --> 00:07:08.220
to talk about the Holy Grail, AGI, artificial

00:07:08.220 --> 00:07:10.680
general intelligence. This is the heavy terminology.

00:07:10.819 --> 00:07:13.199
Let's define it quickly. So narrow AI is good

00:07:13.199 --> 00:07:15.920
at one thing, playing chess, recognizing faces.

00:07:16.439 --> 00:07:19.000
AGI is good at everything. It learns like a human.

00:07:19.100 --> 00:07:23.160
Right. And musks. 10 % probability claim is about

00:07:23.160 --> 00:07:26.500
this specific threshold. Can't Grok 5 look at

00:07:26.500 --> 00:07:28.819
a problem it has never seen before, no training

00:07:28.819 --> 00:07:31.500
data on it, and just figure out the solution

00:07:31.500 --> 00:07:34.240
using logic? And how do they plan to get there?

00:07:34.439 --> 00:07:36.920
We talked about the chips, but what is the strategy?

00:07:37.240 --> 00:07:39.970
The strategy is brute force scaling. Imagine

00:07:39.970 --> 00:07:41.730
you're trying to build a library that contains

00:07:41.730 --> 00:07:43.930
the answer to every question. You just keep adding

00:07:43.930 --> 00:07:46.790
floors. You just keep adding floors. In AI, we

00:07:46.790 --> 00:07:48.889
measure this in parameters. And parameters are

00:07:48.889 --> 00:07:51.670
like the synapses in the brain, right? The connections.

00:07:51.910 --> 00:07:54.089
Yeah, exactly. To give you a sense of scale,

00:07:54.550 --> 00:07:57.990
GPT -4 is estimated to have around 1 .7 trillion

00:07:57.990 --> 00:08:01.459
parameters. 1 .7 trillion. Grok 5 is expected

00:08:01.459 --> 00:08:04.160
to have around 6 trillion parameter. Oh, that

00:08:04.160 --> 00:08:06.560
is a massive jump. That's not just a linear increase.

00:08:06.759 --> 00:08:09.360
That's exponential complexity. It is. And the

00:08:09.360 --> 00:08:12.579
theory the BetX AI is making is that by making

00:08:12.579 --> 00:08:14.660
the brain significantly bigger and feeding it

00:08:14.660 --> 00:08:17.259
more information, intelligence will naturally

00:08:17.259 --> 00:08:19.360
emerge. So they're not programming it to be smart.

00:08:19.759 --> 00:08:22.300
No, they're building a structure complex enough

00:08:22.300 --> 00:08:24.899
for smartness to happen. But does making the

00:08:24.899 --> 00:08:28.000
model bigger actually create understanding? XAI

00:08:28.000 --> 00:08:30.439
believes intelligence naturally emerges from

00:08:30.439 --> 00:08:32.679
massive scale. And they are focusing heavily

00:08:32.679 --> 00:08:36.759
on reasoning versus just autocomplete. What's

00:08:36.759 --> 00:08:40.779
the distinction there? Most current AIs are essentially

00:08:40.779 --> 00:08:43.059
guessing the next word in a sentence. They're

00:08:43.059 --> 00:08:45.679
hyper -advanced autocomplete. Right. Grok 5 is

00:08:45.679 --> 00:08:49.519
being trained to stop think, and check its logic.

00:08:50.039 --> 00:08:52.259
If you ask it a math problem, it doesn't just

00:08:52.259 --> 00:08:54.019
guess the answer. It works through the steps.

00:08:54.559 --> 00:08:57.440
If you ask for code, it checks the logic of that

00:08:57.440 --> 00:08:59.159
code before giving it to you. It's the difference

00:08:59.159 --> 00:09:02.080
between memorizing a textbook and actually understanding

00:09:02.080 --> 00:09:05.139
the physics. Precisely. It's simulating an internal

00:09:05.139 --> 00:09:07.019
monologue. It's what researchers call a chain

00:09:07.019 --> 00:09:09.259
of thought. And if they pull it off at six trillion

00:09:09.259 --> 00:09:12.059
parameters, that internal monologue gets very

00:09:12.059 --> 00:09:15.019
sophisticated. Okay, so we have this massive,

00:09:15.259 --> 00:09:17.740
sarcastic, truth -seeking brain being built in

00:09:17.740 --> 00:09:20.480
Memphis, but let's bring it down to Earth. For

00:09:20.480 --> 00:09:22.259
the person listening who isn't trying to solve

00:09:22.259 --> 00:09:25.039
the theory of relativity, how do I actually use

00:09:25.039 --> 00:09:27.960
this? It comes down to a few very powerful use

00:09:27.960 --> 00:09:30.799
cases. The first is handling large documents.

00:09:31.360 --> 00:09:33.879
Grok 5 is going to have a massive context window,

00:09:34.299 --> 00:09:37.940
likely around 256 ,000 tokens. Which translates

00:09:37.940 --> 00:09:41.440
to what in the real world? book. A very thick

00:09:41.440 --> 00:09:44.120
book. Yeah. Or a massive stack of legal contracts.

00:09:44.419 --> 00:09:47.299
You could upload a 300 page PDF and say, find

00:09:47.299 --> 00:09:49.580
every mention of financial risk in this contract

00:09:49.580 --> 00:09:51.779
and tell me if any of them contradict each other.

00:09:51.980 --> 00:09:54.639
That's huge because right now with some models

00:09:54.639 --> 00:09:56.860
you have to chop the document up into pieces

00:09:56.860 --> 00:09:59.240
or you hit a limit. Right. Grok 5 can hold the

00:09:59.240 --> 00:10:02.120
whole thing in memory at once. It sees the big

00:10:02.120 --> 00:10:04.399
picture. It understands the narrative arc of

00:10:04.399 --> 00:10:07.009
the data. not just isolated sentences. And what

00:10:07.009 --> 00:10:08.529
about the market research side? You mentioned

00:10:08.529 --> 00:10:10.929
the X integration earlier. This is a game changer

00:10:10.929 --> 00:10:13.610
for businesses. Imagine asking, what is the sentiment

00:10:13.610 --> 00:10:16.730
on X right now regarding the Apple launch? Instead

00:10:16.730 --> 00:10:18.529
of getting a summary of articles from yesterday,

00:10:19.070 --> 00:10:21.830
Grok 5 scans the live posts from the last hour.

00:10:22.250 --> 00:10:24.429
Wow. It can tell you people are complaining about

00:10:24.429 --> 00:10:27.389
the battery life or everyone loves the new camera.

00:10:27.669 --> 00:10:30.210
It's real time market intelligence. I can see

00:10:30.210 --> 00:10:32.730
that being addictive for investors. You're getting

00:10:32.730 --> 00:10:34.990
the signal before the market moves. Absolutely.

00:10:35.470 --> 00:10:37.590
And then there's coding. If you're a developer,

00:10:37.789 --> 00:10:41.230
Grok 5 acts like a senior engineer. Because of

00:10:41.230 --> 00:10:43.889
that large memory, it can look at your entire

00:10:43.889 --> 00:10:46.509
project structure, not just one script. So you

00:10:46.509 --> 00:10:48.409
could ask it to rewrite something for speed.

00:10:48.950 --> 00:10:51.009
Exactly. You could say, rewrite this Python script

00:10:51.009 --> 00:10:53.610
for speed. And it understands the context of

00:10:53.610 --> 00:10:56.679
the whole application. And finally, images. I

00:10:56.679 --> 00:11:00.019
saw a note about Grok Imagine. Yes. The struggle

00:11:00.019 --> 00:11:03.080
with AI images has always been text. You ask

00:11:03.080 --> 00:11:05.460
for a picture of a shop with a sign, and the

00:11:05.460 --> 00:11:08.120
sign usually has alien hieroglyphics on it. Yeah,

00:11:08.299 --> 00:11:10.559
just jumbled letters. Grok 5 understands the

00:11:10.559 --> 00:11:13.059
world better, so it can actually spell correctly

00:11:13.059 --> 00:11:15.970
inside the image. You can ask for a futuristic

00:11:15.970 --> 00:11:18.950
Tokyo Cafe with a neon sign that says Grok Cafe,

00:11:19.350 --> 00:11:22.169
and it will actually spell Grok Cafe. Which sounds

00:11:22.169 --> 00:11:24.570
simple, but is technically very difficult. It

00:11:24.570 --> 00:11:26.769
implies the model understands what letters are.

00:11:27.129 --> 00:11:29.629
Conceptually, yes. Not just visually. So which

00:11:29.629 --> 00:11:31.870
user actually benefits the most from this tool?

00:11:32.250 --> 00:11:34.330
Investors and researchers needing real -time

00:11:34.330 --> 00:11:37.129
data and logic. Before we move to the final segment,

00:11:37.169 --> 00:11:39.149
I want to take a brief moment to thank our partners

00:11:39.149 --> 00:11:42.480
who make this deep dive possible. Okay, we're

00:11:42.480 --> 00:11:45.019
back. We've covered the brain. Now I want to

00:11:45.019 --> 00:11:47.659
talk about the body. Because Elon Musk isn't

00:11:47.659 --> 00:11:49.620
just building a chatbot to live in your browser.

00:11:50.259 --> 00:11:53.159
He's building a mind that can move. This is where

00:11:53.159 --> 00:11:55.019
we connect to the bigger picture of the Musk

00:11:55.019 --> 00:11:58.580
ecosystem. The first physical container for Grok

00:11:58.580 --> 00:12:03.340
5 is the car. The Tesla! Yes. Right now, voice

00:12:03.340 --> 00:12:07.539
commands in cars are... let's be honest, pretty

00:12:07.539 --> 00:12:11.289
basic. Call mom. Navigate home. I didn't understand

00:12:11.289 --> 00:12:14.250
that. Playing music by Cher. Exactly. It's frustrating.

00:12:14.710 --> 00:12:17.289
With Grok 5 integrated, the car becomes a smart

00:12:17.289 --> 00:12:19.250
companion. So what does that look like? You could

00:12:19.250 --> 00:12:20.769
be driving through a town you've never visited

00:12:20.769 --> 00:12:22.490
and ask, Grok, tell me a story about the history

00:12:22.490 --> 00:12:24.590
of this town. Or find me a restaurant with a

00:12:24.590 --> 00:12:26.870
4 .5 star rating that is open right now and actually

00:12:26.870 --> 00:12:29.929
has parking. It turns the drive into a conversation.

00:12:30.429 --> 00:12:32.509
But it's more than just a concierge, isn't it?

00:12:32.730 --> 00:12:35.409
It's about the car understanding the world. It

00:12:35.409 --> 00:12:38.929
is. But the even wilder application is Optimus,

00:12:39.289 --> 00:12:41.610
the humanoid robot. This is the part that really

00:12:41.610 --> 00:12:44.049
captures the imagination. We've seen the videos

00:12:44.049 --> 00:12:46.190
of the robots folding shirts or walking around,

00:12:46.610 --> 00:12:49.970
but they still look a bit... robotic. It's stiff,

00:12:50.450 --> 00:12:53.330
because currently they are dumb in the cognitive

00:12:53.330 --> 00:12:55.750
sense. They follow programming. They're executing

00:12:55.750 --> 00:12:58.929
a script. Got it. But if you put Grok 5's brain

00:12:58.929 --> 00:13:02.049
into Optimus, the robot can understand abstract

00:13:02.049 --> 00:13:05.039
concepts. You could say, go to the kitchen and

00:13:05.039 --> 00:13:07.460
make a sandwich. And the robot has to understand

00:13:07.460 --> 00:13:09.519
what a kitchen is, what a sandwich is, where

00:13:09.519 --> 00:13:12.299
the bread is kept. It has to understand the concept

00:13:12.299 --> 00:13:15.120
of a knife. Not just the coordinates of a tool,

00:13:15.539 --> 00:13:18.100
but what a knife is for. It figures that out

00:13:18.100 --> 00:13:21.100
by thinking, using that massive training data.

00:13:21.279 --> 00:13:23.659
It provides the common sense that robots have

00:13:23.659 --> 00:13:26.120
always lacked. Exactly. And where does it get

00:13:26.120 --> 00:13:27.899
that common sense? It gets it from the cars,

00:13:27.980 --> 00:13:30.220
right? From the fleet. That's the closed loop.

00:13:30.330 --> 00:13:33.169
This is the secret weapon XAI has. They have

00:13:33.169 --> 00:13:35.370
millions of Teslas driving around with cameras,

00:13:35.710 --> 00:13:38.190
recording video of physics, of human behavior,

00:13:38.409 --> 00:13:41.370
of how the world moves. That data trains the

00:13:41.370 --> 00:13:44.070
model. The car teaches the robot how to see.

00:13:44.350 --> 00:13:46.769
It's incredible potential. But I have to play

00:13:46.769 --> 00:13:49.570
the skeptic for a second. We talked about colossus

00:13:49.570 --> 00:13:52.210
consuming the power of a small city. Yes. If

00:13:52.210 --> 00:13:54.929
we have millions of these robots and cars all

00:13:54.929 --> 00:13:58.440
accessing this super intelligence. The energy

00:13:58.440 --> 00:14:01.879
demand must be astronomical. Is a massive challenge.

00:14:02.179 --> 00:14:04.340
This is the dark side of the moon here. The more

00:14:04.340 --> 00:14:06.340
intelligent the model, the more energy it consumes.

00:14:07.059 --> 00:14:09.860
Colossus in Memphis is already straining the

00:14:09.860 --> 00:14:12.080
grid. So what's the solution? Musk is betting

00:14:12.080 --> 00:14:15.799
on solar and batteries, Tesla energy, to solve

00:14:15.799 --> 00:14:19.149
this. But it's a race. Can we generate green

00:14:19.149 --> 00:14:22.149
energy faster than these AI brains consume it?

00:14:22.289 --> 00:14:25.509
So is this just software or something physically

00:14:25.509 --> 00:14:28.490
tangible? It is the future mind for cars and

00:14:28.490 --> 00:14:30.909
humanoid robots. It really reframes how you look

00:14:30.909 --> 00:14:32.710
at a Tesla driving down the street. It's just

00:14:32.710 --> 00:14:35.049
waiting for its brain upgrade. It is. It's hardware

00:14:35.049 --> 00:14:37.350
waiting for the soul. So let's bring this all

00:14:37.350 --> 00:14:38.929
home. We've covered a huge amount of ground.

00:14:39.129 --> 00:14:41.330
We have these three giants in the arena now.

00:14:41.350 --> 00:14:43.419
We do. And it's important to see where they sit.

00:14:43.639 --> 00:14:46.019
You have OpenAI with GPT -5 coming, which is

00:14:46.019 --> 00:14:48.659
the enterprise standard. It's safe. It's corporate.

00:14:48.860 --> 00:14:51.919
It's reliable. The suit and tie. Exactly. Then

00:14:51.919 --> 00:14:54.059
you have Google with Gemini. That's the ecosystem

00:14:54.059 --> 00:14:56.720
play. It's integrated into your Docs, your Gmail,

00:14:56.799 --> 00:14:59.039
your Android phone. It's all about convenience.

00:14:59.200 --> 00:15:03.580
And then XAI with Grok. The truth seeker. It's

00:15:03.580 --> 00:15:06.980
about raw power, real time access to the world

00:15:06.980 --> 00:15:10.179
via X, and deep reasoning without the safety

00:15:10.179 --> 00:15:12.700
wheels. Regardless of which one wins, or if they

00:15:12.700 --> 00:15:14.480
all win in different ways, what does this mean

00:15:14.480 --> 00:15:17.259
for us? It forces the whole industry to evolve.

00:15:17.480 --> 00:15:20.480
Even if Grok 5 doesn't hit that 10 % chance of

00:15:20.480 --> 00:15:23.919
being human level smart, just the attempt pushes

00:15:23.919 --> 00:15:26.539
OpenAI and Google to be better. It accelerates

00:15:26.539 --> 00:15:28.779
everything. So here's my challenge to you, the

00:15:28.779 --> 00:15:31.580
listener. We are expecting this release in Q1

00:15:31.580 --> 00:15:34.879
2026. When Grok 5 drops, don't just read the

00:15:34.879 --> 00:15:37.779
headlines. Test it. Absolutely. Compare it. Ask

00:15:37.779 --> 00:15:40.340
ChatGPT a question, then ask Grok the same question.

00:15:40.759 --> 00:15:42.600
See which reasoning style fits your brain better.

00:15:42.960 --> 00:15:45.200
Are you looking for the safe summary or the raw

00:15:45.200 --> 00:15:47.399
data? It's going to be a fascinating year. The

00:15:47.399 --> 00:15:49.360
car you drive and the robot you might one day

00:15:49.360 --> 00:15:51.559
own are just waiting for this mind to wake up.

00:15:51.620 --> 00:15:53.539
And when it does, we'll be here to talk about

00:15:53.539 --> 00:15:55.379
it. I can't wait. Thanks for diving in with us.

00:15:55.460 --> 00:15:56.279
We'll catch you in the next one.
