WEBVTT

00:00:00.000 --> 00:00:04.599
Imagine a single machine, just one. But this

00:00:04.599 --> 00:00:07.000
machine draws more electrical power than the

00:00:07.000 --> 00:00:09.660
entire city of San Francisco. Wow. And that's

00:00:09.660 --> 00:00:11.919
not, you know, science fiction. As of right now,

00:00:12.060 --> 00:00:15.359
January 19th, 2026, it is a physical reality

00:00:15.359 --> 00:00:18.039
humming away on the grid. That is the kind of

00:00:18.039 --> 00:00:19.679
stat that just stops you in your tracks. Yeah.

00:00:19.719 --> 00:00:21.359
I mean, we are not talking about a server farm

00:00:21.359 --> 00:00:23.600
anymore. We're talking about energy use on the

00:00:23.600 --> 00:00:26.140
scale of a, well, a small country. Welcome to

00:00:26.140 --> 00:00:28.859
the Deep Dive. Today we're navigating a world

00:00:28.859 --> 00:00:31.920
of some pretty wild extremes. We've reached a

00:00:31.920 --> 00:00:34.539
point where the infrastructure for intelligence

00:00:34.539 --> 00:00:37.420
is becoming truly titanic. Literal city -sized

00:00:37.420 --> 00:00:40.079
brain. Exactly. And yet at the exact same moment

00:00:40.079 --> 00:00:42.179
people are what fleeing to the countryside to

00:00:42.179 --> 00:00:44.460
knit sweaters. It's just we have the biggest

00:00:44.460 --> 00:00:46.979
computers in history and also a massive spike

00:00:46.979 --> 00:00:50.159
in yarn sales. It's a fascinating dichotomy.

00:00:50.280 --> 00:00:52.820
It is. And we're going to unpack that tension.

00:00:52.960 --> 00:00:55.140
We have a stack of reports today covering the

00:00:55.140 --> 00:00:58.920
activation of XAI's Colossus 2, this strange

00:00:58.920 --> 00:01:02.079
cultural thing called the analog backlash, and

00:01:02.079 --> 00:01:04.760
some really interesting research from DeepMind

00:01:04.760 --> 00:01:07.599
on how to keep these massive systems safe without

00:01:07.599 --> 00:01:10.099
going broke. Plus, on a much more practical level,

00:01:10.219 --> 00:01:12.739
we really need to talk about why copying and

00:01:12.739 --> 00:01:15.819
pasting from an AI still completely ruins your

00:01:15.819 --> 00:01:17.900
formatting. Yes. Let's start with the heavy metal,

00:01:17.980 --> 00:01:19.819
the hardware. We're looking at these reports

00:01:19.819 --> 00:01:23.260
on XAI, Elon Musk's AI company, and this new

00:01:23.260 --> 00:01:26.420
cluster, Colossus 2. Cluster just feels like

00:01:26.420 --> 00:01:28.739
such a small word for what this actually is.

00:01:28.859 --> 00:01:31.859
I mean, to set the scene, XAI has just activated

00:01:31.859 --> 00:01:35.000
a one gigawatt supercomputer. One gigawatt. Yeah.

00:01:35.099 --> 00:01:37.920
And it's purpose built for one thing, training

00:01:37.920 --> 00:01:40.400
the Grok series of models. And just for context,

00:01:40.540 --> 00:01:42.879
that's roughly the output of a large nuclear

00:01:42.879 --> 00:01:45.480
reactor, right? Exactly. They have built a machine

00:01:45.480 --> 00:01:47.920
that effectively requires a nuclear reactor's

00:01:47.920 --> 00:01:50.840
worth of juice just to turn on. And the timeline

00:01:50.840 --> 00:01:53.900
is just while Colossus 1 took, what, about 122

00:01:53.900 --> 00:01:56.400
days to bring online, Colossus 2 followed immediately

00:01:56.400 --> 00:01:58.859
after. Just bang, bang. And looking at the specs

00:01:58.859 --> 00:02:01.000
here, it's not even stopping at one gigawatt.

00:02:01.079 --> 00:02:03.120
Oh, no, that's just the starting line. The reports

00:02:03.120 --> 00:02:05.159
are saying it's already set to expand to one

00:02:05.159 --> 00:02:08.080
and a half gigawatts by April and then hit two

00:02:08.080 --> 00:02:10.650
gigawatts soon after that. The logistics of that

00:02:10.650 --> 00:02:12.650
are, well, they're mind -bending. It has to sit

00:02:12.650 --> 00:02:15.050
on dedicated grid lines. It has its own on -site

00:02:15.050 --> 00:02:18.909
substations. And the heat, just imagine the thermodynamics

00:02:18.909 --> 00:02:22.990
of, what, 555 ,000 GPUs running at full tilt.

00:02:23.169 --> 00:02:25.250
Yeah, the cooling systems alone must be an engineering

00:02:25.250 --> 00:02:28.849
marvel. It's liquid cooling on just a ridiculous

00:02:28.849 --> 00:02:32.710
scale. And you mentioned the GPUs, 555 ,000 of

00:02:32.710 --> 00:02:34.909
them. The procurement number is just eye -watering.

00:02:35.180 --> 00:02:37.699
The cost for that hardware alone is estimated

00:02:37.699 --> 00:02:41.580
at roughly $18 billion. $18 billion. That suddenly

00:02:41.580 --> 00:02:44.240
explains why XAI blew right past their original

00:02:44.240 --> 00:02:46.800
$15 billion funding goal. Yeah, if you want to

00:02:46.800 --> 00:02:48.599
play at this table, the buy -in is astronomical.

00:02:48.979 --> 00:02:51.520
They ended up raising a $20 billion Series E,

00:02:51.719 --> 00:02:54.280
backed by, you know, all the heavy hitters, Fidelity,

00:02:54.400 --> 00:02:57.879
Nvidia, Cisco. And the stated mission here, the

00:02:57.879 --> 00:03:00.900
philosophy behind spending all this money, is

00:03:00.900 --> 00:03:03.580
to understand the universe. They claim that to

00:03:03.580 --> 00:03:05.460
do that, you need the world's biggest brain.

00:03:05.719 --> 00:03:07.960
Right. And practically speaking, this machine

00:03:07.960 --> 00:03:11.120
is what's going to power Grok 4, Grok Voice,

00:03:11.300 --> 00:03:14.219
and the upcoming Grok 5. Technologically, though,

00:03:14.300 --> 00:03:16.319
what stands out to me is that they've kind of

00:03:16.319 --> 00:03:19.300
hit a ceiling, but not on hardware. It's a software

00:03:19.300 --> 00:03:21.960
ceiling. Right. The reports say that the global

00:03:21.960 --> 00:03:25.199
batch sizes for training are now limited by optimization,

00:03:25.500 --> 00:03:28.139
not by compute. They effectively have enough

00:03:28.139 --> 00:03:30.599
hardware to do continuous trillion parameter

00:03:30.599 --> 00:03:33.699
training. It's just there. It's brute force on

00:03:33.699 --> 00:03:36.949
a celestial scale, which I have to ask. Does

00:03:36.949 --> 00:03:38.849
building a brain the size of a city actually

00:03:38.849 --> 00:03:41.370
guarantee understanding? Or are we just building

00:03:41.370 --> 00:03:43.490
the most expensive pattern matching machine in

00:03:43.490 --> 00:03:45.870
human history? That is the multibillion dollar

00:03:45.870 --> 00:03:47.389
question, isn't it? But I guess when you have

00:03:47.389 --> 00:03:49.810
half a million GPUs, the strategy is pretty clear

00:03:49.810 --> 00:03:51.810
that quantity might just have a quality all its

00:03:51.810 --> 00:03:54.430
own. The hope is that understanding just kind

00:03:54.430 --> 00:03:56.349
of emerges from the math if you make the math

00:03:56.349 --> 00:03:59.889
big enough. OK, so while XAI is building the

00:03:59.889 --> 00:04:02.620
digital equivalent of a Dyson sphere. the rest

00:04:02.620 --> 00:04:04.939
of us, the human world, we seem to be reacting

00:04:04.939 --> 00:04:07.199
in a very, very different way. Yeah. Welcome

00:04:07.199 --> 00:04:10.740
to what some are calling the 2026 paradox. On

00:04:10.740 --> 00:04:13.819
one side, the digital economy is absolutely on

00:04:13.819 --> 00:04:17.459
fire. OpenAI has revealed they hit $20 billion

00:04:17.459 --> 00:04:21.060
in annualized revenue for 2025. That's triple

00:04:21.060 --> 00:04:23.439
the year before. The numbers are vertical. Totally.

00:04:23.519 --> 00:04:26.740
And Gemini API traffic doubled in just five months.

00:04:26.839 --> 00:04:28.680
But then you look at the cultural data. And you

00:04:28.680 --> 00:04:31.540
get yarn. You get yarn. I'm not kidding. Sales

00:04:31.540 --> 00:04:34.939
of knitting kits at Michaels have jumped 1 ,000.

00:04:35.599 --> 00:04:38.199
200%. Wow. Searches for analog lifestyle have

00:04:38.199 --> 00:04:40.939
just exploded. It's this weird vibe where for

00:04:40.939 --> 00:04:43.540
a lot of people, 2026 feels a lot like 1996.

00:04:44.079 --> 00:04:46.540
It feels like a retreat. Like while the algorithms

00:04:46.540 --> 00:04:48.399
are eating the Internet, people are just craving

00:04:48.399 --> 00:04:50.660
something tactile, something they can, you know,

00:04:50.680 --> 00:04:52.839
control from start to finish without a prompt

00:04:52.839 --> 00:04:55.079
window. Well, there's some real economic anxiety

00:04:55.079 --> 00:04:57.540
under that, too. Economists are warning that

00:04:57.540 --> 00:04:59.839
2026 is the year this all hits the political

00:04:59.839 --> 00:05:03.139
radar. AI isn't necessarily taking everyone's

00:05:03.139 --> 00:05:06.540
job overnight. What it is doing is deleting the

00:05:06.540 --> 00:05:08.819
on -ramp. The entry -level positions. Exactly.

00:05:08.819 --> 00:05:11.060
All the junior work is being done by the models

00:05:11.060 --> 00:05:13.319
now. So it's getting harder and harder for beginners

00:05:13.319 --> 00:05:15.379
to break into these industries. You have this

00:05:15.379 --> 00:05:18.319
squeeze where the digital elite are generating

00:05:18.319 --> 00:05:20.899
massive revenue, and a lot of other people are

00:05:20.899 --> 00:05:23.000
feeling locked out. And they knit. They knit.

00:05:23.199 --> 00:05:25.939
It's a way to feel competent, I think. When you

00:05:25.939 --> 00:05:28.040
knit a scarf, you know exactly how it was made.

00:05:28.139 --> 00:05:31.680
There are no hallucinations in wool. No hallucinations

00:05:31.680 --> 00:05:34.560
in wool. I like that. But for those of us who

00:05:34.560 --> 00:05:37.399
do have to work in this digital storm, we can't

00:05:37.399 --> 00:05:39.540
just go knit all day. We have to use these tools.

00:05:39.980 --> 00:05:43.740
And honestly, it's still kind of a mess. It really

00:05:43.740 --> 00:05:45.160
is. I was reading through the source material

00:05:45.160 --> 00:05:48.060
on perfectly formatted reports, and it just...

00:05:48.670 --> 00:05:50.910
it hit home we talk about super intelligence

00:05:50.910 --> 00:05:53.149
but most of us are just struggling to get a document

00:05:53.149 --> 00:05:55.810
to look right you spend hours on a prompt you

00:05:55.810 --> 00:05:58.269
get great text you paste it into google docs

00:05:58.269 --> 00:06:02.129
and It just explodes. The formatting nightmare.

00:06:02.449 --> 00:06:05.529
Giant bold text where a header should be. Bullet

00:06:05.529 --> 00:06:08.009
points that turn into random dashes. It just

00:06:08.009 --> 00:06:10.550
looks so amateur. I still wrestle with prompt

00:06:10.550 --> 00:06:12.829
drift myself. You spend like three hours trying

00:06:12.829 --> 00:06:14.490
to get the headers just right, and the model

00:06:14.490 --> 00:06:17.310
just gives you bold text instead of proper H1

00:06:17.310 --> 00:06:20.370
tags. It's infuriating. The friction is so real.

00:06:20.589 --> 00:06:23.269
The whole promise of AI is speed. But if you

00:06:23.269 --> 00:06:25.810
spend 20 minutes reformatting font sizes and

00:06:25.810 --> 00:06:27.850
bullet points, you haven't actually saved any

00:06:27.850 --> 00:06:30.399
time. So what's the fix? The sources mentioned

00:06:30.399 --> 00:06:33.500
some new empowered tools. Well, Google has rolled

00:06:33.500 --> 00:06:35.899
out this personal intelligence thing in Gemini.

00:06:36.279 --> 00:06:39.439
It's supposed to act like a, quote, weirdly well

00:06:39.439 --> 00:06:41.959
-informed best friend that remembers your context.

00:06:42.240 --> 00:06:44.019
But for the formatting problem specifically,

00:06:44.259 --> 00:06:46.480
the solution seems to be learning how to force

00:06:46.480 --> 00:06:49.000
the output. What do you mean by force it? You

00:06:49.000 --> 00:06:51.759
have to stop asking nicely. You need to be incredibly

00:06:51.759 --> 00:06:54.319
specific about the underlying code. You have

00:06:54.319 --> 00:06:56.639
to tell it, I'll put this in Markdown that maps

00:06:56.639 --> 00:07:00.899
to the specific H1 and H2 tags. Or you use these

00:07:00.899 --> 00:07:03.519
new bridge tools like Noodle Seed or Flow Genie

00:07:03.519 --> 00:07:06.699
that structure the data before it even hits the

00:07:06.699 --> 00:07:09.279
document. So we have to learn to speak the machine's

00:07:09.279 --> 00:07:11.399
language just to get it to speak ours properly.

00:07:11.579 --> 00:07:14.040
Pretty much. Until the AI can intuitively understand

00:07:14.040 --> 00:07:16.839
your company's brand style guide, you have to

00:07:16.839 --> 00:07:20.110
manually override its default chaos. It really

00:07:20.110 --> 00:07:22.029
brings up a question of utility then. If we have

00:07:22.029 --> 00:07:24.589
to fight the AI to format a simple page, are

00:07:24.589 --> 00:07:27.350
we really more productive yet? Only if you master

00:07:27.350 --> 00:07:29.670
those force formatting prompts. Otherwise, yeah,

00:07:29.730 --> 00:07:31.790
it's often just a very messy copy paste job.

00:07:32.029 --> 00:07:33.829
Now, speaking of looking under the hood, let's

00:07:33.829 --> 00:07:35.850
pivot to something arguably more critical than

00:07:35.850 --> 00:07:39.709
font sizes, safety. With models as big as Colossus

00:07:39.709 --> 00:07:43.290
2, the question of control is, it's massive.

00:07:43.569 --> 00:07:45.529
Yeah, and DeepMind just dropped a paper on this

00:07:45.529 --> 00:07:47.610
that is... Honestly, a game changer. This is

00:07:47.610 --> 00:07:50.649
the activation probes paper? Yes. And if you

00:07:50.649 --> 00:07:52.930
care about AI safety, but you also care about,

00:07:53.009 --> 00:07:56.810
you know, not lighting money on fire, this is

00:07:56.810 --> 00:07:58.990
huge. The problem they're really solving here

00:07:58.990 --> 00:08:02.569
is cost, right? Monitoring these huge models

00:08:02.569 --> 00:08:05.550
for safety usually costs a fortune. An insane

00:08:05.550 --> 00:08:08.350
amount. Traditionally, to check if an AI is being

00:08:08.350 --> 00:08:10.889
toxic or dangerous, you have to run its output

00:08:10.889 --> 00:08:13.959
through another AI to check it. It's like hiring

00:08:13.959 --> 00:08:17.079
a supervisor to watch every single employee 24

00:08:17.079 --> 00:08:19.699
-7. And that supervisor demands a big salary.

00:08:19.860 --> 00:08:22.720
A huge one in compute power. So DeepMind's solution

00:08:22.720 --> 00:08:25.439
is to stop watching the output and start watching

00:08:25.439 --> 00:08:27.839
the brain itself. So as it's thinking. Precisely.

00:08:28.089 --> 00:08:30.449
As the AI is processing a request, it's doing

00:08:30.449 --> 00:08:32.870
all this internal matrix math. So DeepMind developed

00:08:32.870 --> 00:08:35.529
what they call activation probes. And you can

00:08:35.529 --> 00:08:38.809
think of them like tiny sensors or like a tiny

00:08:38.809 --> 00:08:41.090
brain scan. They scan the model's internal activity

00:08:41.090 --> 00:08:42.750
while it's thinking. So it's like reading the

00:08:42.750 --> 00:08:44.669
mind before the mouth opens. That's the perfect

00:08:44.669 --> 00:08:47.090
analogy. It's eavesdropping on the inner monologue.

00:08:47.350 --> 00:08:49.250
And because it's just reading data that is already

00:08:49.250 --> 00:08:50.990
being computed anyway, it's incredibly cheap.

00:08:51.070 --> 00:08:54.269
How cheap are we talking? We're talking 10 ,000

00:08:54.269 --> 00:08:56.450
times cheaper than running a full safety monitor

00:08:56.450 --> 00:09:00.309
on top. 10 ,000 times. That effectively makes

00:09:00.309 --> 00:09:03.549
safety free or close to it. Basically. And it

00:09:03.549 --> 00:09:06.149
works using a few clever layers. They use these

00:09:06.149 --> 00:09:08.450
multi -max probes to pick out signals and huge

00:09:08.450 --> 00:09:11.149
prompts, up to a million tokens. And then something

00:09:11.149 --> 00:09:13.629
called rolling mean attention to filter out all

00:09:13.629 --> 00:09:15.690
the noise and spot, you know, the toxic thoughts.

00:09:16.110 --> 00:09:18.309
And what if a probe isn't sure about something?

00:09:18.769 --> 00:09:21.799
Then... And only then does it call in the supervisor.

00:09:22.080 --> 00:09:24.720
It's a cascade system. If the probe sees a red

00:09:24.720 --> 00:09:27.440
flag, it sends just that little snippet to a

00:09:27.440 --> 00:09:29.899
lightweight model like Gemini Flash to get a

00:09:29.899 --> 00:09:31.820
second opinion. So you cut your overall safety

00:09:31.820 --> 00:09:34.919
costs by 50x or more without really losing reliability.

00:09:35.629 --> 00:09:37.409
That's a brilliant piece of engineering. It's

00:09:37.409 --> 00:09:39.409
moving safety from an external policing layer

00:09:39.409 --> 00:09:42.029
to an internal awareness layer. We're essentially

00:09:42.029 --> 00:09:44.409
installing a conscience into the AI that costs

00:09:44.409 --> 00:09:47.269
almost nothing to run. A lightweight moral compass

00:09:47.269 --> 00:09:49.710
embedded right there in the matrix math. I want

00:09:49.710 --> 00:09:52.250
to take just a brief pause here. We've covered

00:09:52.250 --> 00:09:54.789
city -sized computers, the return of knitting,

00:09:54.990 --> 00:09:57.129
the struggle of formatting a simple document,

00:09:57.309 --> 00:10:00.289
and now the internal conscience of the machine.

00:10:00.690 --> 00:10:06.309
We'll be right back. And we are back. We've definitely

00:10:06.309 --> 00:10:08.710
covered a lot of ground today. Let's try to synthesize

00:10:08.710 --> 00:10:10.210
this for a moment, connect some of these dots.

00:10:10.509 --> 00:10:13.009
Okay, the big picture. We've got this massive

00:10:13.009 --> 00:10:15.230
convergence happening. On one side, you have

00:10:15.230 --> 00:10:18.250
a company like XAI. They are building physical

00:10:18.250 --> 00:10:20.470
infrastructure at a scale that challenges the

00:10:20.470 --> 00:10:24.909
power grids of major American cities. 1 .5 gigawatts.

00:10:25.979 --> 00:10:29.679
$18 billion. Pure brute force. And then on the

00:10:29.679 --> 00:10:31.299
complete other end of the spectrum, you have

00:10:31.299 --> 00:10:33.360
DeepMind. Right. They're looking at the microscopic

00:10:33.360 --> 00:10:35.799
level. They're finding these tiny efficiencies,

00:10:36.080 --> 00:10:39.220
inserting tiny, cheap probes into the math to

00:10:39.220 --> 00:10:41.320
make these things safe without bankrupting the

00:10:41.320 --> 00:10:42.899
companies that run them. So you have this macro

00:10:42.899 --> 00:10:44.940
expansion and this micro optimization happening

00:10:44.940 --> 00:10:48.019
at the exact same time. Exactly. And who is sandwiched

00:10:48.019 --> 00:10:50.679
in the middle of all this silicon and code? Us.

00:10:51.289 --> 00:10:54.149
The humans. Yeah, the humans who are generating

00:10:54.149 --> 00:10:57.549
billions in revenue for open AI, but also the

00:10:57.549 --> 00:10:59.929
humans who are feeling totally overwhelmed by

00:10:59.929 --> 00:11:03.149
it all. The humans buying all that yarn. It's

00:11:03.149 --> 00:11:05.289
a real tug of war. We're building these, you

00:11:05.289 --> 00:11:08.110
know, gods in the cloud. But down here on Earth,

00:11:08.210 --> 00:11:10.870
we just want a nice sweater and a document that

00:11:10.870 --> 00:11:13.370
formats correctly. And honestly, the fact that

00:11:13.370 --> 00:11:15.169
we can't get the document to format correctly

00:11:15.169 --> 00:11:17.330
is probably why we're all knitting the sweater.

00:11:17.490 --> 00:11:20.529
It's a need to feel competence. To feel like

00:11:20.529 --> 00:11:22.970
we can actually finish a task. I think that's

00:11:22.970 --> 00:11:25.470
going to be the theme in 2026. Competence in

00:11:25.470 --> 00:11:28.350
an age of automation. Before we wrap up, for

00:11:28.350 --> 00:11:30.570
anyone listening who is still stuck in that formatting

00:11:30.570 --> 00:11:33.509
loop. Yes, please do yourself a favor. Look up

00:11:33.509 --> 00:11:35.370
the formatting guides for these models. Learn

00:11:35.370 --> 00:11:37.590
the force formatting prompts. I'm telling you,

00:11:37.629 --> 00:11:40.389
it's a small skill, but it will change your daily

00:11:40.389 --> 00:11:42.850
quality of life. And as you head out into your

00:11:42.850 --> 00:11:44.809
week, I want to leave you with one final thought

00:11:44.809 --> 00:11:48.559
on that. analog backlash. Is this surge in yarn

00:11:48.559 --> 00:11:52.220
sales just a temporary fad, a little blip? Or

00:11:52.220 --> 00:11:54.740
are we seeing the beginning of a permanent bifurcation,

00:11:55.059 --> 00:11:57.600
a real split between a digital elite who live

00:11:57.600 --> 00:12:00.460
entirely in the stream and a kind of analog resistance

00:12:00.460 --> 00:12:03.019
that chooses to disconnect? That is something

00:12:03.019 --> 00:12:05.460
to think about. Are we heading for a world where

00:12:05.460 --> 00:12:08.700
true luxury just means being offline? I think

00:12:08.700 --> 00:12:10.840
we might be. Thanks for diving in with us. See

00:12:10.840 --> 00:12:11.320
you in the deep end.