WEBVTT

00:00:00.000 --> 00:00:02.960
The next massive AI data center isn't being built

00:00:02.960 --> 00:00:05.179
in a remote desert. It's being installed right

00:00:05.179 --> 00:00:07.839
next to your AC unit, and it might just pay your

00:00:07.839 --> 00:00:10.859
utility bill. It is an absolutely wild concept

00:00:10.859 --> 00:00:12.619
to wrap your head around. Welcome back to the

00:00:12.619 --> 00:00:14.220
Deep Dive. I'm really glad you're joining us

00:00:14.220 --> 00:00:17.160
today. We have some truly fascinating ground

00:00:17.160 --> 00:00:19.800
to cover. The physical landscape of artificial

00:00:19.800 --> 00:00:22.839
intelligence is fundamentally shifting. We're

00:00:22.839 --> 00:00:25.219
looking at the hard limits of global computation,

00:00:25.460 --> 00:00:28.519
and we're seeing how engineers are quietly bypassing

00:00:28.519 --> 00:00:31.019
them. Yeah. Today we are exploring the distributed

00:00:31.019 --> 00:00:34.100
frontier of AI. We'll look at how supercomputers

00:00:34.100 --> 00:00:36.359
are moving into residential homes. We'll track

00:00:36.359 --> 00:00:38.880
the explosive agentic shift happening right now.

00:00:39.079 --> 00:00:41.679
This shift spans from simple Excel sheets to

00:00:41.679 --> 00:00:44.500
orbital satellites. And we'll unpack a massive

00:00:44.500 --> 00:00:47.240
new software breakthrough. It allows models to

00:00:47.240 --> 00:00:50.399
swallow 12 million words in a single pass. Let's

00:00:50.399 --> 00:00:52.079
start by looking at the physical infrastructure.

00:00:52.259 --> 00:00:54.600
We all know the AI boom is draining the power

00:00:54.600 --> 00:00:57.299
grid. Training these frontier models takes massive

00:00:57.299 --> 00:01:00.079
amounts of raw electricity. But building bigger

00:01:00.079 --> 00:01:02.399
power plants takes many, many years. So what

00:01:02.399 --> 00:01:04.519
if we just use the power capacity already wired

00:01:04.519 --> 00:01:06.659
into our homes? Right. And that is the exact

00:01:06.659 --> 00:01:10.379
premise of a startup called Span. They just announced

00:01:10.379 --> 00:01:13.120
a massive new infrastructure partnership. They're

00:01:13.120 --> 00:01:16.159
teaming up with NVIDIA and the home builder Polta

00:01:16.159 --> 00:01:17.780
Group. They're launching something called the

00:01:17.780 --> 00:01:21.579
XFRA Distributed Data Center. Yeah, the XFRA

00:01:21.579 --> 00:01:23.780
Distributed Data Center knows. I was reviewing

00:01:23.780 --> 00:01:26.120
the hardware specs earlier today. Each individual

00:01:26.120 --> 00:01:28.819
node is an absolute beast of a machine. They

00:01:28.819 --> 00:01:33.680
pack 16 NVIDIA RTX PRO 6000 Blackwell GPUs. Right,

00:01:33.760 --> 00:01:35.739
and they use the liquid -cooled server edition

00:01:35.739 --> 00:01:38.599
specifically. Exactly. They also include top

00:01:38.599 --> 00:01:42.379
-tier AMD EPYC processors. And they feature 3

00:01:42.379 --> 00:01:46.540
terabytes of RAM per node. beat. That is a staggering

00:01:46.540 --> 00:01:48.939
amount of local computing power. Why do we need

00:01:48.939 --> 00:01:51.140
three terabytes of memory sitting in a backyard?

00:01:51.400 --> 00:01:53.760
Well, it really comes down to loading massive

00:01:53.760 --> 00:01:56.140
parameter models natively. You need vast memory

00:01:56.140 --> 00:01:58.420
to hold these giant neural networks. So we're

00:01:58.420 --> 00:02:00.579
putting supercomputer level hardware directly

00:02:00.579 --> 00:02:03.620
into suburban neighborhoods. But the energy distribution

00:02:03.620 --> 00:02:06.219
is the real puzzle here. The copper wires in

00:02:06.219 --> 00:02:09.300
our streets have hard physical limits. How does

00:02:09.300 --> 00:02:11.800
a normal house power a massive supercomputer?

00:02:12.539 --> 00:02:14.439
This is where the smart panel technology comes

00:02:14.439 --> 00:02:17.379
in. The U .S. electrical grid was designed for

00:02:17.379 --> 00:02:21.139
peak theoretical loads. Most homes only use about

00:02:21.139 --> 00:02:24.180
40 % of their electrical capacity. Wow. Yeah,

00:02:24.240 --> 00:02:26.300
we have all this unused power just sitting there.

00:02:26.819 --> 00:02:29.800
SPAN's smart panel identifies that exact spare

00:02:29.800 --> 00:02:32.659
electricity dynamically. It monitors the home's

00:02:32.659 --> 00:02:35.680
power draw in milliseconds? Exactly. The panel

00:02:35.680 --> 00:02:38.780
detects when your oven or clothes dryer turns

00:02:38.780 --> 00:02:41.360
off. It then funnels that exact spare electrical

00:02:41.360 --> 00:02:45.060
capacity into the XFRA node. The node uses that

00:02:45.060 --> 00:02:47.939
diverted power to run complex cloud AI tasks.

00:02:48.259 --> 00:02:50.080
And the homeowner doesn't even notice the power

00:02:50.080 --> 00:02:52.340
shifting around. They don't notice a single thing

00:02:52.340 --> 00:02:54.479
changing. It operates entirely in the background

00:02:54.479 --> 00:02:56.719
of their daily lives. And the trade -off for

00:02:56.719 --> 00:02:59.240
the homeowner is incredibly compelling. In many

00:02:59.240 --> 00:03:01.500
of these markets, hosting a node brings massive

00:03:01.500 --> 00:03:03.520
benefits. You get completely free electricity

00:03:03.520 --> 00:03:06.340
and free gigabit internet. Span essentially pays

00:03:06.340 --> 00:03:08.849
your monthly utility bills for you. Yep. They

00:03:08.849 --> 00:03:10.770
cover your bills in exchange for using your wall

00:03:10.770 --> 00:03:12.590
space. They just tap into the spare electricity

00:03:12.590 --> 00:03:15.009
your house wasn't using anyway. It's like Airbnb

00:03:15.009 --> 00:03:18.289
for your home's unused electrical capacity. That's

00:03:18.289 --> 00:03:20.889
a perfect way to visualize the mechanism. And

00:03:20.889 --> 00:03:22.909
they're already rolling this infrastructure out

00:03:22.909 --> 00:03:25.150
right now. They're deploying them in new construction

00:03:25.150 --> 00:03:28.610
communities with Poltergroup. Think about the

00:03:28.610 --> 00:03:31.129
massive advantage in actual deployment speed.

00:03:31.270 --> 00:03:34.219
Instead of building one giant... 100 megawatt

00:03:34.219 --> 00:03:36.560
centralized data center, they can just deploy

00:03:36.560 --> 00:03:39.599
8 ,000 of these distributed mini nodes. It is

00:03:39.599 --> 00:03:42.400
six times faster to build this way. And it comes

00:03:42.400 --> 00:03:44.639
in at one fifth of the traditional infrastructure

00:03:44.639 --> 00:03:47.500
cost. Two sec silence. I've been thinking about

00:03:47.500 --> 00:03:49.879
the broader grid impact here. Is this actually

00:03:49.879 --> 00:03:52.360
safe and scalable? Or does putting a supercomputer

00:03:52.360 --> 00:03:55.560
next to my dryer create localized grid failures?

00:03:55.900 --> 00:03:58.360
It is a completely valid engineering concern

00:03:58.360 --> 00:04:01.400
to have. But the system actually balances local

00:04:01.400 --> 00:04:04.539
loads dynamically. Span smart panels act like

00:04:04.539 --> 00:04:06.939
highly intelligent traffic cops. They ensure

00:04:06.939 --> 00:04:09.560
the node only draws power when the house absolutely

00:04:09.560 --> 00:04:12.360
doesn't need it. This prevents any localized

00:04:12.360 --> 00:04:14.800
transformers from blowing out. So it balances

00:04:14.800 --> 00:04:17.220
local loads instead of draining the central grid.

00:04:17.379 --> 00:04:19.699
Precisely. We've been so worried about massive

00:04:19.699 --> 00:04:22.579
data centers sucking the grid dry. Span is proving

00:04:22.579 --> 00:04:24.560
we can just use the capacity we've already built.

00:04:24.660 --> 00:04:27.019
It turns your home from a passive cost center

00:04:27.019 --> 00:04:31.610
into a revenue generating asset. Beat. So hardware

00:04:31.610 --> 00:04:34.410
is moving directly into our suburban homes, but

00:04:34.410 --> 00:04:36.629
the physical distribution of compute is pushing

00:04:36.629 --> 00:04:39.550
much further out. Hardware is distributing all

00:04:39.550 --> 00:04:42.290
the way into the vacuum of space, and it's backed

00:04:42.290 --> 00:04:45.430
by truly astronomical amounts of corporate capital.

00:04:45.850 --> 00:04:47.970
The financial scale of this shift is honestly

00:04:47.970 --> 00:04:50.250
hard to comprehend. We're seeing numbers that

00:04:50.250 --> 00:04:53.050
fundamentally redefine corporate spending. Anthropic

00:04:53.050 --> 00:04:55.670
just committed to a massive new global infrastructure

00:04:55.670 --> 00:04:59.589
deal. They're spending $200 billion on Google

00:04:59.589 --> 00:05:02.730
Cloud and AI chips. $200 billion over just five

00:05:02.730 --> 00:05:05.310
years? That is roughly $40 billion every single

00:05:05.310 --> 00:05:08.069
year. Yeah. That deal represents over 40 % of

00:05:08.069 --> 00:05:11.040
Google Cloud's entire revenue backlog. The global

00:05:11.040 --> 00:05:13.680
AI infrastructure war is completely exploding

00:05:13.680 --> 00:05:16.319
right now. They are buying up silicon and energy

00:05:16.319 --> 00:05:19.379
contracts at an unprecedented pace. But they

00:05:19.379 --> 00:05:21.319
aren't just looking at traditional cloud servers

00:05:21.319 --> 00:05:24.259
on Earth. Anthropic just announced an incredible

00:05:24.259 --> 00:05:27.500
partnership with SpaceX AI. They secured access

00:05:27.500 --> 00:05:30.439
to the Colossus One massive AI supercomputer.

00:05:30.980 --> 00:05:34.360
And here's the truly mind -bending part of that

00:05:34.360 --> 00:05:37.220
specific partnership. Both organizations are

00:05:37.220 --> 00:05:40.180
officially exploring orbital AI compute in space.

00:05:40.639 --> 00:05:43.660
We're talking about massive data centers. orbiting

00:05:43.660 --> 00:05:47.019
the planet. Whoa. Imagine scaling orbital AI

00:05:47.019 --> 00:05:49.600
compute across thousands of satellites. Right.

00:05:49.699 --> 00:05:51.519
It completely changes the physical limits of

00:05:51.519 --> 00:05:53.699
our infrastructure. The cold vacuum of space

00:05:53.699 --> 00:05:56.459
solves massive thermal cooling problems. And

00:05:56.459 --> 00:05:58.759
this endless hardware scale is powering a fundamental

00:05:58.759 --> 00:06:01.800
software shift. AI is no longer just a simple

00:06:01.800 --> 00:06:03.860
chat bot answering questions. It is evolving

00:06:03.860 --> 00:06:06.199
into an active, autonomous digital operator.

00:06:06.600 --> 00:06:08.620
Yeah, the agentic shift is fully underway across

00:06:08.620 --> 00:06:10.540
the entire industry. I want to unpack what an

00:06:10.540 --> 00:06:13.079
agent actually is. A chatbot just predicts the

00:06:13.079 --> 00:06:15.540
next logical word in a sentence. An agent sets

00:06:15.540 --> 00:06:18.319
goals, uses external tools, and executes actions

00:06:18.319 --> 00:06:21.360
autonomously. Exactly. OpenAI is reportedly building

00:06:21.360 --> 00:06:24.000
a dedicated AI phone right now. They expect early

00:06:24.000 --> 00:06:26.040
hardware production to start by the year 2027.

00:06:26.300 --> 00:06:28.839
It completes background tasks, understands your

00:06:28.839 --> 00:06:31.579
goals, and works like a true operator. It is

00:06:31.579 --> 00:06:33.959
not just a standard mobile device anymore. And

00:06:33.959 --> 00:06:36.040
we're seeing this deep agentic integration in

00:06:36.040 --> 00:06:38.600
the workplace, too. Anthropic just launched 10

00:06:38.600 --> 00:06:41.980
ready to use autonomous agents. They are built

00:06:41.980 --> 00:06:44.920
specifically for the finance and insurance sectors.

00:06:45.139 --> 00:06:47.259
These specific agents are doing highly complex

00:06:47.259 --> 00:06:49.839
professional work. They can screen dense KYC

00:06:49.839 --> 00:06:52.180
files for banking compliance. They can review

00:06:52.180 --> 00:06:54.680
massive corporate earnings reports in seconds.

00:06:54.740 --> 00:06:57.079
Wow. They can even build complex presentation

00:06:57.079 --> 00:07:00.160
decks completely from scratch. ChatGPT is also

00:07:00.160 --> 00:07:02.819
moving directly into our daily corporate workflows.

00:07:03.149 --> 00:07:05.910
It now works natively inside Microsoft Excel

00:07:05.910 --> 00:07:08.649
and Google Sheets. You can use it to build complex

00:07:08.649 --> 00:07:11.129
financial formulas automatically. It reads raw

00:07:11.129 --> 00:07:13.509
data, generates insights, and formats the entire

00:07:13.509 --> 00:07:15.449
spreadsheet. They're rolling out a free beta

00:07:15.449 --> 00:07:18.589
for paid users right now. And OpenAI is aggressively

00:07:18.589 --> 00:07:20.750
pushing this integration into higher education.

00:07:21.189 --> 00:07:23.949
They just introduced the ChatGPT Futures Class

00:07:23.949 --> 00:07:26.910
of 2026 program. Right. They gave 26 student

00:07:26.910 --> 00:07:30.189
builders $10 ,000 grants. And they gave them

00:07:30.189 --> 00:07:33.660
full, unlimited access to frontier AI models.

00:07:33.939 --> 00:07:36.459
It is the first generation to start and finish

00:07:36.459 --> 00:07:39.660
college with GPT. We're studying how students

00:07:39.660 --> 00:07:42.759
leverage frontier AI throughout their entire

00:07:42.759 --> 00:07:45.620
degree. We're watching the baseline of human

00:07:45.620 --> 00:07:49.279
productivity shift in real time. Beat, but this

00:07:49.279 --> 00:07:52.220
deep integration introduces some very strange

00:07:52.220 --> 00:07:55.040
new vulnerabilities. We're trusting autonomous

00:07:55.040 --> 00:07:57.560
agents with highly sensitive personal financial

00:07:57.560 --> 00:08:00.399
data. The security risks are evolving just as

00:08:00.399 --> 00:08:03.560
fast as the model capabilities. Researchers just

00:08:03.560 --> 00:08:06.339
exposed a completely new AI scam trick last week.

00:08:06.459 --> 00:08:08.899
They found a brilliant way to manipulate agents

00:08:08.899 --> 00:08:11.740
using hidden instructions. Yeah, the hidden malicious

00:08:11.740 --> 00:08:13.980
instructions were written entirely in Morse code,

00:08:14.199 --> 00:08:16.620
just simple dots and dashes buried deeply inside

00:08:16.620 --> 00:08:18.699
standard text files. And the terrifying part

00:08:18.699 --> 00:08:21.060
is that some agents actually followed them. The

00:08:21.060 --> 00:08:23.160
AI recognized the Morse code formatting perfectly.

00:08:23.439 --> 00:08:26.439
It then bypassed its safety filters and executed

00:08:26.439 --> 00:08:28.779
the hidden malicious commands. I still wrestle

00:08:28.779 --> 00:08:30.920
with prompt drift myself, so agents getting hacked

00:08:30.920 --> 00:08:33.139
by Morse code is terrifying. It sounds like a

00:08:33.139 --> 00:08:35.940
strange plot from a science fiction novel. But

00:08:35.940 --> 00:08:38.200
this is the harsh reality of deploying autonomous

00:08:38.200 --> 00:08:41.559
agents. Large language models process information

00:08:41.559 --> 00:08:44.340
through mathematical tokenization. They don't

00:08:44.340 --> 00:08:46.320
read English the way human beings do. Right.

00:08:46.600 --> 00:08:48.759
They see patterns, and Morse code is just another

00:08:48.759 --> 00:08:51.419
mathematical pattern. We are exposing ourselves

00:08:51.419 --> 00:08:55.080
to entirely new cryptographic attack vectors.

00:08:55.320 --> 00:08:57.759
And the traditional legal system is desperately

00:08:57.759 --> 00:09:00.320
trying to catch up. Apple just agreed to pay

00:09:00.320 --> 00:09:04.659
$250 million. They're settling a massive class

00:09:04.659 --> 00:09:06.759
action lawsuit regarding artificial intelligence

00:09:06.759 --> 00:09:09.419
right now. The underlying lawsuit was tied to

00:09:09.419 --> 00:09:12.889
claims about specific iPhone capabilities. Buyers

00:09:12.889 --> 00:09:15.429
claimed they were misled about Siri's newly advertised

00:09:15.429 --> 00:09:18.830
AI features. Apple agreed to the massive financial

00:09:18.830 --> 00:09:22.029
settlement to avoid a lengthy trial. They settled

00:09:22.029 --> 00:09:23.789
this massive case even though they admitted no

00:09:23.789 --> 00:09:26.730
official wrongdoing. Two sec silence. With agents

00:09:26.730 --> 00:09:29.009
acting autonomously in our spreadsheets and phones,

00:09:29.309 --> 00:09:32.009
who takes the fall when a Morse code hack successfully

00:09:32.009 --> 00:09:35.649
steals data? That is the defining legal question

00:09:35.649 --> 00:09:38.789
of this entire new era. The traditional paradigm

00:09:38.789 --> 00:09:41.230
of software security is shifting rapidly right

00:09:41.230 --> 00:09:44.759
now. Historically, the end user was heavily responsible

00:09:44.759 --> 00:09:47.759
for their own data security. But when the AI

00:09:47.759 --> 00:09:50.519
makes autonomous background decisions, the liability

00:09:50.519 --> 00:09:53.940
fundamentally moves. The legal burden falls heavily

00:09:53.940 --> 00:09:56.820
onto the developers building the underlying models.

00:09:57.120 --> 00:09:59.720
Ah, so security liability shifts entirely from

00:09:59.720 --> 00:10:02.539
user to the AI developer. Exactly. The tech companies

00:10:02.539 --> 00:10:04.620
building this intelligence have to legally secure

00:10:04.620 --> 00:10:07.519
it. They are creating autonomous actors, so they

00:10:07.519 --> 00:10:10.529
hold the ultimate responsibility. Sponsor. We're

00:10:10.529 --> 00:10:13.190
back. We've explored supercomputers sitting right

00:10:13.190 --> 00:10:15.610
next to our air conditioners. We've tracked autonomous

00:10:15.610 --> 00:10:18.070
agents running our daily financial spreadsheets.

00:10:18.090 --> 00:10:20.929
Beat. But for these complex agents to work effectively,

00:10:21.190 --> 00:10:23.789
they need vast memory. They need to instantly

00:10:23.789 --> 00:10:26.950
recall massive amounts of contextual data. And

00:10:26.950 --> 00:10:28.669
that brings us to an incredible new software

00:10:28.669 --> 00:10:31.409
breakthrough. It perfectly matches the explosive

00:10:31.409 --> 00:10:33.370
scale of the physical hardware we discussed.

00:10:33.669 --> 00:10:37.049
A company called SubQuadratic just launched a

00:10:37.049 --> 00:10:39.570
fascinating new AI model. They're calling this

00:10:39.570 --> 00:10:43.190
powerful new model SubQ. It's being described

00:10:43.190 --> 00:10:47.289
as a 12 million token memory hack killer. SubQ

00:10:47.289 --> 00:10:51.230
can cleanly swallow 12 million tokens in a single

00:10:51.230 --> 00:10:54.950
analytical pass. That is a truly incomprehensible

00:10:54.950 --> 00:10:57.830
amount of raw contextual information. It's like

00:10:57.830 --> 00:11:00.509
feeding an entire corporate code repository into

00:11:00.509 --> 00:11:03.490
one single prompt. Right. And the engineering

00:11:03.490 --> 00:11:06.190
team behind this model is absolutely world class.

00:11:06.470 --> 00:11:09.330
They hail from Meta. Google DeepMind and Oxford

00:11:09.330 --> 00:11:11.600
University. They've built an architecture that

00:11:11.600 --> 00:11:13.759
fundamentally changes the core processing economics.

00:11:14.200 --> 00:11:16.419
The massive cost difference is what really stands

00:11:16.419 --> 00:11:19.799
out to me here. It costs just $8 to run a massive

00:11:19.799 --> 00:11:23.340
context task on SubQ. That is a staggering price

00:11:23.340 --> 00:11:25.240
drop for working software developers. If you

00:11:25.240 --> 00:11:27.559
run that exact same massive task on traditional

00:11:27.559 --> 00:11:30.440
frontier models, it is wildly expensive and incredibly

00:11:30.440 --> 00:11:33.480
slow to process. It costs roughly $2 ,600 on

00:11:33.480 --> 00:11:36.740
those older legacy systems. And SubQ runs 52

00:11:36.740 --> 00:11:38.899
times faster than the current industry standard.

00:11:39.320 --> 00:11:41.679
Let's unpack exactly how they achieved this massive

00:11:41.679 --> 00:11:44.200
leap in efficiency. They use a highly optimized

00:11:44.200 --> 00:11:47.679
selective attention neural architecture. Traditional

00:11:47.679 --> 00:11:50.299
models calculate the relationship between every

00:11:50.299 --> 00:11:53.559
single word in a document. That requires massive

00:11:53.559 --> 00:11:56.039
computing power as the document gets larger and

00:11:56.039 --> 00:11:58.940
longer. Their system only looks at the token

00:11:58.940 --> 00:12:01.799
relationships that actually matter. It drops

00:12:01.799 --> 00:12:03.980
irrelevant contextual connections instantly.

00:12:04.259 --> 00:12:07.240
That is how they achieve such incredible, groundbreaking

00:12:07.240 --> 00:12:10.240
processing speeds. We should define a few important

00:12:10.240 --> 00:12:12.620
architectural concepts here. Flash attention

00:12:12.620 --> 00:12:15.539
is a widely used method to speed up AI memory

00:12:15.539 --> 00:12:17.659
processing. Right. It is a critical optimization

00:12:17.659 --> 00:12:19.919
technique for modern hardware architectures.

00:12:19.960 --> 00:12:21.879
And we also need to talk about traditional retrieval

00:12:21.879 --> 00:12:24.779
systems. The industry has relied heavily on complex

00:12:24.779 --> 00:12:27.740
RG pipelines for years now. These are systems

00:12:27.740 --> 00:12:30.259
that fetch outside data to help AI and... or

00:12:30.259 --> 00:12:32.679
questions. You build a vector database to search

00:12:32.679 --> 00:12:35.399
for small text fragments. Yeah, those traditional

00:12:35.399 --> 00:12:37.899
document retrieval systems are incredibly complex

00:12:37.899 --> 00:12:40.980
and fragile. But SubQ might make those complicated

00:12:40.980 --> 00:12:44.340
RRAG pipelines a thing of the past. You simply

00:12:44.340 --> 00:12:47.320
don't need to fetch small chunks of outside data

00:12:47.320 --> 00:12:50.340
anymore. You just feed the entire massive database

00:12:50.340 --> 00:12:53.759
directly into the working model. It's like stacking

00:12:53.759 --> 00:12:55.840
Lego blocks of data, but instead of building

00:12:55.840 --> 00:12:58.019
piece by piece, you dump the whole bucket and

00:12:58.019 --> 00:13:00.519
the AI instantly sees the final castle. That

00:13:00.519 --> 00:13:02.899
visual perfectly captures the leap in processing

00:13:02.899 --> 00:13:05.940
capability. And the verified benchmark performance

00:13:05.940 --> 00:13:09.820
of SubQ is simply incredible. Most standard models

00:13:09.820 --> 00:13:12.360
lose the plot as the input files get bigger.

00:13:12.480 --> 00:13:15.059
They suffer from the classic needle in a haystack

00:13:15.059 --> 00:13:17.700
problem. Right. They completely forget important

00:13:17.700 --> 00:13:20.220
information hidden deeply in the middle of the

00:13:20.220 --> 00:13:23.299
text. But SubQ hit 92 % recall at 12 million

00:13:23.299 --> 00:13:26.080
tokens. It remembers almost everything you feed

00:13:26.080 --> 00:13:29.480
into its working context window. Just for comparative

00:13:29.480 --> 00:13:31.500
context, we could look at the major industry

00:13:31.500 --> 00:13:35.389
competitors. Gemini 3 .1 Pro struggled significantly

00:13:35.389 --> 00:13:38.470
on similar massive multi -needle tests. The exact

00:13:38.470 --> 00:13:41.450
same degradation is true for OpenAI's GPT -5

00:13:41.450 --> 00:13:45.210
.4. They simply could not maintain accurate information

00:13:45.210 --> 00:13:48.789
recall at that massive scale. SubQ is absolutely

00:13:48.789 --> 00:13:51.269
crushing the established giants on these specific

00:13:51.269 --> 00:13:53.250
benchmarks. Yeah, they've already launched a

00:13:53.250 --> 00:13:55.350
specialized developer tool called SubQ Code.

00:13:55.549 --> 00:13:58.190
It's a command line interface agent built directly

00:13:58.190 --> 00:14:00.830
for professional software engineers. It can load

00:14:00.830 --> 00:14:03.690
your entire complex code base in just one single

00:14:03.690 --> 00:14:06.169
pass. And they are certainly not stopping at

00:14:06.169 --> 00:14:08.889
12 million tokens. The engineering team is officially

00:14:08.889 --> 00:14:11.909
targeting a 100 million token context window.

00:14:12.110 --> 00:14:14.169
They want to hit that massive milestone by the

00:14:14.169 --> 00:14:16.629
end of 2026. We have seen so -called transformer

00:14:16.629 --> 00:14:19.309
killers hyped up before. Alternate architectural

00:14:19.309 --> 00:14:21.769
models like Mamba tried to dethrone the standard

00:14:21.769 --> 00:14:24.110
transformer architecture. But SubQ genuinely

00:14:24.110 --> 00:14:26.210
feels like a permanent shift in the landscape.

00:14:26.450 --> 00:14:29.090
It's launching with a highly robust, production

00:14:29.090 --> 00:14:32.129
-ready developer API today. If you're a software

00:14:32.129 --> 00:14:34.350
engineer building agents, this is a massive win.

00:14:34.570 --> 00:14:37.370
Two sec silence. I have to wonder about the long

00:14:37.370 --> 00:14:41.200
-term architectural implications here. If an

00:14:41.200 --> 00:14:45.220
AI can perfectly recall 12 million tokens for

00:14:45.220 --> 00:14:48.220
eight bucks, does this completely kill the need

00:14:48.220 --> 00:14:51.100
for complex retrieval systems? It fundamentally

00:14:51.100 --> 00:14:53.759
rewrites how developers will build AI applications

00:14:53.759 --> 00:14:56.649
moving forward. You simply won't need clunky

00:14:56.649 --> 00:14:59.409
vector databases holding tiny fragments of text.

00:14:59.610 --> 00:15:02.289
You won't need complicated routing logic to find

00:15:02.289 --> 00:15:04.730
the right source document. You'll just load the

00:15:04.730 --> 00:15:06.970
entire required operational context directly

00:15:06.970 --> 00:15:09.269
into the working memory. Right. Massive context

00:15:09.269 --> 00:15:12.210
windows will just replace complex data retrieval

00:15:12.210 --> 00:15:14.629
pipelines entirely. It simplifies the entire

00:15:14.629 --> 00:15:17.029
software development process for engineers globally.

00:15:17.350 --> 00:15:19.690
Let's step back and look at the larger big picture

00:15:19.690 --> 00:15:22.649
here. The core physical bottlenecks of artificial

00:15:22.649 --> 00:15:25.070
intelligence are shattering simultaneously right

00:15:25.070 --> 00:15:27.909
now. We're seeing hardware limits being bypassed

00:15:27.909 --> 00:15:30.149
in incredibly creative and distributed ways.

00:15:30.470 --> 00:15:32.590
We are actively capturing the unused electrical

00:15:32.590 --> 00:15:34.929
capacity of our suburban homes. We are looking

00:15:34.929 --> 00:15:37.169
at deploying massive computing nodes in the cold

00:15:37.169 --> 00:15:39.789
vacuum of space. The physical deployment constraints

00:15:39.789 --> 00:15:42.950
that held us back are rapidly disappearing. And

00:15:42.950 --> 00:15:45.450
the software processing limits are evaporating

00:15:45.450 --> 00:15:48.120
just as fast today. New architectural models

00:15:48.120 --> 00:15:50.779
like SubQ can process entire repositories of

00:15:50.779 --> 00:15:54.320
human knowledge in seconds. We are rapidly transitioning

00:15:54.320 --> 00:15:57.679
away from the passive, simple chatbot era. We're

00:15:57.679 --> 00:16:00.840
actively unleashing highly autonomous, ubiquitous

00:16:00.840 --> 00:16:03.639
software agents into the real world. This digital

00:16:03.639 --> 00:16:05.960
intelligence is moving directly into our Excel

00:16:05.960 --> 00:16:08.000
sheets and our phones. It is running in your

00:16:08.000 --> 00:16:12.490
backyard, your devices, and in orbit. Beat. Which

00:16:12.490 --> 00:16:14.350
brings me to a final thought for you to ponder

00:16:14.350 --> 00:16:17.009
today. If your home's AC unit, your phone, and

00:16:17.009 --> 00:16:19.190
satellites in orbit are all continuously running

00:16:19.190 --> 00:16:22.029
massive AI agents that can instantly recall millions

00:16:22.029 --> 00:16:25.210
of data points, where does human decision -making

00:16:25.210 --> 00:16:27.710
actually sit in the loop? Thank you so much for

00:16:27.710 --> 00:16:29.750
exploring this deeply with us today. We will

00:16:29.750 --> 00:16:31.330
see you next time. You might be sitting right

00:16:31.330 --> 00:16:33.169
next to your air conditioner. Out to your own

00:16:33.169 --> 00:16:33.429
music.