WEBVTT

00:00:00.000 --> 00:00:03.040
Right now, an algorithm that was originally designed

00:00:03.040 --> 00:00:07.700
to play 1970s Atari games is predicting the path

00:00:07.700 --> 00:00:10.279
of the hurricane outside your window. Yeah, it

00:00:10.279 --> 00:00:13.740
really is. It is quietly optimizing the thermal

00:00:13.740 --> 00:00:16.600
cooling of the data center, storing your photos,

00:00:17.219 --> 00:00:19.579
and it's mapping the three -dimensional structures

00:00:19.579 --> 00:00:22.120
of the microscopic proteins inside your own body.

00:00:22.429 --> 00:00:25.250
It's just a complete paradigm shift. I mean,

00:00:25.390 --> 00:00:27.370
when you look at the systems built by Google

00:00:27.370 --> 00:00:29.410
DeepMind, you aren't just looking at a smarter

00:00:29.410 --> 00:00:32.750
search bar or like a text generator. You are

00:00:32.750 --> 00:00:35.070
looking at a fundamentally different way of engineering

00:00:35.070 --> 00:00:38.049
intelligence. Exactly. And welcome to this deep

00:00:38.049 --> 00:00:40.390
dive into the comprehensive history of Google

00:00:40.390 --> 00:00:43.500
DeepMind. We are pulling from a massive stack

00:00:43.500 --> 00:00:46.439
of source material today to trace how an AI lab

00:00:46.439 --> 00:00:48.840
in the UK went from bouncing a pixelated ball

00:00:48.840 --> 00:00:51.619
and breakout to literally winning a Nobel Prize

00:00:51.619 --> 00:00:54.240
in chemistry. And generating real -time virtual

00:00:54.240 --> 00:00:57.219
world. Yes. But to understand how any of this

00:00:57.219 --> 00:00:59.460
is actually possible, we have to look at their

00:00:59.460 --> 00:01:01.460
foundational philosophy. Because they didn't

00:01:01.460 --> 00:01:03.420
start by trying to teach a machine, you know,

00:01:03.740 --> 00:01:05.400
all the rules of the universe. No, they didn't.

00:01:05.480 --> 00:01:07.599
They just taught it how to learn. What's fascinating

00:01:07.599 --> 00:01:09.959
here is that if you go back to their founding,

00:01:10.200 --> 00:01:13.719
right, in 2010. By Demis Hassabis, Shane Legge,

00:01:13.920 --> 00:01:16.799
and Mustafa Suleiman. Right, exactly. The prevailing

00:01:16.799 --> 00:01:19.980
AI logic at that time was highly prescriptive.

00:01:20.200 --> 00:01:22.299
Like, if you wanted a machine to play chess,

00:01:22.739 --> 00:01:25.019
you programmed in the value of every single piece,

00:01:25.620 --> 00:01:28.239
the exact geometry of the board, specific opening

00:01:28.239 --> 00:01:30.739
strategies. You essentially gave it a heavy top

00:01:30.739 --> 00:01:32.840
-down instruction manual. Yeah, exactly. But

00:01:32.840 --> 00:01:35.200
DeepMind completely discarded the manual. They

00:01:35.200 --> 00:01:37.260
utilized reinforcement learning instead. And

00:01:37.260 --> 00:01:40.579
their laboratory for this wasn't... some supercomputer

00:01:40.579 --> 00:01:44.099
simulating global economics. It was retro video

00:01:44.099 --> 00:01:48.299
games, Pong, Space Invaders, Breakout. The classics.

00:01:48.500 --> 00:01:50.700
Right, they would feed the AI raw pixels, just

00:01:50.700 --> 00:01:52.819
the changing colors on the screen. The AI had

00:01:52.819 --> 00:01:55.239
no concept of what a paddle was or that a laser

00:01:55.239 --> 00:01:57.400
destroys an alien. Right, it had no context at

00:01:57.400 --> 00:01:59.959
all. None. It was just given a controller and

00:01:59.959 --> 00:02:02.140
a single mathematical objective, which was basically

00:02:02.140 --> 00:02:04.319
just make the score integer go up. Which forces

00:02:04.319 --> 00:02:06.879
the system to build its own internal representations

00:02:06.879 --> 00:02:09.680
of the world. Because at first, the AI just flails

00:02:09.680 --> 00:02:12.219
around. It's just smashing buttons. Literally.

00:02:12.439 --> 00:02:15.199
It outputs random controller movements. But eventually,

00:02:15.360 --> 00:02:17.560
a random movement causes the paddle to hit the

00:02:17.560 --> 00:02:20.080
ball and the score goes up. So the reinforcement

00:02:20.080 --> 00:02:22.539
learning algorithm registers a positive reward

00:02:22.539 --> 00:02:25.240
signal. Oh, wow. Yeah. And it mathematically

00:02:25.240 --> 00:02:28.020
updates its own neural pathways to say, well,

00:02:28.280 --> 00:02:30.680
Whatever sequence of pixel states led to that

00:02:30.680 --> 00:02:33.659
action, do that again. OK, let's unpack this.

00:02:34.240 --> 00:02:36.180
Because it's like handing a toddler a joystick,

00:02:36.960 --> 00:02:39.560
absolutely refusing to explain the rules, leaving

00:02:39.560 --> 00:02:41.819
the room, and then returning three days later

00:02:41.819 --> 00:02:43.819
to find they are the undisputed world champion.

00:02:44.000 --> 00:02:46.099
That's a great way to put it. It just grinds

00:02:46.099 --> 00:02:49.259
through millions of iterations, failing, adjusting,

00:02:49.639 --> 00:02:51.699
failing again. But I look at them starting with

00:02:51.699 --> 00:02:54.639
Space Invaders, and it just feels like, I don't

00:02:54.639 --> 00:02:57.030
know, a massive waste of computing power. You

00:02:57.030 --> 00:02:59.430
mean, why not start with something real? Yeah.

00:02:59.889 --> 00:03:02.689
If the ultimate goal is artificial general intelligence

00:03:02.689 --> 00:03:06.490
solving real human problems, why hide in a virtual

00:03:06.490 --> 00:03:10.310
sandbox? Why not tackle real world data immediately?

00:03:10.590 --> 00:03:13.050
Well, because physical reality is impossibly

00:03:13.050 --> 00:03:16.389
noisy. I mean, it is ambiguous. Physics are completely

00:03:16.389 --> 00:03:19.169
unpredictable and success is incredibly hard

00:03:19.169 --> 00:03:21.229
to define in a mathematical equation. Right.

00:03:21.250 --> 00:03:23.750
That makes sense. A video game, however, offers

00:03:23.750 --> 00:03:27.819
a pristine state space. It is a perfectly contained

00:03:27.819 --> 00:03:30.599
universe with a crystal -clear wind state. Which

00:03:30.599 --> 00:03:32.800
lets them isolate the learning part. Exactly.

00:03:32.860 --> 00:03:35.639
It proved their core thesis, which was that complex

00:03:35.639 --> 00:03:37.960
cognitive processes like long -term planning

00:03:37.960 --> 00:03:41.120
and pattern recognition can be synthesized entirely

00:03:41.120 --> 00:03:43.699
from scratch through self -correction. And they

00:03:43.699 --> 00:03:45.860
really pushed that thesis to its absolute limit

00:03:45.860 --> 00:03:49.300
with Go. They moved from the strict 2D parameters

00:03:49.300 --> 00:03:52.539
of Atari to the near -infinite game tree complexity

00:03:52.539 --> 00:03:55.599
of the Go board. Yes. In 2016, their Alpha Go

00:03:55.500 --> 00:03:59.020
system beat Lee Saul, who is a nine -dan professional

00:03:59.020 --> 00:04:02.039
human champion, four to one. A huge moment. But

00:04:02.039 --> 00:04:03.759
the detail in the sources that really anchors

00:04:03.759 --> 00:04:06.259
this whole concept is what happened next, AlphaGo

00:04:06.259 --> 00:04:08.419
Zero. Right, because the original AlphaGo was

00:04:08.419 --> 00:04:10.979
initially trained on millions of human matches

00:04:10.979 --> 00:04:14.120
to get like a baseline understanding of strategy.

00:04:14.439 --> 00:04:18.120
But AlphaGo Zero used zero human data. It was

00:04:18.120 --> 00:04:20.220
just given the rules of Go and simply played

00:04:20.220 --> 00:04:23.220
against itself. And within three days, it beat

00:04:23.220 --> 00:04:26.800
the original AlphaGo 100 to 0. It's unbelievable.

00:04:27.120 --> 00:04:29.439
It didn't just learn the game. It discovered

00:04:29.439 --> 00:04:32.519
strategies that human masters had never even

00:04:32.519 --> 00:04:35.500
conceived of in 3 ,000 years of playing the game.

00:04:35.560 --> 00:04:38.069
And that is the inflection point. By mastering

00:04:38.069 --> 00:04:40.050
these environments purely through self -play

00:04:40.050 --> 00:04:42.350
and reinforcement learning, they weren't just

00:04:42.350 --> 00:04:45.470
building like a board game savant. They had built

00:04:45.470 --> 00:04:48.670
a generalized optimization engine. So the natural

00:04:48.670 --> 00:04:50.810
engineering question then became, well, if we

00:04:50.810 --> 00:04:53.370
can map the perfect virtual state of a game board,

00:04:53.870 --> 00:04:56.850
can we apply this exact same learning mechanism

00:04:56.850 --> 00:05:00.029
to the messy physical infrastructure of the real

00:05:00.029 --> 00:05:01.829
world? Here's where it gets really interesting

00:05:01.829 --> 00:05:03.579
for you listening. because this brings us to

00:05:03.579 --> 00:05:06.199
Google's own data centers. These are massive,

00:05:06.639 --> 00:05:09.360
multi -million dollar server farms that generate

00:05:09.360 --> 00:05:12.379
an unbelievable amount of heat. Oh, yeah. And

00:05:12.379 --> 00:05:14.720
in 2014, a Google engineer essentially looked

00:05:14.720 --> 00:05:17.120
at AlphaGo and said, hey, can we use this to

00:05:17.120 --> 00:05:19.560
play the cooling systems like a video game? Just

00:05:19.560 --> 00:05:21.240
treat the whole building like Pong. Exactly.

00:05:21.379 --> 00:05:24.970
They brought DeepMind in, fed the AI. all the

00:05:24.970 --> 00:05:28.009
historical sensor data temperatures, pump speeds,

00:05:28.209 --> 00:05:31.689
power usage, and just told it to optimize for

00:05:31.689 --> 00:05:34.430
energy efficiency. Specifically, they optimized

00:05:34.430 --> 00:05:36.829
for a metric called power usage effectiveness,

00:05:37.370 --> 00:05:40.810
or PUE. And what the AI started recommending

00:05:40.810 --> 00:05:44.689
was, well, deeply unintuitive to the human engineers

00:05:44.689 --> 00:05:47.290
on site. Right. The sources note the AI figured

00:05:47.290 --> 00:05:49.490
out how to exploit winter conditions to produce

00:05:49.490 --> 00:05:52.389
colder than normal water. Yeah. Which fundamentally

00:05:52.389 --> 00:05:55.310
altered the standard operating procedures. The

00:05:55.310 --> 00:05:57.750
human engineers were apparently highly skeptical.

00:05:57.970 --> 00:06:00.250
Understandably so. But they let it run, and it

00:06:00.250 --> 00:06:02.550
reduced the energy used for cooling by up to

00:06:02.550 --> 00:06:05.569
30 percent. At Google's scale, that is a monumental

00:06:05.569 --> 00:06:07.889
drop in power consumption. If we connect this

00:06:07.889 --> 00:06:10.550
to the bigger picture, why was the AI able to

00:06:10.550 --> 00:06:12.569
find those efficiencies when human engineers

00:06:12.569 --> 00:06:15.040
couldn't? Because it lacks human heuristics.

00:06:15.480 --> 00:06:17.579
Human operators, even world -class engineers,

00:06:17.819 --> 00:06:20.139
rely on rules of thumb. An engineer might look

00:06:20.139 --> 00:06:21.839
at a cooling valve and think, well, we never

00:06:21.839 --> 00:06:25.160
opened that past 40 % in December. The manuals

00:06:25.160 --> 00:06:27.339
say it's inefficient. The AI doesn't know what

00:06:27.339 --> 00:06:30.279
December is. And it hasn't read the manual. It

00:06:30.279 --> 00:06:33.160
only sees the raw mathematical relationship between

00:06:33.160 --> 00:06:36.170
the valve state and the PUE metric. It lacks

00:06:36.170 --> 00:06:38.350
our blind spots. But I keep thinking about the

00:06:38.350 --> 00:06:40.870
sheer physical risk here. Yeah. It's one thing

00:06:40.870 --> 00:06:43.610
to lose a virtual game of StarCraft. Sure. It

00:06:43.610 --> 00:06:46.490
is an entirely different reality to hand over

00:06:46.490 --> 00:06:49.009
the thermal controls of a global data center

00:06:49.009 --> 00:06:52.069
to an algorithm. If it decides to try a random

00:06:52.069 --> 00:06:54.490
action just to see what happens, it could melt

00:06:54.490 --> 00:06:56.209
down the servers. Which is why they didn't just

00:06:56.209 --> 00:06:58.889
hand over the keys, they implemented strict safety

00:06:58.889 --> 00:07:01.389
bounds. Think of it as a mathematical fence.

00:07:01.730 --> 00:07:03.689
The reinforcement learning agent was free to

00:07:03.689 --> 00:07:05.529
experiment and adjust the cooling parameters,

00:07:05.930 --> 00:07:08.089
but its outputs were filtered through a hard

00:07:08.089 --> 00:07:10.629
-coded safety system. Ah, I see. Yeah. If the

00:07:10.629 --> 00:07:12.970
AI suggested a temperature curve that exceeded

00:07:12.970 --> 00:07:15.870
hardware safety limits, the system simply rejected

00:07:15.870 --> 00:07:18.550
it. It's guided optimization. And they scaled

00:07:18.550 --> 00:07:21.819
this optimization way beyond server rooms. By

00:07:21.819 --> 00:07:24.680
mid -2025, DeepMind launched the Weather Lab.

00:07:24.980 --> 00:07:27.399
They took neural networks and trained them on

00:07:27.399 --> 00:07:30.879
45 years of historical global weather data. And

00:07:30.879 --> 00:07:34.379
during the 2025 Atlantic hurricane season, this

00:07:34.379 --> 00:07:37.620
AI systematically outperformed the U .S. National

00:07:37.620 --> 00:07:40.180
Weather Service in predicting cyclone tracks

00:07:40.180 --> 00:07:43.860
and intensity up to 15 days in advance. To appreciate

00:07:43.860 --> 00:07:46.540
how massive that is, you have to understand how

00:07:46.540 --> 00:07:49.100
traditional weather forecasting works. The National

00:07:49.100 --> 00:07:52.779
Weather Service uses incredibly complex deterministic

00:07:52.779 --> 00:07:55.339
physics models. Super computers running simulations,

00:07:55.560 --> 00:07:58.040
right? Exactly. Fluid dynamics, thermodynamics,

00:07:58.319 --> 00:08:00.699
the Navier -Stokes equations. But DeepMind's

00:08:00.699 --> 00:08:02.660
weather lab approached it completely differently.

00:08:02.860 --> 00:08:04.819
It didn't try to simulate the physics at all.

00:08:04.959 --> 00:08:07.560
Wait, really? Yeah. It treated 45 years of weather

00:08:07.560 --> 00:08:10.360
data as an incredibly complex pattern recognition

00:08:10.360 --> 00:08:13.079
game. It learned the hidden correlation between

00:08:13.079 --> 00:08:15.519
atmospheric pressure in one hemisphere and wind

00:08:15.519 --> 00:08:17.759
shear in another. Wow. Correlations that are

00:08:17.759 --> 00:08:19.779
just too complex for traditional physics engines

00:08:19.779 --> 00:08:22.459
to compute in real time. So the AI is mapping

00:08:22.459 --> 00:08:26.100
the macro patterns of the globe. But if an architecture

00:08:26.100 --> 00:08:28.360
is powerful enough to decode the atmosphere,

00:08:29.139 --> 00:08:32.120
it can also be turned inward, right, to decode

00:08:32.120 --> 00:08:34.659
the microscopic building blocks of life itself.

00:08:34.919 --> 00:08:37.360
And this is the leap from applied engineering

00:08:37.360 --> 00:08:40.980
to pure scientific discovery. We are talking

00:08:40.980 --> 00:08:44.230
about protein folding. For 50 years, this was

00:08:44.230 --> 00:08:46.809
one of the grandest challenges in molecular biology.

00:08:46.889 --> 00:08:49.850
It really was. Because proteins are the machinery

00:08:49.850 --> 00:08:52.509
of life, and they start as one -dimensional strings

00:08:52.509 --> 00:08:55.789
of amino acids. But to do their job, they fold

00:08:55.789 --> 00:08:58.429
into highly complex three -dimensional shapes.

00:08:58.490 --> 00:09:00.870
Right. And knowing that shape tells you exactly

00:09:00.870 --> 00:09:03.769
what the protein does, which is the key to designing

00:09:03.769 --> 00:09:06.529
targeted medicines, curing diseases, or even

00:09:06.529 --> 00:09:09.710
engineering enzymes that eat plastic. But calculating

00:09:09.710 --> 00:09:13.070
how that 1D string will fold into 3D space is

00:09:13.070 --> 00:09:15.649
computationally staggering. I mean, the number

00:09:15.649 --> 00:09:18.309
of possible configurations for a single protein

00:09:18.309 --> 00:09:20.990
is greater than the number of atoms in the observable

00:09:20.990 --> 00:09:23.629
universe. That's insane. It is. Traditional methods

00:09:23.629 --> 00:09:25.570
like X -ray crystallography could take years

00:09:25.570 --> 00:09:27.870
to map just one structure. Enter alpha fold.

00:09:28.649 --> 00:09:32.029
By 2020, DeepMind largely solved the structural

00:09:32.029 --> 00:09:35.220
prediction problem. By 2022, they released a

00:09:35.220 --> 00:09:37.179
database containing the predicted structures

00:09:37.179 --> 00:09:40.639
of over 200 million proteins. Which is virtually

00:09:40.639 --> 00:09:43.100
every protein known to human science. Yeah. And

00:09:43.100 --> 00:09:46.039
then in May 2024, they released AlphaFold 3,

00:09:46.159 --> 00:09:48.159
which went even further, predicting how those

00:09:48.159 --> 00:09:51.610
proteins interact with DNA and RNA. Amazing.

00:09:51.929 --> 00:09:54.769
It was a breakthrough so seismic that Demis Isavis

00:09:54.769 --> 00:09:58.049
and John Jumper were awarded the 2024 Nobel Prize

00:09:58.049 --> 00:10:01.090
in Chemistry. It compressed decades of biological

00:10:01.090 --> 00:10:03.870
research into a matter of days. And what is so

00:10:03.870 --> 00:10:06.269
profound is that the system isn't just mimicking

00:10:06.269 --> 00:10:09.990
known biology. It is actively inferring the underlying

00:10:09.990 --> 00:10:12.110
biophysical rules from the data it was trained

00:10:12.110 --> 00:10:14.879
on. But biology isn't the only foundational code

00:10:14.879 --> 00:10:17.379
they hacked. In 2023, they released a system

00:10:17.379 --> 00:10:19.940
called Alphadev, and they pointed it at computer

00:10:19.940 --> 00:10:22.220
code itself, specifically sorting algorithm.

00:10:22.419 --> 00:10:24.720
Yes, sorting algorithm. These are the fundamental

00:10:24.720 --> 00:10:26.840
instructions computers use to organize data,

00:10:26.980 --> 00:10:29.899
and they're used trillions of times a day. Alphadev

00:10:29.899 --> 00:10:31.759
discovered a new sorting algorithm for short

00:10:31.759 --> 00:10:34.639
sequences that was up to 70 % faster. It was

00:10:34.639 --> 00:10:37.460
so undeniably superior that it was integrated

00:10:37.460 --> 00:10:41.080
into the C++ standard library, which hadn't updated

00:10:41.080 --> 00:10:44.320
those algorithms in over a decade. But, and this

00:10:44.320 --> 00:10:46.360
is the part that genuinely breaks my brain, if

00:10:46.360 --> 00:10:48.299
human computer scientists have been staring at

00:10:48.299 --> 00:10:51.100
these exact sorting algorithms since the 1990s.

00:10:51.360 --> 00:10:53.779
Analyzing every single line of assembly code.

00:10:54.019 --> 00:10:57.600
Exactly. How does an AI just find a 70 % faster

00:10:57.600 --> 00:11:00.360
route? What is it seeing that the entire computer

00:11:00.360 --> 00:11:03.159
science community missed? It sees the code the

00:11:03.159 --> 00:11:06.059
exact same way AlphaGo saw the Go board. AlphaDev

00:11:06.059 --> 00:11:08.759
doesn't read code like a human programmer. A

00:11:08.759 --> 00:11:11.279
human thinks in sequential logic, right? We build

00:11:11.279 --> 00:11:13.659
algorithms based on what makes logical sense

00:11:13.659 --> 00:11:16.500
to our brains, but DeepMind set up the assembly

00:11:16.500 --> 00:11:18.980
language of the CPU as a game state. Oh, I see.

00:11:19.100 --> 00:11:21.500
The moves were the swapping, adding, or deleting

00:11:21.500 --> 00:11:24.179
of individual machine instructions. The win state

00:11:24.179 --> 00:11:26.299
was sorting the data correctly with the fewest

00:11:26.299 --> 00:11:28.159
possible operations. So it was just playing a

00:11:28.159 --> 00:11:31.149
game of efficiency. Yes, and in doing so, it

00:11:31.149 --> 00:11:33.350
found a shortcut. It discovered that you could

00:11:33.350 --> 00:11:35.830
literally leave out a specific instruction that

00:11:35.830 --> 00:11:38.429
humans always assumed was logically mandatory

00:11:38.429 --> 00:11:41.830
to ensure the data was sorted. The AI proved

00:11:41.830 --> 00:11:44.669
it wasn't mandatory at all. It found an invisible

00:11:44.669 --> 00:11:46.909
pathway through the assembly code because it

00:11:46.909 --> 00:11:49.230
was completely unburdened by human assumptions

00:11:49.230 --> 00:11:52.230
about how code should be written. It proves that

00:11:52.230 --> 00:11:55.269
these systems can discover entirely new mathematically

00:11:55.269 --> 00:11:57.929
verifiable truths. Okay, so we've tracked this

00:11:57.929 --> 00:12:00.490
evolution from an AI that learns virtual rules

00:12:00.490 --> 00:12:03.000
to an AI that that optimizes physical reality

00:12:03.000 --> 00:12:06.240
to an AI that discovers new biological and mathematical

00:12:06.240 --> 00:12:10.000
truths. But over the last two years, the fundamental

00:12:10.000 --> 00:12:13.240
nature of what DeepMind outputs has changed.

00:12:13.600 --> 00:12:16.940
Because in April 2023, they merged with Google's

00:12:16.940 --> 00:12:19.279
brain division, pooling all their resources to

00:12:19.279 --> 00:12:22.139
create Google DeepMind. And this catalyzed the

00:12:22.139 --> 00:12:24.720
generative explosion, the Gemini era. We moved

00:12:24.720 --> 00:12:27.299
from systems designed to decode existing realities

00:12:27.299 --> 00:12:29.360
to systems designed to generate entirely new

00:12:29.360 --> 00:12:31.870
ones. So what does this all mean for you? The

00:12:31.870 --> 00:12:34.049
timeline here is moving at absolute breakneck

00:12:34.049 --> 00:12:36.710
speed for those of you listening today, March

00:12:36.710 --> 00:12:40.929
26th, 2026. Hard to keep up. Very. A year ago,

00:12:41.049 --> 00:12:43.529
we saw Gemini 2 .5, which introduced reasoning

00:12:43.529 --> 00:12:46.110
tokens, giving the model the ability to internally

00:12:46.110 --> 00:12:49.029
think and verify its logic before outputting

00:12:49.029 --> 00:12:53.029
text. By November 2025, Gemini 3 Pro was woven

00:12:53.029 --> 00:12:55.730
into the fabric of Google search. Right. But

00:12:55.730 --> 00:12:57.490
the text is almost the least interesting part

00:12:57.490 --> 00:13:01.820
now. Because in May 2025, they dropped VO3, generating

00:13:01.820 --> 00:13:04.399
high definition, synchronized video, dialogue,

00:13:04.580 --> 00:13:08.200
and ambient sound. And then in August 2025, Genie

00:13:08.200 --> 00:13:11.600
3 launched. Genie 3 is a massive conceptual leap.

00:13:11.820 --> 00:13:13.299
Right, because it doesn't just draw a picture.

00:13:13.360 --> 00:13:15.500
You give Genie 3 a text prompt or a single image,

00:13:15.700 --> 00:13:18.220
and it generates an interactive, real -time 3D

00:13:18.220 --> 00:13:20.379
virtual world. You can literally step into it

00:13:20.379 --> 00:13:22.220
and control the environment. And just yesterday,

00:13:22.299 --> 00:13:25.679
March 25th, 2026, they released Liria 3 Pro,

00:13:25.820 --> 00:13:28.480
which generates structurally aware, symphonic

00:13:28.480 --> 00:13:30.600
-level music. It's incredible. But how do we

00:13:30.600 --> 00:13:33.399
even contextualize this leap? I mean, how does

00:13:33.399 --> 00:13:36.240
an architecture designed to just predict the

00:13:36.240 --> 00:13:38.659
next word in a sentence, suddenly figure out

00:13:38.659 --> 00:13:41.899
how to generate a real -time 3D physics engine.

00:13:42.100 --> 00:13:44.399
Well, it comes down to the shift toward polyvalent

00:13:44.399 --> 00:13:48.500
or multimodal models. Early on, DeepMind built

00:13:48.500 --> 00:13:51.809
a model called GATO. Gaito was trained on text,

00:13:52.129 --> 00:13:54.730
image captions, and robotic arm movements all

00:13:54.730 --> 00:13:57.809
simultaneously. It could perform over 600 different

00:13:57.809 --> 00:14:00.590
tasks using the exact same neural weights. Wow.

00:14:00.870 --> 00:14:03.210
What DeepMind realized is that if you scale that

00:14:03.210 --> 00:14:05.950
architecture massively, the model stops just

00:14:05.950 --> 00:14:08.269
predicting pixels or text and starts inferring

00:14:08.269 --> 00:14:10.830
a world model. A world model, meaning it actually

00:14:10.830 --> 00:14:13.210
understands the physics behind the image. Exactly.

00:14:13.409 --> 00:14:16.129
To accurately generate a video of a glass falling

00:14:16.129 --> 00:14:19.169
off a table, the model has to implicitly learn

00:14:19.169 --> 00:14:21.580
the concepts of gravity, mass, and fragility.

00:14:21.840 --> 00:14:25.340
With Genie 3, it uses latent action models. It

00:14:25.340 --> 00:14:28.320
infers how an environment should react to a user's

00:14:28.320 --> 00:14:30.639
input based on billions of examples of physical

00:14:30.639 --> 00:14:33.519
interactions. The insight here is that the boundary

00:14:33.519 --> 00:14:36.960
between calculating data and creating media has

00:14:36.960 --> 00:14:39.620
completely dissolved. It is a total democratization

00:14:39.620 --> 00:14:42.460
of creation. But that democratization forces

00:14:42.460 --> 00:14:45.320
an incredible responsibility onto you, the listener.

00:14:45.860 --> 00:14:48.679
If Genie 3 can spin up a photorealistic reality

00:14:48.559 --> 00:14:52.360
on the fly, and Lyria 3 Pro can generate human

00:14:52.360 --> 00:14:56.080
emotion through a symphony. I mean, the critical

00:14:56.080 --> 00:14:57.919
thinking required to navigate what is synthetic

00:14:57.919 --> 00:15:00.379
and what is authentic is now a daily survival

00:15:00.379 --> 00:15:02.960
skill. Yeah, absolutely. And that friction between

00:15:02.960 --> 00:15:05.799
incredible capability and human reality brings

00:15:05.799 --> 00:15:07.980
us to the ethical guardrails. Because when you

00:15:07.980 --> 00:15:10.679
have the power to parse biological data and generate

00:15:10.679 --> 00:15:13.539
realities, the collision with human privacy is

00:15:13.539 --> 00:15:15.649
almost inevitable. This raises an important question

00:15:15.649 --> 00:15:18.730
because innovation almost always outpaces regulation,

00:15:19.110 --> 00:15:21.610
and DeepMind has had to navigate severe structural

00:15:21.610 --> 00:15:23.929
failures regarding data privacy. The clearest

00:15:23.929 --> 00:15:26.429
example in the sources is the NHS data sharing

00:15:26.429 --> 00:15:29.850
controversy. Wait. In 2016, a hospital network

00:15:29.850 --> 00:15:33.250
in London, the Royal Free London NHS Foundation

00:15:33.250 --> 00:15:35.929
Trust, partnered with DeepMind to build an app

00:15:35.929 --> 00:15:40.049
called Streams. The goal was noble, right? Alert

00:15:40.049 --> 00:15:42.230
doctors instantly when a patient was at risk

00:15:42.230 --> 00:15:44.870
of acute kidney injury. Good intentions. To build

00:15:44.870 --> 00:15:47.529
it, the Trust gave DeepMind access to an estimated

00:15:47.529 --> 00:15:50.740
1 .6 million. patient records. But this wasn't

00:15:50.740 --> 00:15:53.559
just anonymized kidney data. It included highly

00:15:53.559 --> 00:15:56.700
sensitive historical logs, HIV diagnoses, depression

00:15:56.700 --> 00:15:59.019
treatments, abortion records. Right. And while

00:15:59.019 --> 00:16:01.039
the app functioned beautifully, I mean, doctors

00:16:01.039 --> 00:16:03.840
reported it saved massive amounts of time and

00:16:03.840 --> 00:16:06.779
improved care. The UK's Information Commission's

00:16:06.779 --> 00:16:08.759
office eventually stepped in and ruled that the

00:16:08.759 --> 00:16:10.779
hospital failed to comply with the Data Protection

00:16:10.779 --> 00:16:13.220
Act. Yeah. The patients had never been adequately

00:16:13.220 --> 00:16:15.100
informed that their entire medical histories

00:16:15.100 --> 00:16:17.840
were being handed over to a massive tech corporation

00:16:17.840 --> 00:16:20.259
for algorithm. development. It is a profound

00:16:20.259 --> 00:16:23.019
lesson in alignment. The objective function was

00:16:23.019 --> 00:16:25.759
stop kidney failure but the system surrounding

00:16:25.759 --> 00:16:28.779
it lacked the parameters for human consent. It's

00:16:28.779 --> 00:16:31.019
like building a Ferrari but completely forgetting

00:16:31.019 --> 00:16:33.659
to install the brakes or you know training a

00:16:33.659 --> 00:16:35.720
brilliant medical savant who can diagnose any

00:16:35.720 --> 00:16:38.019
disease flawlessly but failing to teach them

00:16:38.019 --> 00:16:40.419
the concept of doctor -patient confidentiality.

00:16:40.419 --> 00:16:43.240
Exactly. In response to the legal fallout DeepMind

00:16:43.240 --> 00:16:46.120
established a dedicated ethics and society unit

00:16:46.120 --> 00:16:50.059
and more recently in 2024, they implemented what

00:16:50.059 --> 00:16:53.580
they call a robot constitution for their AI agents.

00:16:53.620 --> 00:16:56.820
Right. It is literally inspired by Isaac Asimov's

00:16:56.820 --> 00:16:59.799
three laws of robotics. Rule one is essentially

00:16:59.799 --> 00:17:03.480
do no harm to a human being. Right. But structurally,

00:17:03.519 --> 00:17:05.460
how does that actually work? I mean, can a set

00:17:05.460 --> 00:17:08.079
of text based rules actually contain a system

00:17:08.079 --> 00:17:11.160
as smart as Gemini 3 Pro? How do you map a philosophical

00:17:11.160 --> 00:17:13.160
rule into a loss function? That is the bleeding

00:17:13.160 --> 00:17:15.670
edge of constitutional AI. You don't just paste

00:17:15.670 --> 00:17:18.089
the text into the code. During the reinforcement

00:17:18.089 --> 00:17:20.369
learning phase, the AI generates thousands of

00:17:20.369 --> 00:17:22.470
potential responses to prompts. OK. And then

00:17:22.470 --> 00:17:25.529
another AI or a human raider scores those responses

00:17:25.529 --> 00:17:27.809
strictly based on whether they violate the Constitution.

00:17:28.269 --> 00:17:30.690
The model is mathematically penalized for generating

00:17:30.690 --> 00:17:33.009
harmful outputs and rewarded for aligning with

00:17:33.009 --> 00:17:35.430
the principles. Oh, I see. Yeah, it bakes the

00:17:35.430 --> 00:17:37.250
ethical constraints directly into the neural

00:17:37.250 --> 00:17:39.109
weights before the model is ever released to

00:17:39.109 --> 00:17:41.450
the public. It's an attempt to mathematically

00:17:41.450 --> 00:17:44.769
encode human values, and it's a fitting capstone

00:17:44.769 --> 00:17:47.269
to the journey we've tracked today. We've gone

00:17:47.269 --> 00:17:50.849
from an AI trying to maximize a score in a 1970s

00:17:50.849 --> 00:17:53.789
arcade game to an AI mapping the proteins of

00:17:53.789 --> 00:17:56.450
the human body to models that generate entire

00:17:56.450 --> 00:17:59.049
realities governed by a digital constitution.

00:17:59.279 --> 00:18:02.500
It is the evolution of a system learning to navigate

00:18:02.500 --> 00:18:05.240
and ultimately shape our world. But there is

00:18:05.240 --> 00:18:08.339
one final piece of the puzzle buried deep in

00:18:08.339 --> 00:18:11.119
the source material. A release from May 2025

00:18:11.119 --> 00:18:14.460
called AlphaVolve. AlphaVolve is an evolutionary

00:18:14.460 --> 00:18:17.500
coding agent. It uses large language models to

00:18:17.500 --> 00:18:20.140
actively design new algorithms. It generates

00:18:20.140 --> 00:18:22.940
variations of code, tests them, selects the most

00:18:22.940 --> 00:18:25.539
efficient candidates, and then iteratively rewrites

00:18:25.539 --> 00:18:28.079
them to be even better. It is an AI acting as

00:18:28.079 --> 00:18:30.420
its own architect. And that is the thought we

00:18:30.420 --> 00:18:33.140
want to leave you with today. We've spent this

00:18:33.140 --> 00:18:35.420
entire deep dive looking at how humans have guided

00:18:35.420 --> 00:18:37.839
the learning of these models. But if DeepMind

00:18:37.839 --> 00:18:40.359
has built a system that can iteratively improve

00:18:40.359 --> 00:18:42.299
its own source code faster than a human ever

00:18:42.299 --> 00:18:45.180
could, well, what happens when Elf Evolve points

00:18:45.180 --> 00:18:47.819
its optimization engine at itself? It's a massive

00:18:47.819 --> 00:18:50.259
unknown. If it begins writing variations of its

00:18:50.259 --> 00:18:53.019
own architecture, the software you use next week

00:18:53.019 --> 00:18:55.759
might be built by an intelligence that no human

00:18:55.759 --> 00:18:58.869
engineer fully comprehends. We are rapidly approaching

00:18:58.869 --> 00:19:00.710
the moment where the machine writing the code

00:19:00.710 --> 00:19:02.829
becomes exponentially smarter than the human

00:19:02.829 --> 00:19:05.009
who wrote the machine. Thank you for taking this

00:19:05.009 --> 00:19:08.269
deep dive with us. Keep exploring, keep interrogating

00:19:08.269 --> 00:19:10.750
the technology around you, and we will see you

00:19:10.750 --> 00:19:11.109
next time.
